21 Fluorescent Protein-Based DNA Staining Dyes

Fluorescent protein–DNA-binding peptides or proteins (FP-DBP) are a powerful means to stain and visualize large DNA molecules on a fluorescence microscope. Here, we constructed 21 kinds of FP-DBPs using various colors of fluorescent proteins and two DNA-binding motifs. From the database of fluorescent proteins (FPbase.org), we chose bright FPs, such as RRvT, tdTomato, mNeonGreen, mClover3, YPet, and mScarlet, which are four to eight times brighter than original wild-type GFP. Additionally, we chose other FPs, such as mOrange2, Emerald, mTurquoise2, mStrawberry, and mCherry, for variations in emitting wavelengths. For DNA-binding motifs, we used HMG (high mobility group) as an 11-mer peptide or a 36 kDa tTALE (truncated transcription activator-like effector). Using 21 FP-DBPs, we attempted to stain DNA molecules and then analyzed fluorescence intensities. Most FP-DBPs successfully visualized DNA molecules. Even with the same DNA-binding motif, the order of FP and DBP affected DNA staining in terms of brightness and DNA stretching. The DNA staining pattern by FP-DBPs was also affected by the FP types. The data from 21 FP-DBPs provided a guideline to develop novel DNA-binding fluorescent proteins.


Introduction
Fluorescent proteins (FP) have been essential molecular reporters for microscopic visualization in biochemical and cellular applications since the first application in 1994 [1]. FPs have been applied for monitoring gene expression, protein localization, protein dynamics, and protein-DNA interactions from the molecular to the cellular level [2]. FPs have distinct advantages as they are easy to use and inexpensive. Most FP genes can be obtained through a non-profit plasmid repository, AddGene [3]. Using genetic engineering tools, an FP gene can be placed at either the N-or C-termini of a gene of interest [4]. Then, the expression vector can be transported into a host cell with an organelle-targeting signal sequence which allows the expression of FPs in vivo. Moreover, FP genes can be readily mutated to improve various features, such as excitation and emission wavelengths, brightness, pKa, maturation, lifetime, and photostability. Since the first success of GFP mutation in 1994 [5], many new and improved FPs have been genetically engineered. To date, the FP database (FPbase.org) lists around 800 FPs, and the numbers are still increasing [6]. Each newly developed mutant FP has improved features. For example, the emission wavelengths have been expanded from a green color of 509 nm [7] to a spectrum of wavelengths from 424 nm [8] to 1000 nm [9]. As described so far, FPs have been developed to have many advantages, but they also have disadvantages. A primary limitation of FP is low stability, especially photostability, attributed to protein's properties such as denaturation or degradation. Therefore, there have been efforts to solve this problem through genetic engineering. Recently, a highly stable green fluorescent protein, StayGold, was reported to overcome the durability limitation [10]. Figure 1 shows our recombined 21 FP-DBPs and their stained λ DNA (48.5 kb). In our previous studies, HMG and tTALE were linked only with eGFP and mCherry [34,35]. In order to enhance the brightness and expand emission wavelengths, we constructed 21 plasmids by connecting 14 FP genes obtained from AddGene. In our previous study, we linked HMG peptides at both N-and C-termini of an FP [34], but in this study, we connected HMG at the N-terminus of FP for the efficiency of the cloning processes. N-terminal linkage was successful in most cases as shown in Figure 1a. When HMG-FP could not be expressed, HMG was recombined into the C-terminus of FP. For example, when HMG was linked at the N-terminus (HMG-tdTomato), the color of tdTomato was not obtained. However, when HMG was linked at the C-terminus (tdTomato-HMG), the expression was successful. Since tTALE is 36 kDa, it was always linked to the N-terminus of eleven FPs. After forming the constructs, we screened their expression with BL21 E. coli. Since tTALE has a complex structure, expressions of FP linked to tTALE were generally challenging, particularly when the FP oligomerization state was not monomer (Table 1).  [36]. DNA-binding protein motifs were colored black and were aligned to DNA by using PDB 2EZD (a) PDB 4OTO (b) in the software PyMOL.  Table 1 shows 14 kinds of FPs that we used for FP-DBPs and three references, such as AasuFP1, eGFP, and avGFP. We selected bright FPs as much as possible. Since AausFP1 is the brightest, we also attempted to construct HMG-AausFP1. Unfortunately, HMG-AausFP1 had a high background, which made it difficult to visualize DNA backbones. Further, HMG-AausFP1 aggregated on the DNA backbone to form bright spots instead of linear lines. However, we confirmed bright AausFP1 that was fused with a different protein. For tTALE, monomeric FPs were more likely expressed, probably because tTALE was large. This fact limited expressions with AausFP1, RRvT, and tdTomato which are supposed to be brighter than monomeric FPs.

Brightness Optimization of FP-DBPs
The brightness of an FP is critical for a single-molecule observation. As brightness becomes higher for an FP, a lower amount of the FP is required for detection. We referred to a fluorescent protein database (FPbase.org) for the selection. However, the database does not account for how brightness would change when an FP is coupled to a DBP. Further, the microscopic set-up with emission and excitation filter affects the measured intensity of the brightness of FP. In addition, the number of FP-DBPs per unit length of DNA also affects the fluorescence intensity measured on the image. Given these concerns, we measured the fluorescence intensity of HMG-FPs and tTALE-FPs and compared them to the brightness shared by the FP database ( Figure 2). To measure their brightness from the microscopic images, we selected 60 different DNA molecules and measured the brightest fluorescence intensity within a DNA molecule with ImageJ software.      Figure 2b) were brightest. The dashed line calculated for a linear relationship between FP brightness and FP-DBP fluorescence intensity can be a general guideline. FPs linked to HMG correlated better than FPs linked to tTALE. Because tTALE is larger than HMG, tTALE is more prone to affect the brightness of an FP. Additionally, the correlation of tTALE constructs reflected that. Surprisingly, a few exceptions had brighter fluorescence intensity than what was referred to by the FP database such as HMG-Emerald, tTALE-mTurquoise2, and tTALE-mKO2. The brightness jump of tTALE-mTurquoise2 is more prominent when compared to HMG-mTurquoise2 ( Figure 2b). A weak dimer, HMG-Ypet, and a tandem dimer, tdTomato-HMG, were dimmer. However, tTALE-Ypet was one of the brightest tTALE-FPs. Therefore, we concluded that generally, the intensity of FP-DBP can depend on the FP brightness, but there are many exceptions. A plausible explanation can be that the brightness of the FP was affected by protein folding. An FP that is linked after a protein motif can sometimes be affected by how well the protein motif folds, and we suspect that tTALE-mTurquoise2 might have received a beneficial effect during the process. To explain this observation, we measured fluorescence spectra using mNeonGreen-conjugated FP-DBPs. Figure 2c demonstrates two important factors to change brightness. First, two different DBPs affect the fluorescence intensity and spectrum shape. Second, DNA-bound FP-DBPs have enhanced and slightly different fluorescence spectra. Figure 3 demonstrates the positional effects between FP and DBP. RRvT-HMG was 10% brighter, though they were in the error range, than HMG-RRvT, even though amino acid compositions were essentially the same (Figure 3a). Figure 3b compares the effect of FP position on DNA-staining behavior. For HMG-mNeonGreen, DNA was aggregated instead of stretching on a positively charged surface. mNeonGreen-HMG, on the other hand, allowed DNA to be well stretched. This observation explains that the placement of an FP and a DBP can affect the brightness of FPs.
Molecules 2022, 27, x FOR PEER REVIEW 6 of 13 mNeonGreen-conjugated FP-DBPs. Figure 2c demonstrates two important factors to change brightness. First, two different DBPs affect the fluorescence intensity and spectrum shape. Second, DNA-bound FP-DBPs have enhanced and slightly different fluorescence spectra. Figure 3 demonstrates the positional effects between FP and DBP. RRvT-HMG was 10% brighter, though they were in the error range, than HMG-RRvT, even though amino acid compositions were essentially the same (Figure 3a). Figure 3b compares the effect of FP position on DNA-staining behavior. For HMG-mNeonGreen, DNA was aggregated instead of stretching on a positively charged surface. mNeonGreen-HMG, on the other hand, allowed DNA to be well stretched. This observation explains that the placement of an FP and a DBP can affect the brightness of FPs.

A/T-Rich Specific Staining by tTALE-FP
One of the important advantages of FP-DBP as a dye is its affinity toward a specific DNA sequence. Nonetheless, selective binding exclusively toward a specific sequence on a single molecular level has been a challenging task [37]. In addition to the optical mapping system described in the introduction, fluorescence in situ hybridization (FISH) has been a tool to obtain sequence information from metaphase chromosomes. Further, DNA points accumulation for imaging in nanoscale topography (DNA-PAINT) can enhance the resolution of DNA images [38,39]. However, the hybridization method is based on DNA melting to expose single-stranded DNA. In contrast, DNA-binding proteins can recognize the sequence without opening the double-helix. Thus, we chose protein to direct specific binding; transcription activator-like effector (TALE) and Zinc-Finger domain (ZnF), for instance, are known for their specific sequence affinity. Yet, they are also known for their false positive binding [40]. In our previous report, we showed the A/T-specific binding of tTALE-FP by increasing salt concentrations [35]. The results led us to question, "would

A/T-Rich Specific Staining by tTALE-FP
One of the important advantages of FP-DBP as a dye is its affinity toward a specific DNA sequence. Nonetheless, selective binding exclusively toward a specific sequence on a single molecular level has been a challenging task [37]. In addition to the optical mapping system described in the introduction, fluorescence in situ hybridization (FISH) has been a tool to obtain sequence information from metaphase chromosomes. Further, DNA points accumulation for imaging in nanoscale topography (DNA-PAINT) can enhance the resolution of DNA images [38,39]. However, the hybridization method is based on DNA melting to expose single-stranded DNA. In contrast, DNA-binding proteins can recognize the sequence without opening the double-helix. Thus, we chose protein to direct specific binding; transcription activator-like effector (TALE) and Zinc-Finger domain (ZnF), for instance, are known for their specific sequence affinity. Yet, they are also known for their false positive binding [40]. In our previous report, we showed the A/T-specific binding of tTALE-FP by increasing salt concentrations [35]. The results led us to question, "would the A/T specificity be affected by an FP that is linked by the DBP?" In testing so, we utilized five tTALE-FPs to an A/T-specific stain λ phage DNA with tTALE-Emerald, mStrawberry, Ypet, mNeonGreen, and mTurquoise2. Since tTALE with Emerald and mCherry bound the DNA with A/T specificity [35], Emerald was used as a control for our experiments in this paper. Figure 4 demonstrates DNA-staining with tTALE-FPs by increasing salt concentrations. Five tTALE-FPs were Emerald, mTurquoise2, Ypet, mNeongreen, and mStrawberry. The DNA images of Emerald showed standard A/T-specific staining of DNA. As the salt concentration increased, the 5 -end GC-rich region of the λ phage DNA was broadly destained. Other tTALE-FPs also stained the DNA with distinct patterns. tTALE-mTurquoise2, tTALE-Ypet, and tTALE-mStrawberry stained with patterns resembling tTALE-Emerald. Interestingly, the salt concentrations at which the A/T rich regions appeared distinctively were different for different FPs. From the context, tTALE-mStrawberry caught our attention for the A/T-specific pattern did not appear until 100 mM of salt was applied. The strong interaction strength helps us to develop a DNA staining dye working in a relatively high salt condition. tTALE-mNeonGreen did not generate a typical tTALE-FP pattern. Instead, tTALE-mNeonGreen's pattern appeared to be related to TGTCTGT patterns that truncated TALE were supposed to bind based on TALE binding rule.  To explain the differences in the DNA staining patterns of different tTALE-FPs, we compared protein sequences and structures. Figure 5a shows a phylogenetic tree generated by MEGA (molecular evolutionary genetic analysis) software. Emerald, mTur-quoise2, and YPet are closely related. Figure 5b shows the overlapped structure of three FPs, which are almost identical to one another. In contrast, mStrawberry, and mNeon- To explain the differences in the DNA staining patterns of different tTALE-FPs, we compared protein sequences and structures. Figure 5a shows a phylogenetic tree generated by MEGA (molecular evolutionary genetic analysis) software. Emerald, mTurquoise2, and YPet are closely related. Figure 5b shows the overlapped structure of three FPs, which are almost identical to one another. In contrast, mStrawberry, and mNeonGreen were significantly apart from the three FPs. Figure 5c shows the overlapped structure of Emerald, mStrawberry, and mNeonGreen. They show differences among them. These differences may explain the different profiles of DNA-bound tTALE-FPs in Figure 4. When we traced the FP engineered history, Emerald, mTurquoise2, and Ypet shared their originality to avGFP; mNeongreen was a derivative of LanYFP; mStrawberry was a derivative of DsRed. This similarity of avGFP-derivatives reflected their similarity in the salt concentration at which the A/T-specific pattern appeared. The FP sequence deviation would alter how FPs fold after tTALE. If protein sequences are similar to each other, effects from the protein folding would be most likely similar.
In this paper, we developed and characterized 21 FP-DBPs to increase brightness and optimize properties. The 21 FP-DBPs were used to stain DNA molecules and analyze fluorescence intensities. HMG was used to demonstrate the importance of the binding sequence between FP and DBP. Using tTALE-FP, we found different specificities of tTALE depending on FP. Understanding this interaction between FP and DBP can be a stepping stone towards the engineering and optimization of protein-based DNA staining dyes. This discovery will also enable genetic engineering to expand DNA staining dyes toward multi-color staining.

Chemicals
DNA primers and oligonucleotides were purchased from COSMOGENETH (Seoul, Korea). Biotin-labeled DNA oligomer, 1 kb, and 100 bp ladder were purchased from Bi- When we traced the FP engineered history, Emerald, mTurquoise2, and Ypet shared their originality to avGFP; mNeongreen was a derivative of LanYFP; mStrawberry was a derivative of DsRed. This similarity of avGFP-derivatives reflected their similarity in the salt concentration at which the A/T-specific pattern appeared. The FP sequence deviation would alter how FPs fold after tTALE. If protein sequences are similar to each other, effects from the protein folding would be most likely similar.
In this paper, we developed and characterized 21 FP-DBPs to increase brightness and optimize properties. The 21 FP-DBPs were used to stain DNA molecules and analyze fluorescence intensities. HMG was used to demonstrate the importance of the binding sequence between FP and DBP. Using tTALE-FP, we found different specificities of tTALE depending on FP. Understanding this interaction between FP and DBP can be a stepping stone towards the engineering and optimization of protein-based DNA staining dyes. This discovery will also enable genetic engineering to expand DNA staining dyes toward multi-color staining.

Molecular Cloning
Using standard subcloning procedures, HMG-FP sequences were inserted into the pET-15b vector and transformed into the E. coli BL21 (DE3) strains by using NdeI and BamHI restriction. For the tTALE-FPs, FP sequences were inserted into the same vector and transformed by XmaI and BamHI digestion instead of NdeI. A single colony of the transformed cells was inoculated in a fresh LB media containing ampicillin and incubated for 1 h. After the transformed cells were saturated, they were subsequently cultured to an optical density of~0.4 at 37 • C with corresponding antibiotics. The over-expressed and FP-tagged proteins were induced with a final concentration of 1 mM for IPTG overnight on a shaker at 20 • C and 150 rpm. The cells for the protein purification were harvested by centrifugation at 10,000× g, for 10 min (following centrifugations were performed under similar conditions), and the residual media was washed using the cell lysis buffer (50 mM Na 2 HPO 4 , 300 mM NaCl, 10 mM Imidazole, pH 8.0). The cells were lysed by ultrasonication for 30 min, and the cell debris was centrifuged at 13,000 rpm for 10 min at 4 • C. The his-tagged FP-DNA-binding proteins were purified using affinity chromatography with Ni-NTA agarose resin. The mixture of crude extract and the resin were kept on a shaking platform at 4 • C for 1.5 h. The lysate containing proteins bound Ni-NTA agarose resin was loaded onto the column for gravity chromatography and was further rinsed several times using the Nickel-NTA wash buffer (50 mM Na 2 HPO 4 , 300 mM NaCl, 20 mM Imidazole, pH 8.0) several times. Finally, the bound proteins were eluted using Elution buffer (50 mM Na 2 HPO 4 , 300 mM NaCl, 250 mM imidazole, pH 8.0). All proteins were diluted (10 µg mL −1 ) using 50% w/w glycerol/1× TE buffer (Tris 10 mM, EDTA 1 mM, pH 8.0).

Polydimethylsiloxane (PDMS) Microfluidic Devices
A standard rapid prototyping method was used to create PDMS microfluidic devices for DNA elongation and deposition on a positively charged surface [51]. Briefly, the patterns on a silicon wafer for microchannels (2.3 µm high and 100 µm wide) were fabricated using SU-8 2005 photoresist (Microchem, Netonpression, MA, USA). The PDMS pre-polymer mixed with a curing agent (10:1 weight ratio) was cast on the patterned wafer and cured at 65 • C for four hours or longer. Cured PDMS was peeled off from the patterned wafer, and the PDMS devices were treated in an air plasma generator for 1 min with 100 W (Femto Science Cute Basic, Hwaseong, Korea) to alter the PDMS surface to become hydrophilic. The PDMS devices were punctured for an inlet and outlet. The devices were stored in water before use.

Positively-Charged Surface Preparation
Glass coverslips were stacked in the Teflon rack, soaked in a piranha etching solution (30:70 v/v H 2 O 2 /H 2 SO 4 ) for 3 h, and rinsed with deionized water until the pH reached the neutral pH (pH 7). For the glass surfaces, 350 µL of N-trimethoxymethylsilylpropyl-N,N,Ntrimethyl ammonium chloride in 50% methanol was mixed with 200 mL of deionized water.
The acid-cleaned glass coverslips were incubated in this solution for 12 h at 65 • C. Then, they were rinsed with ethanol three times. The surfaces were stored in ethanol before use.

Microscopy
The microscopy system consisted of an inverted microscope (Olympus IX70, Tokyo, Japan) equipped with a 100× Olympus UPlanSApo oil immersion objective lens and illuminated LED light source (SOLA SM II light engine, Lumencor, Beaverton, OR, USA). The light was passed through the corresponding filter sets (Table 2) to excite the fluorescent dye. Fluorescence images were captured using a scientific-grade complementary metaloxide-semiconductor digital camera (2048 × 2048, Prime sCMOS Camera, Photometrics, Tucson, AZ, USA) and stored in 16-bit TIFF format generated by the software Micromanager. ImageJ was utilized for image processing, particularly to stitch the microchannel images (Figure 2b). DNA molecules (15 ng µL −1 ) were mixed 1:1 with FP-DBPs (5-100 nM) in 1× TE (10 mM Tris, 1 mM EDTA, pH 8). The DNA and FP-DBPs were incubated at room temperature for 10 min. The mix was loaded on a positively-charged surface and was allowed to spread between the surface and a glass slide. From the fluorescence images, we randomly selected 60 DNA molecules and measured the brightest fluorescent intensity within a DNA molecule. The measurement was performed with ImageJ software after background subtraction. In total, 60 values were combined and averaged for a fluorescence intensity value.

DNA Imaging on a Positively Charged Surface with Microchannels
DNA molecules (60 ng µL −1 ) were mixed 1:1 with FP-DBPs (20-70 nM) in 1× TE (10 mM Tris, 1 mM EDTA, pH 8) with NaCl. The NaCl concentration was twice the targeting concentration before mixing. The mix was incubated on ice for 10 min. The mix was then diluted at about 1/200 with 1× TE with a targeting concentration of NaCl. The PDMS device was washed with ethanol and water. When the device was dried, it was placed on a positively-charged surface. The diluted mix was loaded through the inlet hole of the device.

Fluorescence Intensity Measurement with the Fluorometer
In total, 1 µM of FP-DBP was either mixed or not mixed with 250 pM (7.88 ng µL −1 ) of λ Phage DNA in 10 mM phosphate pH 8.0 buffer. When the DNA samples were mixed with DNA, the samples were incubated on ice for 10 min before measurements. Using a Hitachi F-7000 fluorometer, emission spectrums were collected using the fluorescent protein's maximum excitation wavelength.