Bioengineering of Genetically Encoded Gene Promoter Repressed by the Flavonoid Apigenin for Constructing Intracellular Sensor for Molecular Events

In recent years, Synthetic Biology has emerged as a new discipline where functions that were traditionally performed by electronic devices are replaced by “cellular devices”; genetically encoded circuits constructed of DNA that are built from biological parts (aka bio-parts). The cellular devices can be used for sensing and responding to natural and artificial signals. However, a major challenge in the field is that the crosstalk between many cellular signaling pathways use the same signaling endogenous molecules that can result in undesired activation. To overcome this problem, we utilized a specific promoter that can activate genes with a natural, non-toxic ligand at a highly-induced transcription level with low background or undesirable off-target expression. Here we used the orphan aryl hydrocarbon receptor (AHR), a ligand-activated transcription factor that upon activation binds to specific AHR response elements (AHRE) of the Cytochrome P450, family 1, subfamily A, polypeptide 1 (CYP1A1) promoter. Flavonoids have been identified as AHR ligands. Data presented here show the successful creation of a synthetic gene “off” switch that can be monitored directly using an optical reporter gene. This is the first step towards bioengineering of a synthetic, nanoscale bio-part for constructing a sensor for molecular events.


Introduction
Recent innovations in Synthetic Biology, including dramatic reductions in the cost of DNA synthesis and sequencing [1,2], have opened many new possibilities for using biological material as a replacement for traditional "electronic devices". As such, biological components can be combined with electronic biosensors for miniaturing, and to improve integration with target tissue. For such integration to occur, a new set of biological parts (or "bio-parts") should be developed. These bio-parts are required to be highly specific and have the ability to communicate with electronics. The input and output is commonly achieved by light [3][4][5][6][7], chemicals [8,9], or electromagnetic irradiation [10][11][12][13][14][15].
The idea that genes could be activated or suppressed was first pioneered in prokaryotic cells due to their ease and simplicity, but gene switches have since been developed for eukaryotic cells as well. Gene switches are natural or synthetic systems that allow initiation, interruption, or termination of target gene expression [16]. Many of the gene switches are based on gene promoters. Promoters consist of DNA sequences usually preceding the gene open reading frame which are essential for gene regulation. There are only a switches are based on gene promoters. Promoters consist of DNA sequences usually preceding the gene open reading frame which are essential for gene regulation. There are only a few examples that stand out for the use of promoters in mammalian cells as switches, which are activated by metabolites [17], ions [18], optogenetics [19] or miRNA [20], but essentially the most effective gene switch is tetracycline-dependent repressor (TetR) [21]. This promoter/repressor system is the most ubiquitous system that is used in mammalian cells. Mainly because it has no crosstalk with other signaling pathways it does not activate any other gene, and consequently has no cellular side effect. In this study, we sought to expand the synthetic promotor toolbox available as bio-parts for future design and construction of bio-electronic hybrid devices.
The aryl hydrocarbon receptor (AHR) is a ligand-activated transcription factor that serves as a sensor of developmental and environmental signals [22], but whose endogenous activators and their roles are poorly understood [23]. Inactivated, the AHR resides in the cytosol bound to inhibitory cofactors, including heat shock protein 90 (hsp90) and aryl hydrocarbon receptor interacting protein (AIP) (Figure 1). Upon activation by a ligand, the cytosolic AHR undergoes a conformational change that frees it from the inhibitory cofactors [24]. This allows the AHR-ligand complex to translocate into the nucleus and associate with the aryl hydrocarbon receptor nucleus translocator (ARNT) protein [25]. This complex then binds to specific DNA sequences on target genes called AHR response elements (AHRE) and is known to activate the transcription of the Cytochrome P450, family 1, subfamily A, polypeptide 1 (CYP1A1) by binding specifically to its upstream promoter. Aryl hydrocarbon receptor (AHR)/Cytochrome P450, family 1, subfamily A, polypeptide 1 (CYP1A1) pathway diagram. Ligand (e.g. apigenin) enters the cell and binds to the receptor complex (AHR, interacting protein (AIP) and heat shock protein 60 (hsp 60). The AHR-ligand complex translocates to the nucleus upon binding to AHR response elements (AHRE). Once in the nucleus, the complex binds to the CYP1A1 promotor (CYP1A1prom) and initiates the transcription and translation of the gene of interest (GOI).
Most known AHR ligands are toxins and xenobiotics, such as 2,3,7,8-tetrachlorodibenzo-B-dioxin (TCDD) [24]. Additionally, flavonoids, one of the most abundant phytochemicals, have been identified as weak AHR ligands [26]. Flavonoids are a class of polyphenolic secondary plant metabolites broadly found in fruits and vegetables. These compounds are generally recognized as health promoting, immuno-modulators and are components of traditional medicines and commercially-available nutraceuticals [27]. Most known AHR ligands are toxins and xenobiotics, such as 2,3,7,8-tetrachlorodibenzo-B-dioxin (TCDD) [24]. Additionally, flavonoids, one of the most abundant phytochemicals, have been identified as weak AHR ligands [26]. Flavonoids are a class of polyphenolic secondary plant metabolites broadly found in fruits and vegetables. These compounds are generally recognized as health promoting, immuno-modulators and are components of traditional medicines and commercially-available nutraceuticals [27].
This gene circuit, while complex, gives three accessible points of modification: the ligand, the receptor, and the promoter. In this study, we discussed our work on each, and how these come together to produce a sensitive synthetic gene switch which can be remotely suppressed, as well as future work for developing a sensitive activator. The applications of this cellular system are broad and include immune or stem cell imaging and tracking, in situ remote activation of gene expression in cells of interest as well as implementing Synthetic Biology in building bio-electronic hybrid sensors.

Plasmid Synthesis and Construction
A 2 Kb region upstream of the CYP1A1 gene is in the human genome part of the promoter region. This sequence was used as the experimental CYP1A1 promoter. To determine an optimal promoter sequence that will drive the expression of a reporter gene in response to flavonoids, a machine learning model, XGBoost [28], was trained using the AHR binding sites found in human breast cancer Michigan Cancer Foundation-7 (MCF-7) cells. The promoter region was cloned into pGlow TOPO TA expression plasmid (Invitrogen, catalog number: K483001) following the product's protocol. Additionally, the reporter gene green fluorescent protein (GFP) was replaced with mScarlet using the NEBuilder HiFi DNA Assembly kit (New England BioLabs catalog number: E5510S) to produce two different expression vectors.
The presence of the aryl hydrocarbon receptor (AHR) was performed using Western Blot analysis using the AHR Antibody (MA1-513) from TermoFisher according to the manufacturer protocol.

Transfection
Cells were seeded in a 24-well plate at a density of 10 5 cells/well and given overnight to properly adhere and subsequently transfected with either the experimental plasmid (pGlow-CYP1A1 prom or pGlow-CYP1A1 prom ::mScarlet), negative control plasmid (pcDNA3.1/V5-His-TOPO/lacZ), or no DNA using Lipofectamine 3000 Transfection Reagent (Invitrogen catalog number: L3000015) following the product protocol. Cells were incubated with the transfection mixture for 24 h.

Flavonoid Treatment
Transfected cells were then seeded in a 96-well plate at a density of 15,000 cells/well and given overnight to properly attach to the well bottom. Quercetin was purchased from Cayman Chemical (catalog number: 10005169) and dissolved in Dimethyl sulfoxide (DMSO) to a 220 mM stock solution (Sigma Aldrich catalog number: D8885-500G), apigenin and naringenin were purchased from Sigma and dissolved in DMSO as 20 mM and 100 mM stock solutions, respectively. Flavonoid stock solutions were kept at −20 • C. Flavonoids were diluted in cell media to achieve 100 µM working concentration, and then each flavonoid, in addition to a 0 µM control (DMSO), was added to each transfected cell group for 48 h. This produced 12 individual experimental groups.

Imaging
Following incubation, cells were washed with Phosphate-buffered saline (PBS; GIBCO catalog number: 10010023) and covered with Fluorobrite DMEM (GIBCO catalog number: A1896702). Images were taken on the Keyence BZ-X800 microscope using the BZ-X GFP filter for the GFP construct or the BZ-X Tetramethylrhodamine (TRITC) filter for the mScarlet construct.

Quantitative Analysis
Following imaging, quantitative analysis was done using the VictorNIVO (PerkinElmer) microplate reader. The settings of excitation: 560 nm, emission: 593 nm were used.

Structure Prediction, Docking and Validation
The amino acid sequence of target protein 4xt2 named as crystal structure of the high affinity ARNT C-terminal PAS domains in complex with a tetrazole-containing antagonist was taken from the Protein Data Bank database (https://www.rcsb.org/structure/4XT2, accessed on 27 April 2021) [29]. To generate a reliable 3d structure 4xt2_A was remodeled using i-tessar target-template alignments.
To improve the quality of predicted model of 4xt2_A, energy minimization was performed with the gromos 96 force field. This force field permits to evaluate the energy of the modelled structure as well as overhaul distorted geometries through energy minimization. All computations during energy minimization were done in vacuum system, without relative force fields.
To assess flavonoid binding to the AHR (PDB: 4xt2), binding sites were predicted in the 3D structure of AHR using the CASTp server [30]. The interaction energy between the AHR and the ligand was scored for evaluation using Autodock Vina Package in mgltools. Line diagrams of flavonoids in AHR binding site generated by LIGPLOT [31]. The diagrams depict the hydrogen-bond interaction patterns and hydrophobic contacts between the ligand(s) and the main-chain or side-chain elements of the protein.

Results and Discussion
A combined approach of molecular and computational biology was used to model and engineer a synthetic gene switch optimized in three ways: ligand, receptor, and promoter.

Identifying the Optimal Promoter Sequence
To determine an optimal promoter sequence that will drive the expression of a reporter gene in response to flavonoids, a machine learning model, XGBoost [28]. was trained on the AHR binding sites found in human breast cancer MCF-7 cells [32]. The resulting binding probabilities from 10 base pair segments containing the 'GCGTG' motif are displayed in Table 1. Screening of a 2Kb promoter region upstream of the transcription start site revealed 5 locations with putative binding probability of the AHR/ARNT complex greater than 0.5, suggesting that the complex bound to these locations would initiate transcription. This finding was validated by a computational model which shows AHR bound with ARNT form the transcription factor complex, visualized by UCSF Chimera Viewer ( Figure S1). Therefore, the 2Kb sequence of the promoter was determined as sufficient and was used for cloning.

Bioengineering an Optical Reporter System
To determine the compatibility of the gene switch with the host cells, we showed that the AHR protein is expressed in HeLa and HEK293FT cells, as confirmed by western blot analyses (Figure 2a). This step was crucial to ensuring that these cells could support this reporter system. Next, a synthetic DNA fragment encoding to the 2Kb region of the human CYP1A1 promoter was cloned into the pGlow plasmid upstream of the EGFP reporter gene. Since flavonoids show autofluorescence in the green spectrum, EGFP was then replaced with a red shifted reporter, mScarlet, via Gibson assembly. The plasmid map is depicted in Figure 2b.

Testing the Gene Switch Reporter System In Vivo
To test the gene switch reporter system in vitro, HeLa cells were chosen for their stronger adherence to the plate, which was important due to the need of rigorous washes. After optimization of the timeline parameters (Supplementary material), cells were transfected with the pGlow-CYP1A1prom::mScarlet plasmid for 24 h, then subsequently treated with 100 M of either quercetin, apigenin or naringenin for 48 hrs. Interestingly, the

Testing the Gene Switch Reporter System In Vivo
To test the gene switch reporter system in vitro, HeLa cells were chosen for their stronger adherence to the plate, which was important due to the need of rigorous washes. After optimization of the timeline parameters (Supplementary material), cells were transfected with the pGlow-CYP1A1 prom ::mScarlet plasmid for 24 h, then subsequently treated with 100 µM of either quercetin, apigenin or naringenin for 48 hrs. Interestingly, the pGlow-CYP1A1 prom ::mScarlet exhibits a basal level activation, which is eliminated by apigenin (Student's t-test; p < 0.01). Additionally, the signal was slightly enhanced by naringenin, and unchanged by quercetin, however, this was not found to be statistically significant. The control groups transfected with either pcDNA3.1/V5-His-TOPO/lacZ or No DNA exhibit no mScarlet expression, as shown in Figure 3. These findings indicate that among the three flavonoids tested, apigenin can be used successfully as a synthetic "off"-switch of gene expression.

Ligand/Receptor Binding Analysis
To assess flavonoid binding to the AHR (PDB: 4xt2), binding sites were predicted in the 3D structure of AHR using the CASTp server, as shown in Figure 4. The interaction energy between the AHR and the ligand was scored for evaluation

Ligand/Receptor Binding Analysis
To assess flavonoid binding to the AHR (PDB: 4xt2), binding sites were predicted in the 3D structure of AHR using the CASTp server, as shown in Figure 4.
The interaction energy between the AHR and the ligand was scored for evaluation using Autodock Vina Package in mgltools. Results are shown in Table 2. The 43L or racemic tetrazolo-tetrahydropyrimidines are a natural substrate that bind with lower energy score, revealing that they have higher binding affinity towards 4xt2 (AHR) than TCDD. Ligand TCDD had the lowest energy score (−5.7 Kcal/mol) after 43L and apigenin which signifies that it has the highest binding affinity towards the active site of 4xt2. The second and third best docked molecules were naringenin and quercetin which showed high binding affinity for the 4xt2.
The results described above revealed the ability to generate a synthetic gene switch using the CYP1A1/AHR pathway. This pathway was chosen because it is dormant in mature cells and is not activated by any known endogenous ligand, only by exogenous-natural or synthetic-compounds. Flavonoids were screened to identify strong activators and suppressors of this gene circuit to allow for sensitive control. Thus far, apigenin was found to be an efficient suppressor. Using computational analysis, we found potential sites on the promoter region of AHR for future improvements of this gene switch. Promoter mutants were simulated and screened for the optimal sequence which would maximize transcription factor-promoter binding. Ligand-receptor binding simulations were performed to identify key amino acids involved with binding in the ligand binding pocket of the AHR. These could be targets for protein mutagenesis to increase binding efficiency. Once all three prongs have been optimized, this fine-tuned synthetic gene switch could be utilized for vast imaging and therapeutic purposes.

Ligand/Receptor Binding Analysis
To assess flavonoid binding to the AHR (PDB: 4xt2), binding sites were predicted in the 3D structure of AHR using the CASTp server, as shown in Figure 4. The interaction energy between the AHR and the ligand was scored for evaluation using Autodock Vina Package in mgltools. Results are shown in Table 2. The 43L or racemic tetrazolo-tetrahydropyrimidines are a natural substrate that bind with lower energy score, revealing that they have higher binding affinity towards 4xt2 (AHR) than TCDD. Ligand TCDD had the lowest energy score (−5.7 Kcal/mol) after 43L and apigenin which signifies that it has the highest binding affinity towards the active site of 4xt2. The second and third best docked molecules were naringenin and quercetin which showed high binding affinity for the 4xt2.  While it is clear that we developed a reliable "off" system with apigenin, future optimization of the "on" system with naringenin is warranted. This would allow the sensitivity to be increased without compromising on the tightness of the "off" switch. This could be accomplished by adding elements such as enhancers, co-factors, and genetic mutations to increase ligand-receptor binding strength and duration. Another direction to explore is to develop the "on" system by optimizing for a different flavonoid, such as luteolin or quercetin, and reducing the basal level.
Here we used a concentration of 100 µM of flavonoids. While 100 µM is a relatively high concentration for dietary interventions and some cellular models, this concentration was selected among those found to successfully validate protein that are directly associated with apigenin in vivo [33]. Moreover, several studies have used 100 µM as part of cellular platforms [34][35][36][37]. Therefore, the use of 100 µM in this study is reasonable. In future studies, dose dependency experiments will be needed to optimize the system.
In this study we have used transient transfection to test the system. Transient transfections allow high expression of transgene for a relatively short time (1-3 days). However, as is evident from this study, not all of the cells are expressing the gene of interest, and there is variability in the expression levels between cells. For long term stability, viral transduction is required. This allows promoter-specific control of reporter genes [38,39]. Alternatively, CRISPR technologies can be used for integration of the transgene directly to the genome in a known location. This will ensure more homogenous expression.
Here, we investigated regulation of the CYP1A1 promoter by flavonoids and discovered that apigenin is a good candidate for promotor repression. Future investigations are required to elucidate the kinetics of this regulation, how long the ligand is efficient, and what the nature of the switch is. It is important to study the kinetics of this process in order to better fine-tune the system.
Our overall goal is to engineer a completely synthetic signal transduction system that will allow controlling gene expression specifically and at will, while eliminating any crosstalk with other cellular components. This is essential for developing the future generation of in vivo Synthetic Biology-hybrid electronic biosensor devices.