# Robust Normalization of Luciferase Reporter Data

^{*}

^{†}

## Abstract

**:**

## 1. Introduction

## 2. Results and Discussion

`R`code to analyze reporter assay data using EIV and REIV available on GitHub (Section 3.4.3 and Section 3.4.4). A detailed protocol for using the code to analyze dual-Luciferase reporter data is provided in Section 3.6.

## 3. Materials and Methods

#### 3.1. Cell Culture

#### 3.2. Construction of the Luciferase Reporter Plasmid

`TGG CCT AAC TGG CCG GTA CCT GAG CTC GCT AGC CTC GAG AAC TCC TAC CCA CAG CCG CG`(Fwd) and

`TCC ATG GTG GCT TTA CCA ACA GTA CCG GAT TGC CAA GCT TCA GCT TCG GGT CGC GAA TG`(Rev), which include 40 bp of sequence homologous to pGL4.10luc2. PCR amplification was carried out using Q5 High-Fidelity 2X Master Mix (NEB, M0492L) following the manufacturer’s instructions. The following PCR cycling conditions were used: initial denaturation of 30 s at 98 ${}^{\circ}$C, 30 cycles of 30 s at 98 ${}^{\circ}$C, 30 s at 60 ${}^{\circ}$C, and 60 s at 72 ${}^{\circ}$C, and a final extension for 10 min at 72 C. Gibson Assembly (GA) reactions [19] were carried out using 0.06 pmol of digested vector and 0.18 pmol of insert, for 60 min at 50 ${}^{\circ}$C. NEB high-efficiency competent cells (NEB, E5510S) were transformed according to manufacturer’s instructions. See Repele et al. [4] for the construction of the reporter vectors having the wildtype and mutant enhancers.

#### 3.3. Transfection and Luciferase Assays

#### 3.4. Tested Normalization Methods

#### 3.4.1. Ratio

#### 3.4.2. Ordinary Least-Squares Regression

`lm`function of R.

#### 3.4.3. Errors-in-Variable Regression

`eiv`in the code on GitHub (https://github.com/mlekkha/LUCNORM).

#### 3.4.4. Robust Errors-in-Variable Regression

`R`package, we implemented it in

`R`as follows.

`NLOPTR`package of R, with parameters

`xtol_rel`and

`maxeval`set to ${10}^{-7}$ and 1000 respectively.

`R`package

`BOOT`. 999 replicates were subsampled using the ordinary simulation and the function

`boot.ci`was used determine confidence intervals using the basic bootstrap method. REIV regression is implemented as the function

`robusteiv`in the code on GitHub (https://github.com/mlekkha/LUCNORM).

#### 3.5. Generation of Simulated Data

#### 3.6. Procedure for Using REIV to Analyze Dual-Luciferase Reporter Data

#### 3.6.1. Installing Required `R` Packages

`boot``parallel``nloptr``ggplot2``reshape2``RColorBrewer`

`R`Package Installer or by using the following command on the

`R`Console:

install.packages("<package_name>")

#### 3.6.2. Downloading REIV Code from GitHub and Loading It into `R`

- Download the code by visiting https://github.com/mlekkha/LUCNORM, clicking the
`Clone or download`button and choosing`Download ZIP`. - Uncompress the downloaded ZIP file.
- Set the working directory in
`R`to the location of the downloaded code using`Misc → Change Working Directory…` - Load the REIV analysis functions into the
`R`workspace.`source("lucAnalysis.R")``source("robusteiv.R")`

#### 3.6.3. Input Data Format

`R`code accepts input data in a comma-separated values (CSV) file (see Supplementary Materials File S1 for example) organized as follows (Table 2). The first and second columns are titled

`Luc`and

`Ren`respectively and contain the measured luminescences. The third column is titled

`Construct`and contains the names of the firefly Luciferase constructs which were assayed. The rest of the columns contain the names of the conditions and are titled accordingly. In our example data (Supplementary Materials File S1), there are two additional columns titled

`OHT`and

`Cytokine`indicating whether the cells were induced with OHT or not and the name of the cytokine treatment respectively.

#### 3.6.4. Importing Input Data into the `R` Workspace and Preparing It

- In the
`R`Console, read the file into a variable.`data <- read.csv("<input_file_name>")` - (Optional) If any of the construct/condition names are numeric, they must be set as categorical variables. This may be accomplished as follows.
`data$<variable_name> <- as.factor(data$<variable_name>)` - For example:
`data$Construct <- as.factor(data$Construct)`

#### 3.6.5. Normalizing and Saving Results

- Once the data have been saved in a variable, REIV can be used to compute normalized Luciferase activity and its confidence interval for each combination of construct and condition.
`norm_activity <- calcSlopesCIs(<data_variable_name>,``alpha = <alpha_value>,``regmethod = robusteiv,``cim = "boot_positive_ci",``ignore=c("Luc", "Ren"))`Here,`alpha`is the significance threshold. For example, setting`alpha`to $0.05$ will compute 95% confidence intervals.`regmethod`is the normalization method. Use`regmethod = eiv`for EIV normalization.`cim`is the method used for computing the confidence intervals. Use`cim = "gleser_ci"`for EIV normalization.`ignore`specifies which columns/variables are not conditions. If there are other columns in the CSV file that are not conditions, such as annotations, they should be included in the`ignore`vector.Example usage:`norm_activity <- calcSlopesCIs(data,``alpha=0.05,``regmethod=robusteiv,``cim="boot_positive_ci",``ignore=c("Luc", "Ren"))`The output of`calcSlopesCIs`is a data frame with columns containing the normalized Luciferase activity (`slope`) and the lower and upper confidence intervals (`ci_lower`and`ci_upper`). The values may be inspected with the`print`command.`print(norm_activity)` - The normalized activities may be saved to a CSV file for further analysis or visualization using the
`write.csv`function.`write.csv(<variable_name>, "<output_file_name>")`For example:`write.csv(norm_activity, "normalized_activities.csv")`

## Supplementary Materials

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## Abbreviations

PUER | PU.1 Estrogen Receptor |

IL3 | Interleukin 3 |

GCSF | Granulocyte colony stimulating factor |

OLS | Ordinary least-squares |

EIV | Errors-in-variables |

REIV | Robust errors-in-variables |

## References

- Arnold, C.D.; Gerlach, D.; Stelzer, C.; Boryn, L.M.; Rath, M.; Stark, A. Genome-Wide Quantitative Enhancer Activity Maps Identified by STARR-seq. Science
**2013**. [Google Scholar] [CrossRef] [PubMed] - Whyte, W.A.; Orlando, D.A.; Hnisz, D.; Abraham, B.J.; Lin, C.Y.; Kagey, M.H.; Rahl, P.B.; Lee, T.I.; Young, R.A. Master transcription factors and mediator establish super-enhancers at key cell identity genes. Cell
**2013**, 153, 307–319. [Google Scholar] [CrossRef] [PubMed] - Laslo, P.; Spooner, C.J.; Warmflash, A.; Lancki, D.W.; Lee, H.J.; Sciammas, R.; Gantner, B.N.; Dinner, A.R.; Singh, H. Multilineage transcriptional priming and determination of alternate hematopoietic cell fates. Cell
**2006**, 126, 755–766. [Google Scholar] [CrossRef] [PubMed] - Repele, A.; Krueger, S.; Bhattacharyya, T.; Tuineau, M.Y.; Manu. The regulatory control of Cebpa enhancers and silencers in the myeloid and red-blood cell lineages. PLoS ONE
**2019**, 14, e0217580. [Google Scholar] [CrossRef] [PubMed] - Stratowa, C.; Himmler, A.; Czernilofsky, A.P. Use of a luciferase reporter system for characterizing G-protein-linked receptors. Curr. Opin. Biotechnol.
**1995**, 6, 574–581. [Google Scholar] [CrossRef] - Savkur, R.S.; Bramlett, K.S.; Stayrook, K.R.; Nagpal, S.; Burris, T.P. Coactivation of the human vitamin D receptor by the peroxisome proliferator-activated receptor gamma coactivator-1 alpha. Mol. Pharmacol.
**2005**, 68, 511–517. [Google Scholar] [CrossRef] [PubMed] - Kato, M.; Sanada, M.; Kato, I.; Sato, Y.; Takita, J.; Takeuchi, K.; Niwa, A.; Chen, Y.; Nakazaki, K.; Nomoto, J.; et al. Frequent inactivation of A20 in B-cell lymphomas. Nature
**2009**, 459, 712–716. [Google Scholar] [CrossRef] [PubMed] - Jacobs, J.L.; Dinman, J.D. Systematic analysis of bicistronic reporter assay data. Nucleic Acids Res.
**2004**, 32, e160. [Google Scholar] [CrossRef] [PubMed] - Fan, F.; Wood, K.V. Bioluminescent assays for high-throughput screening. Assay Drug Dev. Technol.
**2007**, 5, 127–136. [Google Scholar] [CrossRef] [PubMed] - Minkovsky, A.; Sahakyan, A.; Bonora, G.; Damoiseaux, R.; Dimitrova, E.; Rubbi, L.; Pellegrini, M.; Radu, C.G.; Plath, K. A high-throughput screen of inactive X chromosome reactivation identifies the enhancement of DNA demethylation by 5-aza-2
^{′}-dC upon inhibition of ribonucleotide reductase. Epigenetics Chromatin**2015**, 8, 42. [Google Scholar] [CrossRef] [PubMed] - Smale, S.T. Luciferase assay. Cold Spring Harb. Protoc.
**2010**. [Google Scholar] [CrossRef] [PubMed] - Figueiredo, M.S.; Brownlee, G.G. Cis-acting elements and transcription factors involved in the promoter activity of the human factor VIII gene. J. Biol. Chem.
**1995**, 270, 11828–11838. [Google Scholar] [CrossRef] [PubMed] - Walsh, J.C.; DeKoter, R.P.; Lee, H.J.; Smith, E.D.; Lancki, D.W.; Gurish, M.F.; Friend, D.S.; Stevens, R.L.; Anastasi, J.; Singh, H. Cooperative and antagonistic interplay between PU.1 and GATA-2 in the specification of myeloid cell fates. Immunity
**2002**, 17, 665–676. [Google Scholar] [CrossRef] - Dahl, R.; Walsh, J.C.; Lancki, D.; Laslo, P.; Iyer, S.R.; Singh, H.; Simon, M.C. Regulation of macrophage and neutrophil cell fates by the PU.1:C/EBPalpha ratio and granulocyte colony-stimulating factor. Nat. Immunol.
**2003**, 4, 1029–1036. [Google Scholar] [CrossRef] [PubMed] - Bertolino, E.; Reinitz, J.; Manu. The analysis of novel distal Cebpa enhancers and silencers using a transcriptional model reveals the complex regulatory logic of hematopoietic lineage specification. Dev. Biol.
**2016**, 413, 128–144. [Google Scholar] [CrossRef] [PubMed] - Casella, G.; Berger, R.L. Statistical Inference, 2nd ed.; Duxbury Press: Pacific Grove, CA, USA, 2001. [Google Scholar]
- Zamar, R.H. Robust Estimation in the Errors-in-Variables Model. Biometrika
**1989**, 76, 149–160. [Google Scholar] [CrossRef] - Legraverend, C.; Antonson, P.; Flodby, P.; Xanthopoulos, K.G. High level activity of the mouse CCAAT/enhancer binding protein (C/EBP alpha) gene promoter involves autoregulation and several ubiquitous transcription factors. Nucleic Acids Res.
**1993**, 21, 1735–1742. [Google Scholar] [CrossRef] [PubMed] - Gibson, D.G.; Young, L.; Chuang, R.Y.; Venter, J.C.; Hutchison, C.A., 3rd; Smith, H.O. Enzymatic assembly of DNA molecules up to several hundred kilobases. Nat. Methods
**2009**, 6, 343–345. [Google Scholar] [CrossRef] [PubMed]

**Figure 1.**Example firefly and Renilla luminescence data from a myeloid cell line. Firefly luciferase was under the control of the Cebpa promoter, while Renilla luciferase was under the control of the CMV promoter. Luminescence is reported in relative luminescence units (RLUs). The best-fit line (solid) determined by robust errors-in-variable (REIV) regression is shown. Dashed lines represent the 95% confidence interval for slope determined by bootstrapping (Section 3.4.4). Potential outliers are indicated with asterisks.

**Figure 2.**Tests of normalization methods on simulated data. The top panels plot the Zamar criterion, which is 0 for perfect inference of the true activity. The middle panels plot the 90th percentile (${P}_{90}$) of the relative bias. The bottom panels plot the relative median absolute deviation, a measure of precision. (

**a**) Performance of the methods plotted against mean transfection efficiency. True activity: $A=10$. Sample size: $N=10$. Renilla errors: ${\sigma}_{11}={\sigma}_{12}=3$. Firefly errors were scaled with activity: ${\sigma}_{21}={\sigma}_{22}=10{\sigma}_{11}$. (

**b**) Performance plotted against sample size. Mean transfection efficiency: $\overline{t}=0.25$. The other parameters are the same as in panel (

**a**,

**c**). Performance plotted against true relative activity. Sample size: $N=10$. The other parameters are the same as in panel (

**b**).

**Figure 3.**Tests of normalization methods on simulated data. The top panels plot the Zamar criterion, which is 0 for perfect inference of the true activity. The middle panels plot the 90th percentile (${P}_{90}$) of the relative bias. The bottom panels plot the relative median absolute deviation, a measure of precision. (

**a**) Performance of the methods plotted against standard deviation of the error in Renilla luminescence. Mean transfection efficiency: $\overline{t}=0.25$. True activity: $A=10$. Sample size: $N=10$. There were no outliers: ${\sigma}_{12}={\sigma}_{11}$. Firefly errors were scaled with activity: ${\sigma}_{21}={\sigma}_{22}=10{\sigma}_{11}$ (

**b**). Performance plotted against increasing severity of outliers. The standard deviation of the contaminating Renilla errors ${\sigma}_{12}$ was varied to simulate outliers. True activity: $A=3$. Renilla errors: ${\sigma}_{11}=3$. Firefly errors: ${\sigma}_{21}=9$. The standard deviation of firefly contaminating errors was scaled with activity: ${\sigma}_{22}=3{\sigma}_{12}$. The other parameters are the same as in panel (

**a**). The Zamar criterion of the ratiometric method was greater than the upper limit of the y-axis.

**Figure 4.**Tests of normalization methods on empirical data. (

**a**) Dataset of firefly luminescence driven by the Cebpa promoter and Renilla luminescence driven by the CMV promoter in PUER cells. $N=85$. The dataset was sampled 1000 times with varying sample sizes. Promoter activity was estimated with each method from the same exact data. (

**b**) Boxplots of the inferred activities. The box lines are the first quartile, median, and the third quartile. The whiskers extend to the most extreme values lying within 1.5 times the interquartile range, and any datapoints outside the whiskers are shown as circles. (

**c**) Datapoints having Renilla luminescence 8 RLU or less were excluded from the analysis.

**Figure 5.**Performance of normalization methods in detecting the effect of mutations on enhancer activity. (

**a**) Design of reporter construct. Cebpa(0) contains the luc2 gene driven by the proximal Cebpa promoter. Cebpa(7) contains an enhancer in addition to the proximal promoter [4]. Binding sites for C/EBP family transcription factors and Gfi1 are shown in the magnified view. C/EBP sites have been mutated in Cebpa(7m1). (

**b**) The relative activity of Cebpa(0), Cebpa(7), and Cebpa(7m1) inferred by either the REIV (left panels) or the ratiometric (right panels) methods in uninduced IL3 (progenitor) or induced GCSF (neutrophil) conditions. The activity has been normalized against the activity of the proximal promoter (Cebpa(0)). Error bars are standard errors of the mean. The ratiometric method spuriously detects an activation of Cebpa(7m1) in GCSF conditions (bottom right panel; Welch two sample t-test, $N=10$, $p=0.03$).

**Table 1.**Values of Beta distribution parameters, $\alpha $ and $\beta $, for simulating different mean transfection efficiencies ($\overline{t}$).

$\overline{\mathit{t}}$ | $\mathit{\alpha}$ | $\mathit{\beta}$ |
---|---|---|

0.1 | 2 | 18 |

0.25 | 2 | 6 |

0.5 | 2 | 2 |

0.75 | 6 | 2 |

0.9 | 18 | 2 |

Luc | Ren | Construct | Condition 1 | Condition 2 | ⋯ |
---|---|---|---|---|---|

Luc luminescence | Ren luminescence | Name | Name | Name | ⋯ |

⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Repele, A.; Manu.
Robust Normalization of Luciferase Reporter Data. *Methods Protoc.* **2019**, *2*, 62.
https://doi.org/10.3390/mps2030062

**AMA Style**

Repele A, Manu.
Robust Normalization of Luciferase Reporter Data. *Methods and Protocols*. 2019; 2(3):62.
https://doi.org/10.3390/mps2030062

**Chicago/Turabian Style**

Repele, Andrea, and Manu.
2019. "Robust Normalization of Luciferase Reporter Data" *Methods and Protocols* 2, no. 3: 62.
https://doi.org/10.3390/mps2030062