Acute respiratory distress syndrome (ARDS) is a devastating inflammatory lung disease, affecting approximately 10% of all intensive care unit patients, whereof around 40% perish in hospital [1
]. Due to SARS-CoV-2, the incidence for acute respiratory failure has increased enormously in the past year, given that classical and coronavirus disease-2019 (COVID-19)-related ARDS are closely related [2
]. Clearly, altered gene expression plays a critical role in ARDS, and it has been estimated that 10% to 25% of all genes may change their expression in this condition [4
]. This impressive number illustrates the challenge of identifying suitable reference genes, which are indispensable as a yardstick to quantify changes in gene expression.
In order to study ARDS, many different mouse models have been developed [5
]. To establish a procedure for normalisation based on reference genes, here, we examined the three most commonly used acute lung injury (ALI) models—each of them addressing different aetiologies of pulmonary inflammation—in a mouse intensive care unit (MICU) that provides sedation, mechanical ventilation (MV), oxygen support, and pulmonary and cardiovascular monitoring [6
The most frequently used one-hit model appears to be ventilator-induced lung injury (VILI) [5
]. It causes biotrauma and, if ventilation pressure is high enough, also mechanical disruption of the alveolar–capillary barrier in healthy lungs [8
]. This model helps to understand the molecular processes in ventilated ARDS patients, who can develop ventilator-associated lung injury [9
]. In order to study VILI and eliminate the effects of immune cell sequestration, we used isolated perfused lungs in a well-established setup [10
Another popular two-hit model is the instillation of lipopolysaccharide (LPS), a glycolipid derived from the outer membrane of Gram-negative bacteria that binds to the Toll-like receptor 4 and activates the innate immune response [12
]. Although the sterile LPS-driven inflammation cannot be equated with bacterial infection, LPS is used with the attempt to model the potentially devastating immune responses of ARDS patients (cytokine storm) and to induce pulmonary inflammation in a dose-dependent matter [7
A further important two-hit model (i.e., i.t. instillation of low pH (<2.0) hydrochloric acid (HCl)) aims to mimic gastric aspiration. Although the pulmonary consequences of aspiration are not solely related to the acidic pH, this model is highly relevant because gastric aspiration causes more than 10% of all ARDS cases [13
]. Acid produces an initial chemical injury within 1 h, followed by an inflammatory response with features of human ARDS after about 3 h [14
A general weakness of animal studies is that they often translate poorly into the clinical situation, even when performed under the highest standards. In addition, reproducibility is often hampered by variability between different labs even when similar models are used [15
], possibly explained by different experimental setups, genetic animal backgrounds, assay conditions, and other factors. It is therefore of great importance to not introduce further ambiguities into the analysis.
At present, real-time quantitative reverse-transcriptase PCR is the standard tool for detection and quantification of RNA expression and is widely used to analyse inflammatory gene expression. However, despites its wide-spread use, the accuracy and reliability of qPCR depends on good and standardised laboratory practice, as outlined by Bustin and colleagues in 2009 in the MIQE guidelines [16
], as much as it does on proper data analyses. Because qPCR gene expression is usually expressed in relative terms, proper normalisation of the data is of utmost importance. While it is known that the validity of these analyses depends on the stability of the reference gene(s), their selection is often not well-founded. As recently shown for a preterm lamb model, commonly used reference genes such as 18S rRNA
are not necessarily the best choice for this model, suggesting that reference genes need to be determined for each model [17
]. Furthermore, it seems questionable whether a single control reference is sufficient for proper normalisation, as this may lead to erroneous normalisation [18
]. A recent reference gene validation study in a model of Duchenne muscular dystrophy has shown that gene stability changes with experimental conditions and no single gene is stable under all conditions [20
]. As a solution, the use of a reference gene index, containing multiple reference genes, has already been recommended in the early 2000s [19
This led us to the present study in which we analysed the stability of 11 reference genes (Actb
) and their effect on data normalisation in different murine lung injury models. Although animal models and ARDS patients share many common features of inflammation, different stimuli and models also show critical differences with respect to gene expression [4
]. Therefore, a separate reference gene stability analysis is required for each model and was performed here using three different commonly used algorithms: Bestkeeper [21
], Normfinder [22
], and the pairwise approach of geNorm [19
]. We then chose Interleukin 6 (Il-6
) and C-X-C motif ligand 1 (Cxcl-1
) (as murine Interleukin 8 (Il-8
) homologue) mediators, an cytokine well-known to play an important role in ARDS [23
] as the exemplary target gene to examine and document the relevance of reference gene stability on gene normalisation.
Our study highlights the importance of reference gene stability analysis and shows how improper reference genes can lead to misinterpretations. Furthermore, we provide validated reference gene candidates for three popular lung injury models in mice.
RT-qPCR is widely used for gene expression studies due to its high sensitivity and specificity at relatively low cost. Therefore, even with next generation sequencing on the rise, it remains the most widely used method for gene expression analysis of specific biomarkers in molecular biology research. To circumvent possible variations in RNA extraction/yield and reaction efficiency and to compare different experimental conditions, as exemplified in our study, accurate normalisation by appropriate reference genes is of crucial importance. While expression levels of commonly used reference genes are known to vary between different cells, tissues, and conditions, a lack of evidence for usefulness, normalisation by Actb
is frequently relied upon in RT-qPCR studies without further validation, not only in lung research, but generally in vertebrates [25
]. This might have historical reasons or could be due to the fact that these genes have been shown to be stably expressed in specific settings, even though several studies on reference gene stability in the lung show that the suitability of reference genes in lung models cannot be generalised. The stability of Actb
and other frequently used reference genes (e.g., Rps29
, 18s RNA
, and Hprt
) varies strongly in the lung and these candidates can turn out as the best choice, but might just as well be not suitable at all [17
]. For that reason, we analysed a set of eleven candidate reference genes for their suitability to normalise gene expression data obtained from four commonly used experimental approaches in mice in the field of ARDS research [5
In line with the studies above-mentioned, our comparison confirmed that the suitability of a commonly used ‘housekeeper’ cannot be assumed per se. More precisely, in our acid model, Gapdh
were amongst the three least stable genes. In the LPS model, Gapdh
also proved not to be stable, whereas Actb
turned out to be the most stable gene. Vice versa, in the in vivo ventilation model, in which inflammation is less profound than with acid or LPS, Gapdh
was the most stable and Actb
the least stable gene. Our findings correlate with biological functions of these genes. Since Actb
is one of the major components of the cytoskeleton with a role in mechanotransduction [30
], it is reasonable that it is particularly regulated in models with pronounced mechanical stress. Gapdh
, on the other hand, has frequently been shown to be strongly regulated in the course of metabolic reprogramming that occurs under inflammatory conditions [31
], as present in our Acid + MV and LPS + MV models. Regarding the majority of tested genes, the picture was completely different again in the IPL ventilation setting and our data illustrated a similar variability between the four settings for the other tested genes. Our findings clearly demonstrate that an inappropriate normalisation has significant effects on the results and the interpretation of the data, even under conditions where reference gene expression is stable, as indicated by the low M-value in three of the four experimental approaches. Thus, our study demonstrates the significant dimension of bias that can be introduced at this point.
In order to compare normalisation with the least (IN) versus the most suitable (ON) RGI, we chose two exemplary pro-inflammatory mediators: Il-6
, which is a murine Il-8
homologue. Il-6 and Il-8 levels correlate with tidal volume and mortality in human ARDS, as already shown over 20 years ago [32
]. They are equally important in experimental ALI, and belong to the standard readout parameters in most ALI studies [6
]. While Il-6 is a cytokine that elicits inflammation and, moreover, cytokine storm [24
], the chemokine Il-8 is co-responsible for neutrophil recruitment, a hallmark of ARDS/ALI pathogenesis [38
]. In two recent phenotyping approaches of human ARDS, Il-6 and Il-8 have been found to be predictive for the hyperinflammation and the reactive phenotype, both correlating with higher mortality than the respective phenotype with lower Il-6 and Il-8 [39
]. Furthermore, these two mediators are likewise relevant in COVID-19 induced ARDS [41
We were able to show that this normalisation-mediated bias would have led to crucial misinterpretation of the Il-6
data in our models. We observed four distinct issues of inappropriate normalisation in our experiments—loss or gain of statistically significant effects as well as under- or overestimation of effect magnitudes. This is not surprising, considering that reference gene stability can even vary in the same model, when performed under different conditions [25
]. Interestingly, the extent of statistical bias was associated with the overall gene stability in the models. While the more stable models were less susceptible, analysis of the data from the LPS model, where there was a marked difference in stability between IN and ON (mean M-value = 0.530), actually showed a combination of both spurious effects (loss of effects plus underestimation of effect magnitude). Of note, only in one out of eight analyses (Figure 3
Acid + MV), inappropriate normalization did not influence data interpretation. Therefore, our study nicely demonstrates how normalisation against inappropriate reference genes may result in small but potentially biologically relevant effects being undetected or misinterpreted. In practice, inadequate normalisation can lead to a wrong interpretation of an ALI model and even to false conclusions regarding the tested interventions. This is particularly relevant when gene expression analysis is not complemented by protein quantification data. On the other hand, many studies show that gene and protein expression do not always accord well [42
]. Besides the regulatory mechanisms and the time point of sample collection, this might also be explained by the common utilisation of inappropriate reference genes. Obviously, these findings should also be relevant for gene expression studies in humans with COVID-19.
4. Materials and Methods
4.1. Animals and Sample Collection
Female C57BL/6 mice (8–12 weeks, weighing 20–25 g) were kept under standard conditions. All experiments were in accordance with the German animal protection law and European Directive 2010/63/EU and approved by regional governmental authorities (LANUV NRW, permission numbers: AZ84-02.04.2013.A078, AZ84-02.04.2015.A385, AZ84-02.04.2013.A131, and 10509A6).
Three ALI models were carried out: the one-hit VILI model, in vivo and ex vivo in isolated perfused lungs, the two-hit model of acid-induced lung injury and MV (Acid + MV), and the two-hit model of lipopolysaccharide induced lung injury and MV (LPS + MV). Surgical procedures were performed as described [6
]. In brief, mice were initially anaesthetised with pentobarbital sodium (75 mg/kg) and fentanyl (40 µg/kg), followed by tracheotomy with a 20-gauge cannula and connect to the respective ventilator. A catheter for blood pressure monitoring and permanent infusion of 0.9% NaCl (200 µL/h) was inserted into the carotid artery. Blood pressure and ECG were recorded permanently (PowerLab, ADInstruments, Spenbach, Germany). In addition, pulsoxymetry was performed with a tail clip (MouseOx, STARR Life-Science, Oakmont, PA, USA). Body temperature was maintained between 36.5 °C and 37.5 °C by a homeothermic blanket (Harvard Apparatus Holliston, MA, USA).
gives an overview of the model details of the in vivo models. The samples from the in vivo and ex vivo models, used in this study, were generated in previous projects, as cited below. In addition, successful induction of ALI is shown here by the Horovitz index (Table S2
In the VILI model, plateau pressures of 10 cmH2
O (p10), 24 cmH2
O (p24), 27 cmH2
O (p27), and 30 cmH2
O (p30) were used as described previously. Ex vivo ventilation experiments (IPL) were also performed as described before [43
]. After baseline recording with p
= 8 cmH2
O, f = 90 min−1
, and PEEP = 3 cmH2
O, ventilation was continued for 210 min, with high (p
= 25 cmH2
O; OV = over ventilation) versus low pressure (p
= 8 cmH2
O, NV = normal ventilation).
In the model Acid + MV, mice received 50 µL hydrochloric (HCl) with pH = 2.0, pH = 1.8, or pH = 1.5. MV with the flexiVent ventilator (SCIREQ, Montreal, Canada) was performed with VT = 16 mL/kg, f = 80 min−1
, PEEP = 2 cmH2
O and FiO2
= 0.3 and was terminated after 330 min [6
]. In the model LPS + MV, mice were instilled with 0.001 mg/kg, 0.25 mg/kg, or 2.5 mg/kg LPS (LPS Escherichia coli
O111-5MG, Lot #127M4016V, Sigma, St. Louis, MO, USA) and were ventilated as in the acid model, but for 7 h [45
]. In both models, controls were instilled with 0.9% NaCl.
4.2. RNA Isolation, cDNA Synthesis, and qPCR
For gene expression analysis, RNA was isolated fully automated with the QIAcube (QIAGEN GmbH; Hilden, Germany) using the RNeasy®
Mini Kit (QIAGEN GmbH; Hilden, Germany). Therefore, snap-frozen lung tissue (15 mg) was ground in liquid nitrogen followed by homogenisation using the Precellys®
Tissue Homogeniser. RNA concentration and purity was determined spectrophotometrically with the NanoDrop (Peqlab; Erlangen, Germany). The integrity of RNA was assessed either by agarose gel electrophoresis or by the Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA) displaying RNA integrity as the RNA integrity number (RIN) (Figure S1
). One µg of total RNA were transcribed into cDNA using the Maxima Reverse Transcriptase (RT) (Thermo Fisher Scientific; Waltham, MA, USA) and mixed priming oligo-(dT)18:random hexamer ratio of 3:1, v
), according to the manufacturer’s instructions with 15 min RT incubation.
PCR reactions took place in a total volume of 10 µL containing 1 µL cDNA (equates to 25 ng cDNA), 5 µL SYBR-Green Mastermix I (Roche-Diagnostics GmbH, Mannheim, Germany), 3 µL Nuclease-free water, 0.5 µL each of a forward primer (10 µM), and the corresponding reverse primer (10 µM). After heat activation of the polymerase at 95 °C for 5 min, every cycle consisted of denaturation of the double-stranded DNA template at 95 °C for 10 s, annealing of the primers to the single-stranded DNA templates at specific temperature for 10 s, and elongation at 72 °C for 15 s. The target genes Il-6 and Cxcl1 were subjected to a different cycle protocol with the following conditions: single heat activation at 95 °C for 10 min and 40 times 95 °C for 15 s; specific annealing temperature for 30 s and 72 °C for 1 min.
Quantitative PCR was performed in technical triplicates. Reactions were run on a LightCycler®
480 System (Roche, Basel, Switzerland) for 40 cycles using SYBR Green I technology. Amplification efficiency was calculated by slope analyses of standard curves prepared from a defined cDNA-dilution series within each dataset or with LinReg 2016.0 software (Heart Failure Research Centre, Amsterdam, The Netherlands) as described before [46
]. More details are provided in the Supplementary Materials
4.3. Primer Design
Primers were created with the online primer design tool Primer-BLAST from NCBI (see Figure 2
for primer binding sites), and the physiochemical properties of the oligonucleotides (amplification length and Tm
, secondary structures of amplicon, primer-dimer formation) were calculated using OligoAnalyser 3.1 (Integrated DNA Technologies; Leuven, Belgium) and OligoCalc (http://biotools.nubic.northwestern.edu/OligoCalc.html
, (accessed on 21 July 2021)). The online tool uMelt (https://www.dna-utah.org/umelt/quartz/um.php
, (accessed on 21 July 2021)) was used for in silico melt curve analysis. Appropriate primer annealing temperature was established by gradient qPCR. Primer specificity was validated by melt curve analysis and DNA agarose gel electrophoresis.
4.4. Data Analysis
The evaluation of reference gene stabilities was achieved using three validated algorithms: BestKeeper [21
], NormFinder [22
], and geNorm [19
], each with unique advantages (details in Supplementary Materials
). BestKeeper calculates the standard deviation (SD) of each gene and the geometric mean of candidate Cq values (= BestKeeper index) and, in contrast to NormFinder and geNorm, uses raw data (Cq values) instead of relative quantities [21
]. Genes with the lowest SD and highest coefficient of correlation with the BestKeeper index represent the most stable reference genes.
NormFinder is a model-based approach in which both, intra- as well as intergroup variations, are considered to avoid systematic error [22
The advantage of geNorm over the two other algorithms is the additional calculation of the number of genes required for reliable normalisation by using pairwise variation of calculated SD of the expression ratio of two reference genes for the evaluation of stabilities [19
]. Since this is in line with the currently most accepted normalisation procedures using a reference gene index (RGI) containing multiple reference genes instead of single gene normalisation, we used the geNorm results for our study. The relative changes in Il-6
gene expression normalised to either the most stable or least stable RGI (based on geNorm results) was calculated for each mouse model. In geNorm, the gene-stability measure (called M value) is calculated for each measured gene and represents the average pairwise variation of the specific gene compared to all the other reference genes. Higher M values correlate with low and smaller M values with high expression stability.
Each analysis was conducted with 12 samples for ex vivo IPL, 15 samples for the in vivo models, and 11 reference gene candidates. For the target genes, we used n = 6 in the Acid + MV and in the LPS + MV model and n = 5 in the IPL setting. In the VILI study, the following sample numbers were available: n = 3 in the group p10, n = 5 in the groups p24 and p27, and n = 6 in the group p30.
Variance homogeneity was checked with the Bartlett test. The Shapiro–Wilk test was used to test for normal distribution. BoxCox-Y transformation was conducted to achieve homoscedasticity if necessary. One-way ANOVA followed by the Tukey post-hoc test was performed to analyse the parametric data. Non-parametric data were analysed using the Kruskal–Wallis test followed by Dunn’s post-hoc testing. Data are shown as mean + SEM. Statistical analyses were performed with GraphPad Prism 9 (GraphPad, La Jolla, CA, USA) and JMP 10 (Böblingen, Germany).