Identification of Circulating miR-22-3p and miR-93-5p as Stable Endogenous Control in Tuberculosis Study

The diagnosis and prognosis of tuberculosis remains challenging and necessitates the development of a new test that can accurately diagnose and monitor treatment responses. In this regard, miRNA is becoming a potential diagnostic and prognostic biomarker which differentiates treatment respondents from non-respondents for various non-infectious and infectious diseases, including tuberculosis. The concentration of miRNAs varies based on cell type, disease, and site of infection, implicating that selection of an optimal reference gene is crucial, and determines the quantification of transcript level and biological interpretation of the data. Thus, the study evaluated the stability and expression level of five candidate miRNAs (let-7i-5p, let-7a-5p, miRNA-16-5p, miRNA-22-3p and miRNA-93-5p), including U6 Small Nuclear RNA (RNU6B) to normalize circulating miRNAs in the plasma of 68 participants (26 healthy controls, 23 latent, and 19 pulmonary tuberculosis infected) recruited from four health centers and three hospitals in Addis Ababa, Ethiopia. The expression levels of miRNAs isolated from plasma of culture confirmed newly diagnosed pulmonary tuberculosis patients were compared with latently infected and non-infected healthy controls. The qPCR data were analyzed using four independent statistical tools: Best Keeper, Genorm, Normfinder and comparative delta-Ct methods, and the data showed that miRNA-22-3p and miRNA-93-5p were suitable plasma reference miRNAs in a tuberculosis study.


Introduction
Tuberculosis remains a major global health threat and affects about one third of the global population with estimated incidents of 10 million new cases and deaths of 1.2 million people in the year 2018. In the same year, the global report of drug-resistant tuberculosis was also about half a million (417,000 to 556,000 people), in which 3.4% of the new cases and 18% of previously treated cases had the chance to develop multidrug-resistant tuberculosis (MDRTB). Hence, the development of new and novel diagnostic tools is one of the pillars to curb this challenge [1]. Depending on the methods utilized, diagnosis of tuberculosis has various shortcomings. Low sensitivity and specificity, delay in diagnosis, difficulty in diagnosis of child and extrapulmonary tuberculosis, lack of early detection of treatment respondents and non-respondents, and absence of accurate disease progression markers are some of the diagnostic challenges [2]. Thus, improving tuberculosis diagnosis though new and innovative methods is needed to enhance tuberculosis control and management.

Selection of Candidate miRNAs Reference
The candidate references circulating miRNA were initially selected based on the previous studies and their applicability as endogenous controls in various studies [15,17]. Then, the selected miRNAs were counter-checked with the small RNA sequencing data that was conducted on 30 plasma samples (10 pulmonary tuberculosis, 10 latently infected, and 10 health controls), and all the selected miRNA were not reported in the differentially expressed results. Finally, five miRNAs (let-7a-5p, let-7i-5p, miR-16-5p, miR-22-3p, miR-93-5p, and one small RNA (RNU6B) were selected as a candidate endogenous control to normalize the gene expressions of circulating miRNA in the plasma of study participants.

Sample Preparation and RNA Extractions
Ten mL of whole blood collected using EDTA vacutainer ® tubes (Becton-Dickinson, Franklin Lakes, NJ, USA) were centrifuged for 10 min at 3000 rpm to separate the plasma from cells and platelets [18]. Then, the plasma was stored in 1 mL aliquots at -80 • C for RNA extraction. Extraction of RNA was done using the NucleoSpin®miRNA Plasma Kit, (Macherey-Nagel, Düren, Germany) following the manufacturer's protocol and modified by adding a RNA carrier described in previous studies [19]. The extraction of miRNA from fluid specimens has shortcomings. Thus, it has been reported that carriers/co-precipitants, such as glycogen and yeast RNA extract, were useful to increase the concentration by enhancing the recovery of nucleic acids during alcohol precipitations [19,20]. As a result, 1 µg of yeast RNA carrier (Torulla Ambion®, Invitrogen, Rockville, MD, USA) for 300 µL plasma was applied. Then, total RNA extracted from plasma was stored at −80 • C for RT-qPCR analysis.

Reverse Transcriptase and Real-Time Quantitative PCR
The RNA extracted from plasma were reverse-transcribed using the TaqMan ® Reverse Transcription Kit (Applied Biosystems, Waltham, MA, USA) to prepare the cDNA having a final volume of 15 µL, which was a mixture of 7 µL TaqMan Master mix, 5 µL RNA templates, and 3 µL of 5× RT primer. The TaqMan master mix in turn consisted of 0.15 µL 100 nm dTTP, 1 µL of 50 U/µL MultiScribe Reverse Transcriptase, 1.5 µL of 10× reverse Transcriptase buffer, 0.19 µL of 20 U/µL RNase inhibitor, and 4.16 µL of nuclease-free water. Then, reverse transcription was done using a Veriti™ 96-Well Fast Thermal Cycler (Applied Biosystems, Waltham, MA, USA) with a parameter of 16 • C for 30 min, followed by 42 • C for 30 min, 85 • C for 5 min, and 4 • C cooling. The synthesized cDNA was finally stored at −20 • C for qPCR.
The quantitative polymerase reaction (qPCR) of miRNAs were done using a TaqMan ® miRNA assay (Table 1) (Applied Biosystems, Waltham, MA, USA) mixed with TaqMan ® Universal master mix II and cDNA templates. The final reaction volume was 20 µL, containing 10 µL TaqMan ® Universal master mix II, 1 µL of 20× TaqMan ® miRNA assay, and 9 µL of (1-10 ng) cDNA templates corrected with RNase-free water. A plate containing the reaction mixture was sealed and transferred to the CFX96 ® Touch Real-Time PCR Detection System (BioRad ® , Hercules, CA, USA), and amplification was performed, following a reaction cycle of 95 • C for 10 min, followed by denaturation at 95 • C for 15 s and annealing at 60 • C for 60 s.

Data Analysis
The qPCR data of each marker were valued numerically in terms of the mean and standard deviation and t-test to assess the distributions among groups using SPSS version 21 statistical software (SPSS Inc., Chicago, IL, USA) and GraphPad ® Prism 7 (GraphPad Software, San Diego, CA, USA). The stability of endogenous references was evaluated using four independent applications, such as Norm finder [21], which is used for mathematical modeling and was designed by Anderson and colleagues to assess the variation of candidate gene normalization and variation between samples in the subgroups of the sample set [21], as well as Best Keeper [22], which is software that evaluates the stability of reference miRNAs based on standard deviation (SD), correlation coefficient (R), and coefficient of variation (CV). The stability of genes expressed was inferred by high R, Low SD, and CV values, and a reference gene with SD greater than the one is concerned was deemed unacceptable [22]. The third was comparative delta Ct methods, a model useful for identifying housekeeping genes through analyzing the stability of genes by comparing the change in relative expression among different samples. Genes were deemed stable when analysis of ∆Ct values in different samples remained constant [23]. The fourth was Genorm algorisms, which was utilized to assess the stability of candidate miRNA by calculating the pair-wise internal variation in all the proposed reference genes across all the samples tested. The internal control of gene stability was defined by stability value (M). Hence, the gene with the lowest value was the most stable, and in general, a value of ≤1.5 indicated a stably expressed gene [24]. Finally, the selection of the best endogenous controls decided following the comparative ranking methods have been described somewhere else [25].

Base Line Characteristics of Participants
There was a total of 68 (38 male and 30 women) participants, of which 26 healthy controls, 23 latently infected, and 19 smear positive pulmonary tuberculosis patients were recruited in this study ( Table 2). The age distribution ranged from 18 to 62 years, with mean of 30.82 (±10.3) (Figure 1).

qPCR Data Distributions
The qPCR data of all miRNAs were evaluated in terms of mean, homogeneity, and variances using SPSS Version 21 (SPSS Inc., Chicago, IL, USA). The mean deviation of reading in all cases was less than 0.5. (Table 3). In addition, the ANOVA analysis using GraphPad prism version 7 showed that the expression data of miRNAs had no significant differences (p ≥ 0.05) among all the three groups of participants ( Figure 2).

qPCR Data Distributions
The qPCR data of all miRNAs were evaluated in terms of mean, homogeneity, and variances using SPSS Version 21 (SPSS Inc., Chicago, IL, USA). The mean deviation of reading in all cases was less than 0.5. (Table 3). In addition, the ANOVA analysis using GraphPad prism version 7 showed that the expression data of miRNAs had no significant differences (p ≥ 0.05) among all the three groups of participants ( Figure 2). The distribution of qPCR cycle data of all selected miRNAs was evaluated in terms of mean, homogeneity, and variances using SPSS Version 21 over all samples (SPSS Inc., Chicago, IL, USA). The analysis demonstrated that there were no significant differences among the expression data of

Reference Gene Stability Analysis by Best Keeper, Norm Finder, GeNorm, Comparative Delta Ct Methods and Comprehensive Ranking
The distribution of qPCR cycle data of all selected miRNAs was evaluated in terms of mean, homogeneity, and variances using SPSS Version 21 over all samples (SPSS Inc., Chicago, IL, USA). The analysis demonstrated that there were no significant differences among the expression data of miRNAs with statistical p values ≤ 0.05 (Table 3). The data of RNU6B was excluded from the analysis because of their poor Cq values and the reading was above the threshold level (data not shown), whereas the rest of the five miRNAs' expression were evaluated using four independent algorisms: GeNrom, NormFinder, BestKeepers, and Comparative Delta Ct methods.

NormFinder
Assessment of stability of candidate reference miRNAs using NormFinder modeling [21] showed that the stability value of all selected candidate reference miRNAs were less than 1, but miR-22-3p was the most stable reference miRNA with a stability value of 0.485, followed by let-7i-5p and miR-93-5p with values of 0.611 and 0.756, respectively, and miR-16-5p was the least stable reference miRNA, with a stability value of 0.983 (Table 4 and Figure 3).

Best Keepers
The expression data (qPCR data) of all categories were analyzed using BestKeepers software [22]. In terms of correlation coefficient (R), let-7i-5p displayed the highest value, but the standard deviation (SD) was greater than 1. Thus, it could not be taken as a stable marker. However, miR-16-5p, miR-22-3p, and miR-93-5p had SDs of less than one, and as a result, they were considered as stable markers (Table 5A,B). Of these, miR-93-5p was the most stable, with a SD of 0.82; whereas let-7a was the least stable (SD = 1.203) endogenous marker (Figure 4). The correlation analysis explained by the Bestkeeper index (BKI) revealed that all the selected five endogenous genes showed a significant correlation, with p values of 0.001 (Table 5B).

Best Keepers
The expression data (qPCR data) of all categories were analyzed using BestKeepers software [22]. In terms of correlation coefficient (R), let-7i-5p displayed the highest value, but the standard deviation (SD) was greater than 1. Thus, it could not be taken as a stable marker. However, miR-16-5p, miR-22-3p, and miR-93-5p had SDs of less than one, and as a result, they were considered as stable markers (Table 5A,B). Of these, miR-93-5p was the most stable, with a SD of 0.82; whereas let-7a was the least stable (SD = 1.203) endogenous marker (Figure 4). The correlation analysis explained by the Bestkeeper index (BKI) revealed that all the selected five endogenous genes showed a significant correlation, with p values of 0.001 (Table 5B).

Comparative Delta CT
The analysis of qPCR data through comparative delta Ct methods, as described in previous research [23] revealed that miR-22-3p was the most stable endogenous control with an average standard deviation of 0.949, followed by let-7i-5p and miR-93-5p with values of 1.01 and 1.0, respectively, whereas miR-16-5p was the least stable, with an average SD of 1.22 (Table 6 and Figure  5).

Comparative Delta CT
The analysis of qPCR data through comparative delta Ct methods, as described in previous research [23] revealed that miR-22-3p was the most stable endogenous control with an average standard deviation of 0.949, followed by let-7i-5p and miR-93-5p with values of 1.01 and 1.0, respectively, whereas miR-16-5p was the least stable, with an average SD of 1.22 (Table 6 and Figure 5).

Comparative Delta CT
The analysis of qPCR data through comparative delta Ct methods, as described in previous research [23] revealed that miR-22-3p was the most stable endogenous control with an average standard deviation of 0.949, followed by let-7i-5p and miR-93-5p with values of 1.01 and 1.0, respectively, whereas miR-16-5p was the least stable, with an average SD of 1.22 (Table 6 and Figure  5).

Genorm
Candidate miRNA stability analysis using Genorm demonstrated that all five miRNAs were within the acceptable limit (stability value (M) ≤ 1.15) [24]. However, miR-22-3p and miR-93-5p were the two best reference miRNAs, having an equal stability value of 0.629, followed by let7-i-5p and let-7a-5p with values of 0.841 and 0.999, respectively. On the other hand, miRNA-16-5p turned out to be the least stable, with a value of 1.087 (Table 7 and Figure 6).

Genorm
Candidate miRNA stability analysis using Genorm demonstrated that all five miRNAs were within the acceptable limit (stability value (M) ≤ 1.15) [24]. However, miR-22-3p and miR-93-5p were the two best reference miRNAs, having an equal stability value of 0.629, followed by let7-i-5p and let-7a-5p with values of 0.841 and 0.999, respectively. On the other hand, miRNA-16-5p turned out to be the least stable, with a value of 1.087 (Table 7 and Figure 6).   6. Gene stability graph using Genorm.

Comprehensive Ranking
This method was employed to analyze the geometric ranking values that determines the stability of the reference gene. Consequently, miRNA-22-3p and has-let7a-5p were the best and least stable, with ranking values of 1.19 and 4.40, respectively; whereas miR-93-5p and let-71-5p came in second and third, with geometric mean values of 1.73 and 2.83, respectively (Table 8 and Figure 7).

Comprehensive Ranking
This method was employed to analyze the geometric ranking values that determines the stability of the reference gene. Consequently, miRNA-22-3p and has-let7a-5p were the best and least stable, with ranking values of 1.19 and 4.40, respectively; whereas miR-93-5p and let-71-5p came in second and third, with geometric mean values of 1.73 and 2.83, respectively (Table 8 and Figure 7).

Comprehensive Ranking
This method was employed to analyze the geometric ranking values that determines the stability of the reference gene. Consequently, miRNA-22-3p and has-let7a-5p were the best and least stable, with ranking values of 1.19 and 4.40, respectively; whereas miR-93-5p and let-71-5p came in second and third, with geometric mean values of 1.73 and 2.83, respectively (Table 8 and Figure 7).

Discussion
MicroRNA is a class of small RNA (18 to 24 nucleotides) that play regulatory roles in gene expression at the post-transcriptional level, either through translational repression or mRNA degradation [26,27]. They regulate almost one-third of the known protein coding sequences, and are involved in various cellular processes, including cell proliferation, apoptosis, and signaling pathways [4,28]. The mature miRNAs that are found in varieties of bodily fluids, circulating miRNA, also regulate a wide range of processes, both in immune and non-immune cells, and affect their genes expression. Mounting evidence has reported that the level of circulating miRNA, up/down or deregulation are associated with specific physiological conditions. Thus, the variation in gene expression can be utilized as diagnostic and prognostic markers in many non-infectious and infectious diseases, including tuberculosis [11,[29][30][31]. The stability in nature and various external conditions, expressions associated with changes in physiological conditions, abundance in bodily fluids, and noninvasiveness emphasize circulating miRNA as a source of stable biomarkers [9][10][11]32].
The quantitative polymerase chain reaction (qPCR) is the most powerful and common technique used for quantification of nucleic acid molecules that reflects the biology of tested samples. The data accuracy and quality that results from qPCR can be affected by several factors, including biological variability, sample storage conditions, nucleic acid extraction, cDNA synthesis, qPCR data computation, and reference gene (internal control) validation [13]. However, the lack of universal endogenous controls for miRNA in biofluids remains an impediment for accurate analysis of circulating miRNAs' expression. Therefore, the validation of an optimum endogenous control is a crucial step in qPCR experiment that ensures the reliability of data generated as well [13]. In this study, we examined the suitability of six candidate reference genes: let-7i-5p, let-7a-5p, miRNA-16-5p, miRNA-22-3p, and miRNA-93-5p, as well as RNU6B to normalize the expression of target miRNA in plasma.
Although RNU6B and miR-16 are the most common reference genes utilized to normalize circulating miRNA expression, recent studies have reported that both are not stable markers (endogenous control) in serum and plasma samples of all disease types [33,34]. Thus, a wide range of miRNA has been reported as endogenous controls. Of these, miR-93-5p, let-7a, miR-221, miR-26a, miR-191, and miR-320a have been commonly identified stable circulating miRNAs for various types of diseases and pathological conditions [15]. Our observation also confirmed that the expression of RNU6B is highly variable and was excluded from further data analysis, as its readings were above the threshold (data not shown).
We employed multiple algorithms, such as Best Keepers, geNorm, Normfinder, and Comparative delta CT [35], to identify the best suitable circulating miRNAs in the plasma of tuberculosis-infected and non-infected controls. The computation using Normfinder algorithms [21] revealed that all the five selected miRNAs had a stability value of less than 1, indicating that they can be selected as a reference marker. However, miR-22-3p was the most stable, with a value of 0.485 (Table 4 and Figure 3). Similarly, analysis of qPCR data using comparative delta Ct methods [23] also proved that miR-22 was the most stable endogenous, with a gene stability value of 0.949 ( Figure 5 and Table 6). In line with our observations, miR-22 combined with miR-26a and miR-221 were also proposed as reference miRNAs for circulating miRNA in hepatitis B infected patients [36].
Evaluation of miRNAs expression data performed using Bestkeepers modeling [22] revealed that miR-93-5p, miR-22-3p, and miR-16-5p had acceptable standard deviations (i.e., <1). Of them, miR-93-5p was the most stable with a value of 0.82, followed by miR-22-3p and miR-16-5p with SDs of 0.90 and 0.94, respectively (Table 5 and Figure 4). In addition, the computation of qPCR data using geNorm [24] resulted in both miR-93-5p and miR-22-3p as stable reference miRNAs with stability values of 0.629 ( Figure 6 and Table 7), which is comparable to a cohort study conducted to normalize the expression of circulating miRNA in the plasma of tuberculosis-infected and non-infected participants, which reported that miR-93 was the most stable reference gene [17].
It has been well-reviewed that miR-93-5p is one of the most common circulating miRNAs reported as an internal reference control for both cancer and other diseases [15]. Song et al. employed multiple algorithm tools to analyze the stability of circulating microRNA, and reported miR-93, combined with miR-16, as a stable serum miRNA for gastric carcinoma patients and healthy controls [37]. In another study, miR-93-5p, together with miR-25-3p and hsa-miR-106b-5p, were proposed as internal references for serum miRNA in colorectal cancer patients [38]. Similarly, miR-93-5p and miR-425-5p were identified as stable endogenous markers in the plasma of vulvar carcinoma [39] and miR-93 with miR-101-3p in the plasma of individuals associated with major depression disorder [40].
Finally, comprehensive ranking analysis, which examined the overall geometric ranking [35] showed miR-22-3p as the most stable, followed by miR-93-5p with values of 1.19 and 1.73, respectively ( Figure 7 and Table 8), whereas miR-16-5p was the least stable, with a value of 2.83. In general, the ranking order of stability was as follows: miR-22-3p > miR-93-5p > let-7i-5p > let-7a-5p > miR-16-5p (Table 9). Table 9. Overall stability ranking order of miRNAs. MiR-22-3p and miR-93-5p were first and second in ranking order, whereas miR-16-5p was the least stable. In summary, the stability and gene expression of circulating miRNA can be affected by various conditions associated with either the host or environment, or both. Thus, setting an optimal endogenous control is important to get a reliable result that indicates the clinical conditions. Our study implicated that miR-22-3p and miR-93-5p were stably expressed and can be utilized as an endogenous reference to