Colorectal cancer (CRC) is the third most common cancer worldwide and is responsible for approximately 50,000 deaths per year in the United States alone. If detected in the early stages, adenomas can be surgically removed and prognosis is favorable, with the five-year survival rate reaching 90%. However, once the cancer metastasizes to the lymph nodes and distant organs, the five-year survival rate drops to 60% and 10%, respectively [1
]. At more advanced stages, chemoradiation therapy (CRT) may become a viable neoadjuvant or adjuvant treatment option. For instance, in stage IV resectable colon cancer, neoadjuvant chemotherapy may be beneficial in identifying candidates for surgery and improving three-year disease-free survival [2
]. Furthermore, there is currently no effective method to predict patient response. For rectal cancer, a “watch and wait” approach is commonly employed after surgery or CRT to assess cancer recurrence and determine whether more aggressive treatment should be administered [3
]. Novel biomarkers are needed to expedite tumor detection and improve patient stratification for intervention purposes.
Currently, endoscopy and imaging techniques, such as MRI and CT, are the primary tools used to diagnose and assess CRC progression. Endoscopy is a procedure in which the gastrointestinal tract can be visualized and sampled. Colonoscopy has been shown to decrease CRC incidence by up to 76% and lower mortality by up to 65% [5
]. However, it is an invasive technique that may result in low compliance. In contrast, CT and MRI are non-invasive but do not optimize cost and reliability [8
]. Other non-invasive diagnostic measures such as the fecal occult blood test (FOBT) and carcinoembryonic antigen (CEA) blood test, have low sensitivity, ranging from 30% to 60% and cannot be used alone to detect or predict tumor progression [10
One promising biomarker candidate of CRC is microRNA (miRNA). miRNAs are short 22 nucleotide sequences that inhibit gene expression and are dysregulated in cancer. miRNAs bind to mRNA to either degrade mRNA via the RNA-induced silencing complex (RISC), destabilize mRNA via decapping and deadenylation, or prevent translational initiation and elongation via ribosomal interactions [14
]. Several miRNAs are dysregulated in cancer, including a group of miRNAs termed “oncomiRs”. The first oncomiR to be described was miR-21, a regulator of phosphatase and tensin homolog (PTEN) [15
], programmed cell death 4 (PDCD4) [16
], Sprouty 1 and 2 (SPRY1 and SPRY2) [17
], and other tumor suppressors and transcriptional regulators [19
]. In mice, miR-21 overexpression promoted tumorigenesis, while miR-21 knockout mice were protected against tumor formation [20
]. Human studies have corroborated these findings, as elevated miR-21 levels have been associated with advanced tumor grade and poor prognosis across cancer types [21
miR-20a is an oncomiR that participates in cell proliferation and cancer progression. Along with miR-17, miR-20b, miR-93, miR-106a, and miR-106b, miR-20a is a member of the miR-17 family, a polycistronic group of functionally related miRNAs that contain the same seed sequence [23
]. The oncogene MYC induces the miR-17 family, which in turn dysregulates cell cycle progression, apoptosis, and tumor invasion via interactions with PTEN [25
], E2F genes [26
], and the transforming growth factor beta (TGF-β) pathway [27
]. miR-20a has been shown to be upregulated across both solid and hematopoietic cancers [28
] and has even been suggested as a diagnostic serum biomarker for gastric [29
], nasopharyngeal [30
], and prostate cancer [31
]. However, miR-20a has been particularly noted for its regulatory role in CRC by upregulating the TGF-β signaling cascade. Previous in vitro experimentation has used dual luciferase assays to show direct miR-20a-mediated downregulation of homolog of drosophila mothers against decapentaplegic protein 2 and 4 (SMAD2 and SMAD4) [32
], TGF-β receptor 2 (TGFBR2) [34
], and cyclin dependent kinase inhibitor 1A (CDKN1A) [35
]. Given its biological function, miR-20a has been measured in several CRC patient cohorts, but findings have been inconsistent.
We systemically review and quantitatively synthesize the evidence pertaining to miR-20a as a CRC biomarker. Specifically, we examine the diagnostic potential of miR-20a by quantifying its expression in feces, serum, and tumor tissue, as well as evaluate its sensitivity and specificity in CRC detection. In addition, we quantify the prognostic power of miR-20a in predicting the overall survival (OS) rates of CRC patients via hazard ratio (HR) meta-analysis. Due to its status as an oncomiR, miR-20a is hypothesized to be upregulated in cancerous tissue and is expected to detect CRC with high sensitivity. We also expect that higher miR-20a levels may correlate with poor prognosis, as miR-20a promotes tumor invasion and metastasis [36
2.1. Study Selection Criteria
Studies that met all of the following criteria were included in the review-study design: randomized control trial (RCT), pre-post study, cross-sectional study, case-control study, and cohort study; study subjects: adults aged 18 years and above with colon, rectal, or CRC described by either tumor-node-metastasis (TNM) staging or pathology reports; main outcome: miR-20a expression in tumor, blood, or fecal samples or with at least one diagnostic or prognostic measure; article type: peer-reviewed publications; and language: English.
Studies were excluded from the review if they met any of the following criteria: non-English publications, reviews or case studies, and non-peer reviewed articles (e.g., dissertations or conference proceedings).
2.2. Search Strategy
A keyword search was performed in the PubMed and Web of Science databases. The search algorithm included all possible combinations of the keywords (with wildcard characters) from the following two groups: (1) “miR-20a-5p”, “mir-20a”, “microRNA-20a”, “has-mir-20a”, “mir20a”, “microRNA 20a”, and “mir 20a”; and (2) ‘‘colon cancer”, “rectal cancer”, and “colorectal cancer”. Titles and abstracts of the articles identified through the keyword search were screened against the study selection criteria. Potentially relevant articles were retrieved for evaluation of the full texts.
A cited reference search (i.e., forward reference search) and a reference list search (i.e., backward reference search) were conducted based on the articles from the keyword search. Articles identified through forward/backward reference search were further screened and evaluated using the same study selection criteria. The reference search was repeated on all newly-identified articles until no additional relevant article was found. The two authors of this review, Laura Moody and Svyatoslav Dvoretskiy, jointly determined the inclusion/exclusion of all articles retrieved in full texts and discrepancies were resolved through discussion.
2.3. Data Extraction
Relevant data were extracted according to the following general categories: author(s), publication year, study design, specimen type, the technical methodology used, sample size, and participant characteristics. Diagnostic and prognostic measures were also extracted. miR-20a expression was recorded as either a fold change, median ± standard deviation (SD), mean ± SD, or as simply up or down-regulated. The sensitivity and specificity of miR-20a as a diagnostic biomarker of CRC was denoted using receiver operating characteristic (ROC) curves. We extracted the area under the ROC curve (AUROC) along with a 95% confidence interval (CI). Prognostic measures included disease-free survival (DFS) time and OS time that was reported in Kaplan-Meier curves and HRs.
A meta-analysis was performed to estimate the pooled effect size of miR-20a expression fold change, AUROC, and HR. Study heterogeneity was assessed using the I2 index. The level of heterogeneity represented by the I2 index was interpreted as modest (I2 ≤ 25%), moderate (25% < I2 ≤ 50%), substantial (50% < I2 ≤ 75%), or considerable (I2 > 75%). A fixed-effect model would be estimated when modest to moderate heterogeneity was present, and a random-effect model would be estimated when substantial to considerable heterogeneity was present. Publication bias was assessed by the Begg’s and Egger’s tests. All statistical analyses were conducted using the Stata 14.2 SE version (StataCorp, College Station, TX, USA). Specific STATA commands included “metan” and “metabias”. All analyses used two-sided tests, and p-values less than 0.05 were considered statistically significant. Summary AUROC curve was generated using R version 3.3.2 (R Foundation for Statistical Computing, Vienna, Austria).
2.5. Study Quality Assessment
The quality of all studies included in the review was evaluated by the following 10 quality assessment criteria adapted from Littell et al. [37
]: (1) Was the research question clearly stated? (2) Were the inclusion and exclusion criteria clearly stated? (3) Were the subjects in the study representative of the pathological population? (4) Were the main findings of the study clearly described? (5) Was a control group included, and if so, did it consist of non-tumor specimens from healthy age- and gender-matched subjects? (6) Were diagnostic or prognostic measures clearly defined (e.g., TNM stage or five-year survival rate)? (7) Were samples collected from a relevant source (i.e., tumor tissue, blood, or feces) in a manner to prevent degradation and contamination? (8) Was miRNA expression measured by a validated technique (e.g., miRNA-seq, microarray or quantitative real-time PCR [q-PCR])? (9) Was a sample size justification via power analysis provided? (10) Were potential confounders properly controlled in the analysis? Each of the 10 criteria was scored on a scale of zero to two, depending on whether the criterion was unmentioned or unmet (0), partially met (1), or completely met (2). The possible total score ranged from zero to 20. The two authors of the review, Laura Moody and Svyatoslav Dvoretskiy, independently scored each study and discussed any disagreement. The study quality score was used to quantify the strength of existing evidence but was not used in the study selection.
This study synthesized existing evidence on miR-20a as a diagnostic and prognostic biomarker of CRC. Diagnostic and prognostic efficacy was examined in a total of 5014 CRC patients in 32 studies. Overall, miR-20a was found to be a potentially promising diagnostic marker of CRC. In feces, serum, and tumor tissue, a majority of the evaluated studies found an upregulation of miR-20a. On the other hand, high miR-20a expression was not a statistically significant predictor of poor patient prognosis.
In all biological specimen, miR-20a was found to be more highly expressed in cancer patients than in control subjects. These findings are consistent with mechanistic studies that found miR-20a to target numerous genes involved in cell cycle regulation. Tumorigenesis is initially characterized by the inadequacy of DNA repair machinery to maintain appropriate cellular function. Downregulation of tumor suppressor genes and activation of oncogenes leads to a death-resistant, hyperplasic state that results in cancer progression and eventually metastasis. miR-20a is a member of the miR-17 family known for its oncogenic properties. Indeed, miR-20a overexpression in cell lines has been shown to promote cell cycle progression [35
], while inhibition of miR-20a induces an E2F1-associated DNA damage response and G1 checkpoint activation [70
]. Thus, it is likely that the high levels of miR-20a observed in the meta-analysis reflect an inability to check cellular growth, resulting in a malignant state.
The potential of miR-20a as a positive diagnostic biomarker of CRC was further supported by evaluating sensitivity and specificity through the AUROC analysis. While the gold-standard for CRC diagnosis remains colonoscopy, non-invasive tests have been considered for screening at-risk patients. Carcinoembryonic antigen (CEA) and cancer antigen 19-9 (CA19-9) are two such blood tests that have already been implemented in the clinic. One meta-analysis evaluated the use of CEA in detecting CRC recurrence and found an AUROC of 0.75 [71
]. Similarly, studies examining the diagnostic validity of CA19-9 have observed AUROC values between 0.69 and 0.77 [72
]. Our meta-analysis reported a pooled AUROC of 0.70 (95% CI: 0.63–0.78). This value is comparable to those of CEA and CA19-9, suggesting that miR-20a may be useful in CRC diagnosis.
In the current review, CRC prognosis was assessed through HRs for OS. All studies reported statistically significant HR values of greater than one for both DFS and OS, but only two studies evaluated DFS, so a meta-analysis was not performed. Five studies were pooled for the meta-analysis of HR for OS. Results suggest that patients with high miR-20a expression die earlier than those with lower expression. While the effect size was large, it was not significantly greater than one, so we cannot conclude that high miR-20a expression is indicative of lower OS. Previous literature has quantified the prognostic power of other RNAs in CRC. Two meta-analyses found miR-21 to be a predictor of poor prognosis [74
]. Beyond miRNA, the prognostic value of mRNA expression has also been explored in CRC patients. For instance, overexpression of the metastasis-associated in the colon cancer 1 (MACC1) gene resulted in poor survival prognosis [76
], while amphiregulin (AREG) and epiregulin (EREG) genes were found to be favorable prognostic biomarkers [77
]. Compared to other RNA prognostic predictors, the pooled HR for OS of miR-20a calculated in the present review was relatively high. However, possibly due to the small number of studies, our modeling results were statistically nonsignificant. In order to determine the efficacy of miR-20a as a prognostic biomarker of CRC, further clinical investigation should be performed.
The present meta-analysis had limitations. Only a few studies specifically examined miR-20a in the prognosis and diagnosis of CRC. In addition, some studies did not report sufficient data to be included in the meta-analysis. Much of the data was reported without standard deviation or the necessary information (e.g., CI) to calculate the standard deviation. Several studies focused on multiple miRNAs rather than miR-20a alone, and data were often presented as a ratio of one miRNA to another. Thus, only a few studies clearly compared between cancer and control. This compromised the statistical power to identify the effect of miR-20a as a diagnostic and prognostic biomarker of CRC. Furthermore, many studies did not make a distinction between colon and rectal cancer and did not report tumor location. Given the differences in genetic profile and clinical outcomes between colon and rectal cancer, as well as between proximal and distal colon carcinomas [78
], it is unclear whether miR-20a could be a better biomarker in specific tumor subtypes. Finally, there still exist several barriers to interpret and implement miRNA biomarkers in the clinic. The constantly evolving methodology has made it difficult to compare between studies, especially in establishing a baseline miRNA expression level. Genome-wide next-generation sequencing allows researchers to simultaneously evaluate multiple miRNAs and has great potential for biomarker discovery, but a standardized procedure has yet to be developed for quantifying miRNA and drawing clinically-relevant conclusions. Proper expertise in both data analysis and cancer biology is important for separating signal from noise when identifying CRC biomarkers.