A Three–MicroRNA Signature as a Potential Biomarker for the Early Detection of Oral Cancer

Oral squamous cell carcinoma (OSCC) is often diagnosed at a late stage and may be malignantly transformed from oral leukoplakia (OL). This study aimed to identify potential plasma microRNAs (miRNAs) for the early detection of oral cancer. Plasma from normal, OL, and OSCC patients were evaluated. Small RNA sequencing was used to screen the differently expressed miRNAs among the groups. Next, these miRNAs were validated with individual samples by quantitative real-time polymerase chain reaction (qRT-PCR) assays in the training phase (n = 72) and validation phase (n = 178). The possible physiological roles of the identified miRNAs were further investigated using bioinformatics analysis. Three miRNAs (miR-222-3p, miR-150-5p, and miR-423-5p) were identified as differentially expressed among groups; miR-222-3p and miR-423-5p negatively correlated with T stage, lymph node metastasis status, and clinical stage. A high diagnostic accuracy (Area under curve = 0.88) was demonstrated for discriminating OL from OSCC. Bioinformatics analysis reveals that miR-423-5p and miR-222-3p are significantly over-expressed in oral cancer tissues and involved in various cancer pathways. The three-plasma miRNA panel may be useful to monitor malignant progression from OL to OSCC and as potential biomarkers for early detection of oral cancer.


Extraction of RNA from plasma
Blood was centrifuged at 3000 g for 10 min to separate plasma and stored at −80°C. Hemolysis was monitored based on the optical density at 414 nm [1]. For small RNA sequencing, equal volumes of each individual sample were mixed to obtain a pooled sample. Total RNA was extracted from 3 ml of pooled plasma using a combined version of phenol-chloroform extraction followed by column purification. RNA concentration was assessed using the Qubit RNA Assay Kit (Life Technologies). For individual qRT-PCR assays, 200 µl of plasma was extracted with miRNeasy Serum/Plasma Kit (Qiagen) and Caenorhabditis elegans synthetic miR-39 was added to serve as spike-in control and to evaluate the predence of according to the manufacturer's instruction.

Risk score analysis
For the correlation of combined miRNA with OL or OSCC risk, each patient was assigned a risk score function (RSF). The risk function (RSF) for patient i was calculated using the following formula: Here, the score (Sij) of miRNA j on patient i was weighted by Wj, the regression coefficient estimated by univariate logistic regression models for each miRNA [6,7]. Based on the risk scores, ROCs of combined miRNA panel were also generated. ROC curves of miRNA panel were generated based on the predicted probability (P) for each patient. P = Exp (combined miRNA panel)/ [1+Exp (combined miRA level)].

The Cancer Genome Atlas (TCGA) miRNA sequencing data analysis
To verify if the expression patterns of the identified miRNAs were consistent between plasma and solid tissue, we collected the miRNA expression profiles of solid tissue of head and neck cancer from The Cancer Genome Atlas (TCGA, http://cancergenome.nih.gov/) and compared them with those of plasma in our study. The miRNA sequencing data and patients' clinical information of TCGA Head-Neck Squamous Cell Carcinoma (HNSC) dataset (data version: 2016_01_28) were retrieved from the FireBrowse database (http://firebrowse.org/). The Level 3 miRNA sequencing data with normalized miRNA expressions were analyzed for further elucidating the relationship between miRNA expressions and clinical stages. Samples extracted from oral cavity, oral tongue, floor of mouth, buccal mucosa, base of tongue or hard plate were selected according to the descriptions of "anatomic neoplasm subdivision" in clinical information. A total of 295 tumor samples and 32 adjacent normal samples were selected for further analysis.

Identification of miRNA-targets and functional annotation of target genes
To identify high confidence miRNA-Target Interaction (MTIs), the experimentally verified MTIs with strong evidence were collected using miRTarBase 6.0 [8]. We also predicted miRNA-targets using TargetScanHuman 7.0 and miRDB. The previous step of experimental collection and predicted approach collection generated a list of genes. Functional annotation tools DAVID [9] was employed to illustrate the biological regulation role from Gene Ontology [10] or KEGG pathway database [11]. In addition, ingenuity pathway analysis (IPA) software (Ingenuity Systems, Inc.) was applied to analyze the canonical pathways networks, and biological functions of identified miRNAs.

Figures and Tables
Supplementary Figure S1. qPCR confirmation of differentially expressed miRNAs identified by NGS. Each dot represents the differentially expressed miRNAs between specified groups. Inconsistent results between NGS and qPCR are shown. Figure S2. TCGA analysis of identified miRNAs. TCGA data set from a total of 295 tumor samples and 32 adjacent normal samples were analyzed to compare the miRNA abundance among groups. Figure S3. The top 10 most enriched pathway by IPA analysis. Pathways with average z score > 2 or < -2 and -log (p-value) > 1.301 were included.