Expression Status and Prognostic Value of m6A RNA Methylation Regulators in Lung Adenocarcinoma

N6-methyladenosine (m6A) RNA modification is the most abundant modification method in mRNA, and it plays an important role in the occurrence and development of many cancers. This paper mainly discusses the role of m6A RNA methylation regulators in lung adenocarcinoma (LUAD) to identify novel prognostic biomarkers. The gene expression data of 19 m6A methylation regulators in LUAD patients and its relevant clinical parameters were extracted from The Cancer Genome Atlas (TCGA) database. We selected three significantly differentially expressed m6A regulators in LUAD to construct the risk signature, and evaluated its prognostic prediction efficiency using the receiver operating characteristic (ROC) curve. Kaplan–Meier survival analysis and Cox regression analysis were used to identify the independent prognostic significance of the risk signature. The ROC curve indicated that the area under the curve (AUC) was 0.659, which means that the risk signature had a good prediction efficiency. The results of the Kaplan–Meier survival analysis and Cox regression analysis showed that the risk score can be used as an independent prognostic factor for LUAD. In addition, we explored the differential signaling pathways and cellular processes related to m6A methylation regulators in LUAD.


Background
Lung cancer is the leading cause of cancer-related deaths worldwide. (1) There are many risk factors for lung cancer, with smoking, environmental, and occupational exposure as the most common risk factors.
(2) In the past few decades, medical technology has made great progress, but the treatment effect for lung cancer patients is not ideal. The 5-year survival rate for lung cancer is reported to be 19%, one of the lowest ve-year survival rates, while adenocarcinoma, the most common histological subtype of lung cancer, is more aggressive and has a poorer prognosis. (3)(4)(5) To relieve the current clinical treatment pressure and improve the prognosis of patients, it is necessary to nd reliable prognostic markers to optimize the treatment regimen for lung adenocarcinoma (LUAD).
N6-methyladenosine (m 6 A) is a methylated modi cation of RNA molecules that was rst discovered in 1974. (6) As of the end of 2017, more than 150 post-transcriptional modi cations have been identi ed in all organisms, and m 6 A is the most common internal mRNA modi cation found in eukaryotes and plays a key role in a variety of basic biological processes such as cell differentiation, tissue development, and tumorigenesis. (7)(8)(9)(10) In mammals, about 0.1-0.4% of adenosine in isolated RNA is m 6 A modi ed, accounting for about 50% of the total methylated ribonucleotide. (11) Dominissini et al. (12) used a new method of antibi-mediated capture and massively parallel sequencing, m 6 A-seq, to nd that m 6 A is clustered in the termination codon, the 3' untranslated region (3' UTR), and the internal exon. m 6 A methylation is a dynamic reversible process catalyzed by three types of proteases: Writers (Methyltransferase complex, including METTL3/14/16, WTAP, IC3H13, ZC3H13, RBM15/15B, and KIAA1429), Erasers (Demethylases, including FTO and ALKBH5) and Readers (including YTHDF1/2/3, IGF2BP1/2/3, YTHDC1/2, HNRNPC, HNRNPG, and HNRNPA2B1). (11,13,14) Writers mediate the methylation modi cation process of RNA to "write" the methylation modi cation to RNA, and Readers are responsible for "reading" the information of RNA methylation modi cation, and participate in downstream RNA translation and degradation processes, and then rely on Erasers Mediating the process of RNA demethylation modi cation can "erased" the RNA methylation modi cation signal, thereby making the m 6 A modi cation process dynamic and reversible. (15) At present, many studies have shown that m 6 A methylation regulators are closely related to the occurrence and development of tumors. For example, Taketo et al. (16) indicated that METL3, as an m 6 A regulator, is up-regulated in patients with pancreatic cancer and is an effective target in the treatment of the patients. Maetal. et al. (17) found that down-regulation of METTL14 expression is a poor prognostic factor for hepatocellular carcinoma and is closely related to tumor metastasis. However, there is still insu cient information about the role of m 6 A RNA methylation regulators in LUAD. Therefore, in this study, RNA sequencing data were obtained from The Cancer Genome Atlas (TCGA), and the expression data of 19 m 6 A methylation regulators in 535 LUAD patients and 59 normal people were systematically analyzed, as well as their association with clinicopathological characteristics. We used least absolute shrinkage and selection operator (LASSO) Cox regression algorithm to analyze 19 m 6 A methylation regulators, and selected IGF2BP1, HNRNPC, and HNRNPA2B1 to construct the minimum standard risk signature, and the Kaplan-Meier survival analysis, univariate, and multivariate Cox regression analysis were used to identify the predictive effect of the risk signature on LUAD patient prognosis. Gene Set Enrichment Analysis (GSEA), Gene Ontology (GO), and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis were used for further functional annotation.

Data Acquisition
All data in this study were downloaded from the TCGA (https://cancergenome.nih.gov/) database, including gene expression data (RNA-seq) and corresponding clinical information of 535 LUAD patients and 59 normal people. If any parameter value was missing, the entire patient data will be excluded from the analysis. After screening, 479 clinical data of LUAD patients were retained. (Additional le 1: Table  S1) 2.2 Selection of m 6 A methylation regulators and analysis of their differential expression The TCGA database provides the expression data of 19 m 6 A methylation regulators, including YTHDF3,

Correlation Analysis Of Ma Methylation Regulators
In order to further study the correlation between m 6 A methylation regulators, the co-expression correlation analysis was carried out, and the results were visualized by the "corrlot" package. We performed univariate Cox regression analysis on the expression of 19 m 6 A RNA methylation regulators in 479 LUAD tumor tissues, and genes with P < 0.05 were considered to be signi cantly associated with the survival of LUAD patients.

Construction And Veri cation Of Risk Signature
To verify the prognostic effect of m 6  And patients were classi ed into low-risk and high-risk group according to the median of risk scores, and performed principal component analysis (PCA) on the grouping results. PCA was performed by the Limma package, and the results were visualized by the ggplot2 package. Survival package (21) was used to compare the overall survival (OS) rate of the high-risk and low-risk groups by the Kaplan-Meier method. Then we constructed a receiver operating characteristic (ROC) curve (22) to evaluated the prediction e ciency of the risk signature.

Analysis Of Prognostic Ability Of The Three-gene Signature
The "pheatmap" package was used to generate heat maps to visually analyze the expression differences of the three genes in the high-risk and low-risk groups, as well as the expression differences of the three genes in LUAD patients with different clinicopathological characteristics. Univariate, and multivariate independent prognostic analysis of the risk score were performed to identify the prognostic value of the risk signature. GSEA, were used to annotate the differential signaling pathways and cellular processes between the two groups. GO (23) enrichment, and KEGG (24) pathway analysis were used to analyze the differentially expressed genes (DEGs) between the high-risk group and the low-risk group. When the P value was less than 0.05, the enrichment pathway was considered to be statistically signi cant.

Statistical Analysis
One-way analysis of variance was used to compare the expression of m 6 A RNA methylation regulators in tumor tissues of TCGA lung adenocarcinoma patients. The relationship between m 6 A RNA methylation regulator and clinicopathological characteristics of LUAD patients was analyzed by t-test. OS was de ned as the time interval from the date of diagnosis to the date of death. The Kaplan-Meier method was used for OS analysis to conduct a bilateral log-rank test. All statistical analyses were performed using R software (version 3.6.2). P < 0.05 was considered statistically signi cant.

Expression of m 6 A RNA methylation regulators in LUAD
We compared the expression levels of 19 m 6 A RNA methylation regulators in of LUAD tissues and 59 normal tissues extracted from the TCGA database. The results as shown in Fig. 1a and b, we found that there was a signi cant difference in the expression level of YTHDF3, YTHDF2, YTHDF1, KIAA1429, HNRNPA2B1, RBM15, METTL3, HNRNPC, IGF2BP2, IGF2BP3, IGF2BP1, FTO, ZC3H13, WTAP, METTL14 between LUAD carcinoma and normal tissues. Among these regulators, the expression level of YTHDF3, YTHDF2, KIAA1429, HNRNPA2B1, RBM15, METTL3, HNRNPC, YTHDF1, IGF2BP2, IGF2BP3, IGF2BP1 in LUAD was signi cantly higher than that in normal tissues, while FTO, ZC3H13, WTAP, METTL14 was lower than that in normal tissues.

Construction Of The Ma-related Risk Signature
To explore the prognostic value of the three-gene risk signature, we divided the LUAD patients obtained from TCGA into a low-risk group and a high-risk group based on the median risk score. As shown in Fig. 3a, as the risk score increases, the number of deaths in the high-risk group is signi cantly higher than that in the low-risk group. We also performed PCA on the risk signature to compare the differences between the two groups. The results showed that the distribution directions of the two groups were different and there was a clear boundary, suggesting that the risk signature can divide LUAD patients into two groups. (Fig. 3b) Then we constructed a Kaplan-Meier survival curve to analyze the OS rate of the two groups of patients. The results are shown in Fig. 3c, there is a signi cant difference in the OS rate between the high-risk group and the low-risk group (P = 1.257e-04). The OS rate of LUAD patients in the low-risk group is signi cantly higher than that of the LUAD patients in the high-risk group. In the followup, the ROC curve was established to evaluate the e ciency of the risk signature for predicting the 5-year survival rate of LUAD patients. As shown in Fig. 3d, the AUC was 0.659, indicating that the risk signature has a good predictive e ciency on the 5-year survival rate of LUAD patients.

Prognostic Analysis Of The Ma-related Risk Signature
After dividing the high-risk group and the low-risk group according to the median risk score, we further compared the clinicopathological characteristics and the expression of three genes between the two groups of patients. The heat map in Fig. 4a showed that there were signi cant differences at T, stage, and status between the high-risk group and the low-risk group, and the expression of IGF2BP1, HNRNPC, and HNRNPA2B1 in the high-risk group was up-regulated, while in the low-risk group was down-regulated.
To verify whether the risk signature can be used as an independent prognostic indicator of LUAD, we conducted univariate and multivariate independent prognostic analysis on the risk score. As shown in Fig. 4b and c, the results of univariate Cox independent prognostic analysis showed that risk score (P 0.001), stage (P 0.001), T (P 0.001), N (P 0.001) were signi cantly related to OS in LUAD patients. The results of multivariate Cox independent prognosis analysis showed that risk score (P < 0.001), stage (P = 0.002), N (P = 0.031), which means the risk signature can be used as an independent prognostic factor for LUAD.

Functional Enrichment Analysis
We used GSEA to analyze the signal pathways of different prognosis between the high-risk group and the low-risk group. There are dramatic differences in the expression of genes involved in some pathways between the two groups. For example, genes involved in the cell cycle, genetic material metabolism, gene replication, and translation process are signi cantly up-regulated in the high-risk group, (Fig. 5) and these pathways are obviously related to the occurrence and development of cancer. After GESA analysis, we screened out a total of 378 DEGs (Log FC > 1, P < 0.05) between the high-risk and low-risk groups. To explore the differences in signal pathways and cellular processes between the two groups, we performed GO enrichment and KEGG pathway analysis on DEGs. The GO analysis results showed that DEGs were mainly enriched in tubulin binding, microtubule binding, and ATPase activity, while the KEGG analysis results showed that DEGs are mainly enriched in the cell cycle. (Fig. 6a and b)

Discussion
Lung adenocarcinoma is the most common subtype of non-small-cell lung cancer, accounting for almost 50% of all lung cancers, and its incidence is increasing year by year and overall survival is lower than that of most cancers. (5) Traditional treatment methods include surgery, radiotherapy, and chemotherapy. (25) The choice of treatment depends on the type of cancer (small cell or non-small cell), stage of development, and genetic characteristics. (26) If it can be detected at an early stage, surgical removal of non-small cell lung cancer can provide a good prognosis. However, more than 70% of patients with nonsmall cell lung cancer are diagnosed with advanced or metastatic disease. (27) Therefore, there is an urgent need to explore potential prognostic biomarkers and promising targets.
As the most common modi cation in human mRNA, m 6 A methylation modi cation is involved in regulating mRNA processing, translation, and stability. (28) Many studies have con rmed that m 6 A RNA modi cation is related to tumor proliferation, differentiation, tumorigenesis, invasion, and metastasis. (17,29,30) However, the role of m 6 A methylation regulators in the occurrence and development of LUAD needs to be further clari ed. Therefore, in this study, we explored the role of m 6 A RNA methylation regulators in LUAD. Different m 6 A RNA methylation regulators have different effects on the same cancer, which may be related to their differential expression in different cancers. For example, as an "eraser" in the process of m 6 A RNA methylation, FTO is highly expressed in breast, liver, and gastric cancer tissues compared to normal tissues, and is associated with poor prognosis, while FTO expression in bladder cancer tissues is lower than that in normal tissues. (31)(32)(33)(34) Interestingly, in this study, we analyzed the data of 479 LUAD patients extracted from the TCGA database and showed that the 19 m 6 A RNA methylation regulators we studied had abnormal expression in LUAD patients. Among them, the expression of FTO in cancer tissues of LUAD patients is signi cantly lower than that in normal tissues, which further validates the above viewpoint.
Insulin-like growth factor 2 mRNA-binding proteins (IGF2BPs, IGF2BP1/2/3), as a newly discovered m 6 A regulator in recent years, promotes the stability and storage of its target mRNA (such as MYC) in an m 6 Adependent manner, and play a carcinogenic effect in cancer. (35,36) Interestingly, we performed univariate Cox regression analysis on 19 m 6 A regulatory factors and found that IGF2BPs were signi cantly related to the survival of LUAD patients, and were signi cantly highly expressed in cancer tissues of LUAD patients. Therefore, we believe that the high expression of IGF2BPs may be one of the factors leading to the occurrence and development of LUAD in patients.
Next, we selected three m 6 A RNA methylation regulators (IGF2BP1, HNRNPC, and HNRNPA2B1) to construct a minimum standard risk signature, which had a good predictive effect on the prognosis of LUAD patients. The higher the signature-based risk score, the worse the prognosis. At the same time, we divided 479 LUAD patients into high-risk and low-risk groups based on the median risk score, and veri ed the results with PCA and showed that the results of grouping according to the median risk score have a clear boundary, and Kaplan-Meier survival analysis results show that the signature can signi cantly distinguish LUAD patients with different OS, and also has a good predictive e ciency on the prognosis of LUAD patients (AUC = 0.659). Besides, We further performed univariate and multivariate Cox independent prognostic analysis of the risk score, and the results showed that the risk score is an independent prognostic factor of LUAD. The results of the GESA analysis showed that the poor prognosis of the highrisk group may be related to cell cycle, genetic material metabolism, gene replication, translation process, and other cellular processes and signaling pathways. The results of the GO and KEGG analysis of DEGs between the two groups suggest that m 6 A modi cation in LUAD mainly affects the cell cycle, cell senescence, and other biological processes, and thus has an effect on the development and development of LUAD. These results can be used as a theoretical basis for further research on the pathogenesis of LUAD and the establishment of new risk classi cation and prognostic models.

Conclusions
In summary, our research shows that m 6 A RNA methylation regulators are closely related to the occurrence, development, and poor prognosis of LUAD. Three selected m 6 A RNA methylation regulators (IGF2BP1, HNRNPC, and HNRNPA2B1) were used to construct a risk signature, which was veri ed to be an independent prognostic factor of LUAD. However, the speci c role of IGF2BP1, HNRNPC, and HNRNPA2B1 in LUAD needs further experimental veri cation. Red represents up-regulation and green represents down-regulation. (b) Vioplot visualizing the differentially expressed m6A RNA methylation regulators in LUAD. * P 0.05 ** P 0.01 *** P 0.001.

Figure 2
The correlation among 19 selected m6A RNA methylation regulators. (a) Spearman correlation analysis of m6A RNA methylation regulator in LUAD. (b) The hazard ratio (HR) and 95% con dence interval (CI) of m6A RNA methylation regulator calculated by univariate COX regression analysis.  The relationship between risk score and clinicopathological characteristics of LUAD. (a) Heat map of the expression of three m6A RNA methylation regulators and the distribution of clinicopathological variables between the high-risk and low-risk groups. * P 0.05 ** P 0.01 *** P 0.001. (b) (c) Univariate and multivariate independent prognostic analysis between clinicopathological characteristics and risk scores and OS of patients. The horizontal line in the color module represents the con dence interval of each factor.
Page 17/18 Function enrichment analysis of risk signature based on GSEA. The genes involved in the six functional pathways in the gure were signi cantly up-regulated in the LUAD high-risk group.