Identification of an Amino Acid Metabolism-Related Gene Signature for Predicting Prognosis in Lung Adenocarcinoma

Abstract: Background Dysregulation of amino acid metabolism (AAM) is an important factor in cancer progression. This study intended to study the prognostic value of AAM-related genes in lung adenocarcinoma (LUAD). Methods: The mRNA expression profiles of LUAD datasets from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) were applied as the training and validation sets. After identifying the differentially expressed AAM-related genes, an AAM-related gene signature (AAMRGS) was constructed and validated. Additionally, we systematically analyzed the differences in immune cell infiltration, biological pathways, immunotherapy response, and drug sensitivity between the two AAMRGS subgroups. Results: The prognosis-related signature was constructed on the grounds of key AAM-related genes. LUAD patients were divided into AAMRGS-high and -low groups. Patients in the two subgroups differed in prognosis, tumor microenvironment (TME), biological pathways, and sensitivity to chemotherapy and immunotherapy. The area under the receiver operating characteristics (ROC) and calibration curves showed good predictive ability for the nomogram. Analysis of immune cell infiltration revealed that the TME of the AAMRGS-low group was in a state of immune activation. Conclusion: We constructed an AAMRGS that could effectively predict prognosis and guide treatment strategies for patients with LUAD.


Introduction
LUAD is the most prevalent type of lung cancer, making up 40% of all cases of lung cancer [1]. Its 5-year survival rate is only 15% [2]. The early stage of LUAD can be detected by computed tomography (CT) [3] and then cured by radical surgical resection. However, in more than 50% of patients, LUAD has already metastasized upon initial diagnosis [4]. In the past decades, with further study of molecular targets, such as EGFR, ALK, and PD-1/PD-L1, targeted therapy and immunotherapy for LUAD have achieved remarkable results [5]. However, due to the heterogeneity of LUAD, only a tiny portion of patients benefit from such therapy [6]. These patients initially manage their disease with targeted therapy or immunotherapy; however, eventually, resistance inevitably develops [7]. The existing TNM staging does not reflect the molecular features of LUAD [8]. Therefore, it is necessary to identify new therapeutic targets through a large number of data analyses to provide individualized treatment plans for patients with LUAD.
The core characteristic of cancer is metabolic reprogramming [9]. Compared with normal cells, cancer cells require large amounts of energy for proliferation, invasion, and metastasis. The metabolism of amino acids and their derivatives is one of the key factors of cancer progression [10]. In addition, the function of immune cells in TME is affected by amino acid metabolism, leading to tumor immune escape [11]. As shown in previous studies, glutamine is an important nutrient for cancer cells. It not only participates in the Genes 2022, 13, 2295 2 of 13 energy metabolism of cancer cells but also protects cells from oxidative stress as an antioxidant [12]. Rober et al. found that blocking the metabolism of glutamine could effectively inhibit the metabolic activity of cancer cells, but the activity of effector T cells was not affected, so targeted glutamine therapy may become an effective measure to improve the efficacy of immunotherapy [13]. Liu et al. found that glutamine metabolism significantly affected the TME status and immunotherapy efficacy of LUAD using bioinformatics [14]. Branched-chain amino acids (valine, leucine, and isoleucine; BCAAs) are essential amino acids that promote the expression of mitochondria-related genes to enhance tumor proliferation and activate the mTOR signaling pathway to stimulate tumor growth [15]. Kayo et al. showed that the immunosuppressive effect of Treg cells in mice decreased after reducing the intake of BCAAs in mice [16]. These findings imply that tumor development is significantly influenced by amino acid metabolism. Thus, it has emerged as a potential target for cancer therapy. However, the potential value of amino acid metabolism-related genes in LUAD has not been analyzed.
To better understand the prognosis of LUAD patients and give them more individualized care, we developed an AAMRGS to evaluate immune cell infiltration and the response to immunotherapy and chemotherapy in LUAD patients.

Data Collection
After excluding patients with missing survival information, mRNA (TPM) data and corresponding clinical information of 496 LUAD and 59 normal samples were obtained from the TCGA. The TPM format was then converted to log2(TPM+1) for subsequent analysis. GSE72094 and GSE31210 from the GEO were collected as validation sets. AAM-related genes were obtained from the molecular signature database [17] (Table S1).

Differentially Expressed AAM-Related Genes and Enrichment Analysis
The "limma" R package (version 3.50.3) was employed for differential analysis of the LUAD and normal tissues from the TCGA [18]. The threshold was set to a false discovery rate (FDR) < 0.05 and log2 |fold change| ≥ 1. The "clusterProfiler" R package (version 4.2.2) was then used to implement GO and KEGG enrichment analysis of differentially expressed AAM-related genes [19].

Construction and Validation of the AAMRGS
To identify prognosis-related genes (p < 0.05), a univariate Cox regression analysis of the differentially expressed AAM-related genes was carried out. Subsequently, LASSO and multivariate Cox regression analyses were employed to identify the final key genes and their corresponding coefficients to set the following calculation: risk score = Coef1 × Gene1 exp + Coef2 × Gene2 exp + · · · Coefn × Genen exp . The TCGA and GEO cohorts were grouped according to low-risk (35%) and high-risk (65%). Kaplan-Meier (KM) survival analysis and log-rank tests were then performed. Time-dependent ROC curves were used to assess the predictive power of the signature.

Development of a Nomogram
To assess whether the AAMRGS was a reliable prognostic marker for LUAD patients, univariate and multivariate Cox regression analyses were performed. On the grounds of the results of the multivariate Cox regression using the "rms" R package (version 6.3-0), a nomogram was established. The predictive power of the nomogram was assessed by calibration curves and time-dependent ROC curves.

Gene Set Enrichment Analysis
To understand the differences in biological function and pathways between the AAMRGS-high and AAMRGS-low groups, GO and KEGG enrichment analyses were performed using GSEA (version: 4.2.3; Broad Institute, Inc., USA) [17]. The threshold was set at p < 0.05.

Immune Cell Infiltration Analysis
Single sample gene set enrichment analysis (ssGSEA) was employed with the "GSVA" R package (version 1.42.0) to analyze the degree of infiltration of 28 immune cells [20].

Prediction of Immunotherapy Response
The immunophenoscore (IPS) of LUAD cases was acquired from The Cancer Immunome Atlas (TCIA) database [21], which is an effective indicator for predicting the sensitivity of LUAD patients to anti-PD-1 and CTLA-4 immunotherapy. Furthermore, we used one immunotherapy dataset to evaluate the prediction ability of the signature: 298 patients with locally advanced or metastatic urothelial carcinoma who received anti-PD-L1 (atezolizumab) immunotherapy (IMvigor210) [22]. In the same way, the IMvigor210 cohort was grouped according to low risk (35%) and high risk (65%).

Drug Sensitivity Analysis
To analyze the sensitivity of the two AAMRGS subgroups to chemotherapeutic drugs, the "pRRophetic" R package (version 0.5) was utilized to calculate the half-maximal inhibitory concentration (IC50) of popular chemotherapeutic drugs [23].

Statistical Analysis
All statistical analyses were performed using SPSS (version 26.0; IBM, Chicago, IL, USA) and R software (version: 4.1.3; The University of Auckland, Auckland, New Zealand). Differences between the two AAMRGS subgroups were calculated with the Wilcoxon rank-sum test. Cox regression analysis was used to identify risk factors for prognosis in patients with LUAD. Spearman rank correlation was used for correlation analysis. The threshold of all statistical analyses was p < 0.05.

Identification and Enrichment Analysis of Differentially Expressed AAM-Related Genes
Sixty-four differentially expressed AAM-related genes were identified in TCGA ( Figure 1A), of which 25 were downregulated and 39 were upregulated ( Figure 1B). Subsequently, 64 AAM-related genes were used for enrichment analysis. The results of the GO and KEGG analyses revealed that AAM-related genes were focused on various amino acid metabolism pathways ( Figure 1C,D).

Prognostic Value of AAMRGS
The LUAD patients were split into two groups: an AAMRGS-high group (n = 322) and an AAMRGS-low group (n = 174). The prognosis of the AAMRGS-high group was poorer than that of the AAMRGS-low group (p = 0.00099, Figure 3A). Figure 3B displays the risk score and survival status distribution, and significantly more people died in the AAMRGS-high group than in the AAMRGS-low group. The expression of the six key genes in two subgroups was plotted in a heatmap ( Figure 3C). The area under curve (AUC) values of the AAMRGS for predicting survival at 1, 3, and 5 years were 0.735, 0.692, and 0.651, respectively ( Figure 3D).
The LUAD patients were split into two groups: an AAMRGS-high group (n = 322) and an AAMRGS-low group (n = 174). The prognosis of the AAMRGS-high group was poorer than that of the AAMRGS-low group (p = 0.00099, Figure 3A). Figure 3B displays the risk score and survival status distribution, and significantly more people died in the AAMRGS-high group than in the AAMRGS-low group. The expression of the six key genes in two subgroups was plotted in a heatmap ( Figure 3C). The area under curve (AUC) values of the AAMRGS for predicting survival at 1, 3, and 5 years were 0.735, 0.692, and 0.651, respectively ( Figure 3D). Subsequently, we computed the risk scores of LUAD cases from GSE72094 and GSE31210 on the grounds of the same formula. The patients were split into the AAMRGShigh group and the AAMRGS-low group. The OS was poorer in the AAMRGS-high group compared with that in the AAMRGS-low group. (p < 0.05, Figure 4A, B). The risk score and survival status distribution, as well as the heatmap of the expression of the six genes, were similar to those of the TCGA cohort (Figure 4C-F). The AUC values of the signature for predicting survival showed good prediction ability (Figure 4G, H). Subsequently, we computed the risk scores of LUAD cases from GSE72094 and GSE31210 on the grounds of the same formula. The patients were split into the AAMRGS-high group and the AAMRGS-low group. The OS was poorer in the AAMRGS-high group compared with that in the AAMRGS-low group. (p < 0.05, Figure 4A,B). The risk score and survival status distribution, as well as the heatmap of the expression of the six genes, were similar to those of the TCGA cohort ( Figure 4C-F). The AUC values of the signature for predicting survival showed good prediction ability ( Figure 4G,H).

Development of a Nomogram Based on AAMRGS
To evaluate whether the AAMRGS was a reliable prognostic marker for LUAD patients, we performed univariate and multivariate Cox regression analyses successively ( Figure 5A,B). The results indicated that the AAMRGS was a dependable prognostic factor for LUAD patients (HR = 2.276, 95% CI: 1.639-3.161, p < 0.001). Subsequently, a nomogram was established to predict OS at 1, 3, and 5 years on the basis of the results of the multivariate Cox regression ( Figure 5C). The nomogram's AUC values for predicting 1-, 3-, and 5-year survival were 0.752, 0.750, and 0.716, respectively ( Figure 5D). The calibration curves also showed that the predicted and actual OS values at 1, 3, and 5 years were basically consistent ( Figure 5E). These outcomes indicated the excellent predictive ability of the nomogram.

Development of A Nomogram Based on AAMRGS
To evaluate whether the AAMRGS was a reliable prognostic marker for LUAD patients, we performed univariate and multivariate Cox regression analyses successively (Figure 5A, B). The results indicated that the AAMRGS was a dependable prognostic factor for LUAD patients (HR = 2.276, 95% CI: 1.639-3.161, p < 0.001). Subsequently, a nomogram was established to predict OS at 1, 3, and 5 years on the basis of the results of the multivariate Cox regression (Figure 5C). The nomogram's AUC values for predicting 1-, 3-, and 5-year survival were 0.752, 0.750, and 0.716, respectively ( Figure 5D). The calibration curves also showed that the predicted and actual OS values at 1, 3, and 5 years were basically consistent ( Figure 5E). These outcomes indicated the excellent predictive ability of the nomogram.

Differences in Biological Function and Pathways
GSEA was used for the two AAMRGS subgroups. According to the GO enrichment analysis, the AAMRGS-high group was mostly enriched in the cell cycle and DNA replication process (Figure 6A). In the AAMRGS-low group, enrichment in B cells mediated immune processes, and antigen presentation was observed ( Figure 6B). The KEGG results indicated that the AAMRGS-high group was enriched in the DNA replication, cell cycle, and p53 signaling pathways (Figure 6C), while the AAMRGS-low group showed enrichment mainly in the metabolic processes of a variety of nutrients ( Figure 6D).

Differences in Biological Function and Pathways
GSEA was used for the two AAMRGS subgroups. According to the GO enrichment analysis, the AAMRGS-high group was mostly enriched in the cell cycle and DNA replication process ( Figure 6A). In the AAMRGS-low group, enrichment in B cells mediated immune processes, and antigen presentation was observed ( Figure 6B). The KEGG results indicated that the AAMRGS-high group was enriched in the DNA replication, cell cycle, and p53 signaling pathways ( Figure 6C), while the AAMRGS-low group showed enrichment mainly in the metabolic processes of a variety of nutrients ( Figure 6D).

Analysis of Immune Cell Infiltration and Prediction of Immunotherapy Response
To investigate the connection between the AAMRGS and TME, we analyzed the relationship between the risk score and the level of immune cell infiltration. The analysis of ssGSEA revealed that activated CD4 T cells, gamma delta T cells, natural killer T cells, neutrophils, regulatory T cells, and type 2 T helper cells showed significant infiltration in the AAMRGS-high group. In the AAMRGS-low group, eosinophils and mast cells showed significant infiltration ( Figure 7A). The TIMER database is an online analysis website that uses RNA-seq data to analyze the infiltration of immune cells in tumors [24]. Using this database, we examined the association between six key genes and six kinds of immune cells (B cells, CD8+ T cells, CD4+ T cells, macrophages, neutrophils cells, and dendritic cells). The results are shown in Figure S1. To assess the ability of the AAMRGS to forecast the effectiveness of immunotherapy, we analyzed differences in IPS between the two AAMRGS subgroups. The AAMRGS-low group was more sensitive to treatment, with CTLA4negtive/PD-1negtive and CTLA4positive/PD-1negtive (Figure 7B, C). The sensitivity to immunotherapy was not discernibly different for the CTLA4negtive/PD-1postive and CTLA4postive/PD-1postive samples between the two subgroups ( Figure 7D, E). Subsequently, we utilized the immunotherapy cohort (IMvigor210) to predict patients' responses to immunotherapy. The prognosis of patients in the AAMRGS-low group was improved (Figure 7F), and more patients responded to immunotherapy (Figure 7G). The above results suggest that the AAMRGS may be a marker of immunotherapy efficacy.

Analysis of Immune Cell Infiltration and Prediction of Immunotherapy Response
To investigate the connection between the AAMRGS and TME, we analyzed the relationship between the risk score and the level of immune cell infiltration. The analysis of ssGSEA revealed that activated CD4 T cells, gamma delta T cells, natural killer T cells, neutrophils, regulatory T cells, and type 2 T helper cells showed significant infiltration in the AAMRGS-high group. In the AAMRGS-low group, eosinophils and mast cells showed significant infiltration ( Figure 7A). The TIMER database is an online analysis website that uses RNA-seq data to analyze the infiltration of immune cells in tumors [24]. Using this database, we examined the association between six key genes and six kinds of immune cells (B cells, CD8+ T cells, CD4+ T cells, macrophages, neutrophils cells, and dendritic cells). The results are shown in Figure S1. To assess the ability of the AAMRGS to forecast the effectiveness of immunotherapy, we analyzed differences in IPS between the two AAMRGS subgroups. The AAMRGS-low group was more sensitive to treatment, with CTLA4 negtive /PD-1 negtive and CTLA4 positive /PD-1 negtive (Figure 7B,C). The sensitivity to immunotherapy was not discernibly different for the CTLA4 negtive /PD-1 postive and CTLA4 postive /PD-1 postive samples between the two subgroups ( Figure 7D,E). Subsequently, we utilized the immunotherapy cohort (IMvigor210) to predict patients' responses to immunotherapy. The prognosis of patients in the AAMRGS-low group was improved ( Figure 7F), and more patients responded to immunotherapy ( Figure 7G). The above results suggest that the AAMRGS may be a marker of immunotherapy efficacy.

Relationship between AAMRGS and Chemotherapy Drugs
Chemotherapy is the main option for patients with advanced LUAD who are not responsive to targeted therapy and immunotherapy. We explored the associations of multiple chemotherapeutic agents with the AAMRGS. Patients in the AAMRGS-high group were more responsive to cisplatin, docetaxel, doxorubicin, etoposide, gemcitabine, paclitaxel, and vinblastine ( Figure 8A-G). However, compared with other drugs, the two groups had opposite sensitivities to vinorelbine ( Figure 8H). In addition, we also analyzed the sensitivity of all LUAD patients to chemotherapy drugs as a whole ( Figure S2). This implied that the AAMRGS-high group of patients would benefit more from chemotherapy.  tiple chemotherapeutic agents with the AAMRGS. Patients in the AAMRGS-high group were more responsive to cisplatin, docetaxel, doxorubicin, etoposide, gemcitabine, paclitaxel, and vinblastine (Figure 8A-G). However, compared with other drugs, the two groups had opposite sensitivities to vinorelbine (Figure 8H). In addition, we also analyzed the sensitivity of all LUAD patients to chemotherapy drugs as a whole (Figure S2). This implied that the AAMRGS-high group of patients would benefit more from chemotherapy.

Discussion
Cancers are generally considered to arise from gene mutations in cells that subsequently affect tumor biological behavior, including proliferation, invasion, and metastasis, through alterations at the protein level [25]. In this process, the metabolic reprogramming of amino acids plays a major role [10]. Because malignant tumors have the ability to proliferate indefinitely, they need a large number of amino acids to participate in their own synthesis. Therefore, limiting amino acid intake and targeting key enzymes in amino acid synthesis may become new avenues for cancer therapy. At present, asparaginase has been developed for the treatment of childhood acute lymphoblastic leukemia (ALL) [26]. However, studies targeting amino acid metabolism-related genes in LUAD are rarely mentioned.
In this study, we screened the differentially expressed AAM-related genes in TCGA and constructed an AAMRGS including six genes (CPS1, AZIN2, GNMT, PSPH, RIMKLA, and SMOX) through univariate, LASSO, and multivariate Cox regression analyses. The carbamoyl phosphate synthase encoded by CPS1 is the first rate-limiting enzyme of the urea cycle. CPS1 knockdown in LUAD inhibits the JAK/STAT pathway, which is involved in tumor proliferation, differentiation, apoptosis, and immune regulation [27]. Therefore, CPS1 has the potential to become a new therapeutic target. In non-small cell lung cancer

Discussion
Cancers are generally considered to arise from gene mutations in cells that subsequently affect tumor biological behavior, including proliferation, invasion, and metastasis, through alterations at the protein level [25]. In this process, the metabolic reprogramming of amino acids plays a major role [10]. Because malignant tumors have the ability to proliferate indefinitely, they need a large number of amino acids to participate in their own synthesis. Therefore, limiting amino acid intake and targeting key enzymes in amino acid synthesis may become new avenues for cancer therapy. At present, asparaginase has been developed for the treatment of childhood acute lymphoblastic leukemia (ALL) [26]. However, studies targeting amino acid metabolism-related genes in LUAD are rarely mentioned.
In this study, we screened the differentially expressed AAM-related genes in TCGA and constructed an AAMRGS including six genes (CPS1, AZIN2, GNMT, PSPH, RIMKLA, and SMOX) through univariate, LASSO, and multivariate Cox regression analyses. The carbamoyl phosphate synthase encoded by CPS1 is the first rate-limiting enzyme of the urea cycle. CPS1 knockdown in LUAD inhibits the JAK/STAT pathway, which is involved in tumor proliferation, differentiation, apoptosis, and immune regulation [27]. Therefore, CPS1 has the potential to become a new therapeutic target. In non-small cell lung cancer (NSCLC), AZIN2 overexpression promotes cisplatin resistance [28]. GNMT is a tumor suppressor gene in liver cancer, and its high expression can inhibit tumor proliferation [29]. Overexpression of PSPH in NSCLC is linked to a bad prognosis [30]. RIMKLA is involved in glutamine metabolism, and its role in LUAD is unclear. The AAMRGS showed a good ability to predict prognosis. According to KM survival analysis, a greater risk score was linked to a poorer outcome. Two external GEO datasets verified the accuracy of the signature. Univariate and multivariate Cox regression analyses also suggested that the AAMRGS was a reliable prognostic marker in LUAD. These results provide a reliable basis for the practical application of the AAMRGS. Furthermore, a nomogram was created for easy clinical application.
The GSEA results indicated tumor proliferation and the cell cycle were active in the AAMRGS-high group, which suggests that AAM-related genes play a key role in the metabolic process of tumors, while the AAMRGS-low group was associated with immunerelated pathways. Previous studies have shown that amino acid metabolism can reshape the TME by affecting the proliferation and activation of immune cells [31][32][33]. Immune cells in the TME are conducive to strengthening the efficacy of immunotherapy. In this study, ssGSEA showed that eosinophils and mast cells with antitumor immunity function showed significant infiltration in the AAMRGS-low group, while neutrophils, regulatory T cells, and type 2 T helper cells showed significant infiltration in the AAMRGS-high group and played an immunosuppressive role. This confirms that amino acid metabolism can alter the degree of immune cell infiltration, which in turn affects prognosis. The AAMRGS can effectively distinguish different immune states.
In view of the importance of immunotherapy in cancer treatment, subsequently, we analyzed the connection between the AAMRGS and immunotherapy. On the basis of the IPS results, we found that the AAMRGS-low group was more likely to benefit from immunotherapy, largely because the TME was in a state of immune activation. Immunotherapy has become the first-line treatment for various cancers [34]. However, the role of immunotherapy in most patients is limited [35]. It is urgently necessary to develop new biomarkers for immunotherapy efficacy. Our study found that the AAMRGS could predict the response to immunotherapy. The AAMRGS-low group had a better response to anti-PD-L1 immunotherapy. However, some patients with advanced LUAD do not show sensitivity to targeted drugs or immunotherapy. The traditional chemotherapy strategy is the first choice for such patients [36]. In this study, we analyzed the IC50 of traditional chemotherapy drugs, and the results indicated that the patients in the AAMRGS-high group were more responsive to most chemotherapy drugs. The above results may provide a reasonable treatment strategy for LUAD patients. Previous research [37] comprehensively evaluated the expression status of multiple genes through various algorithms, which provided an important basis for precision and individualized treatment. This study provides good evidence for AAMRGS to be used in clinical practice. In the future, we can test samples of LUAD patients for adjuvant treatment to provide more appropriate treatment for patients.
This study has some limitations. First, the validation data only come from public databases, which still need to be validated by multicenter clinical research. Second, the predictive value of the AAMRGS for immunotherapy can only be verified with other types of cancer due to the lack of immunotherapy data for LUAD patients. Additionally, the individual or combined effects of the six genes involved in constituting AAMRGS require further experiments to analyze and verify the results.

Conclusions
In summary, we analyzed the role of AAM-related genes in LUAD and constructed the AAMRGS to predict prognosis. The AAMRGS could effectively distinguish survival and the state of the TME in LUAD patients. Patients in the AAMRGS-high group were more sensitive to chemotherapy, while those in the AAMRGS-low group were more sensitive to immunotherapy. This provides new insight into the individualized treatment of patients with LUAD.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/genes13122295/s1, Figure S1: Relationship between key AAMrelated genes and immune cell infiltration from TIMER database. Figure S2: Sensitivity of LUAD samples to eight chemotherapeutic drugs. Table S1: Amino acid metabolism-related genes. Table S2: Univariate Cox regression analysis of differentially expressed AAM-related genes.
Author Contributions: W.C. and H.L.: Conceptualization, data curation, formal analysis, writingoriginal draft, writing-review and editing. C.W., L.Z., T.Z., Z.C. and W.O.: data collecting, writingreview and editing. S.W.: Conceptualization, supervision, funding acquisition, project administration, writing-review and editing. All authors have read and agreed to the published version of the manuscript.
Funding: This study was supported by the Guangdong Association Study of Thoracic Oncology (GASTO-202201).

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: This article does not contain any studies with animals or humans performed by any of the authors.

Data Availability Statement:
The source data and statistical programs for the analysis are available at https://www.jianguoyun.com/p/DRfwZc0QwPGFChj7yd8EIAA (accessed on 24 November 2022).