Single-Cell RNA Sequencing Analysis of Gene Regulatory Network Changes in the Development of Lung Adenocarcinoma

Lung cancer is a highly heterogeneous disease. Cancer cells and other cells within the tumor microenvironment interact to determine disease progression, as well as response to or escape from treatment. Understanding the regulatory relationship between cancer cells and their tumor microenvironment in lung adenocarcinoma is of great significance for exploring the heterogeneity of the tumor microenvironment and its role in the genesis and development of lung adenocarcinoma. This work uses public single-cell transcriptome data (distant normal, nLung; early LUAD, tLung; advanced LUAD, tL/B), to draft a cell map of lung adenocarcinoma from onset to progression, and provide a cell-cell communication view of lung adenocarcinoma in the different disease stages. Based on the analysis of cell populations, it was found that the proportion of macrophages was significantly reduced in the development of lung adenocarcinoma, and patients with lower proportions of macrophages exhibited poor prognosis. We therefore constructed a process to screen an intercellular gene regulatory network that reduces any error generated by single cell communication analysis and increases the credibility of selected cell communication signals. Based on the key regulatory signals in the macrophage-tumor cell regulatory network, we performed a pseudotime analysis of the macrophages and found that signal molecules (TIMP1, VEGFA, SPP1) are highly expressed in immunosuppression-associated macrophages. These molecules were also validated using an independent dataset and were significantly associated with poor prognosis. Our study provides an effective method for screening the key regulatory signals in the tumor microenvironment and the selected signal molecules may serve as a reference to guide the development of diagnostic biomarkers for risk stratification and therapeutic targets for lung adenocarcinoma.


Introduction
Lung cancer is the most common cancer worldwide [1]. Non-small cell lung cancer (NSCLC) accounts for 85% of lung cancer cases, with a 5-year survival rate of less than 16% [2]. NSCLC mainly contains two subtypes, lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC). A recent study showed that LUAD and LUSC differed in age, sex, clinical stage, tumor site, histological grade, treatment, and 5-year overall survival, cant differences in gene regulatory networks, played a key role in the transformation of macrophages into immunosuppressive phenotypes using pseudotime analysis, and may affect the occurrence and development of LUAD. In addition, we found that the tumor microenvironment and gene regulatory network of lung squamous cell carcinoma retained similar features but were different from LUAD. Overall, we demonstrate that communication between tumor cells and macrophages alters the functional status of the macrophages; the resulting regulatory signals were significantly associated with poor prognosis in lung adenocarcinoma patients.

Data Collection and Preprocessing
To describe the composition and functional status of lung adenocarcinoma (LUAD) during tumor progression, single cell transcriptome profiles from the lung tissues of distant normal (nLung), early LUAD (tLung) and advanced LUAD (tL/B) were collected from the dataset GSE131907 in the GEO database [26]. Sequencing data were mapped to the GRCh38 human reference genome using the Cell Ranger toolkit (version 2.1.0). Quality measures were applied to the raw gene-cell-barcode matrix for each cell based on: mitochondrial genes (≤20%, unique molecular identifiers (UMIs), and gene count (ranging from 100 to 150,000 and 200 to 10,000). In total, 42,995 cells of lung tissues from distant normal (nLung), 45,149 cells of lung tissues from early LUAD (tLung) and 12,073 cells of lung tissues from advanced LUAD (tL/B) were merged.
To validate the regulatory network changes during lung adenocarcinoma development another set of single cell transcriptome data (GSE123902 [27], GSE117570 [28], GSE148071 [29]) was used as a validation dataset. After quality control, a total of 34,920 cells were selected for subsequent analysis (including 15,701, 14,984, and 4235 cells from distant normal regions of lung samples, early LUAD, and advanced LUAD). The single-cell data of the validation group were used to construct the gene regulatory network through the same analysis process. In addition, in order to observe the heterogeneity of LUAD and LUSC, a LUSC's single cell transcriptome data (http://lungcancer.chenlulab.com (accessed on 22 March 2023)) was analyzed through the same pipeline. A total of 18,117 cells were selected for subsequent analysis (including 6338, 7918 and 3861 cells from normal tissue samples of LUSC, early LUSC and advanced LUSC) after quality control.

Data Integration, Unsupervised Dimensional Reduction and Clustering, and Cell Type Identification
scRNA-seq data were normalized using the "NormalizeData" function; and were scaled using the "ScaleData" function. The top 3000 highly variable genes were identified using the "FindVariableFeatures" function. Next, we used the "RunPCA" function to reduce the dimension of the scRNA-seq data. To integrate cells within a shared space from different datasets for unsupervised clustering, "RunHarmony" of Harmony (version 0.1.0) [30] was used to identify anchors and run integration steps and eliminate batch effects. Pcs were selected by ranking the principal components using the ElbowPlot function in the Seurat R package. This function works by randomly permuting a subset of data, and calculating projected PCA scores. When an inflection point reached the 20th pc, then, the first 20 principal components (PCs) were utilized in UMAP (uniform manifold approximation and projection) analysis using "RUNUMAP". Subsequently, a single cell map at 0.2 resolution was presented by the "FindClusters" function. Further, the "FindAllMarkers" function was used to detect gene expression markers. The above analysis was performed using the Seurat (version 4.1.1) [31] R package. Afterwards, we used the R package SingleR (version 1.6.1) [32], CellMarker dataset [33] and marker genes of cells to annotate cell types in our study.

Identification of Malignant Cells from Patients with Adenocarcinoma of the Lung
In order to isolate malignant tumor cells from patients with lung adenocarcinoma, CNV aberrations were inferred from the patterns of chromosomal gene expression by inferCNV (version 1.8.1) [34]. Since the data were 10× Genomics single-cell data, usually the cutoff is set = 0.1, denoise = T. The expression profiles of lung tissues from distant normal regions were used as a reference, the early LUAD and the advanced LUAD were used as the observation group.

Functional Enrichment Analysis
R package clusterProfiler (version 4.0.5) [35] was used for Gene Ontology (GO) annotation and enrichment analyses. p-value (p.adjust value) was calculated using the Benjamini and Hochberg method [36]. A p.adjust value < 0.05 was considered significant. In the enrichment analysis related to the regulatory network, p-values were not adjusted in order to retain more variables for multivariate analysis. Graphic visualization was enabled using ggplot2 (version 3.2.1).

Survival Analysis of the Proportion of Macrophages in Patients with Lung Adenocarcinoma
The LUAD mRNA expression data and associated clinical data in TCGA were downloaded from the UCSC Xena (http://xena.ucsc.edu/ (accessed on 10 July 2022)) database. The proportional distribution of 22 infiltrated immune cell types in patients was obtained by CIBERSORT [37] in TCGA-LUAD data. The patients were assigned to one of two groups (high risk and low risk) according to the proportion of infiltrated immune cells. Then, the relationship between the proportion of macrophage cells with survival was evaluated using the R package survival (version 3.3.1) [38]. The "surv_cutpoint" algorithm of R package survminer (version 0.4.9) was used to calculate the optimal threshold, and all survival analyses adopted this approach. In order to retain more variables for multivariate analysis, p-values were not adjusted.
In order to confirm the effect of tumor ligand signaling in the regulatory network on the occurrence and development of lung adenocarcinoma, the GEPIA2 [39] tool was used to evaluate the relationship between ligands in the gene regulatory network and the survival of patients with lung adenocarcinoma.

The Gene Regulatory Network of Tumor Cell-Macrophage Interaction Construction
First, the R package, Nichenet (version 1.0.0) [40], was used to infer the gene regulatory network of tumor cell-macrophage interactions. Malignant cells were designated as sender cells, macrophages were designated as receiver cells, "top_n_ligands" = 30, "condi-tion_reference" was designated as nLung, "condition_oi" was designated as tLung, and then, through the function of "nichenet_seuratobj_aggregate", ligand-target pairs were identified. Further, ligand-receptor (L-R), receptor-transcription factor (R-TF), transcription factor-target (TF-target) were found by the "get_ligand_signaling_path" function. Using the above steps, gene regulatory networks linking nlung and tLung and linking tLung and tL/B were identified second. CellPhoneDB (www.cellphonedb.org (accessed on 7 August 2022)) [41] was used to identify ligand-receptor pairs between tumor cells and macrophages, and those pairs with mean values greater than 0 were reserved. The ligand-receptor pairs in the second step were verified using Nichenet, and these verified interaction pairs and their downstream signaling molecules were retained. Third, pySCENIC (version 0.11.2) [42] was used to identify the TF-target of the macrophages, retaining those regulatory chains in the TF-targets that overlap with the previous step's regulatory network. Finally, the differentially expressed genes (DEGs) were found using the "Findmarker" function of Seurat to screen the gene regulatory network signals (Section 3.3).

Immune Cell Trajectory Is Constructed by Regulating the Signaling Molecules of the Network
In order to validate the signals (receptor, TFs, target) of gene regulatory network 4.0 influenced macrophages during LUAD progression, monocle (version 2.14.0) [43] was used to analyze the gene expression matrix with macrophage cells. We used the signals (receptor, TFs, target) of the gene regulatory network 4.0, to sort cells in pseudotime order. "DDRTree" was applied to reduce the number of dimensions and the visualization functions "plot_cell_trajectory" was used to plot the minimum spanning tree on cells. Finally, cells of nLung were defined as the starting point through "orderCells" function.

Single-Cell Expression Atlas and Cell Typing in Normal Lung, Early and Advanced LUAD Tissue Samples
To describe the composition and function of cells of LUAD during their different statuses, single-cell transcriptome datasets were collected from LUAD with 26 tissue samples from 16 patients, including 11 samples of distant normal lung tissues, 11 early LUAD samples, and 4 advanced LUAD samples (Table S1). After quality control, a total of 100,210 cells were selected for subsequent analysis (including 42,995, 45,149, and 12,073 cells from distant normal tissue of lung samples, nLung; early LUAD, tLung; and advanced LUAD, tL/B, respectively). The R package "Harmony" was used to integrate data from different samples. After reduction and clustering, cells were divided into 16 clusters. The results of the cell types from the R package SingleR were kept as a reference ( Figure S1A  Overall, the distribution of cells in different patients, different sources, and different pathologies was observed. While tumor cell numbers increase as the disease progresses, The epithelial cells of LUAD exhibit significant heterogeneity [44]. The R package infercnv was used to identify non-malignant cells and malignant cells. The hallmarks of cancer cells are aneuploidy and chromosomal copy number variations (CNV) [45]. The expression profiles of distant normal regions of lung tissues were used as a reference, and the early LUAD and the advanced LUAD were used as the observation group; we found that the copy number of cells provided by lung tissues of early LUAD and advanced LUAD in club cells, ciliated cells, AT1 cells, and proliferating cells exhibited significant changes ( Figure S1B). The lack of basal cells and AT1 cells in nLung, and the presence of tumor biomarkers of LUAD [46] were observed in potentially malignant cells. Basal cells and AT2 cells were both highly expressed tumor markers of LUAD ( Figure S1C). In addition, basal cells are highly correlated with proliferating cells, and AT2 cells are highly correlated with club cells ( Figure S1D). Therefore, we speculate that club cells, ciliated cells, AT1 cells, proliferating cells, basal cells, and AT2 cells are malignant cells.
Overall, the distribution of cells in different patients, different sources, and different pathologies was observed. While tumor cell numbers increase as the disease progresses, the largest proportion of tumor cells exists in tL/B, while immune cells in tL/B are depleted ( Figure S1E). The increase in tumor cells may be related to the decrease in immune cells. The proportion of fibroblast cells and endothelial cells did not appear to change. There was no significant change in cell number for different pathological stages ( Figure 1F).

The Role of Macrophages in the Immune Microenvironment
From the above, we have found that the number of immune cells was reduced and the number of tumor cells increased as cell status progresses from normal to early LUAD and advanced LUAD (tLung and tL/B). We then further analyzed the composition of immune cells in the three disease stages and found that macrophages accounted for about 28% of the largest proportion of immune cells in nLung. T cells accounted for the largest proportion of immune cells in tLung and tL/B. Plasma cells were only found in tLung and tL/B ( Figure 2A). Further, we studied the proportion of each immune-cell type in the three stages of LUAD, distant normal tissues of lung (nLung), early LUAD (tLung), and advanced LUAD (tL/B). Compared with the cells in nLung, the proportion of B cells and plasma cells was observed to increase in tLung (p < 0.05); the proportion of NK cells, macrophages and granulocytes was observed to decrease in tLung (p < 0.05). Compared with the cells in tLung, macrophage cells, plasma cells, and DC cells were observed to decrease in tL/B (p < 0.05) (Supplementary Figure S2A,B). Interestingly, the changes in macrophages were more pronounced than in other immune cell types ( Figure 2B). Macrophages are first responders for the immune system, initiating and coordinating a multipronged immune response [47]. Therefore, we speculate that the decrease in macrophages may be an important reason for the occurrence and development of LUAD. Thereafter, we observed the relationship between the number of macrophages and the survival rate of patients, a Cox proportional hazards model was applied, and the patients with a lower proportion of macrophages showed a lower chance of survival ( Figure 2C and Figure S3A). Survival analysis confirmed our speculation.
The above study found that the decrease in macrophages may be associated with the occurrence of lung adenocarcinoma. Moreover, changes in the function of macrophages may also affect the development of lung adenocarcinoma during the process that decreases the number of macrophages. Thus, the "FindMarker" function of the package Seurat was used to identify the differentially expressed genes in the macrophage cell population between nLung and tLung and tL/B. In tLung, the upregulated genes were associated with the positive regulation of the immune system process and its response to hypoxia (e.g., T-cell activation, lymphocyte proliferation and response to decreased oxygen levels), whereas the genes downregulated primarily belonged to proliferation-related pathways (e.g., mononuclear cell proliferation, myeloid cell differentiation). According to the results of functional analysis, macrophages from tLung patients are in a state of promoting immune activation compared with nLung people, and the downregulation of proliferation-related function may be related to the decrease in the proportion of macrophages. The upregulated genes in tL/B were related to the response to positive regulation of angiogenesis and vas-Biomolecules 2023, 13, 671 7 of 17 culature development, and the downregulated genes in tL/B were related to the positive regulation of cell adhesion, antigen processing, and T-cell activation through the GO function enrichment analysis ( Figure 2E and Figure S3B). The results showed that macrophages are in an immunosuppressive state in the tL/B stage and may promote tumor growth by promoting the increase in angiogenesis and vasculature development. In conclusion, during the development of LUAD, the proportion of macrophages gradually decreases, and their function evolves from immune activation (trying to fight against tumors) to immunosuppression, which may create an environment suitable for tumor growth. These results suggest that macrophages may be regulated by tumor cells during the transformation from tLung stage to tL/B stage. of functional analysis, macrophages from tLung patients are in a state of promoting immune activation compared with nLung people, and the downregulation of proliferationrelated function may be related to the decrease in the proportion of macrophages. The upregulated genes in tL/B were related to the response to positive regulation of angiogenesis and vasculature development, and the downregulated genes in tL/B were related to the positive regulation of cell adhesion, antigen processing, and T-cell activation through the GO function enrichment analysis ( Figures 2E and S3B). The results showed that macrophages are in an immunosuppressive state in the tL/B stage and may promote tumor growth by promoting the increase in angiogenesis and vasculature development. In conclusion, during the development of LUAD, the proportion of macrophages gradually decreases, and their function evolves from immune activation (trying to fight against tumors) to immunosuppression, which may create an environment suitable for tumor growth. These results suggest that macrophages may be regulated by tumor cells during the transformation from tLung stage to tL/B stage.

Construction of the Regulatory Network of Macrophages Regulated by Tumor Cells
Macrophages exhibit strong plasticity and functional heterogeneity, and their phenotypic transformation is complex [48]. A recent study found that tumor cells can alter the macrophage phenotype. To further investigate which molecules of tumor cells might mediate the change in macrophage cells, we constructed the gene regulatory network comprised of four elements: ligand-receptor-transcription factor (TF)-target. Gene regulatory

Construction of the Regulatory Network of Macrophages Regulated by Tumor Cells
Macrophages exhibit strong plasticity and functional heterogeneity, and their phenotypic transformation is complex [48]. A recent study found that tumor cells can alter the macrophage phenotype. To further investigate which molecules of tumor cells might mediate the change in macrophage cells, we constructed the gene regulatory network comprised of four elements: ligand-receptor-transcription factor (TF)-target. Gene regulatory networks were constructed in four steps ( Figure 3). First, we applied Nichenet to obtain intercellular signal transduction networks and constructed ligand-receptor-TF-target links (Network 1.0); second, ligand-receptor (L-R) pairs of the gene regulatory network were verified by CellPhoneDB, the overlapping ligand-receptor pairs and their downstream signals were retained (Network 2.0); third, SCENIC was used to verify TF-target pairs of Network 2.0. The verified TF-targets of Network 2.0 were preserved (Network 3.0); fourth, differentially expressed genes (DEGs) were used to filter important signals of Network 3.0 (Network 4.0). After multiple screening of regulatory networks, the number of signaling molecules is reduced, which means that the resulting overlapped common filter, derived from multiple communication tool analyses, can reduce the false positive results with the use of a single tool.    The construction of the gene regulatory network may further improve the accuracy and integrity of the signal transduction process. By comparing the networks of nLung and tLung, it was found that multiple signaling pathways were involved. In Network 4.0, 67 signals of nLung and 84 signals of tLung were retained. Regulatory links based on four ligands (GAS6, TIMP1, VEGFA, TGFB1) were found in nLung Network 4.0 ( Figure 4A). However, in tLung Network 4.0, a total of six ligands (GAS6, TIMP1, VEGFA, TGFB1, LIF, CXCL2) were included. The functional analysis illustrated that the upregulated signals (receptor, TF, target) in tLung were related to the immunological effect and some upregulated targets of tLung were related to the response to decreased oxygen levels ( Figure 4B and Figure S4B). For example, after TIMP1 interacts with FGFR2 (receptor), the expression of ADM mediated by the transcription factor JUN is elevated. A typical hypoxic factor of ADM has been used as an important factor in the tumor microenvironment for predicting clinical prognosis in lung adenocarcinoma [49]. The downregulated molecules in tLung were related to inflammatory response. VEGFA regulates the reduced expression of CD44 through signal transduction. Alveolar macrophages (AMs) are CD44-expressing cells located in the alveolar space that maintain lung homeostasis. When CD44 is downregulated, AMs are unable to bind glycosaminoglycans and hyaluronan, which reduces their viability and leads to a decrease in the number of AMs in the lung [50]. It is worth noting that a regulatory network dominated by LIF-LIFR was found in the tLung network. Recent studies have implicated LIF-LIFR signaling in playing a key role in tumor growth, progression, metastasis, stemness and therapy resistance; the LIF-LIFR axis may be considered as a promising clinical target for cancer therapy [51].
In the comparison of the network of tLung and tL/B, 45 signals of tLung and 38 signals of tL/B were retained. Regulatory links based on five ligands (CCL5, CCL3L3, CCL3, BMP2, SPP1) were found in Network 4.0 of tLung and tL/B ( Figure 5A). The functional analysis illustrated that the downregulated signals (receptor, TF, target) in tL/B were related to immune response (e.g., antigen processing and presentation, IL-17 signaling pathway), and the upregulated targets of tL/B were related to mononuclear cell migration ( Figure 5B and Figure S4D). BMP2 regulates the increased expression of EGR2 through signal transduction; EGR2 was highly expressed in tL/B and was reported to be a conserved marker of alternately activated macrophages (M2 macrophages) [52].   In conclusion, the construction of the gene regulatory network improves the accuracy and integrity of deciphering signaling molecules in intercellular communication. In addition, with respect to the progression of lung adenocarcinoma, the changes in the regulatory network may be closely related to the changes in macrophage cells. These ligand signals of tumor cells may be crucial for the occurrence and development of lung adenocarcinoma. lecules 2023, 13, x FOR PEER REVIEW 10 of 18

The Signal Changes in the Gene Regulatory Network Are Related to the Change in Macrophage State
To further verify the relationship between the changes in signaling molecules during disease progression and macrophage status, we used signal molecules in the gene regulatory network as features for the trajectory analysis of macrophage cells. The R package "monocle" was used to sort individual cells by these signals to construct the tree-like structure of the entire lineage differentiation trajectory ( Figure 6A). Macrophages were divided into seven states (state 1, 2, 3, 4, 5, 6 and 7). The macrophages in nLung were dominated by state 3 and state 4; the macrophages in tLung and tL/B were dominated by state 5 and state 6 ( Figure 6B,C). During the development of lung adenocarcinoma, the proportion of macrophages in states 3 and 4 mainly decreased, while the proportion of macrophages in states 5 and 6 increased, on the premise that the overall proportion of macrophages decreased; this means that the immune effector function of macrophages may be changed ( Figure 6C). Further, we mapped macrophage states 3, 4, 5 and 6 in Seurat and found that states 3 and 4 were concentrated with alveolar macrophages, while states 5 and 6 were concentrated with tumor-associated macrophages ( Figure S6D,E). Pathway analysis indicated that the signaling pathways involved in the antigen processing and presentation and MHC class II protein complex binding were enriched in state 3 and state

The Signal Changes in the Gene Regulatory Network Are Related to the Change in Macrophage State
To further verify the relationship between the changes in signaling molecules during disease progression and macrophage status, we used signal molecules in the gene regulatory network as features for the trajectory analysis of macrophage cells. The R package "monocle" was used to sort individual cells by these signals to construct the tree-like structure of the entire lineage differentiation trajectory ( Figure 6A). Macrophages were divided into seven states (state 1, 2, 3, 4, 5, 6 and 7). The macrophages in nLung were dominated by state 3 and state 4; the macrophages in tLung and tL/B were dominated by state 5 and state 6 ( Figure 6B,C). During the development of lung adenocarcinoma, the proportion of macrophages in states 3 and 4 mainly decreased, while the proportion of macrophages in states 5 and 6 increased, on the premise that the overall proportion of macrophages decreased; this means that the immune effector function of macrophages may be changed ( Figure 6C). Further, we mapped macrophage states 3, 4, 5 and 6 in Seurat and found that states 3 and 4 were concentrated with alveolar macrophages, while states 5 and 6 were concentrated with tumor-associated macrophages ( Figure S6D,E). Pathway analysis indicated that the signaling pathways involved in the antigen processing and presentation and MHC class II protein complex binding were enriched in state 3 and state 4 ( Figure S5A,B). Macrophages, as typical antigen-presenting cells, were enriched in the antigen processing and presentation of exogenous peptide antigen via MHC class II [53]. Highly expressed genes in state 6 were related to the response to oxidative stress. Pathway analysis suggested that cells in state 5 were dependent on negative regulation of the immune system process ( Figure S5C). Furthermore, the aberrantly expressed signaling molecules in the gene regulatory network were significantly expressed in macrophage state 5 and state 6 ( Figure 6D,E). For instance, CD44 expression was downregulated and ADM expression was upregulated in state 5 and state 6. Figure S5A,B). Macrophages, as typical antigen-presenting cells, were enriched in the antigen processing and presentation of exogenous peptide antigen via MHC class II [53]. Highly expressed genes in state 6 were related to the response to oxidative stress. Pathway analysis suggested that cells in state 5 were dependent on negative regulation of the immune system process ( Figure S5C). Furthermore, the aberrantly expressed signaling molecules in the gene regulatory network were significantly expressed in macrophage state 5 and state 6 ( Figure 6D,E). For instance, CD44 expression was downregulated and ADM expression was upregulated in state 5 and state 6.

(
The results of the pseudotime analysis of the macrophages further confirmed that signal molecules in the regulatory network are key factors affecting the change in the functional state of macrophages, and the abnormal expression of ligand signals (TIMP1, VEGFA, TGFB1, LIF, CCL3L3, BMP2, SPP1) in tumor cells may be an important reason for the transformation of macrophages into an immunosuppressive state.

The Independent Validation Set of Lung Adenocarcinoma (nLung, tLung and tL/B) was Used to Verify the Regulatory Network
To further verify the reliability of the gene regulatory network screening method and the consistency of signal molecules, we collected an independent lung adenocarcinoma single-cell transcriptome dataset as a validation set to construct its gene regulatory network. In the independent validation set, we found regulatory chains with GAS6, TIMP1, SPP1, and VEGFA as ligands, which was consistent with our findings above, indicating that these regulatory relations may be ubiquitous in tumor cells and macrophages (Figure The results of the pseudotime analysis of the macrophages further confirmed that signal molecules in the regulatory network are key factors affecting the change in the functional state of macrophages, and the abnormal expression of ligand signals (TIMP1, VEGFA, TGFB1, LIF, CCL3L3, BMP2, SPP1) in tumor cells may be an important reason for the transformation of macrophages into an immunosuppressive state.

The Independent Validation Set of Lung Adenocarcinoma (nLung, tLung and tL/B) Was Used to Verify the Regulatory Network
To further verify the reliability of the gene regulatory network screening method and the consistency of signal molecules, we collected an independent lung adenocarcinoma single-cell transcriptome dataset as a validation set to construct its gene regulatory network. In the independent validation set, we found regulatory chains with GAS6, TIMP1, SPP1, and VEGFA as ligands, which was consistent with our findings above, indicating that these regulatory relations may be ubiquitous in tumor cells and macrophages ( Figure 7A,B). Due to the small number of cells in the validation set, some signals were not validated. Next, a survival analysis of the validated ligands revealed that the high expression ligand signals VEGFA, TIMP1, and SPP1 were significantly associated with a lower chance of survival in patients with lung adenocarcinoma ( Figure 7C). The upregulation of VEGFA can induce angiogenesis and recruit monocytes to the tumor niche, which are then transformed into tumor-associated macrophages (TAM) in the tumor to participate in tumorigenesis [54]. The upregulation of TIMP1 can regulate cell behavior by inducing signaling pathways involved in cell growth, proliferation and survival [55], and previous studies have found that TIMP1 may be a potential biomarker for the pathogenesis of LUAD [23]. The upregulation of SPP1 can promote the proliferation, migration and invasion of lung cancer cells, and increase the resistance of lung cancer to cisplatin [56]. Overall, VEGFA-, TIMP1, and SPP1-mediated regulatory networks may not only be the main cause of macrophage changes, but also these three signals may be markers of malignant changes in lung adenocarcinoma. Targeting VEGFA, TIMP1, and SPP1 may be a potential therapeutic strategy for lung adenocarcinoma.
ligand signals VEGFA, TIMP1, and SPP1 were significantly associated with a lower chance of survival in patients with lung adenocarcinoma ( Figure 7C). The upregulation of VEGFA can induce angiogenesis and recruit monocytes to the tumor niche, which are then transformed into tumor-associated macrophages (TAM) in the tumor to participate in tumorigenesis [54]. The upregulation of TIMP1 can regulate cell behavior by inducing signaling pathways involved in cell growth, proliferation and survival [55], and previous studies have found that TIMP1 may be a potential biomarker for the pathogenesis of LUAD [23]. The upregulation of SPP1 can promote the proliferation, migration and invasion of lung cancer cells, and increase the resistance of lung cancer to cisplatin [56]. Overall, VEGFA-, TIMP1, and SPP1-mediated regulatory networks may not only be the main cause of macrophage changes, but also these three signals may be markers of malignant changes in lung adenocarcinoma. Targeting VEGFA, TIMP1, and SPP1 may be a potential therapeutic strategy for lung adenocarcinoma.
Next, the same method of cell interaction network construction was applied to lung squamous cell carcinoma (LUSC), another type of non-small cell lung cancer. We observed that although the cell composition was similar to that of LUAD, the proportion of cells varied greatly from that of LUAD during the process of change from healthy to early to advanced stage, and the proportion of macrophages tended to increase. Further, by constructing a gene regulatory network for LUSC, we found that there is a common ligand signal GAS6 in tumor progression, but the target genes regulated by GAS6 are obviously different.
Altogether, the results of the validation group further verified the accuracy of our gene regulatory network construction method. The construction of a gene regulatory network is feasible for discovering important signals in cell-cell communication, and also provides a reference for screening the prognostic and therapeutic markers of patients.  Next, the same method of cell interaction network construction was applied to lung squamous cell carcinoma (LUSC), another type of non-small cell lung cancer. We observed that although the cell composition was similar to that of LUAD, the proportion of cells varied greatly from that of LUAD during the process of change from healthy to early to advanced stage, and the proportion of macrophages tended to increase. Further, by constructing a gene regulatory network for LUSC, we found that there is a common ligand signal GAS6 in tumor progression, but the target genes regulated by GAS6 are obviously different.

Discussion
Altogether, the results of the validation group further verified the accuracy of our gene regulatory network construction method. The construction of a gene regulatory network is feasible for discovering important signals in cell-cell communication, and also provides a reference for screening the prognostic and therapeutic markers of patients.

Discussion
Lung adenocarcinoma is currently the leading cause of cancer death, and survival rates for lung cancer patients are poor. Lung adenocarcinoma tissues are highly heterogeneous. Single-cell RNA sequencing (scRNA-seq) analysis can provide a better understanding of cellular collective behavior and mutual regulatory mechanisms within a tissue ecosystem. In our study, transcriptional-level sequencing data of more than 135,000 single cells were collected, and various cell types were analyzed, providing a new perspective for understanding the cell composition characteristics and pathogenesis development in the LUAD.
The immune microenvironment of lung adenocarcinoma plays an important role in the initiation and development of lung adenocarcinoma. We found that the infiltration of macrophages in the immune microenvironment was significantly decreased with disease progression, and the function of macrophages was also significantly altered. In addition, we found, through survival analysis, low infiltration of macrophages is associated with poor survival in patients. Macrophages are an important part of the immune microenvironment. As the disease progresses, tumor-associated macrophage (TAM) subtypes play a dominant role in macrophage cells. TAMs exhibit an immunosuppressive protumor phenotype that promotes tumor progression, metastasis, and resistance to therapy [57,58]. The current study found that TAMs not only manipulate cancer cells toward progression and metastasis but also suppress the immune responses and cause chemoresistance [59]. Therefore, our results provided a supportive reference for the impact of macrophage cells on the occurrence and development of LUAD. It is worth noting that T cells (especially CD8 + T cells) play a very important role in the tumor microenvironment, but when we calculated the proportion of cells, we found that the proportion of T-cell infiltration did not change significantly in the lung microenvironment of patients during the process of change from healthy to early to advanced lung cancer. Moreover, by calculating T-cell immunoinfiltration with prognostic survival of lung adenocarcinoma patients, no significant relationship was found between the proportion of T-cell immunoinfiltration and the survival time of patients. We speculate that the total number of T cells may not have changed, but their function and composition have changed, based on the following experiment. T cells in normal tissues highly expressed T-effector cell markers, but tumor-infiltration of LUAD T cells mainly displayed exhausted and regulatory T-cell features, indicating compromised T-cytotoxic activity [60]. CD4 + T and regulatory T (Treg) cells associated with immune suppression and CD8 + exhausted cells were enriched in the LUAD, showing a shift in T-cell composition and gene expression towards immune suppression during LUAD progression [23].
Cell-cell communication influences cell phenotype and function [61]. At present, most of the cell-cell communication is ligand-receptor based. The construction of a gene regulatory network further elaborates the process of signal transmission between cellcell communication and improves the integrity of cell-cell communication. In addition, the analytical methods of different cell-cell communication tools lead to different scores that are difficult to compare and evaluate. These methods make it difficult to validate the results [51,56]. We used several cell communication tools to evaluate consistency to improve the accuracy and confidence of signaling molecules. Based on the regulatory network analysis, we identified a number of ligands that have important regulatory effects on macrophage function, for instance, VEGFA, TIMP1, SPP1. It was demonstrated that TIMP1 could be a potential biomarker for LUAD pathogenesis through a protein fluorescence immunostaining experiment [24], promoting angiogenesis through the VEGFA signaling pathway and downstream target molecules in LUAD [62]; SPP1 plays a crucial role in mediating macrophage polarization and lung cancer escape, suggesting that these molecules are potential therapeutic targets for LUAD [63].
Similar results were obtained in independent verification sets for the regulatory relationships between the above regulatory molecules. First, we selected the same technical platform for (10× genomics) lung adenocarcinoma single-cell data for analysis using the same process, thereby reducing the error introduced by the different platforms. In the validation set, VEGFA, TIMP1 and SPP1 were still found to be significantly associated with the survival of lung adenocarcinoma patients. Both lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) are subtypes of non-small cell lung cancer (NSCLC), and we then applied the process to the LUSC single cell RNA-seq datasets from 10× genomics platform. We observed that although there were some similarities between the different NSCLC subtypes, there is also a great deal of heterogeneity due to their different tumor microenvironments.

Conclusions
In conclusion, the construction of gene regulatory networks provides a comprehensive view of intercellular communication, which not only enhances the understanding of the molecular integrity of cell communication, but also provides a more credible signal transmission relationship. Important regulatory molecules (VEGFA, TIMP1, SPP1, etc.) of macrophages in lung adenocarcinoma were identified by the gene regulatory network method. Targeting VEGFA, TIMP1, and SPP1 may be potential therapeutic strategies for LUAD. Our research provides evidence of the tumor ecosystem heterogeneity between tLung and tL/B, in terms of cell proportion, macrophage developmental trajectories, in addition to the crosstalk between tumor and macrophages. Moreover, we have illustrated a strategy for constructing gene regulatory networks based on the cell-cell communication analysis of single-cell RNA sequencing data. We proved that such regulatory networks provide robust intercellular regulatory signals in the tumor microenvironment. This strategy may be applied to other cancer studies by single-cell RNA sequencing technology. The exploration of cell-cell communication by way of gene regulatory networks may lead to the discovery of novel therapeutic targets and biomarkers of response for current immunotherapies for LUAD and other cancers.