Landscape of Molecular Crosstalk Perturbation between Lung Cancer and COVID-19

Background: Lung cancer patients have the worst outcomes when affected by coronavirus disease 2019 (COVID-19). The molecular mechanisms underlying the association between lung cancer and COVID-19 remain unknown. The objective of this investigation was to determine whether there is crosstalk in molecular perturbation between COVID-19 and lung cancer, and to identify a molecular signature, molecular networks and signaling pathways shared by the two diseases. Methods: We analyzed publicly available gene expression data from 52 severely affected COVID-19 human lung samples, 594 lung tumor samples and 54 normal disease-free lung samples. We performed network and pathways analysis to identify molecular networks and signaling pathways shared by the two diseases. Results: The investigation revealed a signature of genes associated with both diseases and signatures of genes uniquely associated with each disease, confirming crosstalk in molecular perturbation between COVID-19 and lung cancer. In addition, the analysis revealed molecular networks and signaling pathways associated with both diseases. Conclusions: The investigation revealed crosstalk in molecular perturbation between COVID-19 and lung cancer, and molecular networks and signaling pathways associated with the two diseases. Further research on a population impacted by both diseases is recommended to elucidate molecular drivers of the association between the two diseases.


Introduction
Coronavirus disease 2019 , caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), is a worldwide pandemic that has caused unprecedented loss of human life and devastated the world economy [1][2][3]. Despite remarkable progress in the management of patients, the disease continues to cause devastation and overwhelming health care systems, as the increasing number of COVID-19-positive patients who require hospitalization and intensive care support continues to rise worldwide [1][2][3]. This dislocation of the global health care infrastructure is of particular concern in clinical management and treatment of patients with underlying chronic diseases, such as lung cancer [4,5]. Although currently there is no definitive data showing that COVID-19 causes lung cancer, emerging evidence from early studies has shown that lung cancer patients have almost twice the risk of SARS-CoV-2 infection compared to the general population [6,7]. However, it is not clear from the published reports whether there is crosstalk in molecular perturbation between COVID-19 and lung cancer.
In a clinical epidemiology study conducted among 102 patients with lung cancer and COVID-19, researchers at the Memorial Sloan Kettering Cancer Center in New York

Materials and Methods
Our experimental design approach focused on identifying molecular signatures of genes associated with both diseases, signatures of genes uniquely associated with each disease and molecular networks and signaling pathways associated with both diseases. The scientific premise and rationale was that among the genes transcriptionally associated with each disease, a subset of them are associated with both COVID-19 and lung cancer. Thus, molecular crosstalk perturbation between lung cancer and COVID-19 was considered an emergent property of functionally related genes transcriptionally associated with both diseases, interacting in gene regulatory networks and signaling pathways shared by the two diseases. We addressed this knowledge gap using an integrative genomic data analysis approach, combining RNA-Seq data derived from lung tissue of patients severely affected by COVID-19 who succumbed to the disease, lung tissue from patients affected by lung cancer and disease-free normal lung tissue controls. The overall project design and execution workflow, along with sources of RNA-Seq data, are presented in Figure 1. This section provides a brief but detailed description of sources of data and analysis strategies employed in this investigation.
ating COVID-19 with lung cancer. We further hypothesized that COVID-19 and lung cancer have shared gene regulatory networks and signaling pathways, which potentially exacerbates the severity of COVID-19 in lung cancer patients. We addressed these hypotheses using publicly available gene expression data derived from lung tissue from patients severely affected with COVID-19 who succumbed to the disease, patients with lung tumors and disease-free control lung tissue.

Materials and Methods
Our experimental design approach focused on identifying molecular signatures of genes associated with both diseases, signatures of genes uniquely associated with each disease and molecular networks and signaling pathways associated with both diseases. The scientific premise and rationale was that among the genes transcriptionally associated with each disease, a subset of them are associated with both COVID-19 and lung cancer. Thus, molecular crosstalk perturbation between lung cancer and COVID-19 was considered an emergent property of functionally related genes transcriptionally associated with both diseases, interacting in gene regulatory networks and signaling pathways shared by the two diseases. We addressed this knowledge gap using an integrative genomic data analysis approach, combining RNA-Seq data derived from lung tissue of patients severely affected by COVID-19 who succumbed to the disease, lung tissue from patients affected by lung cancer and disease-free normal lung tissue controls. The overall project design and execution workflow, along with sources of RNA-Seq data, are presented in Figure 1. This section provides a brief but detailed description of sources of data and analysis strategies employed in this investigation.

Sources of Gene Expression Data on COVID-19 and Lung Cancer
Gene expression (RNA-Seq) data on COVID-19-affected lung samples (N = 52 samples) were derived from human lung autopsy tissue at the Massachusetts General Hospital and Columbia University Irving Medical Center in New York [21]. Processed data (sequence read counts) along with associated clinical information were downloaded from the Gene Expression Omnibus (GEO) database https://www.ncbi.nlm.nih.gov/geo/ (accessed on 10 February 2022) under accession # GSE150316 [21]. The data set was generated

Sources of Gene Expression Data on COVID-19 and Lung Cancer
Gene expression (RNA-Seq) data on COVID-19-affected lung samples (N = 52 samples) were derived from human lung autopsy tissue at the Massachusetts General Hospital and Columbia University Irving Medical Center in New York [21]. Processed data (sequence read counts) along with associated clinical information were downloaded from the Gene Expression Omnibus (GEO) database https://www.ncbi.nlm.nih.gov/geo/ (accessed on 10 February 2022) under accession # GSE150316 [21]. The data set was generated using the Illumina sequencing platform. Details about sample collection, processing, quality control and preparation for sequencing have been published elsewhere by the data originators [21]. Briefly, gene expression data was derived from lung tissue harvested from patients severely affected by COVID-19, who succumbed to the disease and underwent autopsy upon consent for clinical care [21]. All patients were confirmed for SARS-CoV-2 infection through qRT-PCR assays performed by the data originators [21].
Gene expression (RNA-Seq) data on 594 lung tumor samples and 52 control samples along with clinical information were downloaded from The Cancer Genome Atlas (TCGA) [22] via the Genomics Data Commons (GDC) https://portal.gdc.cancer.gov/ (accessed on 10 February 2022) using the data transfer tool [23]. The tumor samples were matched with clinical information for ascertainment of gene expression data. We processed the two data sets and checked them for quality. We then combined the two data sets to create one data matrix with three sample groups (COVID-19, lung tumors and normal lung samples). We performed noise reduction on the combined data set by filtering the genes with zero and very low expression values across samples. The resulting data set was normalized with quantile normalization using R Bioconductor implemented in our RNA-Sequence data analysis pipeline [24,25].

Bioinformatics Data Analysis
Using normalized data, we performed analysis comparing gene expression levels between COVID-19-affected and normal lung samples and between lung-cancer-affected and normal lung samples by computing p-values using the Limma package implemented in R [24,25]. This unbiased analysis was conducted to identify a signature of genes associated with COVID-19 and signature of genes associated with lung cancer. We then combined the two sets of genes to identify a signature of genes associated with both COVID-19 and lung cancer, and signatures of genes uniquely associated with each disease. The distribution of genes in the three gene signatures was organized using a Venn diagram. We performed additional analysis using gene expression data on genes associated with both diseases, by comparing their expression levels between COVID-19 lung and lung tumor samples to determine their direction of change, which was characterized as either up or downregulated. For each analysis, we controlled for multiple hypothesis testing using the false discovery rate (FDR) procedure [26]. In addition to estimates of p-values, we computed the log2 Fold Change (Log2 FC), defined as the median of gene expression values minus the gene expression value for each gene. The logFC was used to determine the direction of change, denoted as down for the negative value and up for the positive value. The genes were ranked on p-values, logFC and the FDR. We used a volcano plot to visualize the distribution of p-values and logFC resulting from comparison of gene expression levels within disease and between the two diseases. Genes were ranked on p-values, FDR and logFC.
To determine whether genes associated with both COVID-19 and lung cancer are co-regulated and have similar patterns of expression profiles, we performed hierarchical clustering using the Pearson correlation coefficient as the measure of distance between pairs of genes and complete linkage as the clustering method. Hierarchical clustering was performed using Morpheus [27]. To identify molecular networks and signaling pathways associated with the two diseases, we performed network and pathway analysis using the Ingenuity Pathways Analysis (IPA) software [28]. We mapped the up and downregulated genes that were highly associated with both diseases onto networks and canonical pathways. We used Fisher's exact t-test in network and pathways analysis to compute estimates of p-values. Additionally, we computed the Z-scores to assess the likelihood and reliability of correctly predicting molecular networks to which the genes belonged. The FDR was used to correct for multiple hypothesis testing in pathway analysis [26]. The predicted molecular networks and signaling pathways were ranked based on Z-scores and log pvalues, respectively. To characterize the molecular functions, biological processes and cellular components in which the genes associated with the two diseases are involved, we performed gene ontology analysis [29], as implemented in IPA [28].

Results
Clinical management of lung cancer patients in the COVID-19 pandemic era poses significant challenges. One of the more significant challenges has been the lack of informa- tion about the molecular mechanisms underlying the association between the two diseases. This knowledge gap has the potential to disrupt essential oncological services provided to lung cancer patients and lead to suboptimal care with potentially deadly consequences. To address this key knowledge gap and critical unmet medical need, we performed integrative genomic data analysis combining gene expression data from lung tissues derived from COVID-19-affected and lung-cancer-affected individuals to discover a signature of genes associated with both diseases, signatures of genes uniquely associated with each disease and molecular networks and signaling pathways shared by the two diseases. Our findings are summarized in the subsections below.

Discovery of Signatures of Genes Associated with COVID-19 and Lung Cancer
To discover a signature of genes transcriptionally associated with COVID-19 and a signature of genes transcriptionally associated lung cancer, we compared gene expression levels between lung samples derived from patients severely affected by COVID-19 and normal lung samples, and between lung tumors and normal lung samples.
The results of this investigation are summarized in a Venn diagram in Figure 2. A comparison of gene expression levels between COVID-19-affected lung and normal lung tissue samples revealed a signature of 12,014 significantly (p < 0.05) differentially expressed genes associated with COVID-19 ( Figure 2A). The distribution of estimates of p-values and logFC for all the 12,014 genes is presented in a volcano plot in supplementary Figure  SF1A. A complete list of all 12,014 significantly (p < 0.05) differentially expressed genes associated with COVID-19, along with their estimates of p-values and logFC, are presented in supplementary Table S1A. The primary goal of the investigation was to identify a signature of genes associated with both COVID-19 and lung cancer. To achieve this goal, we combined the set of genes associated with COVID-19 with the set of genes associated with lung cancer and sorted them by estimates of p-values and logFC. If the signature of genes was significant in both diseases, as measured by the estimated p-value (p ≤ 0.05), it was considered to be associated with both diseases. Thus, molecular crosstalk perturbation between COVID-19 and lung cancer in this portion of the investigation was measured by discovering a signature of genes associated with or shared by both diseases.
The results of this investigation are summarized in a Venn diagram in Figure 2. The investigation revealed a signature of 9026 genes transcriptionally associated with both COVID-19 and lung cancer, confirming our hypothesis ( Figure 2, see intersection). In addition, the investigation revealed a signature of 2988 genes associated with COVID-19 only ( Figure 2A) and a signature of 3394 genes associated with lung cancer only (Figure The same analysis, comparing gene expression levels between lung cancer and normal lung samples, revealed a signature of 12,420 significantly (p < 0.05) differentially expressed genes associated with lung cancer ( Figure 2B). The distribution of estimates of p-values and logFC for all the 12,420 genes is presented in a volcano plot shown in the supplementary Figure SF1B. A complete list of all the 12,420 genes that were significantly (p < 0.05) associated with lung cancer is presented in supplementary Table S1B. Overall, the investigation confirmed our hypothesis that transcription profiling using lung samples from COVID-19and lung-tumor-affected individuals could identify signatures of genes associated with each disease (Figure 2).
The primary goal of the investigation was to identify a signature of genes associated with both COVID-19 and lung cancer. To achieve this goal, we combined the set of genes associated with COVID-19 with the set of genes associated with lung cancer and sorted them by estimates of p-values and logFC. If the signature of genes was significant in both diseases, as measured by the estimated p-value (p ≤ 0.05), it was considered to be associated with both diseases. Thus, molecular crosstalk perturbation between COVID-19 and lung cancer in this portion of the investigation was measured by discovering a signature of genes associated with or shared by both diseases.
The results of this investigation are summarized in a Venn diagram in Figure 2. The investigation revealed a signature of 9026 genes transcriptionally associated with both COVID-19 and lung cancer, confirming our hypothesis ( Figure 2, see intersection). In addition, the investigation revealed a signature of 2988 genes associated with COVID-19 only ( Figure 2A) and a signature of 3394 genes associated with lung cancer only ( Figure 2B). Interestingly, a majority of the genes were associated with both diseases.

Changes in Expression Profiles for Genes Associated with COVID-19 and Lung Cancer
Following the discovery of a signature of 9026 genes associated with both COVID-19 and lung cancer, we conducted additional investigation to determine their differences in patterns of expression and direction of change. We addressed this issue by comparing the expression levels of the 9026 genes between COVID-19 and lung cancer. Note that this analysis framework was crucial in identifying genes with different patterns of expression profiles (i.e., genes upregulated in COVID-19 and downregulated in lung cancer and vice versa). The differences in patterns of expression were determined by the estimates of p-values, whereas the direction of change was determined by the logFC, represented by a negative value for downregulation and positive values for upregulation.
Comparison of gene expression profiles produced a signature of 7599 significantly (p < 0.05) differentially expressed up and downregulated genes associated with both COVID-19 and lung cancer. The remaining 1427 genes did not show differences in patterns of expression profiles between the two diseases. The distribution of estimates of p-values and logFC for all the 7599 genes is presented in a volcano plot in supplementary Figure SF2. Among the significantly differentially expressed up and downregulated genes, 4124 genes were upregulated and 3475 were downregulated between COVID-19 and lung cancer tumors. A list of the top 50 most highly significantly differentially expressed (25 up and 25 downregulated) genes along with their estimates of p-values and logFC is presented in Table 1. A complete list of all the 7599 genes showing differences in patterns of expression profiles between COVID-19 and lung cancer along with their estimates of p-values, logFC and direction of change (up/down) is presented in supplementary Table S2. Overall, the investigation revealed crosstalk in molecular perturbation between COVID-19 and lung cancer.

Similarity in Expression Profiles for Genes Associated with Both COVID-19 and Lung Cancer
Quantitative assessment of differences in gene expression levels provides limited information about the regulatory patterns of the genes perturbed in both diseases. Genes associated with both diseases could still behave differently in each disease. Therefore, to characterize the patterns of gene expression profiles and their direction of change among the genes associated with both COVID-19 and lung cancer, we performed hierarchical clustering, as explained in the Materials and Methods section. We hypothesized that, among the genes associated with both COVID-19 and lung cancer, there are differences in their patterns of expression profiles in the two diseases. Here, we sought to discover genes that were upregulated in lung cancer and downregulated in COVID-19, and genes that were upregulated in COVID-19 and downregulated in lung cancer. For this analysis, we used the top 515 most highly significantly (p < 10 −48 ) differentially expressed up and downregulated genes associated with both diseases. Note that these genes were selected from the 7599 up and downregulated genes significantly associated with both diseases.
Thus, this analysis includes the genes in Table 1. The selection of the top 515 genes used for hierarchical clustering was crucial to eliminate any spurious patterns of expression profiles. The results showing patterns of expression profiles for all the 515 up and downregulated genes associated with both COVID-19 and lung cancer are presented in Figure 3. Owing to space limitations, the names of genes are not presented in Figure 3, as they could not fit in the figure. As shown in Figure 3, hierarchical clustering produced two clusters of genes: a cluster of genes that were upregulated in lung cancer and downregulated in COVID-19, and a cluster of genes that were upregulated in COVID-19 and downregulated in lung cancer (Figure 3). This confirmed our hypothesis that, among the genes associated with both COVID-19 and lung cancer, there are differences in patterns of their expression profiles in the two diseases. Out of the 515 genes evaluated, 363 genes were upregulated in COVID-19 and downregulated in lung cancer (Figure 3). The other 152 were downregulated in COVID-19 and upregulated in lung cancer (Figure 3). A complete list of all the 515 up and downregulated genes is provided in supplementary Table S3. Overall, this portion of the investigation further confirmed the crosstalk in molecular perturbation between COVID-19 and lung cancer by showing that genes associated with both are co-regulated and have similar patterns of expression profiles.

Discovery of Molecular Networks and Signaling Pathways Shared by the Two Diseases
To gain insights about the broader biological context in which genes associated with both lung cancer and COVID-19 operate and to determine whether they share the same regulatory programs, we performed network analysis. We hypothesized that genes associated with both COVID-19 and lung cancer are functionally related and interact in gene regulatory networks. We sought to identify molecular networks associated with both COVID-19 and lung cancer and to characterize molecular functions, biological and disease processes, and cellular components in which they are involved. This framework was crucial to determining whether these genes share the same regulatory mechanisms. For this investigation, we mapped the top 515 up and downregulated genes that were co-regulated and highly significantly associated with both diseases onto the networks, as described in the Materials and Methods section.
The investigation revealed 25 gene regulatory networks with Z-scores ranging from 10 to 55, containing genes with overlapping functions. The results showing the top seven gene regulatory networks (merged) are presented in Figure 4. Note that to ensure easy presentation and clarity of networks, only the most interconnected genes (≥3 connections) in the networks are presented. Genes and networks with fewer interactions were pruned to remove spurious interactions. The top seven networks (Figure 4) with Z-scores 40 to 55 contained genes predicted to be involved in organismal injury and abnormalities, gene expression, protein synthesis, RNA damage and repair, connective tissue disorders, amino acid metabolism, cellular assembly and organization, small molecule biochemistry, cancer, DNA replication, recombination and repair, gastrointestinal disease and protein synthesis.
In addition, the analysis revealed gene regulatory networks containing genes predicted to be involved in cell-to-cell signaling and interaction, infectious diseases, ophthal-

Discovery of Molecular Networks and Signaling Pathways Shared by the Two Diseases
To gain insights about the broader biological context in which genes associated with both lung cancer and COVID-19 operate and to determine whether they share the same regulatory programs, we performed network analysis. We hypothesized that genes associated with both COVID-19 and lung cancer are functionally related and interact in gene regulatory networks. We sought to identify molecular networks associated with both COVID-19 and lung cancer and to characterize molecular functions, biological and disease processes, and cellular components in which they are involved. This framework was crucial to determining whether these genes share the same regulatory mechanisms. For this investigation, we mapped the top 515 up and downregulated genes that were co-regulated and highly significantly associated with both diseases onto the networks, as described in the Materials and Methods section.
The investigation revealed 25 gene regulatory networks with Z-scores ranging from 10 to 55, containing genes with overlapping functions. The results showing the top seven gene regulatory networks (merged) are presented in Figure 4. Note that to ensure easy presentation and clarity of networks, only the most interconnected genes (≥3 connections) in the networks are presented. Genes and networks with fewer interactions were pruned to remove spurious interactions. The top seven networks (Figure 4) with Z-scores 40 to 55 contained genes predicted to be involved in organismal injury and abnormalities, gene expression, protein synthesis, RNA damage and repair, connective tissue disorders, amino acid metabolism, cellular assembly and organization, small molecule biochemistry, cancer, DNA replication, recombination and repair, gastrointestinal disease and protein synthesis.
Overall, the results showed that COVID-19 and lung cancer share the same regulatory mechanisms and that network analysis is a powerful approach for revealing molecular crosstalk perturbation between lung cancer and COVID-19. Taken together, the investigation demonstrated that the association between COVID-19 and lung cancer can be considered an emergency property of molecular networks encompassing many functionally related genes, as opposed to the core biological processes driving the association between the two diseases being driven by responses to molecular perturbation in a small number of genes. To determine whether COVID-19 and lung cancer share the same regulatory mechanisms and signaling pathways, gain further insights about the broader biological context in which genes associated with both diseases operate and discover potential therapeutic targets, we mapped the top 515 up and downregulated genes that were highly significantly associated with both diseases onto canonical pathways.
The results showing the top nine most highly significant signaling pathways associated with both COVID-19 and lung cancer are presented in Figure 5. Additional signaling pathways associated with both diseases discovered included: the Mitotic Roles of Polo-Like Kinase -log(p-value, 2.47E), containing genes CDK1, In addition, the analysis revealed gene regulatory networks containing genes predicted to be involved in cell-to-cell signaling and interaction, infectious diseases, ophthalmic disease, hematological disease, immunological disease, RNA post-transcriptional modification, cardiac arteriopathy, cardiac fibrosis, cardiovascular disease, cardiac dilation, cell cycle, cellular assembly and organization, respiratory disease, cell-mediated immune response, cellular movement, cellular function and maintenance, lipid metabolism, drug metabolism, molecular transport, organ morphology, tissue development, tissue morphology, cell morphology, cellular function and maintenance, cell death and survival, inflammatory response, organismal survival, carbohydrate metabolism. A complete list of all the predicted molecular networks, the genes they contain and the top diseases and molecular functions they are involved in, is presented in supplementary Table S4.
Overall, the results showed that COVID-19 and lung cancer share the same regulatory mechanisms and that network analysis is a powerful approach for revealing molecular crosstalk perturbation between lung cancer and COVID-19. Taken together, the investigation demonstrated that the association between COVID-19 and lung cancer can be considered an emergency property of molecular networks encompassing many functionally related genes, as opposed to the core biological processes driving the association between the two diseases being driven by responses to molecular perturbation in a small number of genes.
To determine whether COVID-19 and lung cancer share the same regulatory mechanisms and signaling pathways, gain further insights about the broader biological context in which genes associated with both diseases operate and discover potential therapeutic targets, we mapped the top 515 up and downregulated genes that were highly significantly associated with both diseases onto canonical pathways.
In summary, an integrative analysis combining gene expression data from COVID and lung cancer samples produced a signature of genes, molecular networks and signaling pathways associated with both diseases, confirming the crosstalk in molecular perturbation between the two diseases. Thus, in the context of common human diseases, molecular crosstalk perturbation between COVID-19 and lung cancer can be considered an emergent property of molecular networks and signaling pathways associated with both diseases, as opposed to the core biological processes associating the two diseases being driven by responses to changes in a small number of genes dysregulated in only one disease. Taken together, the investigation demonstrates that integrating large-scale, high-dimensional transcriptomic data holds promise to discover potential drivers of the severity of COVID-19 in individuals with lung cancer and targets for the development of therapeutic targets.  supplementary Table S5.

Discussion
In summary, an integrative analysis combining gene expression data from COVID and lung cancer samples produced a signature of genes, molecular networks and signaling pathways associated with both diseases, confirming the crosstalk in molecular perturbation between the two diseases. Thus, in the context of common human diseases, molecular crosstalk perturbation between COVID-19 and lung cancer can be considered an emergent property of molecular networks and signaling pathways associated with both diseases, as opposed to the core biological processes associating the two diseases being driven by responses to changes in a small number of genes dysregulated in only one disease. Taken together, the investigation demonstrates that integrating large-scale, high-dimensional transcriptomic data holds promise to discover potential drivers of the severity of COVID-19 in individuals with lung cancer and targets for the development of therapeutic targets.

Discussion
Patients with lung cancer have the worst outcomes when affected by COVID-19 [6,7]. The molecular mechanisms associating COVID-19 with lung cancer are not known. This investigation was conducted to address this knowledge gap. The investigation revealed a signature of genes, molecular networks and signaling pathways associated with both diseases. These findings suggest that COVID-19 and lung cancer have shared regulatory mechanisms. This integrative genomics data analysis framework provides the first step and is crucial to discovering the molecular drivers of COVID-19 severity in individuals with lung cancer and the discovery of potential therapeutic targets. To our knowledge, this is the first study to map the landscape of molecular crosstalk perturbation between COVID-19 and lung cancer.
A number of epidemiological studies revealing poorer outcomes for COVID-19 in lung cancer patients have been reported [30,31]. Those poorer outcomes have been attributed to the compromised immune system resulting from chemotherapy treatment, which lung cancer patients undergo, a risk factor to SARS-CoV-2 infection [30,31]. The novel aspect of our investigation is that it provides new knowledge by discovering a signature of genes, molecular networks and signaling pathways associated with both diseases. This has not been previously reported. To the extent that imbalance in host immune response to SARS-CoV-2 drives the development and progression of COVID-19 [32,33], the genes and pathways discovered in this investigation, if confirmed, could serve as potential clinically actionable molecular markers and therapeutic targets. For example, the immune-responsive cytokines and pro-inflammatory genes and the signaling pathways they control associated with both diseases discovered in this investigation could serve as molecular markers to guide clinical management of individuals with lung cancer affected by COVID-19 [7,34] and the development of novel, more effective therapeutics [35][36][37][38][39].
Some of the major challenges in clinical management of COVID-19 include extrapulmonary manifestations of the disease and its effects on multiple organs, including the lungs [40][41][42]. Extrapulmonary manifestations include thrombotic complications, myocardial dysfunction and arrhythmia, acute coronary disease syndromes, acute kidney injury, gastrointestinal symptoms, hepatocellular injury, hyperglycemia and ketosis, neurologic illnesses, ocular symptoms and dermatologic complications [40][41][42][43]. Although we did not investigate the association of the discovered genes with extrapulmonary manifestations in COVID-19, the discovery of genes with multiple overlapping functions involved in many biological processes suggests that some of the identified genes and gene regulatory networks may be involved in extrapulmonary activities. Moreover, the lung as an organ is likely to function in unison with other organs. Under such conditions, the effects of COVID-19 on the lungs have potential to trigger a cascade of events likely to affect other organs and lead to extrapulmonary manifestations. Indeed, lungs as organs contain many cells that can play many different roles. Although we did not examine individual lung cells, previous studies have shown that transcription profiling could reveal novel mechanisms of SARS-CoV-2 infection in human lung cells [44,45].
Another finding of significance in this investigation was the discovery of gene regulatory networks and signaling pathways associated with both diseases. This suggests that the host-pathogen interactions linking the two diseases are complex. The novel as-pect and clinical significance of this finding is that it could increase our understanding of host-pathogen interactions, a critical step in vaccine and drug development [46]. For example, the discovery of the coronavirus pathogenesis signaling pathway in this study has the promise to increase our understanding of the pathogenesis of COVID-19 and the molecular mechanisms driving the disease. Although signatures of genes associated with COVID-19 have been reported [17,21], molecular crosstalk perturbation between COVID-19 and lung cancer has not been reported. This framework is crucial for understanding the SARS-CoV-2 host interactions and discovering the molecular mechanisms driving disease severity and poorer outcomes in individuals with lung cancer impacted by COVID-19. Success in understanding the link between COVID-19 and lung cancer has the promise to ensure no disruption to essential oncological services and could guarantee optimal care in lung cancer patients in the COVID-19 pandemic era.
The discovery of the integrin signaling pathway associated with both lung cancer and COVID-19 has clinical significance. In cancer, integrins mediate cell adhesion and transmit mechanical and chemical signals to the cell interior [47]. Deregulation of integrin signaling in cancer empowers tumor cells with the ability to proliferate without restraint and to survive in foreign microenvironments [47]. Integrin signaling drives multiple stem cell functions, including tumor initiation, epithelial plasticity, metastatic reactivation and resistance to oncogene-and immune-targeted therapies [47]. These mechanisms of integrin regulation have the potential to provide a gateway for COVID-19 to drive its adverse effects on lung cancer patients. Thus, the integrin signaling pathway could serve as a potential therapeutic target. The discovery of the mTOR signaling pathway was of particular interest because the application of PI3K-Akt-mTOR signaling axis to COVID-19 disease and to other chronic conditions, such as obesity, has been reported [48,49]. This is a significant finding because patients with lung cancer, obesity and related chronic diseases affected by COVID-19 tend to have poorer outcomes [49,50], which suggests that this pathway has the promise to serve as a therapeutic target.
The discovery of some rather unexpected connections, such as ophthalmic disease and cardiac fibrosis, in network analysis was of particular interest. Cardiac involvement in patients who recovered from COVID-19 has been reported [51]. Recently, cardiopulmonary recovery after COVID-19 has been reported in a prospective multicenter trial [52]. Ophthalmic manifestations of COVID-19 have been reported [53]. Although the molecular mechanisms associating ophthalmic disease and cardiac fibrosis with COVID-19 are not well characterized, the connections observed in this investigation could partially be explained by the functional versatility of identified key genes.
This investigation shows that COVID-19 and lung cancer have shared regulatory programs and signaling pathways. However, the limitations of the study must be acknowledged. The investigation used data from COVID-19-and lung-cancer-affected individuals, not patients affected by both diseases. In addition, we did not perform mechanistic experiments to confirm the results from computational analysis. This was because such data were not available. With these limitations in mind, the manuscript emphasizes modeling the biological association between COVID-19 and lung cancer and considers this the first step in a long road to discovery of the molecular mechanisms driving the two diseases and adverse outcomes. Such line of research would require molecular and clinical data on individuals affected by both diseases. In addition, experimental confirmation of genomic discoveries would be necessary. That framework will be crucial to ensure the translation of genomic discoveries into clinical practice to improve clinical management of lung cancer patients in the COVID-19 pandemic era.

Conclusions
A key knowledge gap and critical unmet medical need in the clinical management of lung cancer patients in the COVID-19 pandemic era is the characterization of molecular mechanisms associating the two diseases. Using an integrative genomic data analysis approach, combining gene expression data from individuals affected by COVID-19 and individuals affected by lung cancer, we discovered a signature of genes, molecular networks and signaling pathways associated with both diseases. The investigation demonstrated that integrative data analysis, combining transcriptomic data from COVID-19 and lung cancer, is a powerful approach to deciphering the molecular mechanisms linking the two diseases. Further research on a population affected by both COVID-19 and lung cancer and experimental confirmation of the results is recommended to discover molecular drivers of the association between the two diseases, clinically actionable biomarkers and potential therapeutic targets. Such an investigation will be crucial to ensuring the translation of genomic discoveries in clinical practice to improve essential oncological services and guarantee the optimal care of lung cancer patients in the COVID-19 pandemic era.

Patents
No patents resulted from the work reported in this manuscript.  Supplementary  Table S1A. List of the 12,014 significantly differentially expressed genes associated with COVID-19affected lung tissue. Supplementary Table S1B. List of the 12,420 significantly differentially expressed genes associated with lung cancer. Supplementary Table S2. List of the 7599 significantly differentially expressed genes associated with both COVID-19 and lung cancer. Supplementary Table S3. List of the top 517 most highly significantly differentially expressed genes associated with both COVID-19 and lung cancer. These genes were used in heat map and pathway analysis. Supplementary Table S4. Predicted molecular networks and genes associated with both COVID-19 and lung cancer along with information on top diseases and molecular functions in which they are involved. Supplementary  Table S5. Predicted signaling pathways and genes associated with both COVID-19 and lung cancer. Institutional Review Board Statement: Not applicable. This study did not involve the use of humans or animals. All the work was based on de-identified publicly available data for which the sources are provided.
Informed Consent Statement: Not applicable. No human subjects were involved in this study. Only publicly available de-identified data were used in the study.

Data Availability Statement:
Original RNA-Seq/gene expression data and clinical information on COVID-19 are available at the Gene Expression Omnibus (GEO) https://www.ncbi.nlm.nih. gov/geo/; database under accession # GSE150316. Original RNA-Seq/gene expression data and clinical information on lung cancer and controls used in this study were downloaded from The Cancer Genome Atlas (TCGA) via the Genomics Data Commons and are available at: https://www. cancer.gov/about-nci/organization/ccg/research/structural-genomics/tcga and are accessible via the Genomics Data Commons GDC https://gdc.cancer.gov/; Additional data are shared through supplementary tables referenced in the manuscript and listed below and provided as supplementary material to this report.