Multi-Design Differential Expression Profiling of COVID-19 Lung Autopsy Specimens Reveals Significantly Deregulated Inflammatory Pathways and SFTPC Impaired Transcription

The transcriptomic profiling of lung damage associated with SARS-CoV-2 infection may lead to the development of effective therapies to prevent COVID-19-related deaths. We selected a series of 21 autoptic lung samples, 14 of which had positive nasopharyngeal swabs for SARS-CoV-2 and a clinical diagnosis of COVID-19-related death; their pulmonary viral load was quantified with a specific probe for SARS-CoV-2. The remaining seven cases had no documented respiratory disease and were used as controls. RNA from formalin-fixed paraffin-embedded (FFPE) tissue samples was extracted to perform gene expression profiling by means of targeted (Nanostring) and comprehensive RNA-Seq. Two differential expression designs were carried out leading to relevant results in terms of deregulation. SARS-CoV-2 positive specimens presented a significant overexpression in genes of the type I interferon signaling pathway (IFIT1, OAS1, ISG15 and RSAD2), complement activation (C2 and CFB), macrophage polarization (PKM, SIGLEC1, CD163 and MS4A4A) and Cathepsin C (CTSC). CD163, Siglec-1 and Cathepsin C overexpression was validated by immunohistochemistry. SFTPC, the encoding gene for pulmonary-associated surfactant protein C, emerged as a key identifier of COVID-19 patients with high viral load. This study successfully recognized SARS-CoV-2 specific immune signatures in lung samples and highlighted new potential therapeutic targets. A better understanding of the immunopathogenic mechanisms of SARS-CoV-2 induced lung damage is required to develop effective individualized pharmacological strategies.


Introduction
Severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) is responsible for coronavirus disease 2019  pandemic: as of November 2021, over 200 million people have been infected and there have been over 5 million deaths.
It is well-established that COVID-19 progression is driven by an inflammatory syndrome caused by an hyperactivation of the immune system. Therefore, disease severity in patients is due to not only the viral infection, but also to the variability of host innate and adaptive immune response, which can result in a wide spectrum of clinical outcomes, ranging from absence of symptoms to severe disease, multiple organ failure, and ultimately death [1,2].
A combination of cytopathic effect, persistent inflammation and immune system activation is responsible for lung tissue damage, resulting in dyspnea, acute lung injury (ALI) and respiratory distress syndrome (ARDS) [3,4].
Understanding COVID-19 related-pathology and the molecular and cellular mechanisms underlying the physiopathology of the disease is essential to obtain reliable diagnostics and optimize clinical management. Because it is rare to obtain samples from living patients, autoptic specimens are essential to gain knowledge of severe COVID-19 lung pathology. Many nonspecific histopathological findings have been identified in autoptic specimens, such as diffuse alveolar damage (DAD), lymphomonocytic and neutrophilic infiltrate, and pleural effusion [5,6].
In this work, we used histologic, immunohistochemical and transcriptomic analyses of post mortem lung tissues, to gain insight into the immunopathologic landscape of COVID-19 lung disease and putative therapeutic strategies targeting aberrant immune responses.

Patient Selection
Lung tissue samples were obtained from 24 patients (17 cases and 7 controls) who died between 20 March 2020 and 30 May 2020 at Padua University Hospital.
The cases were patients with a SARS-CoV-2 infection confirmed by real-time PCR analysis of nasopharyngeal swab samples taken at the time of hospital admission and who died from COVID-19. The controls were deceased patients without any clinically documented respiratory disease.

Autopsy and Tissue Processing
As a safety precaution against infection, a COVID-19-optimized autopsy protocol was followed, in line with the recently published recommendations of the Italian Health Department [7,8].
A median of 5 (range 3 to 7) lung tissue blocks were sampled. All tissue samples were processed according to standard protocols, with formalin fixation time > 48 h. Paraffinembedded sections of 3 µm thickness were stained with hematoxylin and eosin (H&E). Histological examination of COVID-19 pneumonia was performed taking into account several morphological features, as previously described [9,10].
Three experienced pathologists (MF, FP and FC) jointly evaluated the H&E slides to select the most representative tissue block to perform gene-expression profiling and immunohistochemistry (IHC).

RNA Qualification
Five consecutive 10-µm thick sections from each selected FFPE tissue block were obtained. The RecoverAll™ Total Nucleic Acid Isolation Kit (Thermo Fisher Scientific, Waltham, MA, USA) was used to isolate RNA from the material, according to the manufacturer's instructions. RNA concentration was determined using Qubit ® 2.0 Fluorometer (Life Technology, Carlsbad, CA, USA). RNA quality was assessed on the Agilent Bioanalyzed 2100 (Agilent Technologies, Santa Clara, CA, USA), according to the DV200 parameter.

Nanostring Gene-Expression Profiling
At least 100 ng of total RNA was loaded for hybridization with the nCounter ® Autoimmune Profiling Panel and COVID Plus Panel (NanoString Technologies, Seattle, WA, USA) for 16 h at 65 • C, according to the manufacturer's instructions and quantified by the nCounter ® MAX Analysis System (NanoString Technologies). Raw data processing, quality control, and normalization were performed using the nSolver™ 4.0 Analysis Software (NanoString Technologies). Transcript counts were normalized to housekeeper reference gene expression prior to analysis. Three samples out of the 17 SARS-CoV-2 cases were excluded from the analysis due to low quality of the run.

RNA-Seq Gene-Expression Profiling
Total RNA was profiled using SMARTer ® Stranded Total RNA-Seq kit v2-Pico Input Mammalian (Takara Bio USA, Inc., San Jose, CA, USA). Briefly, 50 ng of total RNA was reverse transcribed to single strand cDNA using random primers. Illumina adapters and indexes were added through PCR using only a limited number of cycles. The PCR products were purified using AMPure Beads and then the ribosomal cDNA was depleted. The resulting ribo-depleted library fragments were enriched in a second round of PCR. The amplified RNA-Seq library was purified by immobilization onto AMPure beads and was eluted in TRIS buffer. Purified RNA-Seq library was quantified with Qubit dsDNA HS kit (Thermo Fisher Scientific), run on an Agilent 4200 TapeStation D1000 HS ScreenTape (Agilent, Santa Clara, CA, USA) to assess size distribution of the library molecules, and pooled to equimolar concentration. Pooled RNA-Seq libraries were sequenced with pairedend 75-bp reads using an Illumina NextSeq 550 sequencer (Illumina, Inc., San Diego, CA, USA) and raw data were stored in BaseSpace Sequence Hub (Illumina).

Bioinformatic Analysis
Raw FASTQ data were mapped using the STAR aligner and gene counts were quantified with Salmon. Viral load and clade were generated through the Illumina DRAGEN COVID Lineage v 3.5.0 pipeline, included in the BaseSpace App suite, and served as metadata for the subsequent differential analysis.
Clades were assigned according to the nomenclature defined by Nextstrain. Raw counts and metadata for each sample of interest were imported into the R environment (R version 4.0.5-Shake and Throw).
Genes with poor count levels were filtered out, keeping the ones with a mean value of at least 10, across all samples. Differential expression analysis was performed using the R package DESeq2 exploiting Wald test for statistical significance and Benjamini-Hochberg method for multiple testing correction.
Two design patterns were considered when performing differential expression analysis. Firstly, COVID-infected gene expression profiles were compared to pure control samples' counts, related to deceased patients without any respiratory disease (virus-vs.-control design). Secondly, the two COVID subgroups, low and high viral load, were tested one against the other (low-vs.-high design).
The expression of CD163 and Siglec-1 was quantified by counting the positive macrophages in five consecutive 40× fields, sampling representative different regions of the sample. The mean value per high power field (HPF) for each sample was calculated by dividing the total number of positive cells by 5. When evaluating the expression of Cathepsin C, a semiquantitative scoring system was used as follows: 0 = absence of staining; 1 = weak staining; 2 = moderate staining and 3 = strong staining.
The differential expression of CD163, Siglec-1 and Cathepsin C between cases and controls was detected by applying the Wilcoxon-Mann-Whitney test. A p-value < 0.05 was considered significant. The statistical analysis was performed using the STATA software (Stata Corporation, College Station, TX, USA).

Autoptic Evaluation of Lung from COVID-19 and Control Patients
We analyzed 24 post mortem lung samples from 17 deceased COVID-19 patients and post mortem lung samples from seven patients who died of cardiovascular diseases and had no clinically documented respiratory disease, nor infectious disease. All COVID-19 patients were confirmed for SARS-CoV-2 infection through qRT-PCR assays performed on nasopharyngeal swab specimens. The primary cause of death in all patients of the COVID-19 cohort was respiratory failure. Clinical summaries of the 24 patients are listed in Table 1. Lungs from COVID-19 patients, macroscopically, were increased in volume, with bilateral interstitial oedema and congestion. The cut surfaces revealed tan-gray consolidations and patchy hemorrhagic areas. Microscopically, lungs showed nonspecific lesions consistent with a diagnosis of DAD: multifocal damage with both exudative and proliferative inflammation, inclusive of hyaline membrane formation, alveolar-capillary barrier injury with red blood cell extravasation, inflammatory cells infiltration into the intra-alveolar space, fibroblast and myofibroblast proliferation, acute fibrinous organizing pneumonia and organizing pneumonia, extracellular matrix deposition, parenchymal remodeling, and pulmonary fibrosis. Pneumocytes squamous metaplasia and microthrombi in capillary beds and arterioles were reported in 3 and 11 cases respectively. Lungs from controls showed, apart from congestion and focal oedema, no other relevant pathological aspects.
Regarding the clinical/laboratory information of COVID-19 patients, main characteristics are summarized in Table 2. No significant clusterization of the samples was obtained by considering clinical data (i.e., timing of intubation, type of therapy), sex or age (75 y and 80 y as cut-offs).

Detection of SARS-CoV-2 and Clades Identification
Based on the quantitative assessment of SARS-CoV-2-related genes' expression, six cases were classified with a high viral load. There was concordance between RNA-Seq and digital multiplexed gene expression technologies for determining samples with high viral load.

Gene Expression Profiling Identified an Hyperinflammatory Status of the Lung Parenchyma and a Dysregulation in the Complement Signaling
One representative lung specimen for each patient was subject to gene expression profiling by using a commercially available nCounter panel (nCounter ® Autoimmune Profiling Panel and COVID Plus Panel), to investigate 752 (plus 20 reference genes) differentially expressed immune and inflammatory-related transcripts. The transcriptomic signature identified by the nCounter analysis was further validated by total RNA-Seq, which is able to test for most of 30,000 transcripts.
Three samples of the COVID-19 cohort were discarded from the analysis due to poor quality of RNA (RNA integrity number-RIN < 3) [11].
Both Nanostring and RNA-Seq analysis were able to discriminate between COVID-19 patients and control samples ( Figure 1). The hierarchical clustering heatmaps and the principal component analysis (PCA) graphs based on the differentially expressed genes (DEGs) depict a clear separation of the two groups with no significant difference between male and females. Cases distribution according to distance-based hierarchical clustering heatmaps and principal component analysis (PCA) graphs related to Nanostring (A) and total RNA-Seq (B) results, respectively. The heatmaps are a color-coding graphical representation of data coming from transcriptome analyses according to differentially expressed genes (DEGs) distribution. These plots were built using the Pheatmap package in R achieved by hierarchical, agglomerative clustering methods. The PCA graph depicts variation within and between the two groups. Both representations for the two methods of analysis showed a clear separation between COVID-19 patients (samples) and controls. DEGs were identified by comparing COVID-19 patients' transcriptomes with the controls' transcriptomes. An adjusted p-value (q-value < 0.05) and fold change (FC) ratio (|log2FC| ≥ 2) were used to determine the DEGs. The volcano plots (Figure 2A,B) highlight the DEGs in virus versus control design both in nCounter and RNA-Seq experiment, and the specific genes with the highest degree of statistical significance in terms of differential expression. The relative expression of the top 20 most deregulated genes for nCounter analysis is shown in Figure 2C, while in Figure 2D the results of virus versus control RNA-Seq experiment are depicted differentiating between controls, and samples with low and high viral load. According to nCounter analysis, the genes with higher expression levels amongst the samples were: PKM, CAPN1, CTSA, IFIT1, OAS1, C2, AP1S1, CCL13, CFB, MPV17, SIGLEC1 and PSMB9. On the other hand, differential expression analysis revealed a lower level of expression in the following genes: TCF4, SNCA, TGFB3, PTK2, CD83, DAB2IP, DLL4 and SMAD3. For the purpose of our study, we divided these DEGs into macrocategories of pathways: INF I signaling pathway (to which IFIT1, OAS1, CCL13 and DAB2IP belong), macrophage activation genes (to which PKM, SIGLEC1, PSMB9 and CD83 belong), complement system (to which C2 and CFB belong), Cathepsin genes (CTSA) and other genes (CAPN1, AP1S1, MPV17, TNF4, SNCA, TGFB3, PTK2, DLL4, SMAD3).
We then expanded the analysis to all significantly deregulated genes in both methodologies. We found an important overlap, confirming the relevance of the following genes. The commonly overexpressed ones were: PKM, IFIT1, OAS1, C2, CFB, SIGLEC1, CD163, ISG15, CTSC, RSAD2 and MS4A4A, while the commonly downregulated ones: TCF4, SNCA, TGFB3, CD83 and DDL4 ( Figure 2E,F). Table 3 lists the common increased and decreased DEGs from both the nCounter and total RNA-Seq analyses. Table 3. Common deregulated genes from nCounter and total RNA-Seq analyses. Plays a key role in the innate immune response to viral infection either via its conjugation to a target protein (ISGylation) or via its action as a free or unconjugated protein.

# Gene ID Gene Name Main Function p-Value
p < 0.01 9 CTSC Cathepsin C Functions as a central coordinator for activation of many serine proteases in immune/inflammatory cells. p < 0.01

Major Cell Type Components of SARS-CoV-2 Associated Inflammation
The nCounter analysis allows to discriminate the main pathways and main cell types involved in the inflammatory insult. In particular, the COVID-19-related samples were characterized by a significant activation of genes related to type I interferon signaling pathway, the complement system, and macrophages ( Figure 3A). Endothelial activation genes and TNF family signaling pathways are downregulated in the samples in comparison to controls. The distribution of various cell types between samples and controls, obtained by processing Nanostring raw data, showed a more abundant population of DCs and macrophages in the COVID-19 related samples ( Figure 3B).

Comparison between Lung Samples with Different Viral Load (Low versus High Design)
Pulmonary viral load was quantified by Nanostring with eight probes specific for SARS-CoV-2, and samples were divided into two groups, high and low viral load, numbering six and eight cases respectively, using the median value as a cutoff between the two groups ( Figure 4). The two groups were moderately distinguished one from another, as regards DEGs identified by RNA-Seq. The PCA plot exemplifies the separation between these two with no significant difference between male and females.
The volcano plot in Figure 4B illustrates the DEGs obtained in low-vs-high design. It shows the names of specific genes with the highest degree of differential expression between the two groups. On the right side of the plot are the overexpressed genes in high viral load patients, while on the left side the underexpressed ones. Amongst the deregulated genes ( Figure 4C), the significantly upregulated ones were IFI44, IFI44L, IFI6, EPSTI1, CLEC4E, MX2, MX1, TLR2, IFIT2, FPR2, ZC3H12A, IFIT3, LOC10798, PARP9, DDX60L, ISG15, GBP2 and GBP5. On the other hand, differential expression analysis revealed a downregulation of SFTPC and WWTR1. Among these, IFI44, IFI44L, IFI6, IFIT2, IFIT3, PARP9, DDX60L, ISG15, GBP2 and GBP5 belong to INF I signaling pathway, while EPSTI1, CLEC4E, MX2, MX1, TLR2, FPR2 and ZC3H12A can be associated with macrophage activation pathways; those not belonging to a specific category are WWTR1 and LOC10798.

Comparison between Lung Samples with Different Viral Load (Low versus High Design)
Pulmonary viral load was quantified by Nanostring with eight probes specific for SARS-CoV-2, and samples were divided into two groups, high and low viral load, numbering six and eight cases respectively, using the median value as a cutoff between the two groups ( Figure 4). The two groups were moderately distinguished one from another, as regards DEGs identified by RNA-Seq. The PCA plot exemplifies the separation be-  Despite not falling into a specific pathway category, SFTPC encodes the pulmonaryassociated surfactant protein C (SP-C), essential for lung function, and is notably by far the most deregulated gene in terms of statistical significance (q-value = 6.41 × 10 −6 ) and log Fold Change score (log2FC = 6.21).

Immunohistochemical Evaluation
Among the most deregulated genes, we performed IHC staining for CD163, Siglec-1 and Cathepsin-C ( Figure 5) on the same 21 samples tested by gene expression profiling. All the three tested proteins confirmed the data obtained by Nanostring/total RNA-Seq profiling.
As for CD163, the mean of controls was 9.6 positive cells/HPF, while the mean of COVID-19 patients was 41.4 (p = 0.0029). As for Siglec-1, the mean of controls was 2.4 positive cells/HPF, while the mean of COVID-19 patients was 25.2 (p = 0.0035).
For cathepsin C IHC results, the median of controls was 1+, while the median of COVID-19 patients was 3+ (p < 0.0001). As for CD163, the mean of controls was 9.6 positive cells/HPF, while the mean of COVID-19 patients was 41.4 (p = 0.0029). As for Siglec-1, the mean of controls was 2.4 positive cells/HPF, while the mean of COVID-19 patients was 25.2 (p = 0.0035).
For cathepsin C IHC results, the median of controls was 1+, while the median of COVID-19 patients was 3+ (p < 0.0001).

Discussion
In this study we have analyzed the differential expression of a large number of immune and inflammatory-related transcripts in autopsy lung specimens of COVID-19 deceased patients, with the aim of identifying an immune signature associated with severe COVID-19. Our data confirm that severe COVID-19 is associated with hyperinflammation and a dysregulation of the innate and adaptive immune response. Specifically, SARS-CoV-2 specimens revealed upregulation of interferon-stimulated genes, monocytes/macrophage activation-associated genes, complement pathway and Cathepsin C.
Type I Interferon (IFN) is a key component to the immediate antiviral response. The IFN system comprises several IFN cytokines inducing hundreds of IFN stimulated effectors genes (ISG) [12]. It is well-established that severe COVID-19 is associated with an impaired IFN-I response. However, while low levels of IFNs have been detected in the peripheral blood of patients with severe COVID-19 [13], the local induction of IFNs and ISGs has been reported in the bronchoalveolar lavage (BAL) of some critically ill patients [14]. This discrepancy has been attributed to the activation of specialized immune cells such as lung-resident dendritic cells (DCs), which produce IFNα in response to SARS-CoV-2 infection [14]. Recent clinical trials have shown that early administration of IFN-α is linked to reduced mortality in hospitalized patients, while late IFN-α therapy leads to increased mortality and delayed recovery. Conceivably, high levels of IFN in severe COVID-19 might have no antiviral effect but promote inflammation and tissue damage [15].
Among the antiviral proteins induced by IFN1, the OAS genes harbor genetic variants that might influence susceptibility to the SARS-CoV infection and progression [16]. These genes are responsible for the production of a host antiviral mediator (2 ,5 -oligoadenylate [2-5A]), that activates the effector enzyme RNAse L. RNAse L degrades a double-stranded RNA generated by the virus as a replication intermediate [17,18]. The OAS genes are also a potential therapeutic target: 2-5A is degraded by endogenous phosphodiesterase 12 (PDE-12). Therapeutic PDE-12 inhibitors are available and increase OAS-mediated antiviral activity [19].
Among the interferon-stimulated genes that were found to be upregulated in our analysis, the gene ISG15 is of particular interest. The ISGylation is a host defense mechanism which involves the post-translational attachment of ubiquitin-like protein ISG15 to host and viral target substrates. SARS-CoV-2 triggers deISGylation to generate free ISG15 through the papain-like protease (PLpro) enzyme [20,21]. PLpro is an attractive drug target because it has multiple essential functions involved in processing viral proteins, with several compounds having shown promising results as an antiviral therapy in vitro [22,23].
Excessive monocyte/macrophage activation is another pillar of SARS-CoV-2 immune response and contributes to hyperinflammation [24]. Single-cell RNA sequencing analysis of BAL collected from patients with severe or mild COVID-19 revealed a marked expansion of the mononuclear phagocytes compartment. According to our results, mononuclear phagocyte composition was characterized by a depletion of tissue-resident alveolar macrophages and an enrichment of inflammatory monocyte-derived macrophages in critical patients [25]. Recent studies found low CD169/Siglec1 and high CD163 expression on circulating monocytes in severe COVID-19 patients [26,27]. The early phases of COVID-19 were characterized by an enrichment of CD169/Siglec1+ monocytes in the peripheral blood [26]. Conversely, our data show how the lung immune microenvironment of deceased COVID-19 patients is characterized by a range of macrophage activation states. In fact, the local enrichment of CD163+ monocytes is associated with anti-inflammatory macrophage functions [28], while the abundance CD169/Siglec1+ monocytes reflects IFN1 pathway activation [29].
Complement has emerged as a key contributor to the pathogenesis of COVID-19associated tissue inflammation and thrombosis and is becoming an attractive therapeutic target [30]. In line with our results, there are several lines of evidence for local deposition of complement proteins and activation products in post mortem tissue samples showing activation of the three pathways (classical, alternative and mannose-binding lectin pathway) [31,32]. The lectin and alternative complement pathways are thought to be associated with COVID-19 mortality. However, the involvement of the complement cascade in COVID-19 seems to be heterogeneous and still to be fully elucidated [33].
Neutrophils play a crucial role in SARS-CoV-2 mediated lung damage by releasing elastase-related serine proteases and reactive oxygen [34]. Interestingly, our analysis found the gene CTSC to be upregulated in lung tissue of deceased COVID-19 patients. CTSC encodes for the cysteine dipeptidyl aminopeptidase Cathepsin C (Cat C), that activates most of tissue-degrading elastase-related serine proteases. For this reason, CatC is an attractive therapeutic target to prevent the irreversible pulmonary failure caused in patients with severe COVID-19 [35].
Differential expression analysis revealed lower levels in samples with higher viral load of SFTPC. Through the infection of type II alveolar cells, SARS-CoV-2 interferes with the production of pulmonary surfactant, of which surfactant protein C is a key component, thus causing an increase in surface tension, lastly driving to alveolar collapse [36]. In accordance with this, our data highlight a higher SFTPC expression in patients with lower viral load. As a direct consequence of this, the use of pulmonary surfactant as an additional therapy for the treatment of ARDS can be supported [36]. Moreover, surfactant protein C plays a role in regulating inflammation via JAK/STAT pathway and this may represent an additional therapeutic benefit [37].
The main limitations of the study are the relatively small sample size and the poor quality of the RNA due to the intrinsic poor preservation of autopsy tissue specimens. COVID-19 patients derived from the first pandemic wave in which autopsy was performed only in small series of patients with limited access and heterogeneous information on the clinical management of the disease and laboratory testing. Moreover, viral pathogenicity has changed over time and what we observed in this series should be validated in more recent cases of COVID-19-related deaths. Furthermore, the evaluation of additional control cohorts bearing forms of infectious and noninfectious interstitial pneumonitis may help to better define which molecular alterations are specific to SARS-CoV-2 infection. Notwithstanding these limitations, the validation of the nCounter analysis by total RNA-Seq, which quantified the expression of more than 30,000 transcripts, represents a strength of our results. Another important point is that the subanalysis according to viral load highlighted the results obtained in the comparison between COVID-19 and control samples, which further validate the finding that the infection was responsible for INF I signaling pathway overexpression and macrophages activation.
In summary, detailed transcriptomic analysis of autopsy lung tissue specimens reveals the presence of a strong proinflammatory signature. Our data highlight the key role of IFN1 as the main driver of the dysregulated inflammatory and immune microenvironment observed in the lungs of COVID-19. The current work is a preliminary study which provides the foundation to evaluate a larger series of autopsies, to better understand the immunopathology of severe COVID-19 and to identify novel therapeutic targets and biomarkers to optimize patient selection and the timing of treatment administration.  Institutional Review Board Statement: All information regarding human material was managed using anonymous numerical codes, and all samples were handled in compliance with the Declaration of Helsinki (https://www.wma.net/what-we-do/medical-ethics/declaration-of-helsinki/ (accessed on 22 February 2022)).

Informed Consent Statement:
Patient consent was waived due to the nature of the research (autopsy material).

Data Availability Statement:
The data presented in our study are available on request from the corresponding author. The data are not publicly available due to our internal policy in terms of privacy restrictions.

Conflicts of Interest:
Aida Freire Valls is an employee of Nanostring Technologies, Inc. The other authors have no competing interest to declare related to the present work.