Combined Large Cell Neuroendocrine Carcinomas of the Lung: Integrative Molecular Analysis Identifies Subtypes with Potential Therapeutic Implications

Simple Summary In this manuscript, we offer an integrated molecular analysis of 44 combined large cell neuroendocrine carcinomas (CoLCNECs) in order to deepen the knowledge about these rare histotypes and to clarify their relationship with lung cancers. In the present state of research, molecular studies are still scant, consisting of small and heterogeneous cohorts, and the genomic landscape is poorly characterized. This study shows that CoLCNECs constitute a standalone group of neuroendocrine neoplasm, with three different molecular profiles, two of which overlap with pure LCNEC or adenocarcinoma. CoLCNECs can be considered an independent histologic category with specific genomic and transcriptomic features, different and therefore not comparable to other lung cancers. Indeed, in addition to a histological re-evaluation of lung cancer classification, our study may help to develop a new diagnostic approach for novel and personalized treatments in CoLCNECs. Abstract Background: Combined large cell neuroendocrine carcinoma (CoLCNEC) is given by the association of LCNEC with adeno or squamous or any non-neuroendocrine carcinoma. Molecular bases of CoLCNEC pathogenesis are scant and no standardized therapies are defined. Methods: 44 CoLCNECs: 26 with adenocarcinoma (CoADC), 7 with squamous cell carcinoma (CoSQC), 3 with small cell carcinoma (CoSCLC), 4 with atypical carcinoid (CoAC) and 4 napsin-A positive LCNEC (NapA+), were assessed for alterations in 409 genes and transcriptomic profiling of 20,815 genes. Results: Genes altered included TP53 (n = 30), RB1 (n = 14) and KRAS (n = 13). Targetable alterations included six KRAS G12C mutations and ALK-EML4 fusion gene. Comparison of CoLCNEC transcriptomes with 86 lung cancers of pure histology (8 AC, 19 ADC, 19 LCNEC, 11 SCLC and 29 SQC) identified CoLCNEC as a separate entity of neuroendocrine tumours with three different molecular profiles, two of which showed a non-neuroendocrine lineage. Hypomethylation, activation of MAPK signalling and association to immunotherapy signature specifically characterized each of three CoLCNEC molecular clusters. Prognostic stratification was also provided. Conclusions: CoLCNECs are an independent histologic category. Our findings support the extension of routine evaluation of KRAS mutations, fusion genes and immune-related markers to offer new perspectives in the therapeutic management of CoLCNEC.

The current WHO classification of lung neoplasms defines CoLCNECs as the association of an LCNEC with a recognizable component of adenocarcinoma (CoADC), squamous cell carcinoma (CoSQC) or, less commonly, spindle cell or giant cell carcinoma [4]. Combinations of LCNEC with SCLC are also reported but the proportion of large cells should be ≥10% of the tumour to define a combined SCLC/LCNEC [1]. Within the spectrum of CoLCNECs, LCNECs that focally express the exocrine marker napsin-A but lack a distinct adenocarcinoma component should also be considered [3]. The association between LC-NEC and atypical carcinoid (AC) has also been reported, suggesting that LCNEC may in some cases be the result of accumulating mutations in a previous AC [5,6].
Molecular studies on this rare entity are scant and consist of small and heterogeneous cohorts. In most studies, only the genomic profile has been investigated, highlighting TP53 as the most frequently altered gene while data on the frequency of mutations affecting KRAS and RB1 genes are discordant [3,[7][8][9]. To date, only two studies have analysed these tumours also at a transcriptomic level but their relationship under the umbrella of lung cancers remains unclear [8,10]. Indeed, knowledge about the molecular pathogenesis of combined neuroendocrine/non-neuroendocrine tumours is largely based on studies suggesting a common genetic origin of the two components, especially when the neuroendocrine one is a poorly differentiated carcinoma [9,11,12]. A recent integrated molecular study showed that the histologically different components of the same tumour shared a similar subclonal architecture, especially at a transcriptomic level [10].
Remarkably a therapeutic strategy directed only to one tumour component could be ineffective. In the era of precision medicine, a detailed molecular characterization could guide the therapeutic approach.
We performed an integrated molecular analysis of a well-characterized CoLCNEC series to deepen the knowledge of these rare forms and clarify their relationship among lung cancers.

Cases
A retrospective series (1988-2018) of 44 surgically-resected primary CoLCNEC was collected from 6 Italian institutions with the approval of the local ethics committees (Supplementary Methods).
In addition, 86 surgically-resected primary lung tumours with pure histological features were retrieved for histotypes-based comparison of expression profiles, including 19 adenocarcinomas (ADC), 19 large cell neuroendocrine carcinoma (LCNEC), 11 small cell lung carcinomas (SCLC), 29 squamous cell carcinomas (SQC), and 8 atypical carcinoids (AC). No patient had received preoperative therapy. All cases were formalin-fixed and paraffin-embedded (FFPE) for histological evaluation. Reclassification according to WHO 2021 criteria [1] and confirmation of histological diagnosis were performed by consensus among pathologists before evaluation of each case. Tumours were staged according to the 8th edition of the TNM classification [13]. Eight non-neoplastic lung tissues served as controls for transcriptomic analysis.

Immunohistochemistry
Monoclonal antibodies and staining steps are listed in Table S1. Neuroendocrine markers synaptophysin (Syn) and chromogranin A (ChgA) were used to confirm the neuroendocrine nature and to properly estimate the LCNEC percentage in each CoLCNEC with a non-neuroendocrine counterpart. Napsin-A (NapA) and TTF-1 were used to identify ADC components, p40 to detect squamous components and OTP for carcinoid components. The tumour proliferative index was evaluated only for the neuroendocrine components by counting the percentage of Ki67-positive cells in areas of strongest nuclear labelling [14]. Immunoreactivity and scores for Somatostatin receptor 2A (SSTR2A) were evaluated using a two-tiered system as suggested by Volante et al.: negative for scores of 0 and 1 and positive for 2 and 3 positive [15].

Mutational and Copy Number Variation Status of 409 Cancer Genes
DNA was obtained from an FFPE tumour using 10 consecutive 4-µm sections and the QIAamp DNA FFPE Tissue Kit (Qiagen, Milan, Italy) and qualified according to Simbolo et al. [16]. The Oncomine Tumor Mutational Load (TML) panel (Thermo Fisher Scientific, Milan, Italy) with a next-generation sequencing assay was used (Supplementary Methods). The assay covers 1.65 Mb including the exons of 409 cancer-related genes.

Tumor Mutational Load and Mutational Signatures
TML and mutational spectrum for each sample were evaluated using the Oncomine TML 5.10 plugin on IonReporter (Thermo Fisher Scientific) (Supplementary Methods).

Fusion Gene Detection
The FusionPlex Solid Tumor Panel (ArcherDX, Boulder, CO, USA) was used to screen for fusions in 57 genes (Supplementary Methods).

Gene Expression Analysis by Next-Generation Sequencing
RNA was prepared using ReliaPrep FFPE Total RNA Miniprep System (Promega, Milan, Italy), quantified using Qubit RNA HS Assay Kit (Thermo Fisher), and qualified using RIN analysis of Agilent RNA 6000 Nano Kit on Agilent 2100 Bioanalyzer (Agilent Technologies). RNA with RIN > 5 and concentration over 10 ng/µL was considered suitable. The Ampliseq Transcriptome Human Gene Expression Kit (Thermo Fisher Scientific, MA, USA) was used to analyse the expression status of 20,815 human RefSeq genes. The expression data analysis was subjected to quality control according to Law et al. [17] (Supplementary Methods).

Gene Set Enrichment Analysis
To determine the biological processes differently enriched among all the clusters, we used the GAGE R package [18] and ssGSEA score [19]. We downloaded c2, c5, c6 and H pathways from MSigDB [19,20] and identified the cluster-specific enriched gene sets using the normalized and batch-corrected count matrix. We assessed the ssGSEA score for each pair of samples and gene sets. We performed a z-score normalization of the pathway scores for each sample (Supplementary Methods). A positive correlation between the sample and the specific pathway is represented by a z-score > 0 in a range from −1 to 1. We considered only the differently related pathways (p-value < 0.05 according to Benjamini-Hochberg correction for multiple comparisons). All samples were grouped according to their molecular class.

Statistical Analysis
The associations between clinical, immunophenotypical and molecular features and CoLCNEC groups were assessed using the Fisher exact test or the Kruskal-Wallis test, as appropriate. Correction for multiple comparisons was performed according to Benjamini-Hochberg. Overall survival (OS) was assessed from diagnosis to death or last follow-up by the Kaplan-Meier method. The log-rank test was used to assess the survival difference between patient groups. Cox proportional regression analysis was used to assess the association between clinical-pathological features and OS. Hazard ratios (HR) are presented with a 95% confidence interval (CI).
Data analysis was performed using the R environment for statistical computing and graphics (R Foundation, Vienna, Austria-Version 3.6.2) and MedCalc for Windows version 15.6 (MedCalc Software, Ostend, Belgium). All tests were two-sided and p-values < 0.05 were considered significant.

Mutational and Copy Number Status of 409 Genes
All 44 cases were analysed for 409 genes included in the TML assay panel. Sequencing achieved an average coverage of 522 × (119 − 1582×) (Table S4).
Fifty-three genes contained mutations and/or copy number alterations (Table S5). Mutations were 133 (80 missense, 28 nonsense, 13 splice site alterations, 11 frameshifts and 1 frame deletion; Table S6). Mutations were found in at least one gene in all cases and 17 genes were altered in ≥3 cases ( Figure 2A; Table 2).     A validation by IHC for p53 and rb1 was performed and immunostaining for both markers appeared homogeneous and coherent with the mutational pattern ( Figure 2B).
Copy number variation status was estimated for all 409 genes by using sequencing data. Focal amplification was observed in 15 genes (Table 2). MYC was the most frequently amplified gene (4/44; 9.1%). CDKN2A/B locus showed frequent homozygous deletion (14/44; 31.8%) followed by RB1 (3/44; 6.8%). Based on the chromosomal position of each gene, the status of chromosome arms was defined ( Figure S1). Gains in chromosomes 1 were the most frequent alteration observed (25/44; 56.8%) followed by gains in chromosome 8 ( (Table S7). The mutational signatures did not show any specific pattern.

Fusion Genes
RNA sequencing with the 57-genes panel identified one CoADC with an ALK-EML4 fusion gene, confirmed by FISH analysis that showed positivity in 77% of neoplastic cells in both components ( Figure S2).

Gene Expression Profiles
Adequate transcriptome sequencing data were obtained for 38 CoLCNECs and compared with the expression profiles of 86 histologically pure lung cancers and 8 nonneoplastic lung samples (see Materials and Methods).
Unsupervised hierarchical clustering analysis using the Cola package [22] identified 1356 most informative genes (Table S8) and 10 clusters ( Figure 2C): one cluster including all non-neoplastic lung samples (N), six clusters each including mainly one single-histology lung cancer subtype (AC, ADC, LC, SCLC and SQC), and three clusters including 34/38 CoLCNECs (CL4, CL7, CL9). The remaining 4 CoLCNECs were 2 CoADCs clustered with the ADC group and 2 CoACs clustering in the SCLC group. The 2 CoADC had >70% of adenocarcinoma component and harboured a KRAS mutation in one case and an ALK-EML4 fusion gene in the other. The 2 CoACs had an LCNEC component of 90% and 30%, respectively, and both cases displayed a TP53 alteration associated with RB1 inactivation in one case.

Survival Analysis
The median OS (mOS) was 21 months (95% CI  in the overall cohort, 39 months for CL4, 17 months for CL7 and 11 for CL9. Patients in CL9 had significantly shorter OS than patients in CL4 (p = 0.035; Figure 3D).

Discussion
We performed a genomic and transcriptomic characterization of a cohort of 44 wellcharacterized CoLCNECs, including 4 CoACs, 26 CoADCs, 4 NapA+, 7 CoSQCs and 3 CoSCLCs. In addition to morphology, immunohistochemistry was crucial for the correct identification of the components, especially for the rare LCNEC NapA+ subtype that shows only immunohistochemical napsin-A positivity but no evidence of a distinct conventional ADC pattern [3].
Genomic analysis showed that TP53, RB1 and KRAS genes were frequently altered. A defined distribution of these and other alterations identified in the different histological subtypes of CoLCNEC was reported. Indeed, all CoSCLCs showed simultaneous alteration of TP53 and RB1, while CoADCs showed frequent alterations of KRAS, KEAP1 and STK11 or the presence of ALK-EML4 fusion gene in line with the genomic profile of histologically pure SCLC and ADC, respectively. CoSQC, LCNEC NapA+ and CoAC were mainly driven by TP53 and RB1 alterations. To date, the genomic profile of CoLCNECs is based on a few studies, in which this rare subtype has been included in cohorts of pure LCNECs [3,[7][8][9]. In particular, George et al. [8] reported 8 CoSCLC and 5 CoSQC showing TP53 and RB1 alteration in almost all cases. No genomic studies exist on CoACs, while 14 LCNEC NapA+ were recently investigated by Rekhtman et al. [3] showing mutation in KRAS and/or STK11 in 11 cases, similar to our results.
Our study is to our knowledge the first comparing the transcriptomes of CoLCNECs with those of histologically pure lung cancers (AC, ADC, LCNEC, SCLC and SQC). This comparison suggests that CoLCNEC is a standalone entity composed of three distinct clusters: CL4, CL7 and CL9. CL4 showed an expression profile hierarchically linked to neuroendocrine carcinomas (LCNECs and SCLCs), with the greatest number of cases with simultaneous inactivation of TP53 and RB1, and a transcriptomic profile associated with hypomethylation-related signatures. Of interest, none of the markers previously identified as specific to pure histology LCNEC [8] or SCLC [29] (ASCL1, DLL3, NOTCHs, NEUROD1, POU2F3, YAP1) has been identified as differentially expressed in any CoLCNEC cluster while the direct comparison with other lung cancer molecular profiles highlighted a strong similarity to carcinoids described by Alcala et al. [23] and Laddah et al. [24] suggesting a strong neuroendocrine lineage of this cluster. Conversely, CL7 and CL9 transcriptomes were hierarchically closer to non-neuroendocrine lineage. CL7, mainly composed of CoADC showed the greatest number of cases affected by mutation in KRAS and/or STK11. As expected by such a genomic asset, this cluster showed a transcriptomic profile strongly associated with gene sets linked to the activation of the MAPK pathway, making it a suitable candidate for MEK inhibitor therapy. This is further reinforced by the fact that 3 out of 6 KRAS mutations in this cluster were G12C, amenable to target therapy. In this cluster, the non-LCNEC counterpart also influenced the expression profile which resulted in an intermediate between a pure adenocarcinoma and an LCNEC type I [8] described by George and colleagues. Finally, all cases in CL9, which mainly included CoSQC, were driven by TP53 alterations. The GSEA also showed a relationship between CL9, EMT, inflammation and CTLA4 immunotherapy signature, suggesting that this molecular cluster might be a candidate for immunotherapy. Additionally in this cluster, the non-LCNEC counterpart strongly affected the global molecular lineage of the cluster exhibiting a nonneuroendocrine lineage.
Our study also suggests that these three CoLCNEC clusters have diverse clinical behaviour, with CL4 having a better overall survival with respect to CL9.

Conclusions
This study shows that CoLCNECs are an independent histologic category within lung cancers with mixed genomic and transcriptomic profiles that carry both prognostic significance and potential therapeutic vulnerabilities.
Our results support the extension of molecular profiling in the diagnostic routine, especially for LCNECs combined with an adenocarcinoma component to evaluate the presence of druggable KRAS G12C mutations and fusion genes. Furthermore, transcriptome-based characterization suggests immunotherapy for the worst prognosis molecular group opening up to new therapeutic perspectives.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/cancers14194653/s1, Figure S1: The panel depicts detail of copy number variation (CNV) in whole chromosomes. The consensus of chromosome CNV was represented in red for copy gain events while blue was used for loss events; Figure S2: ALK-EML4 fusion gene identification and validation. (A) Schematic diagram of the domains of wild-type ALK, wildtype EML4, and the identified ALK-EML4 fusion protein. (B) Representative results of break-apart FISH signals for ALK in cancer cells (sample ID 291). Associated red signals (red arrowheads) and green signals (green arrowheads) indicate no split involving the ALK gene while lonely red signals (white arrows) indicate genomic rearrangement of ALK; Figure S3: Gene set enrichment analysis (GSEA). Heatmaps of: (A) hallmark gene sets from MSigDB collections, (B) oncogenic signature gene sets (C6) and (C) curated gene sets (C2) inferred by gene expression significantly enriched in any of the 3 molecular classes of combined large cell neuroendocrine carcinomas (CoLCNECs). ssGSEA was used to obtain the enrichment score, representing the degree to which the genes in a particular gene set are co-ordinately up-or downregulated; Figure S4: Forest plot of hazard ratios: A forest plot showing the hazard ratio and 95% confidence intervals of Stage, Period of diagnosis and Cluster Table S1: Antibody sources and dilutions; Table S2: Clinico-pathological features of 44 combined large cell neuroendocrine carcinomas (CoLCNECs); Table S3: Immunohistochemical features of 44 combined large cell neuroendocrine carcinomas (CoLCNECs) grouped according to histotype; Table S4: Coverage detail of sequencing analysis performed for each of 44 combined large cell neuroendocrine carcinomas (CoLCNECs); Table S5: Differences in mutation prevalence for 53 mutated genes among co-LCNECs grouped according to histotype. Genes are listed according to their p-value; Table S6: Detail of mutations detected in 44 combined large cell neuroendocrine carcinomas (CoLCNECs). Related to Figure 1; Table S7: Tumor mutational load (TML) and signatures of somatic mutations (mutational spectrum) of 44 combined large cell neuroendocrine carcinomas (CoLCNECs); Table S8: List of genes used in unsupervised clustering analysis and differential expression analysis among clusters identified; Table S9: List of gene set included in Hallmark, C2, C5 and C6 MSigDB and GSEA score for each cluster;   Data Availability Statement: All data generated or analyzed during this study are included in this article and its supplementary material files. Further enquiries can be directed to the corresponding author.