Histological and Somatic Mutational Profiles of Mismatch Repair Deficient Endometrial Tumours of Different Aetiologies

Simple Summary Endometrial cancers can arise due to an error in DNA mending known as mismatch repair. This can happen because of an error in the cancer itself (somatic) or due to an inherited error (Lynch syndrome). Treatment trials have considered endometrial cancers caused by either of these errors as identical. As it is easier to recruit people with Lynch syndrome, they may be overrepresented in this group despite being less numerous in clinical practice. This would not be an issue if somatic and Lynch syndrome-related endometrial cancers were similar at a molecular level. The data presented herein, however, indicates that these two routes to mismatch repair, although sharing many similarities, lead to endometrial cancers with distinct molecular and pathological features. This may explain the range of outcomes observed in clinical trials of endometrial cancers with mismatch repair errors. Abstract Background: Mismatch repair deficient (MMRd) tumours may arise from somatic events acquired during carcinogenesis or in the context of Lynch syndrome (LS), an inherited cancer predisposition condition caused by germline MMR pathogenic variants. Our aim was to explore whether sporadic and hereditary MMRd endometrial cancers (EC) display distinctive tumour biology. Methods: Clinically annotated LS-EC were collected. Histological slide review was performed centrally by two specialist gynaecological pathologists. Mutational analysis was by a bespoke 75- gene next-generation sequencing panel. Comparisons were made with sporadic MMRd EC. Multiple correspondence analysis was used to explore similarities and differences between the cohorts. Results: After exclusions, 135 LS-EC underwent independent histological review, and 64 underwent mutational analysis. Comparisons were made with 59 sporadic MMRd EC. Most tumours were of endometrioid histological subtype (92% LS-EC and 100% sporadic MMRd EC, respectively, p = NS). Sporadic MMRd tumours had significantly fewer tumour infiltrating lymphocytes (p ≤ 0.0001) and showed more squamous/mucinous differentiation than LS-EC (p = 0.04/p = 0.05). PTEN mutations were found in 88% sporadic MMRd and 61% LS-EC, respectively (p < 0.001). Sporadic MMRd tumours had significantly more mutations in PDGFRA, ALK, IDH1, CARD11, CIC, MED12, CCND1, PTPN11, RB1 and KRAS, while LS-EC showed more mutations affecting SMAD4 and ARAF. LS-EC showed a propensity for TGF-β signalling disruption. Cluster analysis found that wild type PTEN associates predominantly with LS-EC, whilst co-occurring mutations in PTEN, PIK3CA and KRAS predict sporadic MMRd EC. Conclusions: Whilst MMRd EC of hereditary and sporadic aetiology may be difficult to distinguish by histology alone, differences in infiltrating immune cell counts and mutational profile may predict heterogenous responses to novel targeted therapies and warrant further study.


Introduction
The Cancer Genome Atlas (TCGA) categorises endometrial cancers (EC) into four molecular subgroups that more accurately predict clinical outcomes than histological subtype [1]. Approximately one quarter are mismatch repair deficient (MMRd) [2], usually because of hypermethylation of the promoter region of MLH1 [2], an almost exclusively somatic event [3,4]. Less commonly MMRd EC are because of Lynch syndrome (LS), an autosomal dominant hereditary condition affecting up to 3% of all EC patients [2]. MMRd tumours have an intermediate prognosis and gain reduced benefit from standard chemotherapeutic agents [5,6] but are sensitive to immune checkpoint inhibitors [7,8], with high rates of durable responses described [9]. The recent decision by the United States Food and Drug Administration (FDA) to licence immunotherapy treatments for MMRd tumours irrespective of their site of origin is, therefore, an exciting development [10]; however, the trials that informed this recommendation considered all MMRd tumours to be equal [8,11]. Sporadic and hereditary causes of MMRd reflect different underlying biology that may, in turn, influence treatment response and survival outcomes [12], and failure to account for their potential differences may be an important source of confounding [12]. This is particularly important because participants of drug registration clinical trials are predominantly those with LS-associated rather than sporadic MMRd tumours, whereas, in routine clinical practice, the reverse is true [2]. LS-associated carcinogenesis is a constant threat in individuals with inherited dysfunctional mismatch repair because DNA replication errors occur regularly during normal growth, repair and regeneration [13]. Resulting nonsense proteins, so-called frameshift peptides, are highly immunogenic, and their associated cancers are rapidly cleared by a functional immune system. Therefore, surviving cancers must develop in the context of a strong anti-cancer immune response which drives adaption and immune escape [14]. By contrast, sporadic MMRd EC develops in a mostly un-primed immune microenvironment [12]. It is, therefore, likely that sporadic and hereditary MMRd EC exploit different routes to carcinogenesis and may comprise a heterogeneous group of tumours with different biology and clinical outcomes. An improved understanding of the similarities and differences characterising MMRd EC of sporadic and hereditary origin may therefore inform therapeutic innovations, targeted treatments and personalised care.
The aim of this study was to assess pathological features and somatic mutational profiles of a large cohort of LS-EC and compare these with sporadic MMRd EC sourced from TCGA. The lack of previous studies in this area reflects historically poor routine testing of EC for MMRd, limiting the size of LS-proven tumour cohorts available for comparison with TCGA data.

Materials and Methods
Definitions:  [15] consistent with LS and a histologically confirmed diagnosis of EC were identified from two large gynaecological cancer centres in Manchester (UK) and Leiden (NL) and through collaboration with the patient support group Lynch Syndrome UK. Two formalin-fixed, paraffin-embedded (FFPE) blocks (one tumour, one normal tissue) were obtained from the hysterectomy specimen for next-generation sequencing alongside a representative haematoxylin and eosin (H&E) stained slide for pathology review. Where hysterectomy material was not available, tissue blocks and slides of the diagnostic biopsy were obtained. Sporadic MMRd EC that were microsatellite high [16] and MLH1 hypermethylated formed the comparator cohort and were sourced from TCGA via the cBioportal (http://www.cbioportal.org/ accessed on 14 March 2020) using the Nature 2013 data.

Pathology Review
Tumour morphology was assessed independently by two specialist gynaecological pathologists (TB and JB) using World Health Organisation criteria [17]. Disagreements were resolved by collaborative review and discussion. Review was limited to one representative H&E slide per case to be comparable to the TCGA cohort. Pathological features of interest were histological subtype, grade, mucinous differentiation, squamous differentiation, lymphovascular space invasion (LVSI), myometrial invasion, tumour infiltrating lymphocytes (TILs) and fixation quality. Myometrial invasion was categorised according to Quick et al. [18] and LVSI extent according to Bosse et al. [19]. TILs were scored as a percentage of the stromal compartment as per Salgado et al. [20]. Fixation quality was good or poor, according to the reviewing pathologist's opinion. The LS-EC underwent slide review, and digital images of the sporadic MMRd EC comparator cohort were reviewed on TCGA cBioportal. Pathologists were blinded to the original pathology report and each other's report, as well as germline and somatic mutational data. Discordant cases were settled by consensus review. Tumour stage was taken from the original pathology report or TCGA cBioportal.

Immunohistochemistry
Immunohistochemistry was carried out on 4 µm tissue sections from representative LS-EC tumour blocks. For MMR protein immunohistochemistry, 0.3% H 2 O 2 /methanol was used to inactive endogenous peroxidases. This was followed by antigen retrieval in boiling 10 mml/L Tris-EDTA pH 9.0. Sections were incubated overnight with primary antibodies against MSH6 (clone EPR3945, 1:800, Genetex) and PMS2 (clone EP51, 1:25, Diaminobenzidine-tetrahydrochloride (DAKO)). Sections stained for PMS2 underwent incubation at room temperature with Envision FLEX + Linker (DAKO) for 20 min. All sections were subsequently incubated with a secondary antibody (poly-HRP-GAM/R/R; DPV0110HRP; Immunologic). DAKO was used as a chromogen. Sections were counterstained with Mayer's haematoxylin, dehydrated and mounted. The proportion of stained tumour epithelial component and intensity of staining was scored by two expert independent observers using tumour stroma as internal control [21].
p53 immunohistochemistry was carried out in the Manchester University NHS Foundation Trust (MFT) Clinical Pathology Laboratory using the automated Ventana BenchMark ULTRA IHC/in situ hybridisation (ISH) staining module (Ventana Co., Tucson, AZ, USA) and ultraView 3,3' diaminobenzidine version 3 detection system. 4 µm tissue sections were baked at 70 • C for 30 min, deparaffinised and incubated in EZPrep (Ventana Co., Tucson, AZ, USA) before washing with TRIS-based reaction buffer. Antigen retrieval used TRISethylenediamine tetraacetic acid (EDTA)-boric acid buffer and cell conditioner 1 for 36 min. Sections were then incubated with ultraviolet inhibitor blocking solution for 4 min before applying DO-7 mono-clonal p53 antibody (DAKO) at 1:50 dilution for 36 min. Sections were incubated with horseradish peroxidase-linked secondary antibody, H 2 O 2 and DAB chromogen and copper for 8, 8 and 4 min, respectively. Slides were washed, counterstained with Harris haematoxylin, dehydrated and coverslipped. p53 staining was scored using British Association of Gynaecological Pathologists protocols by two independent observers; discordant results were resolved by collaborative review and discussion, with a senior author (JB) having the final call [22].

DNA Extraction and Next Generation Sequencing
Tumour DNA was obtained by core biopsy of tumour blocks from several different tumour regions and compared with core biopsies from normal tissue blocks (4 × 0.6 mm of each). Where there was less material available, DNA was extracted from tissue microdissected from five 10 µm slides. DNA extraction was performed on the automated VersantTissue Preparation platform (TPS, Siemens Healthcare Diagnostics, Erlangen, Germany) as previously described [23]. DNA concentration was confirmed by fluorometer (Qubit dsDNA HS, Life Technologies, Carlsbad, CA, USA), with >50 ng needed for analysis. A custom-designed AmpliSeq next-generation sequencing panel (Cancer Hotspot Panel v4) was designed to capture the genes most frequently mutated in endometrial cancer described in COSMIC [24]. The bespoke panel comprised 75 genes, 32 exomic regions and 43 hotspots (Section S2 Supplementary Materials). Primer sequences are available on request. Sequencing libraries were prepared using AmpliSeq methodology according to the manufacturer's recommendations (Thermo Fisher, Waltham, MA, USA) using 10 ng of DNA. Libraries were sequenced on the Ion Torrent Genestudio S5 platform and a 540 chip (Thermo Fisher, Waltham, MA, USA).

Data Analysis
The unaligned bam files generated by the sequencer were mapped against the human reference genome (GRCh37/hg19) using the TMAP 5.0.7 software with default parameters (https://github.com/iontorrent/TS accessed on 16 June 2020). Variant calling used the Torrent Variant Caller (TVC) 5.0.2 according to the recommended somatic variant caller parameter. Integrative Genomics Viewer (IGV) was used for visual inspection of variants [25]. Unless otherwise stated, all sequences have a depth of more than 100 reads, minimum base-pair quality of 20, and minimum number of reads and variants are reported with an allele frequency of >0.25. Variants were imported into the local in-house variant database Genetic Assistant (GA), Version: 1.4.5; SoftGenetics, which assigns variant annotations, functional prediction, conservation scores and disease-associated information to each variant. A five-class system was used to categorise mutations. These were assigned through a systematic search of the literature (PubMed), general or locus-specific databases (Mycancergenome, Alamut Visual, NCBI dbSNP, NCBI ClinVar, COSMIC, Jackson laboratory database, LOVD, MD Anderson, IARC TP53 database). Class I and II variations were considered benign. Class III were variations of unknown significance; it was not possible to define the downstream effect on protein function. Class IV and V were considered pathogenic. Only pathogenic/driver mutations were included in the analysis. Copy number variants (Gains, amplification, or deletions and LOH) were studied with an in-house developed copy number variation analysis tool, visualised in R (version 3.3.1) the NGSE shiny app (https://git.lumc.nl/druano/NGSE accessed on 16 June 2020). For the sporadic MMRd EC, somatic mutational data were taken directly from TCGA via the cBioportal. Only genes included in the bespoke 75-gene endometrial cancer panel used for the LS-EC were analysed.

Statistical Analysis
Data tidying and consolidation was conducted by VBA scripting and conditional formulae in Microsoft Excel 2010. Statistical analysis was performed using GraphPad v 7 (La Jolla, CA, USA) for comparative statistics and R 3.6.0 and RStudio programming environment for clustering analysis. FactomineR was used in addition to base R and the TidyVerse suite. For overall comparisons of percentages, Student's t-test or ANOVA was used. Individual comparisons of percentages were carried out with the N-1 Chi-squared test [26]. For all descriptive analyses, the alpha was set at 0.05. Clinical features from the two cohorts were unified, and data matrices constructed for the 75-gene panel. Four matrices were populated and scored based on (1) a binary "presence/absence" of any mutation; (2) an ordinal "Passenger/Driver" mutation type; (3) an ordinal "Missense/Inframe/Truncation" mutation type; and (4) an ordinal "Ranked Severity" scoring system that integrated Passenger/Driver status with Mutation Type. Multiple Correspondence Analysis (MCA) [27] was employed to project the binary or ordinal mutation scorings into low-dimension Euclidean space to determine whether mutation signatures predict disease grade, disease stage, histological subtype, squamous differentiation, or mucinous differentiation. The four matrices were subsetted by LS status to provide additional four matrices for LS only analyses. These eight datasets formed the basis of bioinformatics analysis. No more than five patient outliers were removed for any MCA analysis (and no more than three during a single iteration). If necessary, the four lowest informative genes were removed prior to final MCA output. Dendrogrammatic clustering was used to assess for mutational substructure in LS and spontaneous MMRd cohorts by standardising and scaling mutation scorings.

Pathological Features
In total, 166 diagnostic slides and/or surgical specimens were received from LS-EC proven cases treated between 1982 and 2016 ( Figure 1). In total, 65 sporadic MMRd EC (MSI hyper-mutated) were identified in the TCGA, six were excluded due to MSI-L (n = 1) or normal MLH1 methylation status (n = 5); thus, 59 tumours formed the comparator cohort. Concordance for histological subtype was high between the two pathologists with an overall Cohen's kappa of 0.82. There were no serous tumours in either group. The LS-EC cases were younger than their sporadic MMRd counterparts, showed higher TIL counts and were more likely to demonstrate broad front myometrial invasion. The sporadic MMRd tumours were exclusively of endometrioid histological subtype with a tendency towards higher grade, squamous or mucinous differentiation and LVSI (Table 1).

Somatic Mutations
After filtering, the mean number of mutations in the 75 genes sequenced was 4 and 5 per LS-EC and sporadic MMRd tumour, respectively (Figures S1-S3). In the 64 LS-EC tumours, there were 246 variants, of which 28%, 36% and 35% were class III, IV and V, respectively. The most common mutation type was missense (76.4%), followed by truncating (11%). The most common base pair substitution was cystine to thymine (41%) (Figures S4-S7). Two LS-EC tumours did not have a mutation detected. For the 59 sporadic MMRd tumours, 289 variants were detected. Direct mutation class comparison was not possible due to different classification methodologies; however, 54% were considered driver variants. The most common type of variant was missense (79.2%), followed by frameshift (17.7%).

Clustering Analysis
Dendrogrammatic clustering revealed mutations in PTEN, PIK3CA, KRAS and CTNNB1 as the most important mutational events. Wild type PTEN associates predominantly with LS, whilst co-occurring mutations in PTEN, PIK3CA and KRAS predict sporadic MMRd (Figure 3). No associations were observed between disease grade, squamous or mucinous differentiation and mutational profile. Within the LS-EC cohort, PTEN, PIK3CA, KRAS, TP53 and APC mutation status were the most important mutational events (Figure 4). No subclusters are associated with pathological features, possibly due to class imbalances within histology and grade. There was also no association of subclusters with LS MMR genotype, suggesting that the gene panel is positioned downstream of MMR functional ablation. Further analysis is demonstrated in Figures S8-S11.

Discussion
All MMRd cancers are considered equal in treatment trials [8,11,28] despite a lack of evidence for this assumption. We sought to explore the validity of the assumption by comparing the genotypic and phenotypic characteristics of a large cohort of proven LS-EC with sporadic MMRd endometrial tumours from TCGA. All sporadic MMRd and most LS-EC tumours were of endometrioid histological subtype. Sporadic MMRd tumours had significantly fewer tumour infiltrating lymphocytes and showed more squamous/mucinous differentiation than LS-EC. There were similar mutational landscapes in MMRd tumours regardless of aetiology, although co-occurring mutations in PTEN, PIK3CA and KRAS were more common in sporadic MMRd and perturbations of TGF-β signalling more common in LS-EC. Our comprehensive interrogation of the phenotype and genotype of MMRd EC of different aetiologies revealed many shared features; however, differences in immune landscapes and mutational profiles may predict heterogeneous responses to treatment and divergent clinical outcomes. Future clinical trials should consider subgroup analysis of women with MMRd tumours of hereditary and sporadic aetiology to investigate this further.
The association between endometrioid histological subtype and MMRd endometrial tumours is well established [29]; however, few previous studies have reviewed such a large bank of proven LS-EC and looked for similarities and differences with sporadic MMRd endometrial tumours. The striking difference in infiltrating immune cells between MMRd tumours of hereditary and sporadic aetiology is consistent with previous work [12] and supports the concept of tumour evolution in the context of longstanding immune pressures in LS-EC [30]. Our observation that LS-EC, but not sporadic MMRd EC, was associated with disruption of immune signalling pathways lends further support to this theory. The immunological landscape plays a crucial role in determining tumour fate, response to treatment and survival outcomes [30]. A primed immune microenvironment may explain why women with LS-EC have better recurrence-free survival than those with MLH1 hypermethylated endometrial tumours [31].
A previous study by Libera et al. evaluated twenty LS-EC cases and five sporadic MMRd endometrial tumours using a 16-gene sequencing panel [32]. They found an association between KRAS mutation and sporadic MMRd tumours and noted a preponderance of ARID1a pathogenic variants in their cohort. Our study builds on these findings and establishes key differences between MMRd tumours of different aetiologies, with a triad of co-occurring somatic mutations in PTEN, PIKCA and KRAS being a common finding only in the sporadic MMRd tumours, implicating a reliance on MAPK and PI(3)K signalling. LS-EC seems to arise independently of PTEN mutation, which is interesting given how common such mutations are generally and in sporadic MMRd [1,33]. TP53 mutations were prevalent in the LS-EC cohort, a consequence of defective DNA repair and widespread genomic instability [1]. It is interesting that both MLH1 and PTEN are prone to epigenetic silencing through promoter methylation, and therefore, the concordance of these two mutations may indicate a shared aetiology [34]. However, our pipeline did not capture epigenetic changes, and it is most likely that MMR dysfunction and not hypermethylation was the driving mechanism [34]. APC mutations were detected in both cohorts despite being uncommon in endometrial cancer [1]. Further, pathogenic APC mutations were found in grade 3 disease suggesting that the prevailing theory that such mutations are only present in pre-cancerous or low-grade disease does not hold in MMRd EC [35].
Our study has several key strengths. First, the LS-EC cases were all from clinically confirmed pathogenic MMR variant carriers (InSiGHT Class V https://www.insightgroup.org accessed on 8 January 2021) and together comprised the largest cohort of LS-EC reported in the literature. Two expert gynaecological pathologists reviewed all morphology, using slides or digital images as appropriate, with discrepancies settled by consensus. Somatic next-generation sequencing was conducted to clinical laboratory standards with a high allele frequency of >0.25 to reduce false positives. The use of TCGA data for sporadic MMRd cases ensured robust comparison with the LS-EC cases. Limitations of the study include pathology review being restricted to one representative H&E slide per case. This was to enable a fair comparison of the LS-EC and sporadic MMRd cases, for which only one digitised slide was available on the cBioportal. Stage was taken from original pathology reports due to inequitable access to the full resected specimen across LS-EC and sporadic MMRd cases. We recognise that our cohort has a limited number of non-endometrioid tumours. This limits the application of our findings to non-endometrioid MMRd EC. However, this lack of non-endometrioid ECs is also of interest as it highlights their rarity in MMRd ECs. Recruitment through Lynch Syndrome UK favoured those who survived their EC, introducing selection bias against aggressive/non-survivable disease, however of those where stage was recorded, 19% had stage III disease, comparable with the sporadic MMRd cohort (18.6%). Our LS cohort originated from two European sites, and their generalisability to non-European populations is unclear. Sequencing formalin-fixed tissue can create artefacts, particularly when using very old samples [36]. However, recent studies have endorsed their use [37]. We have not included EC with somatic path_MMR gene mutations, which account for around 3% of all EC, whereas somatic MMR silencing through hypermethylation of the promoter region of MLH1 accounts for 16% [2]. Therefore, our cohort represents the most clinically relevant subgroup of somatic MMRd, but it does not provide complete representation [38]. Finally, panel sequencing instead of whole-genome sequencing may miss pathogenic variants [39], and restricted sampling of a single tumour block for mutational analysis may not adequately address potential tumour heterogeneity in EC [40].

Conclusions
This is the most comprehensive comparison of proven LS-EC and sporadic MMRd endometrial tumours conducted to date. We provide detailed information about the pathological features and somatic mutational profile of a large cohort of MMRd endometrial tumours of different aetiologies. There are many similarities in pathological features and mutational landscape across tumours of sporadic and hereditary origin, with key differences in PTEN mutations, the immune microenvironment and disrupted immunological signalling likely reflecting different routes to carcinogenesis. These differences may underlie differential treatment responses and clinical outcomes across the two groups.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/cancers13184538/s1, Table S1: A percentage breakdown of onco-genic pathways affected in LS vs. MSI-H MLH1 methylated samples; Table S2: Somatic mutations in genes covered by The Leiden Endometrial onco-panel found in LS (from de-novo NGS sequencing) and MSI-H/MLH1 methylated (from TCGA data) ECs; Table S3: Somatic mutations in genes covered by The Leiden Endometrial onco-panel found in LS broken down by the germline pathogenic variation; Table S4: Somatic mutations in genes covered by The Leiden Endometrial onco-panel found in path_MLH1; Table S5: A comparison of somatic mutations in genes covered by The Leiden Endometrial onco-panel found in our LS cohort vs. the molecular cohorts as taken from the Cancer Genome Atlas; Table S6: Genes included in the analysis; Table S7: Genes with no mutations in the Lynch cohort; Figure S1: The percentage of mutations by type in the unfiltered sequencing output; Figure S2: The percentage of mutations by class in the unfiltered sequencing output; Figure S3: The percentage of mutations by class in the unfiltered sequencing output; Figure S4: The percentage base changes in the unfiltered sequencing output; Figure S5: The percentage of mutations by type in the filtered sequencing output.; Figure S6: The percentage of mutations by class in the filtered sequencing output; Figure S7: The percentage base changes in the filtered sequencing output; Figure S8