High-Plex and High-Throughput Digital Spatial Profiling of Non-Small-Cell Lung Cancer (NSCLC)

Simple Summary Characterizing the tumour microenvironment (TME) has become increasingly important to understand the cellular interactions that may be at play for effective therapies. In this study, we used a novel spatial profiling tool, the Nanostring GeoMX Digital Spatial Profiler (DSP) technology, to profile non-small-cell lung cancer (NSCLC) for protein markers across immune cell typing, immune activation, drug targets, and tumour modules. Comparative analysis was performed between the tumour, adjacent tissue, and microenvironment to identify markers enriched in these areas with spatial resolution. Our study reveals that this methodology can be a powerful tool for determining the expression of a large number of protein markers from a single tissue slide. Abstract Profiling the tumour microenvironment (TME) has been informative in understanding the underlying tumour–immune interactions. Multiplex immunohistochemistry (mIHC) coupled with molecular barcoding technologies have revealed greater insights into the TME. In this study, we utilised the Nanostring GeoMX Digital Spatial Profiler (DSP) platform to profile a non-small-cell lung cancer (NSCLC) tissue microarray for protein markers across immune cell profiling, immuno-oncology (IO) drug targets, immune activation status, immune cell typing, and pan-tumour protein modules. Regions of interest (ROIs) were selected that described tumour, TME, and normal adjacent tissue (NAT) compartments. Our data revealed that paired analysis (n = 18) of matched patient compartments indicate that the TME was significantly enriched in CD27, CD3, CD4, CD44, CD45, CD45RO, CD68, CD163, and VISTA relative to the tumour. Unmatched analysis indicated that the NAT (n = 19) was significantly enriched in CD34, fibronectin, IDO1, LAG3, ARG1, and PTEN when compared to the TME (n = 32). Univariate Cox proportional hazards indicated that the presence of cells expressing CD3 (hazard ratio (HR): 0.5, p = 0.018), CD34 (HR: 0.53, p = 0.004), and ICOS (HR: 0.6, p = 0.047) in tumour compartments were significantly associated with improved overall survival (OS). We implemented both high-plex and high-throughput methodologies to the discovery of protein biomarkers and molecular phenotypes within biopsy samples, and demonstrate the power of such tools for a new generation of pathology research.


Introduction
Non-small-cell lung cancer (NSCLC) accounts for 85% of lung cancers, and is the leading cause of cancer-related deaths [1]. Patients are often diagnosed at an advanced stage, where the immediate prognosis is poor, resulting in a five-year survival rate of less than 20% [2,3]. With the emerging success of immune checkpoint blockade leading to durable responses and prolonged survival in 15-40% of cases, there is now a need for predictive biomarkers to guide patient selection for targeted therapies [4]. The use of comprehensive tumoural information to inform clinical decision-making is becoming increasingly important [5][6][7][8][9]. Studies in the tumour microenvironment (TME) have revealed that a high degree of T-cell infiltration into the tumour provides fertile grounds for effective immunotherapies [10]. As such, the immune contexture (type, density, and location, as well as phenotypic and functional profile of immune cells) has been used to understand a greater depth of the tumour-immune cell interactions, which may provide cues into predictive biomarkers of the response to immune checkpoint therapy (anti PD-1/PD-L1) [11,12].
While traditional immunohistochemistry (IHC) techniques allow for the spatial profiling of cells in the tumour, this is often lost when tumours are analysed using bulk tissue genomic approaches. Moreover, the actual cellular proportions, cellular heterogeneity, and deeper spatial distribution are lacking in characterisation. Spatial and immunological composition with cellular status can aid in identifying micro-niches within the TME [13]. The classification of the immune context within the TME lays the foundation for addressing how the immunological composition and status (activated/suppressed) may dictate response to therapy. Therefore, to address this need, imaging and tissue sampling is required simultaneously to analyse tumour tissue and immune proteins with spatial resolution.

Tissue Microarray
This study has QUT Human Research Ethics Committee (UHREC) approval (#2000000494). The NSCLC Tissue Micro Array (TMA) (HLugA180Su03), containing 92 cases with concordant histologically normal adjacent tissue, was obtained from US Biomax, Inc. (Rockville, MD, USA), including associated clinical information. H&E images were demarcated by a pathologist for tumour and non-tumour regions in each core. The tissue microarray was purchased from US Biomax (commercial source). These companies keep the informed consent of the patient samples used to create the microarrays.

Nanostring GeoMX Digital Spatial Profiler: Tissue Microarray
The slides were profiled using Technology Access Program (TAP) by Nanostring Technologies (Seattle, WA, United States). In brief, immunofluorescent staining was performed on the TMA with tissue morphology markers (PanCK, CD3, CD45, and DAPI) in parallel with DNA-barcoded antibodies within the immune cell profiling, IO drug targets, immune activation status, immune cell typing, and pan-tumour protein panels, as shown in Table 1. Geometric (circular) and custom regions of interest (ROIs) were selected based on visualisation markers to generate tumour (PanCK+) and TME (PanCK−) areas, from which barcodes were liberated by UV light using the GeoMx DSP instrument, then hybridised and counted on the Ncounter system.

Nanostring GeoMX Digital Spatial Profiler: Data Analysis
Patient data presented in Table 2 was generated in R studio [14] using the package "gtsummary" [15]. Remote access to the GeoMx DSP analysis suite (GEOMX-0069) allowed inspection, quality control (QC), normalisation, and differential expression to be performed. Briefly, each ROI was tagged with metadata for its compartment and patient pairing, in order to allow pairwise comparisons. Raw data was exported and plotted in R using "ggplot2" [16] for raw counts, a signal relative to IgG controls, and an evaluation of the Pearson correlation coefficient (R) between normalisation parameters using the "ggpubr" package [17]. Normalisation using Histone H3 and S6 proteins was performed in GeoMx DSP analysis suite. Differential expression between paired compartments was evaluated by paired t-tests with a Benjamini-Hochberg correction, while differential expression between unpaired compartments was performed by a Mann-Whitney test with Benjamini-Hochberg correction, and results were plotted in R studio using "ggplot2". Relative expression data was exported from the GeoMx DSP analysis suite, hierarchical clustering performed using the R package "complexHeatmap" [18], and univariate Cox proportional hazards regression was performed using "survivalAnalysis" [19] package.

Region of Interest (ROI) Selection
Ninety-six ROIs in total were selected that were representative of 45 tumours, 32 TMEs, and 19 histologically normal adjacent tissues from the cohort of patients described in Table 2. Images of H&E-stained cores were demarcated by a pathologist and were utilised alongside Nanostring immunofluorescent staining for morphology markers PanCK, CD45, CD3, and DAPI to draw ROIs indicative of a tumour (CK+) or TME (CK-/CD3+). Figure 1 provides an example of this strategy where tumour and TME ROIs were able to be identified within the same tumour core. Of all the samples collected, comparisons from the same patient could be made between eight TME-NAT pairs, 14 NAT-tumour pairs, and 18 tumour-TME pairs. Figure 2 provides an overview of the tumour and immune ROI selection, as well as representative expression profiles for a number of associated markers.

Data Quality Control
Quality control was performed within the GeoMx DSP analysis suite, to ensure the Ncounter quantification of probes was within specification. Raw probe counts per ROI were inspected to ensure comparable ranges of the signal, and to evaluate systemic variability ins sample groups. ROIs generated median counts within the range of 10 2 and 10 3 , with observably lower median counts for ROIs 13, 67, and 96 ( Figure S1). Raw probe counts were then inspected within TME, tumours, and NAT, as targets were expected to vary by respective tissue compartment (e.g., immune markers in TME vs. tumours). Robust counts were observed for abundant targets, including histone H3, SMA, S6, GAPDH, fibronectin, cytokeratin, CD44, CD68, β-2-microglobulin (B2M), HLA-DR, CD45, and B7-H3 (beyond axis range in Figure S2); however, the remaining probes shown exhibited raw counts below 200. Overall, raw counts from NAT ROIs shown in Figure S2 appeared to be lower than others for compartments, while tumour ROIs generated higher signals for most lowly-abundant probes. Of note, background isotype control IgG probes possessed counts between 50 to 150 ( Figure S2), and while rabbit (Rb) IgG exhibited similar counts between tumour and TME compartments, mouse (Ms) IgG showed higher counts for tumours then TME. This suggests that background correction may not be the best strategy for the normalisation of lowly-expressed targets, as these targets are expressed at background or just above background levels, making their quantification challenging. Target signals relative to Ms and Rb IgG control probes was therefore evaluated to identify probes from which data should be considered with caution. Probes shown in Figure 3 whose median signal from all compartments relative to IgG Cancers 2020, 12, 3551 5 of 16 was less than 1 (pink region) were thus followed with caution, and 31 of the 55 probes above CD25 in Figure 3 below were considered robust for further analysis.

Data Quality Control
Quality control was performed within the GeoMx DSP analysis suite, to ensure the Ncounter quantification of probes was within specification. Raw probe counts per ROI were inspected to ensure comparable ranges of the signal, and to evaluate systemic variability ins sample groups. ROIs generated median counts within the range of 10 2 and 10 3 , with observably lower median counts for ROIs 13, 67, and 96 ( Figure S1). Raw probe counts were then inspected within TME, tumours, and NAT, as targets were expected to vary by respective tissue compartment (e.g., immune markers in TME vs tumours). background or just above background levels, making their quantification challenging. Target signals relative to Ms and Rb IgG control probes was therefore evaluated to identify probes from which data should be considered with caution. Probes shown in Figure 3 whose median signal from all compartments relative to IgG was less than 1 (pink region) were thus followed with caution, and 31 of the 55 probes above CD25 in Figure 3 below were considered robust for further analysis.  Probe counts relative to rabbit (Rb) and mouse (Ms) IgG controls. Counts of each probe were normalised to mean counts of Rb and Ms IgG within each ROI. The mean of these normalised values per probe was plotted to evaluate the robustness of the target protein signal to isotype background controls.

Data Normalisation
The method of normalisation between ROIs was assessed by examining correlations between histone H3, S6, GAPDH, and IgG background control probes, under the assumption that normalisers should correlate between ROIs and be unrelated to underlying biology. Housekeeping proteins included in the assay (GAPDH, histone H3, and S6) were plotted to determine which pairs best correlated across ROIs ( Figure 4A-C). Histone H3 and S6 exhibited the strongest Pearson correlation coefficient (R = 0.7) ( Figure 4C), and were thus examined further for correlation to IgG background to confirm independence from tissue biology. Ms and Rb IgG strongly correlated with each other across ROIs (R = 0.92) ( Figure 4D), and the means of these IgG counts showed strong correlation with means of histone H3 and S6 housekeeping controls (R = 0.91), indicating that IgG controls, histone H3, and S6 were unrelated to underlying biology, and could act as appropriate normalisers across ROIs ( Figure 4E). coefficient (R = 0.7) ( Figure 4C), and were thus examined further for correlation to IgG background to confirm independence from tissue biology. Ms and Rb IgG strongly correlated with each other across ROIs (R = 0.92) ( Figure 4D), and the means of these IgG counts showed strong correlation with means of histone H3 and S6 housekeeping controls (R = 0.91), indicating that IgG controls, histone H3, and S6 were unrelated to underlying biology, and could act as appropriate normalisers across ROIs ( Figure  4E). In addition to the normalisation by IgG and traditional housekeeping members, ROI area and nuclei count were inspected for their utility to normalise DSP data. Figure 5A-C illustrates the relationship that the ROI area possessed with histone H3/S6, IgG, and nuclei. A number of ROIs varied In addition to the normalisation by IgG and traditional housekeeping members, ROI area and nuclei count were inspected for their utility to normalise DSP data. Figure 5A-C illustrates the relationship that the ROI area possessed with histone H3/S6, IgG, and nuclei. A number of ROIs varied by area; however, some possessed maximum sized geometry, and these ROIs varied significantly in their relationship with other normalisation parameters, indicating that ROI area was not a useful normalisation method in this experiment. Similarly, nuclei counts were evaluated relative to IgG and histone H3/S6 means ( Figure 5D,E), where some trend was evident but lacked the robustness of either IgG or histone H3/S6 normalisation. Histone H3/S6 means were therefore utilised for normalisation, and henceforth the comparative quantification of probes. Some ROIs contained the maximum area (horizontal dots in A-C) and exhibited significant variance in the secondary parameter, indicating that area was not a suitable normalisation method. Nuclei counts (D-E) demonstrated a trend with the secondary parameter; however, correlation was not as significant as that observed for IgG or histone H3/S6. NAT: normal adjacent tissue; TME: tumour microenvironment; Tumour: Tumour region.

Data Analysis
Hierarchal clustering by the Ward D2 method [20] was first used to explore normalised data; however, expression appeared to vary significantly within classes of compartments, such that clear distinction between the NAT, tumour, and TME was not evident (Figure 6). K-means clustering to further group ROIs into classes showed most NATs grouping together ( Figure 6, left), characterised by higher expression of most genes except for PanCK, EpCAM, and Ki-67. Another class consisting of both the TME and tumour (middle Figure 6) was characterised by lower expression of most proteins, with some ROIs expressing high levels of Ki-67 and EpCAM, whereas a third class was characterised by relatively heterogenous expression of all proteins (Figure 6, right).
Cancers 2020, 12, x FOR PEER REVIEW 10 of 15 some ROIs expressing high levels of Ki-67 and EpCAM, whereas a third class was characterised by relatively heterogenous expression of all proteins (Figure 6, right). Figure 6. Clustered heatmap of relative expression of proteins per ROI. Ward D2 clustering was applied, followed by K-means clustering to delineate differences between expression profiles among compartments. NAT: normal adjacent tissue; TME: tumour microenvironment; tumour: tumour region.
Global correlation matrices for target protein expression within the TME ( Figure S3) and tumour ( Figure S4) indicated a large number of significant (p ≤ 0.001) positive correlations.
Global correlation matrices for target protein expression within the TME ( Figure S3) and tumour ( Figure S4) indicated a large number of significant (p ≤ 0.001) positive correlations.
Differential protein expression was then evaluated between patient matched and unmatched compartments (Figure 7). Interestingly, matched TME and NAT (n = 8) did not exhibit significant differences ( Figure 7A), while matched TME-tumour pairs (n = 18) indicated an expected enrichment of CD27, CD3, CD4, CD44, CD45, CD45RO, CD68, CD163, and VISTA within the TME, while tumour regions were enriched in Ki-67, EpCAM, and cytokeratin ( Figure 7B). When incorporating all samples, irrespective of patient pairing, several proteins appeared to be downregulated in TME relative to NAT, including CD34, fibronectin, IDO1, LAG3, ARG1, and PTEN ( Figure 7C). TME-tumour comparisons remained similar to the paired data, whereas CD3, CD45RO, VISTA, and CD163 were enriched in the TME relative to the tumour ( Figure 7D). Assessment of the association between protein expression and survival was also explored through an unadjusted, univariate Cox proportional hazards regression. Interestingly, expression data from immune ROIs indicated that the presence of EpCAM and cytokeratin was associated with better patient OS ( Figure  8), while the presence of CD34, CD3, and ICOS in tumour ROIs was associated with better patient OS. When placed in a multivariate model to adjust for age, AJCC, and TNM tumour staging variables, those markers found to be significant in a univariate model no longer reached significance levels (data not shown). The number of samples did not permit higher-level multivariate analysis and statistical modelling of covariate prognostic signatures. When incorporating all samples, irrespective of patient pairing, several proteins appeared to be downregulated in TME relative to NAT, including CD34, fibronectin, IDO1, LAG3, ARG1, and PTEN ( Figure 7C). TME-tumour comparisons remained similar to the paired data, whereas CD3, CD45RO, VISTA, and CD163 were enriched in the TME relative to the tumour ( Figure 7D). Assessment of the association between protein expression and survival was also explored through an unadjusted, univariate Cox proportional hazards regression. Interestingly, expression data from immune ROIs indicated that the presence of EpCAM and cytokeratin was associated with better patient OS (Figure 8), while the presence of CD34, CD3, and ICOS in tumour ROIs was associated with better patient OS. When placed in a multivariate model to adjust for age, AJCC, and TNM tumour staging variables, those markers found to be significant in a univariate model no longer reached significance levels (data not shown). The number of samples did not permit higher-level multivariate analysis and statistical modelling of covariate prognostic signatures.
Cancers 2020, 12, x FOR PEER REVIEW 12 of 15 Figure 8. Cox proportional hazards of compartment-specific protein expression, ranked by association with overall survival. Log2 protein expression was modelled against follow-up time for the tumour and TME. Hazard ratio (HR) < 1 was associated with better patient outcome; HR > 1 was associated with poorer patient outcome.

Discussion
The Nanostring GeoMx DSP platform [13,21] offers a novel solution for high-plex digital quantification of proteins and mRNA from fixed and fresh frozen tissues with spatial resolution [22]. It has been recently applied to triple-negative breast cancer (TNBC) [23], NSCLC [24], and melanoma [25]. However, the implementation and interpretation of such high-plex discovery is still in its infancy. The application of such technologies to large numbers of patient samples in the TMA format potentially provides unparalleled insight into spatial cell types, biomarkers, and the interactions that may underlie the disease biology. In this study, we quantified proteins across the current DSP immune cell profiling, IO drug targets, immune activation status, immune cell typing, and pan-tumour protein modules to understand the presence of these markers in tumours, tumour microenvironments, and histologically normal adjacent tissue compartments. We present a users' experience where 96 ROIs were collected from a single TMA-containing tumour and NAT cores, with data processed and analysed within the GeoMx DSP analysis suite.
In conventional IHC and multiplex IHC, information can be obtained from the entirety of sections or TMA cores, giving a global perspective of marker expression and allowing post-hoc segmentation to inform on distribution. The DSP approach differs in that, while visualisation markers may inform on tumour/non-tumour regions and areas of immune cell infiltrate, ROIs are limited to a maximum of 600 µm geometric shapes. In this study, circular ROIs and several custom-drawn ROIs were used, meaning that "tumour" ROIs innately contained immune infiltrate, and that "immune" ROIs needed to be completely separate from the tumour to be defined, and may represent tumour-adjacent "stromal" immune infiltrate rather than an activated "tumour microenvironment" immune infiltrate. The DSP platform does allow for "masking" or "compartmentalization" within ROIs, enabling the signal to be obtained directly from tumour cells and from the immediate stromal space into which they have proliferated, at µm resolution [24]. However, this approach was not used in this study, and is a salient point to be considered for future analyses using the platform.
Here, we also demonstrate that there is a need to empirically determine the method of normalisation and identify probes which lack robust signal-to-noise. We demonstrated that both IgG Figure 8. Cox proportional hazards of compartment-specific protein expression, ranked by association with overall survival. Log2 protein expression was modelled against follow-up time for the tumour and TME. Hazard ratio (HR) < 1 was associated with better patient outcome; HR > 1 was associated with poorer patient outcome.

Discussion
The Nanostring GeoMx DSP platform [13,21] offers a novel solution for high-plex digital quantification of proteins and mRNA from fixed and fresh frozen tissues with spatial resolution [22]. It has been recently applied to triple-negative breast cancer (TNBC) [23], NSCLC [24], and melanoma [25]. However, the implementation and interpretation of such high-plex discovery is still in its infancy.
The application of such technologies to large numbers of patient samples in the TMA format potentially provides unparalleled insight into spatial cell types, biomarkers, and the interactions that may underlie the disease biology. In this study, we quantified proteins across the current DSP immune cell profiling, IO drug targets, immune activation status, immune cell typing, and pan-tumour protein modules to understand the presence of these markers in tumours, tumour microenvironments, and histologically normal adjacent tissue compartments. We present a users' experience where 96 ROIs were collected from a single TMA-containing tumour and NAT cores, with data processed and analysed within the GeoMx DSP analysis suite.
In conventional IHC and multiplex IHC, information can be obtained from the entirety of sections or TMA cores, giving a global perspective of marker expression and allowing post-hoc segmentation to inform on distribution. The DSP approach differs in that, while visualisation markers may inform on tumour/non-tumour regions and areas of immune cell infiltrate, ROIs are limited to a maximum of 600 µm geometric shapes. In this study, circular ROIs and several custom-drawn ROIs were used, meaning that "tumour" ROIs innately contained immune infiltrate, and that "immune" ROIs needed to be completely separate from the tumour to be defined, and may represent tumour-adjacent "stromal" immune infiltrate rather than an activated "tumour microenvironment" immune infiltrate. The DSP platform does allow for "masking" or "compartmentalization" within ROIs, enabling the signal to be obtained directly from tumour cells and from the immediate stromal space into which they have proliferated, at µm resolution [24]. However, this approach was not used in this study, and is a salient point to be considered for future analyses using the platform.
Here, we also demonstrate that there is a need to empirically determine the method of normalisation and identify probes which lack robust signal-to-noise. We demonstrated that both IgG background control probes as well as histone H3 and S6 housekeeping probes correlated across ROIs, while area and nuclei varied significantly and were thus less reliable for normalising data for quantification. Existing studies that have used area as a normaliser [24] have also utilised a signal-to-noise ratio cut-off >3, suggesting that our particular TMA may have exhibited disproportionately high background or low overall signal, as a significant number of probes were within range of IgG control probes. Without validation through IHC, it is difficult to interpret the meaning of probes that give signal within range of isotype IgG controls. It is noteworthy that, for example, PD-L1 counts fell below background where an abundance should be expected in a subset of NSCLC tissues, which highlights the importance of orthogonal validation when using a discovery technique. It is perhaps for this reason that many significant correlations were observed within compartments for markers that possessed low signal-to-noise, and these observations require additional validation.
It is important to note that the use of traditional methods of housekeeping normalisation in such datasets require deeper investigation. Evidence exists within our data for systematic lower expression in NAT samples, for which a single normalisation approach, including all samples will arbitrarily overestimate normalized NAT counts. This is critical in differential analysis where it should be assumed that most targets are not differentially expressed, and is better controlled for by global scaling methods, such as the "Trimmed Mean of M-values: (TMM) in edgeR package [26], and "Relative Log Expression" (RLE) in DESeq2 package [27]. Such methods require more advanced informatics processing beyond the DSP analysis suite.
With this in mind, it was notable that when differential analysis was applied by paired t-test to a limited number of patient pairs, NAT was indistinguishable from TME. A clear distinction between matched tumour and TME was evident, though, and was indicated by the increased presence of several key markers within the TME. Such markers included CD44, CD45, T cell lineage (CD3, CD4), memory T cells (CD45RO), monocyte/macrophage lineage (CD163, CD163), and costimulatory immune checkpoints (CD27, VISTA). When incorporating all samples, irrespective of patient matching, immunosuppressive molecules LAG3 [28] and IDO1 [29] were, perhaps counter-intuitively, significantly depleted in the TME relative to NAT, indicating the requirement for patient matching to make meaningful comparisons. Furthermore, mixed-model differential analysis should be performed to control for patient matching, where t-tests available for single-slide analysis within the DSP analysis suite are not wholly appropriate.
Nevertheless, the sheer scale of high-plex analyses appropriately applied to large numbers of cases through TMAs is an incredibly powerful tool for spatial biology. The DSP protein modules include key markers that describe multiple immune cell types, immune checkpoints, and experimental targets that enable a more comprehensive understanding of the immunological parameters that influence patient outcome. While overall survival was the only clinical endpoint investigated in this study, the emergence of patient cohorts treated with immunotherapies means that such assays may be used to track patient progression and outcomes, and indicate potential biomarkers for patients most likely to respond to these therapies. Despite some limitations in the absolute definition of tumour and TME compartments in our study, we were able to identify that the presence of CD3, CD34, and ICOS expressing cells in tumour compartments were associated with better patient OS in an unadjusted univariate Cox proportional hazards model. However, these findings require validation.
It is interesting to note that the enrichment of CD3 in tumour regions was associated with improved OS in this study, independently of CD4 T helper cells and cytotoxic CD8 T lymphocytes. Several markers significantly correlated with CD3 expression in the tumour compartment, including CD40, CD44, CD14, B2M, Tim-3, CD8, CD45RO, and ICOS, potentially implicating other cell lineages in immune-associated anti-tumour activity. Of note is the correlation between CD3 and ICOS, both of which were independently prognostic within tumour compartments, highlighting the potential power of such multiplex discovery.
Furthermore, limitations include the retrospective nature of the study, the need for orthogonal validation, and an increase in comparative groups.

Conclusions
In summary, the application of such novel platforms to provide comprehensive snapshots of clinical material enables an unprecedented insight into molecular phenotypes that may be indicative of response to emerging therapies, and ultimately, patient outcome. We propose the development of appropriate normalization methods to overcome systematic variation and low signal-to-noise, and indicate the requirement for larger sample numbers to overcome the limitations of multiple testing in discovery approaches. By combining such high-plex approaches with TMAs and orthogonal validation through multispectral IHC, a new field of biomarker discovery is developing that offers to change the way clinical pathology is performed.