Next Article in Journal
Origin of Polyploidy, Phylogenetic Relationships, and Biogeography of Botiid Fishes (Teleostei: Cypriniformes)
Previous Article in Journal
Comparison of Tumor Cell Responses to Different Radiotherapy Techniques: Three-Dimensional Conformal Radiotherapy (3D-CRT), Volumetric Modulated Arc Therapy (VMAT), and Helical Tomotherapy (HT)
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Unsupervised Clustering of Cell Populations in Germinal Centers Using Multiplexed Immunofluorescence

1
Department of Laboratory Medicine and Pathology, Institute of Pathology, Lausanne University Hospital, University of Lausanne, CH-1011 Lausanne, Switzerland
2
Biomedical Data Science Center, Lausanne University Hospital, Lausanne University, CH-1011 Lausanne, Switzerland
3
Service of Immunology and Allergy, Department of Medicine, Lausanne University Hospital, Lausanne University, CH-1011 Lausanne, Switzerland
*
Authors to whom correspondence should be addressed.
Biology 2025, 14(5), 530; https://doi.org/10.3390/biology14050530
Submission received: 27 February 2025 / Revised: 22 April 2025 / Accepted: 28 April 2025 / Published: 11 May 2025
(This article belongs to the Section Immunology)

Simple Summary

The follicular/germinal center immune reactivity determine the development of pathogen and immunogen induced antibody responses while is a critical factor for the pathogenesis of chronic diseases like HIV and certain lymphomas. Delineation of the follicular/germinal center immune landscape provides critical information for the understanding of this immune reactivity. To this end, the development of imaging assays and computational tools for imaging data analysis is of great importance. Our data support the use of computational clustering of cell types defined by multiplex imaging as well as the application of complementary methodologies for the high-dimensional characterization of follicular/germinal center immune landscape.

Abstract

Follicles (Fs)/Germinal Centers (GCs) in tonsils and lymph nodes are dynamic microenvironments where diverse immune cell populations interact for the development of antibody responses against pathogens. The accurate in situ phenotypic analysis of these immune cells is a prerequisite for the comphehensive understanding of GC development. In this study, we explore unsupervised clustering approaches for distinguishing cell populations within F/GCs using marker expression data. We evaluate multiple clustering algorithms and find that k-means clustering provides the most effective separation of distinct cell subsets. Additionally, we investigate the predictive potential of common GC markers (CD3, CD4, CD20 and BCL6) for PD-1 expression, an important immune checkpoint regulator. Our analysis demonstrates that PD-1 expression can be reliably inferred using these markers, suggesting potential applications for automated cell classification in immunological studies. This approach enhances our ability to analyze immune cell heterogeneity and may contribute to improved understanding of GC dynamics in health and disease. Our findings support the use of computational clustering for high-dimensional immune profiling.

1. Introduction

Multiplex immunofluorescence (mIF) is a technique that allows the simultaneous detection and quantification of multiple protein markers within a single tissue section. This method improves traditional immunohistochemistry (IHC) by overcoming its limitation of labeling only one marker per tissue section, thus providing a more comprehensive analysis of cell composition, cellular functions, and cell-to-cell interactions [1]. Multiplex immunofluorescence allows for the simultaneous visualization of multiple biomarkers within a single tissue section at the cellular level, providing a comprehensive view of the tumor microenvironment, cell phenotypes, and other complex tissue architectures. This technique is crucial to understanding the interactions between different cell types and the spatial organization of tissues, which are essential for accurate disease diagnosis and prognosis. However, a major bottleneck in miF data analysis is to accurately assign cell types based on the expression of measured markers at the cellular level [2]. In the context, the use of unsupervised clustering in multiplex immunofluorescence is increasingly important for pathology research and clinical applications due to its ability to enhance the characterization of complex cellular phenotypes [1,3]. Unsupervised clustering algorithms could play a pivotal role in the analysis of high-dimensional data generated by such methods. These algorithms can identify patterns and structures within the data without prior labeling or knowledge of the data, allowing the discovery of novel tissue architectures and cellular interactions that might not be evident through traditional analysis methods. This approach is particularly valuable in pathology, where understanding the spatial distribution and interaction of cells can lead to better insights into disease mechanisms and the development of targeted therapies. They have been shown to be effective in segregating different cell populations on flow cytometry data [4] as well as spatial mIF [5]. Furthermore, unsupervised clustering facilitates the integration of multiplex immunofluorescence data with other high-throughput techniques, such as spatial transcriptomics, to provide a more holistic view of tissue biology. This integration can improve the identification of microanatomical domains and enhance the predictive power of diagnostic models, ultimately contributing to more personalized and effective clinical interventions. As the field of digital pathology continues to evolve, the optimization and standardization of these analytical techniques will be essential for their routine application in clinical settings.
The development of humoral responses against a pathogen or immunogen requires the coordinated function of several cell types (stromal cells, innate and adaptive immune cell types) [6] as well as soluble mediators [7]. Follicular helper CD4 T cells (Tfh) and germinal center (GC) B cells are the main immune cell types mediating the development of antigen specific antibodies [8]. Tonsils represent a prototype lymphoid organ for studying follicular/germinal center immune dynamics. Furthermore, lymphoid organs are important anatomical sites for chronic diseases like lymphomas [9] and HIV infection [10]. Therefore, the comprehensive analysis of F/GC immune dynamics is of great interest for the understanding of disease pathogenesis and the development of novel therapeutic targets. We used mIF data from human tonsils to validate unsupervised methods for the clustering of relevant cell types/biomarkers.

2. Materials and Methods

2.1. Human Material-Ethical Approval

The tissue samples used in this study were obtained from (i) the archives of the Institute of Pathology of Lausanne University Hospital, Switzerland (cancer-free, HIV-free reactive LNs) and their use was approved by the Ethical Committee of the Canton de Vaud, Switzerland (2021-01161) (Table A1) and (ii) from the Hospital de l’Enfance of Lausanne (tonsils were obtained from anonymized children during routine tonsillectomy) and their use was approved by the Canton de Vaud-CER-VD, Switzerland (PB_2016-02436 (201/11)). Tissues from participants with written consent were used while all procedures were in accordance with the Declaration of Helsinki.

2.2. Tissue Processing and Staining

Fresh tissues were fixed in formalin overnight as soon as possible after biopsy and processed for the preparation of formalin-fixed, paraffin-embedded (FFPE) blocks using standard procedures. The blocks were sequentially cut into 4–5 μ m sections and prepared on Superfrost glass slides (Thermo Scientific, Waltham, MA, USA, Ref. J1800AMNZ), dried overnight and stored at 4 °C. Before staining, the slides were heated on a metal hotplate (Stretching Table, Medite, Burgdorf, OTS 40.2025, Ref. 9064740715) at 65 °C for 30 min. Tissue sections were stained with titrated antibodies using a Ventana Discovery Ultra Autostainer (Roche Diagnostics, Ventana Medical Systems, Tucson, AZ, USA). Tissues were deparaffinized, hydrated and the protein epitopes were retrieved by applying the standard Ventana Discovery’s protocols. Before all antibody incubation steps, tissues were blocked using Antibody Diluent/Block from Akoya (ARD1001EA, Akoya Biosciences, Marlborough, MA, USA). The cycling staining/imaging approach we developed consisted of two staining cycles. During the first staining cycle, unconjugated and conjugated primary antibodies coupled with Alexa Fluor dyes were used (CD3, CD4, PD1, CD20, Ki67, CD57) (Table A2). Both unconjugated and conjugated antibodies were diluted in Antibody Diluent/Block and incubated sequentially for 90 min at room temperature (RT). Alexa Fluor-conjugated antibodies were diluted in Antibody Diluent/Block and incubated for 45 min at RT. To avoid any unwanted secondary antibody binding to conjugated antibodies we started by incubating unconjugated antibodies, followed by secondary antibodies and lastly with the conjugated antibodies. The samples were then counterstained with SYTO45 (1/10,000 dilution in TBS-T, CatNo 10297192, ThermoFischer Scientific for 40 min, rinsed in soapy water and mounted using DAKO mounting medium (Dako/Agilent, Santa Clara, CA, USA, Ref. S302380-2). After imaging the first cycle, slide coverslips were carefully de-mounted using warm ddH2O, and slides were washed briefly in PBS. Then, the first cycle’s antibodies were stripped off using Cell Conditioning Solution (CC2, 950-223, Roche Diagnostics) for 10 min at 100 °C. After the stripping step, slides were washed 2× (ca 10s) in ddH2O and 1× for 5 min in PBS and the second cycle of staining followed. During the second cycle, an Opal dye (Opal 7-color Automation IHC kit, from Akoya, Ref. NEL821001KT T) was used to amplify the signal of second-cycle primary antibodies (Bcl6). More specifically, tissue sections were sequentially subjected to antibody blocking, staining with primary antibodies, incubation with secondary HRP-conjugated antibodies (DISCOVERY OmniMap anti-Ms HRP/760-4310, DISCOVERY OmniMap anti-Rb HRP/760-4311) for 16 min, detection with optimized fluorescent Opal tyramide signal amplification (TSA) dyes and repeated antibody denaturation cycles. The samples were then counterstained with SYTO40 (1/10,000 dilution in TBS-T, CatNo, ThermoFischer Scientific for 40 min rinsed in soapy water and mounted using DAKO mounting medium (Dako/Agilent, Santa Clara, CA, USA, Ref. S302380-2). Alternatively, tonsillar tissue sections were stained using the Ventana system and titrated primary antibodies (same clones as above) and appropriate secondary HRP-labeled antibodies with repeated cycles of antibody denaturation (Table A2). Optimized fluorescent Opal tyramide signal amplification (TSA) dyes were used for the detection of all target proteins. For the cell nuclei visualization, the sections were counterstained with Spectral DAPI. Following staining, the sections were rinsed in water with soap and mounted using DAKO mounting medium from Dako/Agilent, Santa Clara.

2.3. Data Acquisition

Images were acquired using a Leica Stellaris 8 SP8 confocal system (Leica Microsystems CMS GmbH, Mannheim, Germany), equipped with Leica Application Suite X (LAS-X)-4.6.1.27508 software, at 512 × 512-pixel density, 0.75× optical zoom and a z-step of 1 μ m using a 20× objective (NA). At least 70% of each section was imaged, to ensure an accurate representation and minimize selection bias. Tissues stained with a single antibody fluorophore combination were used to create a compensation matrix via the Leica LAS-AF Channel Dye Separation module (Leica Microsystems, Leica Microsystems CMS GmbH, Mannheim, Germany), which was used to correct fluorophore spillover (when present), as per the user’s manual. When the dye separation results were not optimal, the manual LAS-AF Channel Dye Separation module was employed. Alternatively, Multispectral images (MSI) were acquired using the Vectra Polaris 1.0 imaging system (Akoya Biosciences, Marlborough, MA, USA) at a resolution of 5 μ m/pixel (20×).

2.4. Image Alignment and Registration-Cell Segmentation

The images generated during the two cycles were aligned using SimpleITK [11] as an Imaris extension (Imaris software version 9.9.0, Biplane. To facilitate registration, we utilized one common channel present in both imaging cycles (SYTO). After successful alignment, the Surface Creation module of Imaris was used to generate 3-dimensional segmented surfaces (based on the nuclear signal) of spillover-corrected images. The segmented cells were then processed with the filtering Imaris module using different combinations of filtering types based on the mean and median intensities of channels to be able to exclude artifacts that are characterized by uniform staining across the segmented area. Areas with uniform staining were excluded among the different tissues. Data generated, such as average voxel intensities for all channels and the volume and sphericity of the 3-dimensional surfaces, were exported.

2.5. Histocytometry

Acquired images were further processed (Imaris, version 9.8, Oxford Instruments, Abingdon, UK for confocal images and Phenochart 1.0.12 and InForm, version 2.4.8 software (Akoya) for Polaris images) for cell segmentation as previously described [12,13]. A csv file report was generated, containing the spatial coordinates (X, Y) of each segmented cell, along with their mean intensity for each fluorophore used. After converting it to FACS file (FSC) format the data were uploaded to the FlowJo 10 software for further analysis using HistoFlowCytometry [13,14]. Data are reported as frequency of total imaged cells per follicular area.

2.6. Marker Spillover Correction

To evaluate if spillover played a role in the cell poppulation identification, we used REDSEA [15] using probability maps generated by Ilastik [16]. SYTO was specified as the nuclear marker, and parameters set as for the MIBI example data. Spillover correction was performed for CD20 and CD3. We used Ilastik v1.4 to classify pixels as nuclear or non-nuclear based on SYTO expression. All available features provided by Ilastik were used for training, with sigma values of 0.3 and 0.7 selected. The classification was exported as a probability map tiff image for the nuclear class.

2.7. Cell Phenotyping with Mass Cytometry (CyTOF)

Tonsillar single cell suspensions were used for the phenotypic characterization of relevant immune cell types by CyTOF. Cryopreserved tonsil cells were thawed and resuspended in complete RPMI medium (10% heat inactivated FBS, Life Technologies, Gibco (Waltham, MA, USA), 100 IU/mL penicillin, and 100 μ g/mL streptomycin, BioConcept) and rested for 6 h at 37 °C with 5 U/mL Benzonase (Thermo Fisher, Waltham, MA, USA) to minimize cell aggregation. Cells were washed (0.5% BSA-PBS, Sigma, St. Louis, MO, USA) and incubated for 20 min with a 50 μ L antibody cocktail of cell surface titrated metal-conjugated antibodies (Table A3). Cell-IDTM-103 Rh Intercalator at a final concentration of 1 μ M was used for the cell viability staining. Following a washing and fixation (10 min, RT with 2.4% paraformaldehyde (PFA, Thermo Fisher) step, cells were permeabilized (45 min at 4 °C with the Foxp3 Fixation/Permeabilization kit, eBioscience, Waltham, MA, USA), washed and stained (30 min, 4 °C) with a 50 μ L cocktail of intracellular metal conjugated antibodies. For TCF1 assessment, antibody is conjugated with phycoerythrin fluorochrome (Clone 7F11A10, Cat. N. 655208, Biolegend, San Diego, CA, USA) and then detected by metal conjugated anti-PE (145Nd, Standard Biotools, Billerica, MA, USA). Cells were washed and fixed for 10 min at RT with 2.4% PFA. Total cells were identified by DNA intercalation (1 mM Cell-ID Intercalator, Standard Biotools) in 1% PFA and 0.3% saponin (Sigma) at 4 °C, overnight. For CyTOF analysis, cells were washed (×3, MilliQ water, Burlington, MA, USA) and resuspended at 0.5 × 106 cells/ml in 0.1% EQ™ Four Element Calibration Beads solution (Standard Biotools). Data were acquired using a Helios mass cytometer instrument (Standard Biotools), using a flow rate of 0.030 mL/min and an event rate of 300 cells/s. Flow cytometry standard (FCS) files were normalized to EQ Four Element calibration beads using CyTOF software, version 7.0.8493. For conventional cytometric analysis of immune cell populations, FCS files were imported into Cytobank data analysis software for processing with more in-depth analysis performed in R using the OpenCyto and cytofkit packages.

2.8. Bioinformatic Analysis (Tonsilar scRNA Data)

To examine gene expression in doublets in single cell RNA-seq (scRNA-seq) data, we downloaded the unfiltered counts tables from an atlas of tonsil scRNA-seq [17]. We selected cells from 9 donors processed at the same hospital. For each sample, we choose “gates” by manually inspecting a density scatterplot of counts for CD3D versus MS4A1 (the gene encoding the CD20). Using these gates, we selected CD3D- and MS4A1-positive “doublets”, CD3D-positive MS4A1-negative (putative T-cells), and MS4A1-positive CD3D-negative (putative B-cells). With the selected B- and T-cell populations as input, we used scDblFinder v 1.18.0 [18] to generate three times as many artificial B-cell-T-cell doublets as there were cells. We then randomly subsampled the artificial doublets to 500 cells per sample. Using Seurat v5.2.1 [19], we performed differential expression analyses between the real and artificial doublets using the default Wilcoxon test. We also examined average gene expression in the members of the Gene Ontology category “leukocyte cell adhesion” (GO:0007159). Analyses were performed using R version 4.4.2.

2.9. Statistical Analysis and Cell Population Clustering

We use K-Means, Affinity Propagation, Mean shift Clustering, Agglomerative Clustering, Density-Based Spatial Clustering, OPTICS Clustering, and Birch Clustering and evaluate their performance using Silhouette, Calinski-Harabasz and Davies-Bouldin. The computations were performed using the numpy [20], Scikit-learn [21] and pycaret [22] libraries with python version 3.11.7 Figures were created using the matplotlib [23], seaborn [24] and Vega-Altair [25] libraries. For clustering algorithms that require the number of clusters as input, this number was set to 4. Other inputs were kept to default values. For marker prediction, we used the pycaret classification class with all possible default models including boosting algorithms, tree-based models, and linear models. The model was trained on data from one lymph node and one tonsil selected randomly and evaluated on the remaining data. The models were selected based on AUC to avoid biases introduced by class imbalance. Scaling of cell marker expression was performed using the StandardScaler class from Scikit-learn. The models were interpreted using Shapley Additive exPlanations (SHARP) [26].

3. Results

3.1. Unsupervised Clustering Analysis of Human Tonsillar Germinal Center Immune Cell Types

Human tonsils represent a ‘prototype’ lymphoid organ for the study of F/GC immune landscape [27]. We started our investigation by staining tonsillar tissues with a ‘cycling’ mIF (8-plex, Table A2) assay using antibodies against molecules expressed by the main F/GC immune cell types (anti-CD3 and anti-CD4, T cell markers and anti-CD20, a B cell marker) (Figure 1a). Further, tissues were stained for PD-1, a prototype marker for Tfh cells [28], Ki67, a proliferation marker and Bcl-6 a master regulator of both Tfh and GC B cells [29] (Figure 1b). Although Bcl-6 expression was widely distributed across the follicular area, the expression of Ki67 was more polarized with some areas highly enriched in Ki67high cells (Figure 1b). A distinct pattern for localizing PD-1high cells in areas less populated by Ki67high cells was consistently found (Figure 1b).
Cellular events showing a ‘co-localization’ of CD3 and CD20 were evident across the follicular areas analyzed (Figure 2), presumably depicting areas where nearby cells have interconnected membrane structures.
For each segmented cellular event, the position (X, Y) coordinates along with the raw intensities of the markers used were extracted and used for downstream analysis. A consistent correlation among PD1, CD3, and CD4 was found, with Pearson’s R coefficients ranging from 0.69 to 0.71 (Figure 3a), while Ki67 and CD20 had the weakest correlation with other markers (Pearson’s R range: [−0.3, 0.33] and [−0.14, 0.2] respectively). Then, different clustering algorithms were applied for further unsupervised analysis of the imaging data. The performance of each algorithm was evaluated based on its score for Silhouette, Calinski-Harabasz and Davies-Bouldin profiles (Figure 3b).
The K-mean algorithm provided the best differentiation of the data based on Sihouette profile, Davies-Bouldin and Calinski-Harabasz index, with Silhouette profiles averaging 0.28 (range: 0.27–0.29). The other metrics used for evaluations of the cell clusters also highlighted the superiority of the K-means algorithm with higher Calinski-Harabasz and lower Davies-Bouldin respectively (Figure 3b), in line with the aforementioned mIF (Figure 1). Specifically CD4 was found to be co-expressed with CD3 while a similar profile was found for PD1 too (Figure 3a). Bcl6 was found mainly co-expressed with CD20high cells with or without expression of Ki67. Details about the different clusters and the number of cells in each group per tonsil sample are listed in Table 1.
Despite the variation in the expression of a given marker within a specific cell type, the resulting marker expression signatures are consistent with known cell types found in healthy germinal centers. Specifically, our approach correctly identifies a population with high prevalence of Ki67highCD20high events, enriched in the Dark Zone as well as Ki67lowCD20high B cells, enriched in the Light Zone [30] (Figure 4a). With respect to CD3 expression, we observed a low but measurable ‘co-expression’ with CD20 (Figure 4b), in line with the aforementioned mIF. Again, CD4 was found to be co-expressed with CD3 while a similar profile was found for PD-1 too (Figure 4b). Bcl6 was found to be mainly co-expressed with CD20high cells with or without Ki67 expression (Figure 4b). Despite the noise and marker spillage, our unsupervised clustering analysis can accurately identify follicular/germinal center immune cell types.
Then, we sought to apply our clustering approach in a second set of mIF (5-plex, Table A2) imaging data generated by an alternative platform (Polaris) (Figure 5a). To this end, tissue sections from two tonsils were used. Follicular areas were enriched in CD20 positive B cells (Figure 5a). Similar to the cycling mIF generated data (Figure 1b), PD-1highCD3high Tfh cells were found exclusively expressed within the follicular areas (Figure 5a). We observed a low co-expression between CD3 and CD20 (Figure 5b). Our unsupervised clustering for the measured markers (CD20, CD3, PD-1, Ki67) showed showed that Ki67 was mainly associated with B cells while PD-1 was mainly co-expressed with CD3 (Figure 5c).
Then, imaging data (Polaris mIF) were analyzed with HistoCytometry, an established methodology for the analysis of imaging data using the FlowJo platform [14,31,32] (Figure 6a). Analysis of two tonsils showed a CD20highCD3high cell population (9 and 11.5% of total follicular cells) (Figure 6a and Supplemental Figure S1) in line with the aforementioned identified clusters (Figure 3a). Although in lower frequency compared to CD20lowCD3high, we found a high representation of CD3highPD-1high Tfh cells in CD20highCD3high cell compartment (Figure 6a and Supplemental Figure S1). In line with our unsupervised clustering analysis, Ki67 was found to be associated with B cells while PD-1 expression was almost exclusively associated with CD3 T cells (Figure 6a and Supplemental Figure S1).

3.2. Characterization of the Follicular CD3/CD20 ‘Conjugates’

We sought to further investigate the phenotype of the CD3/CD20 ‘conjugates’. To this end, the expression of several molecules in relevant tonsillar immune cell subsets was investigated in tissue-derived cell suspension with CyTOF (Figure 6b). The CD3/CD20 ‘conjugates’ were also evident in the CyTOF analysis (Figure 6b). Clustering analysis revealed a gradually increased expression of Tfh cell markers (e.g., PD-1, ICOS, TIGIT) between CD3highCD19high DNAdim/low and CD3highCD19high DNAhigh T cells (Figure 6c).
We further characterized the molecular profile of the CD3/CD20 ‘conjugates’ using an online available tonsillar dataset (HCATonsilData). In line with the imaging and CyTOF data, ‘conjugates’ based on the co-expression of CD3 and CD20 genes we identified (Figure 7a). The comparative analysis of gene expression between in silico (artificial) CD3/CD20 ‘conjugates’ and the in situ detected ones (real), revealed a high expression of Tfh cell markers in the ‘real conjugates’ (Figure 7b). However, no significant difference was found between the two types of ‘conjugates’ when the gene expression for several adhesion molecules was investigated (Figure 7c). Application of the REDSEA [15] algorithm was able to further ‘clean’ the CD3/CD20 ‘conjugates’ (Figure 7d).

3.3. Unsupervised Clustering Analysis of Human Lymph Node Germinal Center Immune Cell Types

Tonsils are a unique lymphoid organ for their anatomical site, structure and cellular composition. Despite the shared phenotypes between human tonsils and reactive lymph nodes [33] different immune dynamics are possible between the two organs, especially in the settings of a disease. We applied our approach to human lymph nodes characterized by follicular hyperplasia. Markers expression were found to be most correlated between PD1, CD3 and CD4 (Figure 8) similar to the results in Figure 3 and consistent with known cell phenotypes in GC. The Pearson’s R-value between CD57 and PD-1 however was noticeably lower for lymph nodes compared to tonsils.
Similarly to the results for tonsils, the best-performing clustering algorithm for our lymph node samples was Kmean (Figure 8) with higher Silhouette and Calinski-Harabasz index and lower Davies-Bouldin index. This approach also identified clusters characterized by the co-expression of CD3, CD4, and PD-1 (Figure 9).
Bcl6 was mainly co-expressed with CD20, with or without Ki67. As expected, our approach showed that most follicular cells are B cells followed by CD3 T cells (Table 2). The cell population is smaller in lymph nodes’ germinal centers compared to the results of Table 1, mainly due to the difference in follicular/germinal center size between the two tissue types.

3.4. PD-1 Marker Prediction

Then we asked whether mIF measured markers could predict the expression of PD-1. For the prediction of PD1 expression, among all models tested, the best performances were recorded for Gradient Boosting Classifier and Light Gradient Boosting Machine with AUC ranging between 0.9467 and 0.9456 (see Table 3 and Figure 10).
As expected from Figure 3 and Figure 8, the feature importance highlighted the predictive role of CD3 and CD4 for the best performing model (see Figure 10). This is in line with PD1 beeing expressed mainly by follicular T cells.
Recursive feature elimination (RFE) is shown in Figure 10. We can see that only 3 features are necessary to obtain our classification score with the model performance stagnating for additional variables. This suggests that cellular phenotype can be reliably inferred from very few markers.

4. Discussion

Given the complexity of the in situ operating cellular and molecular network under these conditions, mIF supported by unsupervised computational analysis tools is of great importance for the delineation of the immune landscaping and the possible impact of the disease on specific components of this network. As a starting point for the establishment of relevant pipelines, we have focused our analysis in well characterized anatomical sites (follicles/germinal centers) and associated immune phenotypes (Tfh, B cells) using a 8-plex mIF assay. We have successfully applied an unsupervised clustering method for analyzing mIHC data in pathology. Our results indicate a robust unsupervised clustering method using open-source methods and providing cellular marker expression consistent with known cell phenotypes. We found that among the explored unsupervised clustering algorithms Kmean provides the most robust performance. The superior performance of K-means in unsupervised clustering of cells based on MIF expression can be primarily attributed to its spherical cluster assumption and centroid-based approach. This methodology aligns remarkably well with the biological reality of cell populations expressing specific markers, which often exhibit a Gaussian distribution around a population mean. The identified cell clusters aligned well with known cell types in healthy germinal centers, including dark zone B cells, light zone B cells, and CD3/CD4 T cells. In addition to clustering analysis, we explored the ability of the generated set of imaging data to reliably predict the expression. of a given marker. Machine learning models, particularly Gradient Boosting Classifier and Light Gradient Boosting Machine, showed high accuracy in predicting PD1 expression based on other markers, indicating consistent marker expression.
Expansion (use of more biomarkers) and validation of the prediction value of a data set for specific molecules and for specific disease would be highly supportive for diagnostic purposes, especially if the capacity for performance of high dimension staining panels is limited. While these presented results are promising, several limitations should be addressed in future research. The lack of a gold standard for cellular clustering markers challenges the ultimate validation by comparing different clustering algorithms. The use of general-purpose clustering metrics and investigating the coherence of cell marker expression within cell populations through markers prediction can overcome, at least in part that limitation. However, more investigations using other metrics are needed to further assess the validity of those results. The suboptimal silhouette profiling for the clustering algorithm Kmean, could be attributed, at least in part, to noisy data and marker spillage between cells. We observed a ‘co-expression’ of T cell markers (CD3, CD4) by CD20high B cells too. Several mechanisms could contribute to this profile. The closed proximity of T/B cells in the germinal center, especially the LZ, could lead to signal spillover between the two cell types [15]. This could potentially be avoided by focusing on nuclear markers for further studies. Data acquisition with higher resolution (e.g. use of 40X or 63X lens), application of algorithms such as REDSEA [15] and improvement of cell segmentation by using both cell membrane (e.g. use of anti-CD45RA) and nuclear staining could correct, at least in part, the detection of cell/cell conjugates. Interestingly, analysis of mIF data using the HistoCytometry platform, provided similar profiles with respect to CD3/CD20 ‘conjugates’ and the co-expression among CD3, CD20, PD-1 and Ki-67 further validating our approach. However, we should mention that part of the identified ‘conjugates’ may represent ongoing in vivo processes such as membrane ‘exchange’ between the two cell types, a process enhanced in SIV infection [34] and/or carrying of dead B blebs by Tfh cells [35]. Our CyTOF data, addressing relevant cell phenotypes in single cell suspension, suggests that part of these T/B cell ‘conjugates’ may represent stable Tfh/B cell interactions taking place in vivo [35], rather than an imaging artifact. Interestingly, the CD3highCD20highDNAhigh cells are characterized by high expression of both Tfh and B cell markers. Furthermore, the transcriptomic analysis of public available data further confirmed the presence of these CD3/CD20 ‘conjugates’. Of note, these conjugates express relatively high levels of Tfh cell related markers. Further, the CD3/CD20 ‘conjugates’ express considerate levels of adhesion molecules (although not significantly different, there is a higher average expression of ICAM1 CD40L, CCL5 in the ‘real’ compared to ‘artificial’ conjugates) that could mediate, at least in part, the formation of the ‘conjugates’. Our in situ imaging analysis and the cell suspension analysis urge for further investigations of the possible role of these T/B cell conjugates as biomarkers for the pathogenesis of a disease like chronic HIV or SLE. We propose that a comprehensive analysis should take in account the calculation of these doublets together with the estimation of ’single cell’ data after the application of algorithms like REDSEA. Finally, the lack of an external independent dataset to validate the predictive capacity of our model for PD-1 expression is an important limitations in this part of our analysis. This should be evaluate with further investigations to access whether those results can be replicated. In conclusion, this study provides a robust framework for unsupervised analysis of mIHC data in pathology. It demonstrates the potential of combining advanced imaging techniques with machine learning algorithms to enhance our understanding of complex tissue microenvironments. However, for such a method to be introduced in a diagnostic pipeline, the generalization of these results should be confirmed by analyzing relevant data from different laboratories/Institutes.

5. Conclusions

This study demonstrates the successful application of an unsupervised clustering method for analyzing mIHC data in pathology. We found that amongst the explored unsupervised clustering algorithms K-mean provided the most robust performance. The identified cell clusters aligned well with known cell types in healthy germinal centers, including dark zone B cells, light zone B cells, and CD3+/CD4+ T cells. Supervised machine learning models, particularly boosting algorithms, showed high precision in predicting PD-1 expression based on other markers, indicating consistent marker expression. We should emphasize that this is a pilot study introducing alternative approaches for the analysis of imaging data. Given the importance of Tfh cells in chronic diseases like HIV, lymphomas, SLE, it is of great interest to develop computational tools that could improve their molecular characterization in order to better understand their role in disease pathogenesis as well as their in vivo manipulation in the context of an immunotherapy. Our data suggest that a ‘limited’ mIF assay (like the presented 8-plex assay) can reliably predict the cell density of Tfh cells in relevant LN tissues. Further, analysis of the spatial positioning (by applying algorithms for calculation of minimum distances among different cell tyoes) between germinla center B cells (proliferating or not) and Tfh cells could be applied. Following such an initial screening, further in situ deep phenotypic characterization of Tfh cells using more biomarkers (e.g. CXCR5, ICOS, TIGIT) should be applied to samples of interest. Given the availability of tissue-derived cells, the imaging data could be coupled to the characterization of Tfh and B cells and particularly the possible CD3/CD20 conjugates, at protein (CyTOF or flow cytometry) and/or transcriptomic (scRNA) level. The comparison between the prevalence of Tfh and B cells obtained by the different platforms could also indicate whether the imaged tissue level is representative of the whole tissue with respect to these cell subsets or not (‘sampling error’). Our complementary approach for the characterization of CD3/CD20 ‘conjugates’, where many of T cells are Tfh, could provide valuable information with respect to the biological significance of such conjugates which likely reflect actual in situ cellular cognate and non-cognate interactions. While these results are promising, several limitations should be addressed in future research: The lack of a gold standard for cellular clustering makes it challenging to definitively validate the results. The generalizability of the study is limited by the use of samples from a single institution. Suboptimal silhouette profiling suggests room for improvement in clustering accuracy, possibly through enhanced imaging techniques and marker expression analysis. In conclusion, this study provides a robust framework for unsupervised analysis of mIHC data in pathology. It demonstrates the potential of combining advanced imaging techniques with machine learning algorithms to enhance our understanding of complex tissue microenvironments. Future work should focus on validating these methods across multiple institutions and improving clustering accuracy to further advance the field of digital pathology.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/biology14050530/s1, Figure S1: HistoCytometry gating scheme for the identification of relevant cell types in the second tonsil imaged with the Polaris mIF platform.

Author Contributions

S.B. analyzed mIF data, design and performed clustering analysis and drafted the manuscript; S.G., M.O. and C.B. performed mIF experiments, analyzed and interpreted data; C.B. edited the manuscript. H.L. performed scRNA data analysis, C.F. performed CyTOF experiments and data analysis, G.P. interpreted CyTOF data and revised the manuscript, R.G. supervised the snRNA analysis and revised the manuscript, C.P. designed and supervised the study and revised the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Swiss National Science Foundation (SNF, 310030_204226) NIAID (UM1 AI164561 and 1R01AI179476-01 subcontracts) to C.P. and by the Institute of Pathology, Department of Laboratory Medicine and Pathology, Lausanne University Hospital and Lausanne University, Lausanne, Switzerland.

Institutional Review Board Statement

All procedures were performed in accordance with the Declaration of Helsinki and approved by the appropriate Institutional Review Board/Ethical Committee. The use of these samples was formally sanctioned by the Canton de Vaud-CER-VD, Switzerland for LNs (2021-01161) and tonsils ( P B _ 2016 02436 (201/11)). Written informed consent was obtained from all participants.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

Code and preprocessed data used for this study can be found in our github repository: https://github.com/LTI-CHUV/clustering_MIF (accessed on 24 February 2025).

Acknowledgments

The authors would like to thank Laurence de Leval for assisting with control lymph nodes and Natalie Piazzon (Tissue Biobank), Damien Maison and Emilie Lingre, Institute of Pathology, CHUV, for their help with tissue processing.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
mIFmultiplex immunofluorescence
IHCimmunohistochemistry
LNlymph node
TSTonsil
GCgerminal center
LZLight zone
DZDark zone
TfhT-follicular helper cells

Appendix A

Table A1. Demographic data for the used human lymph nodes.
Table A1. Demographic data for the used human lymph nodes.
SamplePatient AgePatient GenderLocalization
LN128Femaleinguinal
LN222Femalecervical

Appendix B

Table A2. Multiplex immunofluorescence assays used.
Table A2. Multiplex immunofluorescence assays used.
SpecificityCloneCat NoConjugated/UnconjugatedFluorophoreCycle
Cycling confocal mIF (8-plex) 2 cycles
CD3OTI3E10 (IgG2b)TA506064UnconjugatedAlexa-546 (Secondary)1
CD4PolyclonalFAB8165N-100ConjugatedAlexa-7001
CD20L26 ConjugatedeF6151
Ki67B56561281ConjugatedV4501
Bcl6GI191E/A8760-4241UnconjugatedOpal-690 (Secondary)2
PD-1PolyclonalFAB7115GConjugatedAlexa-4881
CD57QA17A04393326ConjugatedBV4211, 2
Mouse IgG2bPolyclonalA-21143ConjugatedAlexa-5461
Mouse IgG (OmniMap)Polyclonal760-4310ConjugatedHRP substrate: Opal-690)2
SYTO45 10297192 1, 2
Polaris mIF (5-plex) Order 1-cycle
PD-1NAT105ACI 3137 AK, CKUnconjugatedOpal 6201
CD3OTI3E10TA506064UnconjugatedOpal 5702
CD20L26NCL-L-CD20-L26UnconjugatedOpal 5203
Ki67MIB-1M7240UnconjugatedOpal 4804
DAPI 5
Table A3. Mass Cytometry (CyTOF) phenotypic characterization of tonsillar-derived cells. Information for the antibodies used is shown.
Table A3. Mass Cytometry (CyTOF) phenotypic characterization of tonsillar-derived cells. Information for the antibodies used is shown.
AntibodyMetalCloneCompanyCat. N.Amount
Cell Viability103RhCell-IDStandard Biotools201103A0.6 μ L
CD81123lnRPA-T8Biolegend3010180.6 μ L
CD4115lnRPA-T4Biolegend3005150.6 μ L
CCR6141Pr11A49Standard Biotools3141014A1 μ L
CD19142NdHIB19Standard Biotools3142001B1.6 μ L
CCR6144NdG034E3Standard Biotools3141003A1.88 μ L
CD20147Sm2H7Standard Biotools3147001B0.67 μ L
ICOS148NdC398.4AStandard Biotools3148019B1 μ L
CCR4149SmL291H4Standard Biotools3149029A1.5 μ L
CD40L152SmTRAP1BD Biosciences5556983 μ L
TIGIT153EuMBSA43Standard Biotools3153019B2 μ L
CD3154SMUCHT1Standard Biotools3154003B0.5 μ L
CD27155GdL128Standard Biotools3155001B1 μ L
CXCR3156GdG025H7Standard Biotools3156004B0.8 μ L
CCR7159TdG043H7Standard Biotools3159003B1 μ L
CD30162DyBY88Biolegend3339023 μ L
CXCR5164DyRF8B2Standard Biotools3164029B1.2 μ L
CD45RO165HoUCHL1Standard Biotools3165011B0.7 μ L
CD38167ErHIT2Standard Biotools3167001B0.6 μ L
CD45RA170ErHI100Standard Biotools3171001B1.6 μ L
HLA-DR174YbL243Standard Biotools3174001B0.7 μ L
CD279/PD-1175LuEH12.2H7Standard Biotools3175008B1 μ L
CD127176YbA019D5Standard Biotools3176004B0.7 μ L
CTLA4174PtIpilimumabBMSACB30652.5 μ L
CD57198PtNK-1BD Biosciences5556181 μ L
CD16209Bi3G8Standard Biotools3209002B1.5 μ L
Anti-PE145NdPE001Standard Biotools3145006B2 μ L
GATA-3146NdTWAJThermoFisher14-9966-821 μ L
Tbet161Dy4B10Standard Biotools3161015B2 μ L
Bcl6163DyK112-91Standard Biotools3163012B2 μ L
Ki67168TmKi-67Standard Biotools3168001B1.4 μ L
Blimp1169Tm646702Bio-techneMAB360813 μ L
Granzyme B171YbGB11Standard Biotools3171002B2 μ L
Bcl2173Yb100Biolegend6587022 μ L

References

  1. Tan, W.C.C.; Nerurkar, S.N.; Cai, H.Y.; Ng, H.H.M.; Wu, D.; Wee, Y.T.F.; Lim, J.C.T.; Yeong, J.; Lim, T.K.H. Overview of multiplex immunohistochemistry/immunofluorescence techniques in the era of cancer immunotherapy. Cancer Commun. 2020, 40, 135–153. [Google Scholar] [CrossRef] [PubMed]
  2. Kuswanto, W.; Nolan, G.; Lu, G. Highly multiplexed spatial profiling with CODEX: Bioinformatic analysis and application in human disease. Semin. Immunopathol. 2022, 45, 145–157. [Google Scholar] [CrossRef] [PubMed]
  3. Harms, P.W.; Frankel, T.L.; Moutafi, M.; Rao, A.; Rimm, D.L.; Taube, J.M.; Thomas, D.; Chan, M.P.; Pantanowitz, L. Multiplex Immunohistochemistry and Immunofluorescence: A Practical Update for Pathologists. Mod. Pathol. 2023, 36, 100197. [Google Scholar] [CrossRef]
  4. Hickey, J.W.; Tan, Y.; Nolan, G.P.; Goltsev, Y. Strategies for Accurate Cell Type Identification in CODEX Multiplexed Imaging Data. Front. Immunol. 2021, 12, 727626. [Google Scholar] [CrossRef] [PubMed]
  5. Lacinski, R.A.; Dziadowicz, S.A.; Melemai, V.K.; Fitzpatrick, B.; Pisquiy, J.J.; Heim, T.; Lohse, I.; Schoedel, K.E.; Llosa, N.J.; Weiss, K.R.; et al. Spatial multiplexed immunofluorescence analysis reveals coordinated cellular networks associated with overall survival in metastatic osteosarcoma. Bone Res. 2024, 12, 55. [Google Scholar] [CrossRef]
  6. Cyster, J.G.; Allen, C.D.C. B Cell Responses: Cell Interaction Dynamics and Decisions. Cell 2019, 177, 524–540. [Google Scholar] [CrossRef]
  7. Choi, J.; Crotty, S.; Choi, Y.S. Cytokines in Follicular Helper T Cell Biology in Physiologic and Pathologic Conditions. Immune Network 2024, 24, e8. [Google Scholar] [CrossRef]
  8. Shane, C. Follicular Helper Cell Biology: A Decade of Discovery and Diseases. Immunity 2019, 50, 1132–1148. [Google Scholar]
  9. Küppers, R. Mechanisms of B-cell lymphoma pathogenesis. Nat. Rev. Cancer 2005, 5, 251–262. [Google Scholar] [CrossRef]
  10. Estes, J.D.; Kityo, C.; Ssali, F.; Swainson, L.; Makamdop, K.N.; Prete, G.Q.D.; Deeks, S.G.; Luciw, P.A.; Chipman, J.G.; Beilman, G.J.; et al. Defining total-body AIDS-virus burden with implications for curative strategies. Nat. Med. 2017, 23, 1271–1276. [Google Scholar] [CrossRef]
  11. Radtke, A.J.; Kandov, E.; Lowekamp, B.; Speranza, E.; Chu, C.J.; Gola, A.; Thakur, N.; Shih, R.; Yao, L.; Yaniv, Z.R.; et al. IBEX: A versatile multiplex optical imaging approach for deep phenotyping and spatial analysis of cells in complex tissues. Proc. Natl. Acad. Sci. USA 2020, 117, 33455–33465. [Google Scholar] [CrossRef] [PubMed]
  12. Georgakis, S.; Ioannidou, K.; Mora, B.B.; Orfanakis, M.; Brenna, C.; Muller, Y.D.; Estrada, P.M.D.R.; Sharma, A.A.; Pantaleo, G.; de Leval, L.; et al. Cellular and molecular determinants mediating the dysregulated germinal center immune dynamics in systemic lupus erythematosus. Front. Immunol. 2025, 16, 1530327. [Google Scholar] [CrossRef] [PubMed]
  13. Moysi, E.; Estrada, P.M.D.R.; Torres-Ruiz, F.; Reyes-Terán, G.; Koup, R.A.; Petrovas, C. In Situ Characterization of Human Lymphoid Tissue Immune Cells by Multispectral Confocal Imaging and Quantitative Image Analysis; Implications for HIV Reservoir Characterization. Front. Immunol. 2021, 12, 683396. [Google Scholar] [CrossRef]
  14. Gerner, M.Y.; Kastenmuller, W.; Ifrim, I.; Kabat, J.; Germain, R.N. Histo-cytometry: A method for highly multiplex quantitative tissue imaging analysis applied to dendritic cell subset microanatomy in lymph nodes. Immunity 2012, 37, 364–376. [Google Scholar] [CrossRef] [PubMed]
  15. Bai, Y.; Zhu, B.; Rovira-Clave, X.; Chen, H.; Markovic, M.; Chan, C.N.; Su, T.H.; McIlwain, D.R.; Estes, J.D.; Keren, L.; et al. Adjacent Cell Marker Lateral Spillover Compensation and Reinforcement for Multiplexed Images. Front. Immunol. 2021, 12, 652631. [Google Scholar] [CrossRef]
  16. Berg, S.; Kutra, D.; Kroeger, T.; Straehle, C.N.; Kausler, B.X.; Haubold, C.; Schiegg, M.; Ales, J.; Beier, T.; Rudy, M.; et al. ilastik: Interactive machine learning for (bio)image analysis. Nat. Methods 2019, 16, 1226–1232. [Google Scholar] [CrossRef]
  17. Massoni-Badosa, R.; Aguilar-Fernández, S.; Nieto, J.C.; Soler-Vila, P.; Elosua-Bayes, M.; Marchese, D.; Kulis, M.; Vilas-Zornoza, A.; Bühler, M.M.; Rashmi, S.; et al. An atlas of cells in the human tonsil. Immunity 2024, 57, 379–399. [Google Scholar] [CrossRef]
  18. Germain, P.L.; Lun, A.; Meixide, C.G.; Macnair, W.; Robinson, M.D. Doublet identification in single-cell sequencing data using scDblFinder. f1000research 2022, 10, 979. [Google Scholar] [CrossRef]
  19. Hao, Y.; Stuart, T.; Kowalski, M.H.; Choudhary, S.; Hoffman, P.; Hartman, A.; Srivastava, A.; Molla, G.; Madad, S.; Fernandez-Granda, C.; et al. Dictionary learning for integrative, multimodal and scalable single-cell analysis. Nat. Biotechnol. 2024, 42, 293–304. [Google Scholar] [CrossRef]
  20. Harris, C.R.; Millman, K.J.; van der Walt, S.J.; Gommers, R.; Virtanen, P.; Cournapeau, D.; Wieser, E.; Taylor, J.; Berg, S.; Smith, N.J.; et al. Array programming with NumPy. Nature 2020, 585, 357–362. [Google Scholar] [CrossRef]
  21. Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
  22. Ali, M. PyCaret: An Open Source, Low-Code Machine Learning Library in Python; Version 1.0.0; PyCaret: Online, 2020; Available online: https://www.pycaret.org (accessed on 2 May 2025).
  23. Hunter, J.D. Matplotlib: A 2D graphics environment. Comput. Sci. Eng. 2007, 9, 90–95. [Google Scholar] [CrossRef]
  24. Waskom, M.L. seaborn: Statistical data visualization. J. Open Source Softw. 2021, 6, 3021. [Google Scholar] [CrossRef]
  25. Satyanarayan, A.; Moritz, D.; Wongsuphasawat, K.; Heer, J. Vega-Lite: A Grammar of Interactive Graphics. IEEE Trans. Vis. Comput. Graph. 2017, 23, 341–350. [Google Scholar] [CrossRef]
  26. Lundberg, S.M.; Lee, S.I. A Unified Approach to Interpreting Model Predictions. In Advances in Neural Information Processing Systems 30 (NeurIPS 2017); Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2017; pp. 4765–4774. [Google Scholar]
  27. Orfanakis, M.; Molyvdas, A.; Petrovas, C. In Situ Characterization of Human Follicular Helper CD4 T Cells. In Intracellular Pathogens: Methods and Protocols; Springer: Berlin/Heidelberg, Germany, 2024; pp. 281–293. [Google Scholar]
  28. Kamphorst, A.O.; Ahmed, R. Manipulating the PD-1 pathway to improve immunity. Curr. Opin. Immunol. 2013, 25, 381–388. [Google Scholar] [CrossRef]
  29. Yu, D.; Vinuesa, C.G. The elusive identity of T follicular helper cells. Trends Immunol. 2010, 31, 377–383. [Google Scholar] [CrossRef]
  30. Mesin, L.; Ersching, J.; Victora, G.D. Germinal Center B Cell Dynamics. Immunity 2016, 45, 471–482. [Google Scholar] [CrossRef]
  31. Li, W.; Germain, R.N.; Gerner, M.Y. High-dimensional cell-level analysis of tissues with Ce3D multiplex volume imaging. Nat. Protoc. 2019, 14, 1708–1733. [Google Scholar] [CrossRef]
  32. Wang, V.G.; Liu, Z.; Martinek, J.; Foroughi Pour, A.; Zhou, J.; Boruchov, H.; Ray, K.; Palucka, K.; Chuang, J.H. Computational immune synapse analysis reveals T-cell interactions in distinct tumor microenvironments. Commun. Biol. 2024, 7, 1201. [Google Scholar] [CrossRef]
  33. Padhan, K.; Moysi, E.; Noto, A.; Chassiakos, A.; Ghneim, K.; Perra, M.M.; Shah, S.; Papaioannou, V.; Fabozzi, G.; Ambrozak, D.R.; et al. Acquisition of optimal TFH cell function is defined by specific molecular, positional, and TCR dynamic signatures. Proc. Natl. Acad. Sci. USA 2021, 118, e2016855118. [Google Scholar] [CrossRef]
  34. Samer, S.; Chowdhury, A.; Salinas, T.R.W.; Estrada, P.M.D.R.; Reuter, M.; Tharp, G.; Bosinger, S.; Cervasi, B.; Auger, J.; Gill, K.; et al. Lymph-Node-Based CD3+ CD20+ Cells Emerge from Membrane Exchange between T Follicular Helper Cells and B Cells and Increase Their Frequency following Simian Immunodeficiency Virus Infection. J. Virol. 2023, 97, e0176022. [Google Scholar] [CrossRef]
  35. Allen, C.D.C.; Okada, T.; Tang, H.L.; Cyster, J.G. Imaging of Germinal Center Selection Events during Affinity Maturation. Science 2007, 315, 528–531. [Google Scholar] [CrossRef]
Figure 1. Compartmentalization of immune cell types in tonsillar follicular areas. (a) Representative confocal immunofluorescence images (mIFs) showing the expression of CD20 (cyan), a B cell marker, CD3 (red) and CD4 (green), T cell markers and a merged CD3/CD4 image, centered in a F/GC area. (b) Representative mIF images showing the expression of Bcl6 (magenta), Ki67 (blue), PD1 (yellow), as well as merged images (Bcl6/Ki67, Ki67/PD-1) in a human tonsillar follicular area.
Figure 1. Compartmentalization of immune cell types in tonsillar follicular areas. (a) Representative confocal immunofluorescence images (mIFs) showing the expression of CD20 (cyan), a B cell marker, CD3 (red) and CD4 (green), T cell markers and a merged CD3/CD4 image, centered in a F/GC area. (b) Representative mIF images showing the expression of Bcl6 (magenta), Ki67 (blue), PD1 (yellow), as well as merged images (Bcl6/Ki67, Ki67/PD-1) in a human tonsillar follicular area.
Biology 14 00530 g001
Figure 2. Representative confocal images showing the expression of CD3 (red), CD20 (green), and nuclear (blue) in a tonsillar follicle (scale bar: 70 μ m) (left panel) as well as zoomed areas (scale bar: 10 μ m and 5 μ m, middle panel). The interconnected/fused membranes between cells are shown too (right panels). White arrow points to a CD3/CD20 ‘conjugate’ in a zoomed area (white circle). Imaris modules were used for the generation of ‘isothermic’ surfaces for the visualization of the CD3 (red)/CD20 (green) membranes.
Figure 2. Representative confocal images showing the expression of CD3 (red), CD20 (green), and nuclear (blue) in a tonsillar follicle (scale bar: 70 μ m) (left panel) as well as zoomed areas (scale bar: 10 μ m and 5 μ m, middle panel). The interconnected/fused membranes between cells are shown too (right panels). White arrow points to a CD3/CD20 ‘conjugate’ in a zoomed area (white circle). Imaris modules were used for the generation of ‘isothermic’ surfaces for the visualization of the CD3 (red)/CD20 (green) membranes.
Biology 14 00530 g002
Figure 3. (a) Calculated correlations between markers used for the characterization of tonsilar follicular/GC immune cell types. Correlation is represented through Pearson’s R. (b) Clustering scores for tonsils germinal centers cell populations. Silhouette & Calinski-Harabasz: higher values indicate a better clustering result. Davies-Bouldin, lower values indicate a better clustering result. Bar indicate standard deviation between GC areas.
Figure 3. (a) Calculated correlations between markers used for the characterization of tonsilar follicular/GC immune cell types. Correlation is represented through Pearson’s R. (b) Clustering scores for tonsils germinal centers cell populations. Silhouette & Calinski-Harabasz: higher values indicate a better clustering result. Davies-Bouldin, lower values indicate a better clustering result. Bar indicate standard deviation between GC areas.
Biology 14 00530 g003
Figure 4. Identification of immune cell clusters in tonsillar follicular areas (cycling mIF). (a) Distinct follicular/GC areas in tonsilar follicles can be identified based on the expression of CD20 (red) and Ki67 (green). Areas enriched in KI67 positive CD20 cells represent the Dark Zone (DZ), while the Light Zone (LZ) area is less populated by Ki67-positive B cells. The Mantle Zone is characterized by the expression of CD20 and the absence of proliferating, Ki67-positive Data from two tonsils (upper panel) are shown. Two zoomed follicular areas (white circles) are shown in the lower panel. (b) F/GCs from two tonsils were used for our analysis. The digital representation of identified clusters, using the Kmeans algorithm, are shown for both tonsils as well as the relative expression of individual analyzed markers in each immune cell cluster.
Figure 4. Identification of immune cell clusters in tonsillar follicular areas (cycling mIF). (a) Distinct follicular/GC areas in tonsilar follicles can be identified based on the expression of CD20 (red) and Ki67 (green). Areas enriched in KI67 positive CD20 cells represent the Dark Zone (DZ), while the Light Zone (LZ) area is less populated by Ki67-positive B cells. The Mantle Zone is characterized by the expression of CD20 and the absence of proliferating, Ki67-positive Data from two tonsils (upper panel) are shown. Two zoomed follicular areas (white circles) are shown in the lower panel. (b) F/GCs from two tonsils were used for our analysis. The digital representation of identified clusters, using the Kmeans algorithm, are shown for both tonsils as well as the relative expression of individual analyzed markers in each immune cell cluster.
Biology 14 00530 g004
Figure 5. Identification of immune cell clusters in tonsillar follicular areas (Polaris mIF). (a) Representative images showing the expression of CD3 (red), CD20 (cyan), Ki67 (blue) and PD-1 (yellow) in tonsillar follicles (scale bar: 200 μ m). A zoomed follicular area (white circle, scale bar 50) is shown on the right panels too. (b) A 2D plot representation of CD3 and CD20 expression in one tonsillar follicular areas. (c) The Kmeans algorithm was applied and the relative expression of individual analyzed markers in each immune cell cluster is shown.
Figure 5. Identification of immune cell clusters in tonsillar follicular areas (Polaris mIF). (a) Representative images showing the expression of CD3 (red), CD20 (cyan), Ki67 (blue) and PD-1 (yellow) in tonsillar follicles (scale bar: 200 μ m). A zoomed follicular area (white circle, scale bar 50) is shown on the right panels too. (b) A 2D plot representation of CD3 and CD20 expression in one tonsillar follicular areas. (c) The Kmeans algorithm was applied and the relative expression of individual analyzed markers in each immune cell cluster is shown.
Biology 14 00530 g005
Figure 6. F/GCs represent areas with high densities of Tfh and B cells in close proximity. (a) mIF imaging data were analyzed by the HistoCytometry pipeline. The gating scheme for the identification of relevant immune cell subsets is shown. (b) CyTOF generated 2D plots showing the identified CD19 and CD3 cell subsets in a tonsillar single cell suspension (upper panel). Cell suspensions from four tonsils were used. (c) Clustering analysis showing the expression of individual molecules in CD19high (B cells), CD3highCD19low T cells as well as CD3highCD19high T cells with high or dim/low DNA content (lower panel). A color bar index is also shown.
Figure 6. F/GCs represent areas with high densities of Tfh and B cells in close proximity. (a) mIF imaging data were analyzed by the HistoCytometry pipeline. The gating scheme for the identification of relevant immune cell subsets is shown. (b) CyTOF generated 2D plots showing the identified CD19 and CD3 cell subsets in a tonsillar single cell suspension (upper panel). Cell suspensions from four tonsils were used. (c) Clustering analysis showing the expression of individual molecules in CD19high (B cells), CD3highCD19low T cells as well as CD3highCD19high T cells with high or dim/low DNA content (lower panel). A color bar index is also shown.
Biology 14 00530 g006
Figure 7. Transcriptomic characterization of CD3/CD20 ‘conjugates’. (a) accumulated data showing the relative frequency of CD20highCD3low (white), CD20lowCD3high (light gray) and CD20highCD3high (gray) in tonsils (n = 9). (b) the gene expression of Tfh cell biomarkers (PD-1, ICOS, TIGIT, TOX2) in the two types of CD3/CD20 ‘conjugates’ is shown. (c) dot plot showing the gene expression of leukocyte cell adhesion molecules in the two types of ‘conjugates’. (d) Flow cytometry 2D plots showing the co-expression of CD20 and CD3 proteins in one tonsillar (upper panel) and one LN (lower panel) follicular area before (left) and after (right) the application of the REDSEA algorithm.
Figure 7. Transcriptomic characterization of CD3/CD20 ‘conjugates’. (a) accumulated data showing the relative frequency of CD20highCD3low (white), CD20lowCD3high (light gray) and CD20highCD3high (gray) in tonsils (n = 9). (b) the gene expression of Tfh cell biomarkers (PD-1, ICOS, TIGIT, TOX2) in the two types of CD3/CD20 ‘conjugates’ is shown. (c) dot plot showing the gene expression of leukocyte cell adhesion molecules in the two types of ‘conjugates’. (d) Flow cytometry 2D plots showing the co-expression of CD20 and CD3 proteins in one tonsillar (upper panel) and one LN (lower panel) follicular area before (left) and after (right) the application of the REDSEA algorithm.
Biology 14 00530 g007
Figure 8. (a) Calculated correlations between markers used to characterize lymph node follicular/GC immune cell types using Pearson’s R. (b) Clustering scores for lymph nodes germinal center cell populations. Silhouette & Calinski-Harabasz: higher values indicate a better clustering result. Davies-Bouldin, lower values indicate a better clustering result.
Figure 8. (a) Calculated correlations between markers used to characterize lymph node follicular/GC immune cell types using Pearson’s R. (b) Clustering scores for lymph nodes germinal center cell populations. Silhouette & Calinski-Harabasz: higher values indicate a better clustering result. Davies-Bouldin, lower values indicate a better clustering result.
Biology 14 00530 g008
Figure 9. Results for follicular area in lymph nodes. (a) Example of expression of CD3, CD20, Bcl6 and Ki67, (b) Identification of immune cell clusters in lymph node follicular areas. Follicles/GCs from two reactive lymph nodes were used for our analysis. Coordinates-based representation of cell clusters produced with the Kmeans algorithm for both lymph nodes (left). Relative expression of individual analyzed markers in each immune cell cluster.
Figure 9. Results for follicular area in lymph nodes. (a) Example of expression of CD3, CD20, Bcl6 and Ki67, (b) Identification of immune cell clusters in lymph node follicular areas. Follicles/GCs from two reactive lymph nodes were used for our analysis. Coordinates-based representation of cell clusters produced with the Kmeans algorithm for both lymph nodes (left). Relative expression of individual analyzed markers in each immune cell cluster.
Biology 14 00530 g009
Figure 10. Model statistics: AUC curves for Gradient Boosting Classifier (best-performing model) with classes 0.0 = PD1 negative cells, classe 1.0= PD1 positive cells (a). Features importance for best performing model (b). RFE for the same model wit 95% confidence interval (c).
Figure 10. Model statistics: AUC curves for Gradient Boosting Classifier (best-performing model) with classes 0.0 = PD1 negative cells, classe 1.0= PD1 positive cells (a). Features importance for best performing model (b). RFE for the same model wit 95% confidence interval (c).
Biology 14 00530 g010
Table 1. Cell population clusters characterized by tonsil sample.
Table 1. Cell population clusters characterized by tonsil sample.
SampleNum. GC 1ClustersNumber of Cells
TS16Ki67highCD20+3841
CD20high3662
CD3high2239
Other1756
TS25CD20high13,419
Other10,244
Ki67highCD20+6831
CD3high6545
1 Number of GC used for analysis.
Table 2. Cell population clusters characterized by lymph nodes sample.
Table 2. Cell population clusters characterized by lymph nodes sample.
SampleNum. GC 1ClustersNumber of Cells
LN14CD20high2260
Ki67highCD20+1943
CD3high1347
LN24CD20high2395
Other1660
CD3high1519
Ki67highCD20+1153
1 Number of GC used for analysis.
Table 3. Models performance for the ten best-performing models in classifying of PD1 expression using lymph nodes and tonsils data.
Table 3. Models performance for the ten best-performing models in classifying of PD1 expression using lymph nodes and tonsils data.
ModelAccuracyAUCRecallPrec.
Gradient Boosting Classifier0.92690.94670.52970.6873
Light Gradient Boosting Machine0.92630.94560.52510.6843
Ada Boost Classifier0.92420.94250.50700.6787
Extreme Gradient Boosting0.92360.94100.51570.6622
Extra Trees Classifier0.92640.94050.49230.6991
Ridge Classifier0.92080.94030.34000.7293
Linear Discriminant Analysis0.92240.94030.58280.6307
SVM-Linear Kernel0.92180.93950.45210.6958
Logistic Regression0.92440.93940.48160.6825
Random Forest Classifier0.92580.93900.49990.6919
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Burgermeister, S.; Orfanakis, M.; Georgakis, S.; Brenna, C.; Lindsay, H.; Fenwick, C.; Pantaleo, G.; Gottardo, R.; Petrovas, C. Unsupervised Clustering of Cell Populations in Germinal Centers Using Multiplexed Immunofluorescence. Biology 2025, 14, 530. https://doi.org/10.3390/biology14050530

AMA Style

Burgermeister S, Orfanakis M, Georgakis S, Brenna C, Lindsay H, Fenwick C, Pantaleo G, Gottardo R, Petrovas C. Unsupervised Clustering of Cell Populations in Germinal Centers Using Multiplexed Immunofluorescence. Biology. 2025; 14(5):530. https://doi.org/10.3390/biology14050530

Chicago/Turabian Style

Burgermeister, Simon, Michail Orfanakis, Spiros Georgakis, Cloe Brenna, Helen Lindsay, Craig Fenwick, Giuseppe Pantaleo, Raphael Gottardo, and Constantinos Petrovas. 2025. "Unsupervised Clustering of Cell Populations in Germinal Centers Using Multiplexed Immunofluorescence" Biology 14, no. 5: 530. https://doi.org/10.3390/biology14050530

APA Style

Burgermeister, S., Orfanakis, M., Georgakis, S., Brenna, C., Lindsay, H., Fenwick, C., Pantaleo, G., Gottardo, R., & Petrovas, C. (2025). Unsupervised Clustering of Cell Populations in Germinal Centers Using Multiplexed Immunofluorescence. Biology, 14(5), 530. https://doi.org/10.3390/biology14050530

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop