# Biomarkers of Tumor Heterogeneity in Glioblastoma Multiforme Cohort of TCGA

^{1}

^{2}

^{3}

^{4}

^{*}

## Abstract

**:**

## Simple Summary

## Abstract

## 1. Introduction

## 2. Methods

#### 2.1. Preprocessing the WSIs

#### 2.2. Nuclear Segmentation

#### 2.3. Image Normalization

#### 2.4. Computation of the Morphometric Indices

#### 2.5. Association of the Morphometric Indices with Survival

#### 2.5.1. Representation

#### 2.5.2. Distance Metrics and Clustering

#### 2.5.3. Statistical Analysis of Morphometric Indices for Biomarker Validation

#### 2.5.4. Computing Resources

## 3. Results

#### 3.1. Biomarker Discovery

#### 3.1.1. Biomarkers of Nuclear Morphometric Indices

#### 3.1.2. Biomarkers of Morphometric Indices Preconditioned on Genomics Signature

## 4. Discussion

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## References

- Ostrom, Q.T.; Cioffi, G.; Waite, K.; Kruchko, C.; Barnholtz-Sloan, J.S. CBTRUS statistical report: Primary brain and other central nervous system tumors diagnosed in the United States in 2014–2018. Neuro-Oncology
**2021**, 23, iii1–iii105. [Google Scholar] [CrossRef] [PubMed] - Verhaak, R.G.; Hoadley, K.A.; Purdom, E.; Wang, V.; Qi, Y.; Wilkerson, M.D.; Miller, C.R.; Ding, L.; Golub, T.; Mesirov, J.P.; et al. Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1. Cancer Cell
**2010**, 17, 98–110. [Google Scholar] [CrossRef] [PubMed] - Zhu, X.; Yao, J.; Huang, J. Deep convolutional neural network for survival analysis with pathological images. In Proceedings of the 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Shenzhen, China, 15–18 December 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 544–547. [Google Scholar]
- Lin, H.; Chen, H.; Graham, S.; Dou, Q.; Rajpoot, N.; Heng, P.A. Fast scannet: Fast and dense analysis of multi-gigapixel whole-slide images for cancer metastasis detection. IEEE Trans. Med. Imaging
**2019**, 38, 1948–1958. [Google Scholar] [CrossRef] [PubMed] - Jung, H.; Lodhi, B.; Kang, J. An automatic nuclei segmentation method based on deep convolutional neural networks for histopathology images. BMC Biomed. Eng.
**2019**, 1, 1–12. [Google Scholar] [CrossRef] [PubMed] - Xing, F.; Yang, L. Robust nucleus/cell detection and segmentation in digital pathology and microscopy images: A comprehensive review. IEEE Rev. Biomed. Eng.
**2016**, 9, 234–263. [Google Scholar] [CrossRef] [PubMed] - Mahmood, F.; Borders, D.; Chen, R.J.; McKay, G.N.; Salimian, K.J.; Baras, A.; Durr, N.J. Deep adversarial training for multi-organ nuclei segmentation in histopathology images. IEEE Trans. Med. Imaging
**2019**, 39, 3257–3267. [Google Scholar] [CrossRef] [PubMed] - Khoshdeli, M.; Winkelmaier, G.; Parvin, B. Fusion of encoder-decoder deep networks improves delineation of multiple nuclear phenotypes. BMC Bioinform.
**2018**, 19, 1–11. [Google Scholar] [CrossRef] [PubMed] - Chang, H.; Han, J.; Borowsky, A.; Loss, L.; Gray, J.W.; Spellman, P.T.; Parvin, B. Invariant delineation of nuclear architecture in glioblastoma multiforme for clinical and molecular association. IEEE Trans. Med Imaging
**2012**, 32, 670–682. [Google Scholar] [CrossRef] [PubMed] - Mobadersany, P.; Yousefi, S.; Amgad, M.; Gutman, D.A.; Barnholtz-Sloan, J.S.; Velázquez Vega, J.E.; Brat, D.J.; Cooper, L.A. Predicting cancer outcomes from histology and genomics using convolutional networks. Proc. Natl. Acad. Sci. USA
**2018**, 115, E2970–E2979. [Google Scholar] [CrossRef] [PubMed] - Chen, R.J.; Lu, M.Y.; Wang, J.; Williamson, D.F.; Rodig, S.J.; Lindeman, N.I.; Mahmood, F. Pathomic fusion: An integrated framework for fusing histopathology and genomic features for cancer diagnosis and prognosis. IEEE Trans. Med. Imaging
**2020**, 41, 757–770. [Google Scholar] [CrossRef] [PubMed] - Kong, J.; Cooper, L.A.; Wang, F.; Gutman, D.A.; Gao, J.; Chisolm, C.; Sharma, A.; Pan, T.; Van Meir, E.G.; Kurc, T.M.; et al. Integrative, multimodal analysis of glioblastoma using TCGA molecular data, pathology images, and clinical outcomes. IEEE Trans. Biomed. Eng.
**2011**, 58, 3469–3474. [Google Scholar] [CrossRef] [PubMed] - Zhang, Y.; Li, A.; He, J.; Wang, M. A novel MKL method for GBM prognosis prediction by integrating histopathological image and multi-omics data. IEEE J. Biomed. Health Inform.
**2019**, 24, 171–179. [Google Scholar] [CrossRef] [PubMed] - Muñoz-Aguirre, M.; Ntasis, V.F.; Rojas, S.; Guigó, R. PyHIST: A histological image segmentation tool. PLoS Comput. Biol.
**2020**, 16, e1008349. [Google Scholar] [CrossRef] [PubMed] - Winkelmaier, G.; Parvin, B. An enhanced loss function simplifies the deep learning model for characterizing the 3D organoid models. Bioinformatics
**2021**, 37, 3084–3085. [Google Scholar] [CrossRef] [PubMed] - Kumar, N.; Verma, R.; Sharma, S.; Bhargava, S.; Vahadane, A.; Sethi, A. A dataset and a technique for generalized nuclear segmentation for computational pathology. IEEE Trans. Med. Imaging
**2017**, 36, 1550–1560. [Google Scholar] [CrossRef] [PubMed] - Vahadane, A.; Peng, T.; Sethi, A.; Albarqouni, S.; Wang, L.; Baust, M.; Steiger, K.; Schlitter, A.M.; Esposito, I.; Navab, N. Structure-preserving color normalization and sparse stain separation for histological images. IEEE Trans. Med. Imaging
**2016**, 35, 1962–1971. [Google Scholar] [CrossRef] [PubMed] - Ruifrok, A.C.; Katz, R.L.; Johnston, D.A. Comparison of quantification of histochemical staining by hue-saturation-intensity (HSI) transformation and color-deconvolution. Appl. Immunohistochem. Mol. Morphol.
**2003**, 11, 85–91. [Google Scholar] [CrossRef] [PubMed] - Müllner, D. Modern hierarchical, agglomerative clustering algorithms. arXiv
**2011**, arXiv:1109.2378. [Google Scholar] - Bar-Joseph, Z.; Gifford, D.K.; Jaakkola, T.S. Fast optimal leaf ordering for hierarchical clustering. Bioinformatics
**2001**, 17, S22–S29. [Google Scholar] [CrossRef] [PubMed] - Grambsch, P.M.; Therneau, T.M. Proportional hazards tests and diagnostics based on weighted residuals. Biometrika
**1994**, 81, 515–526. [Google Scholar] [CrossRef] - Wijethilake, N.; Meedeniya, D.; Chitraranjan, C.; Perera, I. Survival prediction and risk estimation of Glioma patients using mRNA expressions. In Proceedings of the 2020 IEEE 20th International Conference on Bioinformatics and Bioengineering (BIBE), Cincinnati, OH, USA, 26–28 October 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 35–42. [Google Scholar]
- Dabbs, D.J. Diagnostic Immunohistochemistry E-Book: Theranostic and Genomic Applications; Elsevier Health Sciences: Amsterdam, The Netherlands, 2021. [Google Scholar]

**Figure 1.**Eash WSI is represented in the context of tumor heterogeneity for biomarker discovery: (

**a**) a WSI is partitioned to patches of 224-by-224, where each patch is analyzed for pen marks or other aberrations; (

**b**) nuclei are segmented in patches; (

**c**) H&E optical density is normalized in each patch; (

**d**) nuclei organization is quantified in each patch; (

**e**,

**f**) computed indices from nuclei and their organizations are used for the dictionary- and PDF-based representations. (

**g**) Predictive morphometric indices of survival are identified.

**Figure 2.**H&E stain is heterogeneous between patients. Two patches from two WSIs indicate a diverse staining signature. They are normalized for quantifying HOD and visualized in the RGB space.

**Figure 3.**Dictionary-based learning identified two and three subpopulation (e.g., clusters) of patients based on cellularity and eccentricity indices, respectively. (top row): Computed similarity matrices; (middle row) the cumulative Density Function (CDF) of similarity matrices shows the quality of the number of clusters for each index (e.g., a flat horizontal line indicates a low number of misclassified samples between clusters). (bottom row) Silhouette plots of 800,000 randomly sampled nuclei show the similarity of patients within a cluster (e.g., a silhouette score less than 1) and a red dashed indicating the average silhouette score.

**Figure 4.**Representative patches showing low, medium, and high eccentricities corresponding to clusters 1, 2, and 3 from the dictionary-based method.

**Figure 5.**Representative patches showing low, and high cellularities corresponding to clusters 1 and 2 from the dictionary-method.

**Figure 6.**Steps in the dictionary-based method for representing heterogeneity: (

**a**) each WSI is partitioned into patches; (

**b**) each patch is quantified in terms of nuclear indices and organization; (

**c**) each computed index (e.g., HOD content, nuclear size) is aggregated across the entire cohort for dictionary-based learning (e.g., alphabets, which are four in this example); and (

**d**) each WSI is then represented as a composition of learned alphabets.

**Figure 7.**Optimal transport identifies subpopulations of patients, based on PDF representation, for survival analysis. Top row: similarity matrices identified by linkage analysis; Bottom row: Kaplan–Meier plots, hazard ratio, and computed p-values for three computed morphometric indices of nuclear size, solidity, and total chromatin.

**Figure 8.**The forest plot indicates biomarkers associated with the subpopulation at risk using the PDF-based representation without any genomic preconditioning. The asterisks **, ***, and **** denote the number of stratifications per morphometric index.

**Figure 9.**Using the PDF method, pre-conditioned on the classical subtype, the forest plot indicates the subpopulation at risk. The asterisks **, ***, and **** denote the number of stratifications per morphometric index.

**Figure 10.**Using the PDF method, pre-conditioned on a high EGFR expression, the forest plot indicates the subpopulation at risk. For example, Area cluster two has an 52% decreased risk of death compared to Area cluster zero. The asterisks **** denote the number of stratifications per morphometric index.

Morphometric Index | Number of Clusters | p-Value | |
---|---|---|---|

(a) PDF model | |||

Area | 2 | 0.026 | |

Area | 3 | 0.016 | |

Area | 4 | 0.013 | |

Mean HOD | 3 | 0.016 | |

Mean HOD | 4 | 0.006 | |

Solidity | 3 | 0.014 | |

Solidity | 4 | 0.007 | |

Total HOD | 2 | 0.044 | |

Total HOD | 3 | 0.037 | |

Total HOD | 4 | 0.025 | |

(b) Dictionary model | |||

Cellularity | 2 | 0.008 | |

Cellularity | 3 | 0.040 | |

Eccentricity | 2 | 0.002 | |

Eccentricity | 3 | 0.005 | |

Eccentricity | 4 | 0.011 | |

Mean HOD | 2 | 0.019 |

**Table 2.**Predicted morphometric biomarkers and their p-values for the combined model without genomic preconditioning.

Nuclear Morphometric Index | Number of Clusters | p-Value |
---|---|---|

Cellularity | 2 | 0.025 |

Eccentricity | 2 | 0.007 |

Eccentricity | 3 | 0.013 |

Eccentricity | 4 | 0.019 |

Mean HOD | 4 | 0.028 |

**Table 3.**Predicted morphometric biomarkers for the PDF- and dictionary-based models preconditioned on genomic subtypes.

Nuclear Morphometric Index | Number of Clusters | p-Value | |||
---|---|---|---|---|---|

Neural | Proneural | Mesenchymal | Classical | ||

(a) PDF method | |||||

Area | 2 | 0.021 | - | - | - |

Area | 3 | 0.020 | - | - | 0.009 |

Area | 4 | 0.018 | - | - | 0.006 |

Mean HOD | 4 | 0.024 | - | - | - |

Solidity | 3 | 0.006 | - | - | - |

Solidity | 4 | <0.001 | - | 0.009 | - |

Total HOD | 2 | - | - | - | 0.019 |

Total HOD | 3 | - | - | - | 0.008 |

Total HOD | 4 | - | - | - | 0.008 |

(b) Dictionary method | |||||

Area | 4 | - | - | - | 0.040 |

Total HOD | 2 | <0.001 | - | - | - |

Total HOD | 3 | 0.008 | - | - | - |

Total HOD | 4 | 0.003 | - | - | - |

**Table 4.**Predicted morphometric biomarkers and their p-values for the combined model preconditioned on genomic subtypes.

Nuclear Morphometric Index | Number of Clusters | p-Value | |||
---|---|---|---|---|---|

Neural | Proneural | Mesenchymal | Classical | ||

Area | 2 | 0.04 | - | - | - |

Area | 4 | - | - | - | 0.043 |

Mean HOD | 4 | - | 0.031 | - | - |

Solidity | 3 | 0.010 | - | - | - |

Solidity | 4 | 0.004 | - | 0.048 | - |

Total HOD | 2 | 0.001 | - | - | - |

Total HOD | 3 | 0.012 | - | - | - |

Total HOD | 4 | 0.004 | - | - | 0.036 |

**Table 5.**Predicted morphometric biomarkers for the PDF- and dictionary-based models preconditioned on patients with high EGFR expression.

Nuclear Morphometric Index | Number of Clusters | p-Value | |
---|---|---|---|

(a) PDF model | |||

Total HOD | 4 | 0.048 | |

(b) Dictionary model | |||

Area | 2 | 0.007 | |

Area | 3 | 0.009 | |

Area | 4 | 0.025 |

**Table 6.**Predicted morphometric biomarkers for the PDF- and dictionary-based models preconditioned on patients with low EGFR expression.

Nuclear Morphometric Index | Number of Clusters | p-Value | |
---|---|---|---|

(a) PDF model | |||

Area | 4 | 0.031 | |

Cellularity | 4 | 0.018 | |

(b) Dictionary model | |||

Cellularity | 2 | 0.035 | |

Total HOD | 2 | 0.001 | |

Total HOD | 3 | 0.003 | |

Total HOD | 4 | 0.009 |

**Table 7.**Predicted morphometric biomarkers and their p-values for the combined model preconditioned on the EGFR transcript.

Nuclear Morphometric Index | Number of Clusters | p-Value | |
---|---|---|---|

(a) Biomarkers for patients with matched transcriptome data | |||

Cellularity | 2 | 0.047 | |

Cellularity | 3 | 0.033 | |

Cellularity | 4 | 0.019 | |

Eccentricity | 2 | 0.040 | |

Mean HOD | 3 | 0.010 | |

Mean HOD | 4 | 0.004 | |

(b) Biomarkers of patients stratified with high EGFR expression | |||

Area | 3 | 0.004 | |

Area | 4 | 0.005 | |

Cellularity | 3 | 0.031 | |

Cellularity | 4 | 0.009 | |

(c) Biomarkers of patients with low EGFR expression | |||

Cellularity | 2 | 0.034 | |

Cellularity | 4 | 0.018 | |

Mean HOD | 3 | 0.015 | |

Mean HOD | 4 | 0.021 | |

Total HOD | 2 | 0.001 | |

Total HOD | 3 | 0.002 |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Winkelmaier, G.; Koch, B.; Bogardus, S.; Borowsky, A.D.; Parvin, B.
Biomarkers of Tumor Heterogeneity in Glioblastoma Multiforme Cohort of TCGA. *Cancers* **2023**, *15*, 2387.
https://doi.org/10.3390/cancers15082387

**AMA Style**

Winkelmaier G, Koch B, Bogardus S, Borowsky AD, Parvin B.
Biomarkers of Tumor Heterogeneity in Glioblastoma Multiforme Cohort of TCGA. *Cancers*. 2023; 15(8):2387.
https://doi.org/10.3390/cancers15082387

**Chicago/Turabian Style**

Winkelmaier, Garrett, Brandon Koch, Skylar Bogardus, Alexander D. Borowsky, and Bahram Parvin.
2023. "Biomarkers of Tumor Heterogeneity in Glioblastoma Multiforme Cohort of TCGA" *Cancers* 15, no. 8: 2387.
https://doi.org/10.3390/cancers15082387