# Segmentation Uncertainty Estimation as a Sanity Check for Image Biomarker Studies

## Abstract

## Simple Summary

## Abstract

## 1. Introduction

#### 1.1. Background

#### 1.2. Problem

- ${s}_{a}^{2}$—the intra-patient variance due to protocol ‘a’ uncertainty;
- $CO{V}_{a,b}$—the covariance of protocol ‘a’ and ‘b’ uncertainties.

## 2. Materials and Methods

#### 2.1. Datasets

^{3}using Plastimatch v 1.9.0 (https://plastimatch.org/, last access on 1 October 2021).

#### 2.2. Estimation of Segmentation Uncertainty and the Effect Size of the Radiomic Features

^{3}voxel size using PyRadiomics v3.0.1 [2]. First-order statistics, shape, gray level co-occurrence (GLCM), gray level run length (GLRLM), gray level dependence (GLDM), gray level size zone (GLSZM), and Neighboring Gray Tone Difference (NGTDM) matrix-based features were extracted—104 radiomic features in total.

#### 2.3. Prognostic Model Simulation

- #MajC—number of samples of the majority class for a subject;
- #MinC—number of samples of the minority class for a subject.

## 3. Results

## 4. Discussion

#### 4.1. General Discussion

#### 4.2. Limitations and Future Perspective

^{3}voxel size limits the number of augmented binary masks, as we sample from a finite set of voxels in the uncertainty ring. To mitigate this, segmentation uncertainty might be based on raw RTSTRUCT contour coordinates and not on binary masks [16]. We do not, however, believe that this will fundamentally change the analysis of this paper.

## 5. Conclusions

**Figure 1.**The Workflow of Image Biomarker Analysis: from image acquisition to modeling—each procedure can be performed using a different protocol (set of parameters), thus inducing uncertainty in the image biomarker model performance. Partially adapted from: https://med.stanford.edu/bmrgroup/Research/AcqRecon.html (accessed on 14 January 2022), https://www.nature.com/articles/srep03529 (accessed on 14 January 2022) [6].

**Figure 2.**Image biomarker value distributions in populations A and B given relatively low (low opacity curves) and high (full opacity curves) intra-population variance.

**Figure 6.**Segmentation uncertainty influence in the high η (setup B): two sample testing set realizations result in different performances.

**Figure 7.**Segmentation uncertainty and patient stratification agreement δ Equation(5) in setups A (

**a**, low η) and B (

**b**, high η).

Setup A—Low η | Setup B—High η | ||
---|---|---|---|

Feature | η | Feature | η |

first-order Maximum | 0.0 | glszm GrayLevelVariance | 0.2944 |

gldm GrayLevelNonUniformity | 0.0108 | glrlm RunEntropy | 0.3066 |

glrlm GrayLevelNonUniformity | 0.0115 | glcm MCC | 0.3216 |

ngtdm Coarseness | 0.0129 | glszm GrayLevelNonUniformityNormalized | 0.4005 |

