# EndoNet: A Model for the Automatic Calculation of H-Score on Histological Slides

## Abstract

## 1. Introduction

## 2. Methods

#### 2.1. Data

- ${x}_{i},{x}_{j}$ = point locations;
- s = scale parameter, which equals the mean square nuclei radius.

#### 2.2. General Architecture

#### 2.3. Detection Model Architecture

#### 2.4. Training of Detection Model

- y = true label;
- $\widehat{y}$ = predicted label;
- $\delta $ = the threshold where the Huber loss function transitions from quadratic to linear.

#### 2.5. Pre-Training

#### 2.6. H-Score Module

#### 2.7. Statistical Testing

## 3. Results

#### 3.1. Pre-Training Results

#### 3.2. Training

#### 3.3. H-Score

## 4. Discussion

## 5. Conclusions

## Supplementary Materials

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Conflicts of Interest

## References

**Figure 2.**Architecture of EndoNet model. Tiles go through an Image-to-Image model to be converted into heatmaps, while Keypoint Extractor obtains the coordinates and classes of the centers of nuclei and passes them to the H-score Module to calculate H-score in stroma and epithelium.

**Figure 3.**Pre-training process with SimCLR [28]. Here, $\mathrm{t}\in \tau $ and ${\mathrm{t}}^{\prime}\in \tau $ are two augmentations taken from the same family of augmentations. $f(\xb7)$ is a base encoding network and $\mathrm{g}(\xb7)$ is a projection head that maps a hidden representation to another space, where contrastive loss is applied. X is the initial image, ${\mathrm{X}}_{\mathrm{i}}$ and ${\mathrm{X}}_{\mathrm{j}}$ are the augmented images, ${\mathrm{h}}_{\mathrm{i}}$ and ${\mathrm{h}}_{\mathrm{j}}$ are the hidden representations of corresponding augmented images, and ${\mathrm{z}}_{\mathrm{i}}$ and ${\mathrm{z}}_{\mathrm{j}}$ are the outputs of the decoding network. The optimization task here is to maximize the agreement between ${\mathrm{z}}_{\mathrm{j}}$ and ${\mathrm{z}}_{\mathrm{i}}$.

**Figure 4.**Distributions of pixels of (

**a**) the whole tile; (

**b**) blue and brown nuclei, stained in their colors; (

**c**) blue and brown nuclei, where red are brown nuclei and blue are blue nuclei.

**Figure 7.**Absolute error for “Model small”, “Model big”, and “QuPath”. Significant differences were found between “Model big” and “QuPath” for epithelium. Circles are outliers in the distributions, and ∗ indicates significant statistical difference among the distributions.

**Table 1.**Resulting metrics for the SimCLR pre-trained model and the ImageNET-based model, computed on the test dataset and with confidence interval bounds for mean differences in models.

SimCLR Pre-Trained | ImageNET | Confidence Interval—Lower Bound | Confidence Interval—Upper Bound | |
---|---|---|---|---|

Stroma AP | 0.8577 | 0.8544 | −0.00666 | 0.01840 |

Epithelium AP | 0.7576 | 0.7256 | 0.00461 | 0.07010 |

mAP | 0.8077 | 0.7900 | −0.00024 | 0.04115 |

EndoNuke | PathLab | Combined | |
---|---|---|---|

Stroma AP | 0.85 | 0.83 | 0.85 |

Epithelium AP | 0.69 | 0.84 | 0.69 |

mAP | 0.77 | 0.84 | 0.77 |

**Table 3.**Calculated thresholds of “Value” dimension in HSV space for each slide for both annotators. “Left” means threshold which divides strong and moderate staining and “Right” means threshold which divides moderate and weak staining, as in Figure 6. The fourth slide is mutual for calculating the agreement level.

Annonator | Slide | Left | Right |
---|---|---|---|

1st | 1 | 80 | 120 |

2 | 80 | 125 | |

3 | 80 | 120 | |

4 | 80 | 125 | |

2nd | 4 | 80 | 135 |

5 | 80 | 120 | |

6 | 75 | 130 | |

7 | 80 | 130 |

**Table 4.**Calculated H-score in stroma and epithelium for each slide for both annotators. The model scores for each slide are calculated based on thresholds from Table 3. The “Man.” H-score is based on the keypoint annotations of pathologists, the “Model small” H-score is based on annotations provided by our model on the same tiles, the “Model big” H-score is based on annotations provided by our model (but on the large amount of tiles from the same slides), and the “QP” H-score is calculated in the QuPath program (0.4.4 version).

Annotator | Slide | Stroma | Epithelium | ||||||
---|---|---|---|---|---|---|---|---|---|

Man. | Model Small | Model Big | QP | Man. | Model Small | Model Big | QP | ||

1st | 1 | 137 | 120 | 122 | 205 | 164 | 145 | 161 | 203 |

2 | 149 | 165 | 164 | 219 | 180 | 182 | 181 | 193 | |

3 | 138 | 128 | 116 | 138 | 144 | 144 | 128 | 112 | |

4 | 183 | 178 | 179 | 201 | 137 | 159 | 141 | 161 | |

2nd | 4 | 187 | 181 | 184 | 201 | 150 | 167 | 159 | 176 |

5 | 131 | 109 | 114 | 100 | 150 | 142 | 138 | 169 | |

6 | 198 | 165 | 158 | 164 | 57 | 88 | 65 | 9 | |

7 | 180 | 168 | 188 | 182 | 202 | 198 | 219 | 278 |

