Understanding the Impact of Evaluation Metrics in Kinetic Models for Consensus-Based Segmentation

Raffaella Fiamma Cabini; Horacio Tettamanti; Mattia Zanella

doi:10.3390/e27020149

,

and

¹

Euler Institute, Università della Svizzera Italiana, 6900 Lugano, Switzerland

²

Department of Mathematics “F. Casorati”, University of Pavia, 27100 Pavia, Italy

^*

Author to whom correspondence should be addressed.

Entropy2025, 27(2), 149;https://doi.org/10.3390/e27020149

This article belongs to the Section Multidisciplinary Applications

Version Notes

Order Reprints

Abstract

In this article, we extend a recently introduced kinetic model for consensus-based segmentation of images. In particular, we will interpret the set of pixels of a 2D image as an interacting particle system that evolves in time in view of a consensus-type process obtained by interactions between pixels and external noise. Thanks to a kinetic formulation of the introduced model, we derive the large time solution of the model. We will show that the parameters defining the segmentation task can be chosen from a plurality of loss functions that characterize the evaluation metrics.

Keywords:

kinetic equations; interacting particle systems; consensus models; image segmentation; clustering

1. Introduction

The primary objective of image segmentation is to partition an image into distinct pixel regions that exhibit homogeneous characteristics, including spatial proximity, intensity values, color variations, texture patterns, brightness levels, and contrast differences, thereby enabling more effective analysis and interpretation of the visual data. The application of image segmentation methods plays an important role in clinical research by facilitating the study of anatomical structures, highlighting regions of interest, and measuring tissue volume [1,2,3,4,5,6]. In this context, the accurate recognition of areas affected by pathologies can have a great impact on more precise early diagnosis and monitoring in a great variety of diseases that range from brain tumors to skin lesions.

Over the past few decades, a variety of computational strategies and mathematical approaches have been developed to address image segmentation challenges. Among these, deep learning techniques and neural networks have emerged as some of the most widely used methods in contemporary image segmentation tasks [7,8,9,10,11,12,13,14,15]. Leveraging a set of examples, these techniques are capable of approximating the complex nonlinear relationship between inputs and desired outputs. While deep learning models excel in complex segmentation problems, their dependence on large annotated datasets remains a significant challenge, particularly in fields such as biomedical imaging, where data availability is limited and manual labeling can be both expensive and time-consuming. A different approach is based on clustering methods [16,17,18,19,20,21]. These methods group pixels with similar characteristics, effectively partitioning the image into distinct regions. Clustering-based methods offer an attractive alternative to deep learning techniques as they do not require supervised training and therefore can be used on small unlabeled datasets. In this direction, a kinetic approach for unsupervised clustering problems for image segmentation has been introduced in [22,23]. In these works, microscopic-consensus-type models have been connected to image segmentation tasks by considering the pixels of an image as an interacting system where each particle is characterized by its space position and a feature determining the gray level. A virtual interaction between the particles will then determine the asymptotic formation of a finite number of clusters. Hence, a segmentation mask is generated by assigning the mean of their gray levels to each cluster of particles and by applying a binary threshold. Among the various nonlinear compromise terms that have been proposed in the literature, we will consider the Hegselmann–Krause model described in [24], where it is supposed that each agent may only interact with other agents that are sufficiently close. This type of interaction is classically known as a bounded confidence interaction function. As a result, two pixels will interact based on their distance in space and their gray level. The approach developed in [22] is based on the methods of kinetic theory for consensus formation. In recent decades, following the first model developed in [25,26,27,28], several approaches have been designed to investigate the emergence of patterns and collective structures for large systems of agents/particles [29,30,31,32]. To this end, the flexibility of kinetic-type equations has been of paramount importance to link the microscopic scale and the macroscopic observable scale [33,34,35,36,37,38,39].

In order to construct a data-oriented pipeline, we calibrate the resulting model by exploiting a family of existing evaluation metrics to obtain the relevant information from a ground truth image [40,41,42,43,44,45]. The main development of this study, compared to the one described in [46], relies on the fact that we evaluate multiple metrics to quantify segmentation error, which is crucial for the optimization of the internal model parameters. In particular, we will concentrate on the Standard Volumetric Dice Similarity Coefficient (Volumetric Dice), a volumetric measure based on the quotient between the intersection of the obtained segmented images and their total volume, and the Surface Dice Similarity Coefficient, which is analogous to Volumetric Dice but exploits the surface of the segmented images [46]. Furthermore, we test the Jaccard Index, which is an alternative option to evaluate the volumetric similarity between two segmentation masks, and the

F_{β}

-measure, which is a performance metric that facilitates balance between precision and sensitivity. In this paper, we describe these metrics in detail and analyze how such choices regarding evaluation metrics influence the parameter optimization process. Furthermore, we discuss the most suitable metrics for the final assessment of the produced segmentations. This expanded evaluation provides novel insights into the impact of evaluation metrics on model performance and enhances our understanding of how to efficiently optimize the introduced segmentation pipeline.

In more detail, the manuscript is organized as follows. In Section 2, we introduce an extension of the Hegselmann–Krause model in 2D and present the structure of the emerging steady states for different values of the model parameters. Next, we present a description of the model based on a kinetic-type approach. Furthermore, we show how this model can be extended and applied to the image segmentation problem. In Section 3, we present a Direct Simulation Monte Carlo (DSMC) method to approximate the evolution of the system and introduce possible optimization methods to produce segmentation masks for particular images. To this end, we introduce the definition of the principal optimization metrics used in the context of biomedical images and their principal characteristics. In Section 4, we show the results for a simple case of segmenting a geometrical image with a blurry background and compare the results obtained for different choices regarding the diffusion function. Finally, we present the results obtained for various brain tumor images and discuss how the choice regarding different metrics may affect the final result. We show that the

F_{β}

-measure does not produce consistent results for different values of

β

. We reproduce the expected relationship between the Volumetric Dice Coefficient and Jaccard Index and show that both metrics plus the Surface Dice Coefficient yield similar results. Nevertheless, we argue that, for this type of image, the Surface Dice Coefficient produces more accurate loss values, and its definition is more representative compared to the Volumetric Dice Coefficient and Jaccard Index.

2. Consensus Modeling and Applications to Image Segmentation

In recent years, there has been growing interest in exploring consensus formation within opinion models to gain a deeper understanding of how social forces affect nonlinear aggregation processes in multiagent systems. To this end, various models have been proposed considering different scenarios and hypotheses on how the pairwise interactions may lead to the emergence of a position. For a finite number of particles, the dynamics are usually defined in terms of first-order differential equations with the general form

\frac{d x_{i}}{d t} = \frac{1}{N} \sum_{j = 1}^{N} P (x_{i}, x_{j}) (x_{j} - x_{i}),

(1)

where

x_{i} (t) \in R^{d}

,

d \geq 1

characterize the position of the agent

i = 1, \dots, N

at time

t \geq 0

, and

P (\cdot, \cdot) \geq 0

tunes the interaction between the agents

x_{i}, x_{j} \in R^{d}

; see, e.g., [24,30,32,47,48].

In addition to microscopic-agent-based models, in the limit of an infinite number of agents, it is possible to derive the evolution of distribution functions characterizing the collective behavior of interacting systems. These approaches, typically grounded in kinetic-type partial differential equations (PDEs), are capable of bridging the gap between microscopic forces and the emerging properties of the system; see [37].

2.1. The 2D-Bounded Confidence Model

We now consider the bidimensional case

d = 2

, and we specify the interaction function based on the so-called bounded confidence model. In more detail, we consider

N \geq 2

agents and define their opinion variable through a vector

x = (x_{i} (t), y_{i} (t)) \in R^{2}

, characterized by initial states

{x_{1} (0), \dots, x_{N} (0)}

. Agents will modify their opinion as a result of the interaction with other agents only if

| x_{i} - x_{j} | \leq Δ

, where

Δ \geq 0

is a given confidence level. Hence, we can write (1) as follows

\frac{d}{d t} x_{i} = \frac{1}{N} \sum_{j = 1}^{N} P_{Δ} (x_{i}, x_{j}) (x_{j} - x_{i}),

(2)

where

P_{Δ} (x_{i}, x_{j}) = χ (| x_{i} - x_{j} | \leq Δ) : R^{2} \to {0, 1}

and

χ (A)

being the characteristic function of the set

A \subseteq R^{2}

. We can easily observe that the mean position of the ensemble of agents is conserved in time, indeed

\frac{d}{d t} \sum_{i = 1}^{N} x_{i} = \frac{1}{N} \sum_{i, j = 1}^{N} χ (| x_{i} - x_{j} | \leq Δ) (x_{j} - x_{i}) = 0,

(3)

thanks to the symmetry of the considered bounded confidence interaction function. The bounded confidence model converges to a steady configuration, meaning that the systems achieve consensus in finite time. The structure of the steady state depends on the value of

Δ

; see [38].

Furthermore, to account for random fluctuations provided by external factors in the opinion of agents, we may consider a diffusion component as follows:

d x_{i} = \frac{1}{N} \sum_{j = 1}^{N} P_{Δ} (| x_{i} - x_{j} | \leq Δ) (x_{j} - x_{i}) d t + \sqrt{2 σ^{2}} d W_{i}

(4)

where

{W_{i}}_{i = 1}^{N}

is a set of independent Wiener processes. The impact of the diffusion is weighted by the variable

σ^{2} > 0

. To visualize the interplay between consensus forces and diffusion, we depict in Figure 1 the steady configuration of the model (4) for different combinations of the model parameters. For

σ^{2} = 0

, the system forms a finite number of clusters depending on the value of

Δ > 0

, as illustrated in Figure 1a. For values of the diffusion coefficient

σ^{2} > 0

, the number of clusters of the system varies as depicted in Figure 1b. The right panel of Figure 1b shows the scenario in which the diffusion effect becomes comparable to the tendency of agents to cluster. Finally, in Figure 1c, for

σ = 0.05

, the diffusion effect dominates the grouping tendency, resulting in a homogeneous steady state distribution.

Figure 1. Large time distribution of the 2D-bounded confidence model for different parameters characterizing the compromise propensity and the diffusion for N =

10^{5}

particles in

[0, T]

with

T = 100

and

Δ t = 0.01

. In (a), the final state converges to a number of clusters depending on the value of

Δ

. As we reduce the range of interaction, more clusters are created. In rows (b,c), we can see the interplay between the tendency of particles to aggregate and diffuse. In the first column, we see that the steady state converges to a Gaussian distribution with a standard deviation provided by

σ^{2}

. In the second column, for (b,c), we see that the final states differ greatly in their structure. Finally, the last column shows the final states in the case where the diffusion surpasses considerable aggregation tendency.

2.2. Kinetic Models for Consensus Dynamics

In the limit

N \to + \infty

, it can be shown that the empirical density

f^{(N)} (x, t) = \frac{1}{N} \sum_{i = 1}^{N} δ (x - x_{i} (t))

of the system of particles (4) converges to a continuous density

f (x, t) : R^{2} \times R_{+} \to R_{+}

solution to the following mean-field equation

\begin{matrix} \partial_{t} f (x, t) & = \nabla_{x} \cdot [Ξ [f] (x, t) + σ^{2} \nabla_{x} f] \\ f (x, 0) & = f_{0} (x) \end{matrix}

(5)

where

Ξ [f] (x, t)

is defined as follows

Ξ [f] (x, t) = \int_{R^{2}} P_{Δ} (x_{i}, x_{j}) (x - x_{*}) f (x_{*}, t) d x_{*};

(6)

see, e.g., [49].

We can derive (5) using a kinetic approach by writing

x : = x_{i} (t)

and

x_{*} : = x_{j} (t)

for a generic pair

(i, j)

of interacting agents/particles, and we approximate the time derivative in (4) in a time step

ϵ = Δ t > 0

, through a Euler–Maryuama approach, in the same spirit as [34,50]. Hence, we recover the binary interaction rule

\begin{matrix} x^{'} & = x + ϵ P_{Δ} (x, x_{*}) (x_{*} - x) + \sqrt{2 σ^{2}} η \\ x_{*}^{'} & = x_{*} + ϵ P_{Δ} (x_{*}, x) (x - x_{*}) + \sqrt{2 σ^{2}} η_{*}, \end{matrix}

(7)

where

x^{'} = x_{i} (t + ϵ)

,

x_{*}^{'} = x_{j} (t + ϵ)

and

η, η_{*}

are two independent 2D-centered Gaussian distribution random variables such that

⟨ η ⟩ = ⟨ η_{*} ⟩ = 0 ⟨ η^{2} ⟩ = ϵ

(8)

where

⟨ \cdot ⟩

denotes the integration with respect to the distribution

η

. Furthermore, in (7), we shall consider

P (x, x_{*}) = χ (| x - x_{*} | < Δ)

. We can remark that, if

σ = 0

, since

P_{Δ} \in [0, 1]

and

ϵ \in (0, 1)

, we obtain

\begin{matrix} ⟨x^{'} + x_{*}^{'}⟩ & = x + x_{*} + Δ t (P_{Δ} (x, x_{*}) - P_{Δ} (x_{*}, x)) (x_{*} - x) \\ = x + x_{*} \end{matrix}

(9)

since the interaction function

P_{Δ}

is symmetric, consistent with (3). This shows that the mean position is conserved at every interaction. Finally, we have

\begin{matrix} | ⟨ x^{'} ⟩ |^{2} + {| ⟨ x ⟩ |}^{2} = {| x^{'} |}^{2} + {| x |}^{2} - 2 Δ t P_{Δ} {| x^{'} - x |}^{2} + o (Δ t) \end{matrix}

(10)

and the mean energy is dissipated at each interaction since

P_{Δ} \geq 0

. Hence, we consider the distribution function

f = f (x, t) : R^{2} \times R_{+} \to R_{+}

such that

f (x, t) d x

represents the fraction of agents/particles in

[x_{1}, x_{1} + d x_{1}) \times [x_{2}, x_{2} + d x_{2}]

at time

t \geq 0

. The evolution of f as a result of binary interaction scheme (7) is obtained by a Boltzmann-type equation, which reads in weak form

\begin{matrix} \frac{d}{d t} \int_{R^{2}} φ (x) f (x, t) d x = \\ ⟨\int_{R^{4}} (φ (x^{'}) - φ (x)) f (x, t) f (x_{*}, t) d x d x_{*}⟩, \end{matrix}

(11)

φ (\cdot)

being a test function. As observed in [39], when

Δ t = ϵ \to 0^{+}

, we can observe that the binary scheme (7) becomes quasi-invariant, and we can introduce the following expansion

⟨φ (x^{'}) - φ (x)⟩ = ⟨ x^{'} - x ⟩ \cdot \nabla_{x} φ (x) + \frac{1}{2} ⟨{(x^{'} - x)}^{T} H [φ] (x^{'} - x)⟩ + R_{ϵ} (x, x_{*})

(12)

R_{ϵ} (x, x_{*})

being a reminder term and

H [φ]

the Hessian matrix. Hence, scaling

τ = ϵ t

and the distribution

f_{ϵ} (x, τ) = f (x, τ / ϵ)

, we may plug (12) into (11) to obtain

\begin{matrix} \frac{d}{d τ} \int_{R^{2}} φ (x) f_{ϵ} (x, t) d x = & \frac{1}{ϵ} \int_{R^{4}} ⟨ x^{'} - x ⟩ \cdot \nabla_{x} φ (x) f_{ϵ} (x, τ) f_{ϵ} (x_{*}, τ) d x d x_{*} + \\ \frac{1}{2 ϵ} \int_{R^{4}} ⟨ {(x^{'} - x)}^{T} H [φ (x] (x^{'} - x) ⟩ f_{ϵ} (x, τ) f_{ϵ} (x_{*}, τ) d x d x_{*} + \\ \frac{1}{ϵ} \int_{R^{4}} R_{ϵ} (x, x_{*}) f_{ϵ} (x, τ) f_{ϵ} (x_{*}, τ) d x d x_{*} \end{matrix}

Following [22], see also [37], we can prove that

\int_{R^{4}} R_{ϵ} (x, x_{*}) f_{ϵ} (x, τ) f_{ϵ} (x_{*}, τ) d x d x_{*} \to 0^{+},

as

ϵ \to 0^{+}

. Hence, integrating back by parts the first two terms, we obtain (5). In more detail, we can prove that

f_{ϵ}

converges up to extraction of a subsequence to a probability density

f (x, τ)

that is weak solution to the nonlocal Fokker–Planck Equation (5).

2.3. Application to Image Segmentation

An application of the Hegselmann–Krause model for data clustering problems has been proposed in [23]. The idea is to extend the 2D model by characterizing each particle with an internal feature

c_{i} \in [0, 1]

that represents the gray color of the ith pixel. Therefore, we interpret each pixel in the image as a particle characterized by a position vector and the static feature c as shown in Figure 2.

Figure 2. A schematic representation of the proposed model, where each pixel is interpreted as a particle

(x_{i}, y_{i}, c_{i})

, with

c_{i}

being a static feature in the interval

[0, 1]

that represents the grey level.

To address the segmentation task, we can define a dynamic feature for the system of pixels through an interaction function that accounts for alignment processes among pixels with sufficiently similar features. In particular, let us consider the following:

P_{Δ_{1}, Δ_{2}} (x_{i}, x_{j}, c_{i}, c_{j}) = χ (| x_{i} - x_{j} | \leq Δ_{1}) χ (| c_{i} - c_{j} | \leq Δ_{2}) .

(13)

Therefore, the time-continuous evolution for the system of pixels is provided by

\begin{matrix} \frac{d}{d t} x_{i} & = \frac{1}{N} \sum_{j = 1}^{N} P_{Δ_{1}, Δ_{2}} (x_{i}, x_{j}, c_{i}, c_{j}) (x_{j} - x_{i}) \\ \frac{d}{d t} c_{i} & = 0 \end{matrix}

(14)

In this case, we introduced two confidence bounds

Δ_{1} \geq 0

,

Δ_{2} \geq 0

, taking into account the position and the gray level of the pixels, respectively. In this way, the interactions between the pixels will generate a large time distribution that is characterized by several clusters depending on the values of

Δ_{1}

and

Δ_{2}

. Hence, consistent with k-means methods, see, e.g., [44], a pixel belongs to a cluster

C_{μ} = {x_{i} : ∥ x_{i} - μ ∥ \leq α}

, with

α > 0

being the pixel size, if it is sufficiently close to the local quantity

μ \in R^{2}

. We highlight how we are only interested in clustering with respect to the space variable.

This dynamics are represented in Figure 3.

Figure 3. Representation of the evolution of pixels as they tend to aggregate in different clusters.

Biomedical images are often subject to ambiguities arising from various sources of uncertainty related to clinical factors and potential bottlenecks in data acquisition processes [2,9]. These uncertainties can be broadly categorized into aleatoric uncertainty, stemming from inherent stochastic variations in the data collection process, and epistemic uncertainty, relating to uncertainties in model parameters, potentially leading to deviations in the results. Aleatoric uncertainties poses significant challenges in image segmentation as image processing models must contend with limitations in the raw acquisition data. Addressing these uncertainties is critical, and the study of uncertainty quantification in image segmentation is an expanding field aimed at developing robust segmentation algorithms capable of mitigating erroneous outcomes. To this end, in [22], an extension of (14) has been proposed to consider segmentation of biomedical images. In particular, the particle model (15) has integrated a nonconstant stochastic part to take into account aleatoric uncertainties arising from the data acquisition process. These uncertainties may include factors such as motion artifacts or field inhomogeneities in magnetic resonance imaging (MRI). They modified Equation (14) as follows:

\begin{matrix} d x_{i} & = \frac{1}{N} \sum_{j = 1}^{N} P_{Δ_{1}, Δ_{2}} (x_{i}, x_{j}, c_{i}, c_{j}) (x_{j} - x_{i}) d t + \sqrt{2 σ^{2} D (c)} d W_{i} \\ \frac{d}{d t} c_{i} & = 0 \end{matrix}

(15)

where

{W_{i}}_{i = 1}^{N}

is set of independent Wiener processes,

P_{Δ_{1}, Δ_{2}} (\cdot, \cdot, \cdot, \cdot) \in [0, 1]

is the interaction function defined in (13), and

D (c) \geq 0

quantifies the impact of diffusion related to the value of the feature

c \in [0, 1]

. Since the aleatoric uncertainties are expected to appear far away from the static feature’s boundaries, only diffusion functions that are maximal at the center and satisfy

D (0) = D (1) = 0

are considered. Similarly to (7), we may introduce the following binary interaction scheme by writing

(x, x_{*}) : = (x_{i} (t), x_{j} (t))

, with a random couple of pixels having features

(c, c_{*}) : = (c_{i} (t), c_{j} (t))

. We obtain

\begin{matrix} x^{'} & = x + ϵ P_{Δ_{1}, Δ_{2}} (x, x_{*}, c, c_{*}) (x_{*} - x) + \sqrt{2 σ^{2} D (c)} η \\ x_{*}^{'} & = x_{*} + ϵ P_{Δ_{1}, Δ_{2}} (x_{*}, x, c_{*}, c) (x - x_{*}) + \sqrt{2 σ^{2} D (c_{*})} η \\ c_{*}^{'} & = c_{*} \\ c^{'} & = c, \end{matrix}

(16)

where

(x^{'}, x_{*}^{'}) : = (x_{i} (t + Δ t), x_{j} (t + Δ t))

and

(c^{'}, c_{*}^{'}) : = (c_{i} (t + Δ t), c_{j} (t + Δ t))

. At the statistical level, as in [22], we may follow the approach described in Section 2.2. Hence, we introduce the distribution function

f = f (x, c, t) : R^{2} \times [0, 1] \times R_{+} \to R_{+}

, such that

f (x, t) d x

represents the fraction of agents/particles in

[x_{1}, x_{1} + d x_{1}) \times [x_{2}, x_{2} + d x_{2}]

characterized by a feature

c \in [0, 1]

at time

t \geq 0

. The evolution of f, whose interaction follows the binary scheme (16), is provided by the following Boltzmann-type equation:

\begin{matrix} \frac{d}{d t} \int_{0}^{1} \int_{R^{2}} φ (x, c) f (x, c, t) d x d c = \\ ⟨\int_{{[0, 1]}^{2}} \int_{R^{4}} (φ (x^{'}, c) - φ (x, c)) f (x, c, t) f (x_{*}, c_{*}, t) d x d x_{*} d c d c_{*}⟩, \end{matrix}

(17)

Hence, since the feature is not evolving in time, we can proceed as in Section 2.2 to derive in the quasi-invariant limit for

ϵ \to 0^{+}

the corresponding Fokker–Planck-type PDE

\begin{matrix} \partial_{t} f (x, c, t) = \nabla_{x} \cdot [Ξ {[g]}_{Δ_{1}, Δ_{2}} (x, c, t) f (x, c, t) + σ^{2} D (c) \nabla_{x} f (x, c, t)] \end{matrix}

(18)

where

Ξ {[g]}_{Δ_{1}, Δ_{2}} (x, c, t) = \int_{0}^{1} \int_{R^{2}} P_{Δ_{1}, Δ_{2}} (x, x_{*}, c, c_{*}) (x - x_{*}) f (x_{*}, c_{*}, t) d x_{*} d c_{*} .

3. Evaluation Metrics and Parameter Estimation

In this section, we present a classical Direct Simulation Monte Carlo (DSMC) method to numerically approximate the evolution of (17) as a quasi-invariant approximation of the Fokker–Planck Equation (18). The resulting numerical algorithm is fundamental to estimate consistent parameters from MRI images. To this end, we present several loss metrics with the aim to compare the result of our model-based approach with existing methods for biomedical image segmentation. In this work, we focus exclusively on binary metrics. For evaluation of segmentation with multiple labels, we direct the reader to [45] for a detailed presentation of various metrics.

3.1. DSMC Algorithm for Image Segmentation

The numerical approximation of Boltzmann-type equations has been deeply investigated in recent decades; see, e.g., [51,52]. The approximation of this class of equations is particularly challenging due to the curse of dimensionality brought up by the multidimensional integral of the collision operator, and the presence of multiple scales. Furthermore, the preservation of relevant physical quantities is essential for a correct description of the underlying physical problem [53].

In view of its computational efficiency, in the following, we will adopt a DSMC approach. Indeed, the computational cost of this method is

O (N)

, where N represents the number of particles. Next, we describe the DSMC method based on a Nanbu–Bavosky scheme [52]. We begin by randomly selecting

N / 2

pairs of particles and making them evolve following the binary scheme presented in (7). We consider a time interval

[0, T]

, which we divide into

N_{t}

intervals of size

Δ t > 0

. The DSMC approach for the introduced kinetic equation is based on a first-order forward time discretization. In the following, we will always consider the case

Δ t = ϵ > 0

such that all the particles are going to interact; see [52] for more details. We introduce the stochastic rounding of a positive real number x as

S r o u n d (x) = \{\begin{matrix} ⌊ x ⌋ + 1 with probability x - ⌊ x ⌋ \\ ⌊ x ⌋ with probability 1 - x + ⌊ x ⌋ \end{matrix}

(19)

where

⌊ x ⌋

is the integer part of x. The random variable

η

is sampled from a 2D Gaussian distribution centered at zero and a diagonal covariance matrix.

3.2. Generation of Model-Oriented Segmentation Masks

In this section, we present the procedure to estimate the segmentation masks of brain tumor images. The procedure described in this section closely follows the methodology presented in [22]. For a given image, we define the feature’s values in relation to the gray level of each pixel. In more detail, for a given pixel

i \in {1, \dots, N}

, we define

c_{i} = \frac{C_{i} - {min}_{i = 1, \dots, N} C_{i}}{{max}_{i = 1, \dots, N} C_{i} - {min}_{i = 1, \dots, N} C_{i}} \in [0, 1],

C_{i}

,

i = 1, \dots, N

being the gray value of the original image. Therefore, the value

c_{i} = 1

represents a white pixel and

c_{i} = 0

represents black pixel.

In particular, for this work, we used the brain tumor dataset that consists of 3D multi-parametric MRI of patients affected by glioblastoma or lower-grade glioma, publicly available in the context of the Brain Tumor Image Segmentation Challenge http://medicaldecathlon.com/ (accessed on 28 November 2024). The acquisition sequences include

T_{1}

-weighted, post-Gadolinium contrast

T_{1}

-weighted,

T_{2}

-weighted, and

T_{2}

Fluid-Attenuated Inversion Recovery volumes. Each MRI scan is accompanied by corresponding ground truth segmentation mask, which is a binary image where anatomical regions of interest are highlighted as white pixels while all other areas are represented as black pixels. These ground truth segmentation masks were manually delineated by experienced radiologists and specifically identify three structures: “tumor core”, “enhancing tumor”, and “whole tumor”. We evaluate the performance of the DSMC algorithm for two different segmentation tasks: “tumor core” and “whole tumor” annotations. For the first task, we use a single slice in the axial plane of the post-Gadolinium contrast

T_{1}

-weighted scans, while, for the second task, we use a single slice in the axial plane of the

T_{2}

-weighted scans. The procedure to generate the segmentation masks is as follows:

We begin by associating each pixel with a position vector $(x_{i}, y_{i})$ and with static feature $c_{i}$ . We scale the vector position to a domain $[- 1, 1] \times [- 1, 1]$ and the static feature to $[0, 1]$ .
We apply a DSMC approach as described in Algorithm 1 to numerically approximate the large-time solution of the Boltzmann-type model defined in (17). This approach enables pixels to aggregate into clusters based on their Euclidean distance and gray color level.
The segmentation masks are generated by assigning to the original position of each pixel the mean values of the clusters they belong to. Thus, we generate a multi-level mask composed of a number of homogenous regions.
Finally, we obtain the binary mask by defining a threshold $\tilde{c}$ such that

$c_{i} = \{\begin{matrix} 1 i f c \geq \tilde{c} \\ 0 i f c < \tilde{c} \end{matrix}$

(20)

For all the following experiments, $\tilde{c}$ is defined as the 10th percentile of pixels in the image that belong to the region of interest. This percentile was chosen as an optimal value for brain tumor images; however, it could also be considered as a parameter to be optimized within the process outlined in the section on parameter optimization.

Algorithm 1 DSMC algorithm for Boltzmann equation

1:: Given N particles $(x_{n}^{0}, c_{n}^{0})$ , with $n = 1, \dots, N$ computed from the initial distribution $f_{0} (x, c)$ ;
2:: for $t = 1$ to $N_{t}$ do
3:: set $n_{p} = Sround (N / 2)$ ;
4:: sample $n_{p}$ pairs $(i, j)$ uniformly without repetition among all possible pairs of particles at time step t;
5:: for each pair $(i, j)$ , sample $η$ , $η_{*}$
6:: for each pair $(i, j)$ , compute the data change

$\begin{matrix} Δ x_{i}^{t} = & ϵ P_{Δ_{1}, Δ_{2}} (x_{i}^{t}, x_{j}^{t}, c_{i}^{0}, c_{j}^{0}) (x_{j}^{t} - x_{i}^{t}) + \sqrt{2 σ^{2} D (c_{i}^{0})} η \\ Δ x_{j}^{t} = & ϵ P_{Δ_{1}, Δ_{2}} (x_{j}^{t}, x_{i}^{t}, c_{j}^{0}, c_{i}^{0}) (x_{i}^{t} - x_{j}^{t}) + \sqrt{2 σ^{2} D (c_{j}^{0})} η_{*} \end{matrix}$

(21)

compute

$x_{i, j}^{t + 1} = x_{i, j}^{t} + Δ x_{i, j}^{t}$

(22)
7:: end for

Following this procedure, we apply two morphological refinement steps to remove small regions that have been misclassified as foreground parts and to fill small regions that have been incorrectly categorized as background pixels. We begin by labeling all the connected pixels in the foreground and reassigning to the background those whose number of pixels is less than a certain threshold. Then, we repeat the same procedure but for the pixels in the background. To this end, we use the scikit-image Python library that detects distinct objects of a binary image [54]. This enables us to obtain more precise segmentation masks by reducing small imperfections. This entire process is illustrated in Figure 4.

Figure 4. Summary of the segmentation process. The first image shows the input image. By means of Algorithm 1, we generate the multi-level mask where we reassign each picture’s gray level to the mean value of the cluster it is assigned to. The binary mask is produced as result of the binarization process. The final mask is the result after the two morphological refinement steps have been applied.

Parameter Optimization

In this section, we outline the procedure for optimizing the parameters

Δ_{1} > 0

,

Δ_{2} > 0

, and

σ^{2} > 0

that best approximate the ground truth segmentation masks. The goal is to identify the parameter configuration that minimizes the discrepancy between the computed and ground truth masks, measured through a predefined loss metric. To achieve this, we solve the following minimization problem:

min_{Δ_{1}, Δ_{2}, σ^{2} > 0} L o s s (S_{g}, S_{t}) = min_{Δ_{1}, Δ_{2}, σ^{2} > 0} 1 - M e t r i c (S_{g}, S_{t})

(23)

where

S_{g}

is the ground truth segmentation mask and

S_{t}

is the segmentation mask computed by the model. The different loss metrics quantify the discrepancy between the masks, with lower values indicating greater similarity. Accordingly, the metric function, detailed in Section 3.3, measures the similarity between the two masks, with higher values indicating better agreement. The relationship

l o s s = 1 - m e t r i c

is satisfied when the metric is defined to take a value of 1 for perfect agreement and 0 for complete mismatch.

To solve the optimization problem (23), we used the Hyperopt package [55]. This optimization method randomly samples the parameter configurations from predefined distributions and selects the configuration that minimizes the

l o s s

metric. This sampling process is repeated for a predefined number of iterations. In this work, we sample the values of our parameters from the following distributions:

\begin{matrix} Δ_{1} \sim U (Δ x, 0.7) \\ Δ_{2} \sim U (0.05, 0.3) \\ σ^{2} \sim \log - uniform (e^{- 5}, 1) \end{matrix}

(24)

where

Δ x

represents the distance between the initial positions of the pixels at

t = 0

. We perform 300 iterations of the optimization process. To ensure reproducibility and correctly compare the different results obtained, the random seed for parameter sampling is fixed.

3.3. Segmentation Metrics

Next, we introduce the principal optimization metrics used for evaluating a binary segmentation mask. We define

{S_{g}^{0}, S_{g}^{1}, S_{t}^{0}, S_{t}^{1}}

, where

S_{g}^{0}

and

S_{g}^{1}

represent the sets of pixels that belong to the background and foreground of the ground truth segmentation mask, respectively. The same applies for

S_{t}^{0}, S_{t}^{1}

but for the binary mask we want to evaluate. One could also wish to assess the validity of a segmentation mask with multiple labels; we refer to [45] for an introduction to the subject. Figure 5 presents a summary of the key terms used in the definitions of metrics.

Figure 5. Representation of the relevant areas between the predicted

S_{t}^{1}

and ground truth

S_{g}^{1}

segmentation masks.

B_{t}^{τ}

and

B_{g}^{τ}

represent the corresponding boundaries with a

τ

threshold. (a) Intersection area or true positive (TP). (b) Union area. (c) False positive (FP). (d) False negative (FN). (e) Intersection of boundaries at

τ = 0

. (f) Intersection of boundaries at

τ > 0

.

3.3.1. Volumetric and Surface Dice Indexes

The Volumetric Dice Index, also known as the Standard Volumetric Dice Similarity Coefficient, first introduced in [42], is the most used metric when evaluating volumetric segmentation masks. It is defined as follows:

DICE = \frac{2 | S_{g}^{1} \cap S_{t}^{1} |}{| S_{g}^{1} | + | S_{t}^{1} |}

(25)

where

| \cdot |

indicates the total number of pixels of the considered region. This metric is equal to one if there is a perfect overlap between the two segmentation masks and null if both segmentation masks are completely disjoint. Since the Volumetric Dice Coefficient is the most commonly used metric, especially in the biomedical field, the results are highly interpretable and can be compared with those obtained in other studies. However, when assessing surface segmentation masks, the Volumetric Dice Coefficient can yield suboptimal results. This limitation arises because the Volumetric Dice Coefficient evaluates the similarity between segmentation masks based on pixel overlap without considering the spatial accuracy of the boundaries. Specifically, it treats all pixel displacements equally without considering how far a segmentation error might be from the true boundary of the object. This means that segmentation masks with minor errors spread across multiple areas and those with a major error in a single area might receive similar scores. To address this limitation, the Surface Dice Similarity Coefficient was presented in [5] as a metric that can assess the accuracy of segmentation masks by considering the similarity of their boundaries. We define

ζ : I \to R^{2}

as a parameterization of

\partial S_{i}

, the boundary of the segmentation mask

S_{i}

. The border region

B_{i}^{(τ)}

, which is a region around the boundary

\partial S^{i}

with tolerance

τ

, is defined as

B_{i}^{(τ)} = \{x \in R^{2} / \exists y \in I s . t . | | x - ζ (y) | | \leq τ\}

(26)

where

τ

is a positive real number that defines the maximum allowable distance from the boundary

\partial S^{i}

for a point x to be considered part of the border region

B_{i}^{(τ)}

. The Surface Dice Similarity Coefficient between

S_{t}

and

S_{g}

with tolerance

τ

is defined as

R_{g, t}^{(τ)} = \frac{2 |B_{g}^{(τ)} \cap B_{t}^{(τ)}|}{|B_{g}^{(τ)}| + |B_{t}^{(τ)}|}

(27)

R_{g, t}^{(τ)}

, ranges from 0 to 1. A score of 1 indicates a perfect overlap between the two surfaces, while a score of 0 indicates no overlap. A larger value of

τ

results in a wider border region, making the metric more tolerant to small deviations in the boundary.

3.3.2. Jaccard Index

The Jaccard Index (JAC) [43], similar to the Volumetric Dice Coefficient, measures the similarity between two segmentation masks by quantifying the overlap between the computed mask and the ground truth. It is defined as the ratio between the intersection and the union of the foreground’s segmentation masks

JAC = \frac{| S_{g}^{1} \cap S_{t}^{1} |}{| S_{g}^{1} \cup S_{t}^{1} |} .

(28)

The JAC Index and the Volumetric Dice Coefficient are closely related since we have

\begin{matrix} JAC = \frac{DICE}{2 - DICE} DICE = \frac{2 JAC}{1 + JAC} . \end{matrix}

(29)

From (29), we obtain the relationship between the JAC index and the Volumetric Dice Coefficient. While both are widely used for measuring segmentation similarity, they can produce slightly different results. To understand the implications of these differences, we can analyze how their absolute and relative errors are related.

Definition 1

(Absolute Approximation). A similarity S is absolutely approximated by

\tilde{S}

with error

ϵ \geq 0

if the following holds for all y and

\tilde{y}

:

| S (y, \tilde{y}) - \tilde{S} (y, \tilde{y}) | \leq ϵ

Definition 2

(Relative Approximation). A similarity S is relatively approximated by

\tilde{S}

with error

ϵ \geq 0

if the following holds for all y and

\tilde{y}

:

\frac{\tilde{S} (y, \tilde{y})}{1 + ϵ} \leq S (y, \tilde{y}) \leq \tilde{S} (y, \tilde{y}) \cdot (1 + ϵ) .

The following result holds.

Proposition 1.

JAC and Volumetric Dice approximate each other with a relative error of 1 and an absolute error of

3 - 2 \sqrt{2}

.

We direct the reader to [46] for a deeper comparison between the Jaccard and Volumetric Dice indexes.

3.3.3. $F_{β}$ -Measure

The

F_{β}

-measure is commonly used as an information retrieval metric [41,56]. To define this metric, we first introduce two terms: positive predicted value (PPV) and true positive rate (TPR), which are also known as precision and sensitivity, respectively. The precision metric quantifies the proportion of correctly predicted foreground pixels (true positives, TPs) out of all pixels predicted as foreground (TPs + false positives, FPs). The sensitivity measures the proportion of actual foreground pixels (TPs) correctly identified by the model out of all actual foreground pixels (TPs + false negatives, FNs). These two metrics can be expressed as follows

\begin{matrix} Precision = PPV = \frac{TP}{TP + FP} \\ Sensitivity = TPR = \frac{TP}{TP + FN} \end{matrix}

(30)

The precision metric indicates how many of the predicted foreground pixels are actually correct. The sensitivity metric, on the other hand, measures how many of the actual foreground pixels were correctly predicted by the model.

We can define the

F_{β}

-measure as a combination of precision and sensitivity, with a parameter

β

that controls the trade-off between these two metrics. Specifically, the

F_{β}

-measure is provided by

{FMS}_{β} = \frac{(β^{2} + 1) \cdot PPV \cdot TPR}{β^{2} \cdot PPV + TPR}

(31)

We may observe that, if

β = 1

, we obtain the Volumetric Dice metric.

To understand the impact of

β

in the

F_{β}

-measure, we can substitute the definitions of PPV and TPR into (31), which results in the following

{FMS}_{β} = \frac{(β^{2} + 1) {TP}^{2}}{(β^{2} + 1) {TP}^{2} + TP (β^{2} FN + FP)}

(32)

If

β > 1

, the

F_{β}

-measure emphasizes minimizing false negatives (maximizing sensitivity), which can lead to more false positives (lower precision). If

β < 1

, the

F_{β}

-measure focuses on minimizing false positives (maximizing precision), potentially increasing the number of false negatives (lower sensitivity).

Furthermore, it can be noticed that

lim_{β \to \infty} \frac{(β^{2} + 1) {TP}^{2}}{(β^{2} + 1) {TP}^{2} + TP (β^{2} FN + FP)} = Sensitivity = \frac{TP}{TP + FN}

(33)

since for

β ≫ 0

we neglect the contribution of the false positives by considering only the contribution of the false negatives where we re-obtain the TPR metrics defined in (30).

In summary, thanks to the

β

parameter, the

F_{β}

-measure offers a flexible way to evaluate segmentation models by enabling a tunable balance between precision and sensitivity. It provides a useful metric when dealing with class imbalances, especially in the field of medical imaging, where the relative importance of false positives and false negatives can vary according to each segmentation task.

4. Numerical Results

4.1. Impact of Different Diffusion Functions

In this section, we study the impact of choosing different diffusion functions

D (c)

in images consisting of a blurry background and a geometric shape in the center, as shown in Figure 6. The objective is to detect the shape of the geometric figure and to compare how the choice of different diffusion functions affects the value of the model parameters,

Δ_{1}, Δ_{2}

, and

σ^{2}

, where the optimization process is identical to the one introduced in the section on parameter optimization. To this end, we chose the following diffusion functions:

\begin{matrix} D_{1} (c) = c (1 - c) & D_{2} (c) = 4 c^{2} {(1 - c)}^{2} \\ D_{3} (c) = \{\begin{matrix} \frac{c}{2} if c \leq 0.5 \\ \frac{c}{2} (1 - c) if c > 0.5 \end{matrix} & D_{4} (c) = 64 c^{4} {(1 - c)}^{4} . \end{matrix}

(34)

Figure 6. Images used to test different diffusion functions. The first column displays the original images, the second column presents the expected segmentation mask, and the third column shows the resulting binary mask. Each picture consists of

(256, 256)

pixels. For the optimization procedure, we set

T = 200

and

Δ t = 0.1

. We define the number of iterations at 50. Row (a) shows the image with a square on a blurry background, while row (b) displays a similar image but with a circle. Only one resulting binary mask was reported for each of the images because all the tests described in this section obtain the same segmentation mask.

We direct the reader to Figure 7 for a summary of the various introduced diffusion functions in (34).

Figure 7. Diffusion functions defined in (34) to assess the variability related to a given feature’s level.

For both the square and circle images, the Surface Dice Coefficient was used to optimize the parameters with a tolerance equal to the length of 1 pixel. Both images have a shape of

(256, 256)

pixels. The final time was set to

T = 200

with

Δ t = 0.1

. The resulting binary mask was the same for all choices of diffusion functions, obtaining the same loss function value. The results are shown in Figure 6. In the case of the square in Figure 6a, we can see from Table 1 that, for

D_{1} (c)

and

D_{3} (c)

, the values of

Δ_{1}

do not differ greatly for these two diffusion functions. In the case of

Δ_{2}

, we obtain a slightly smaller value for

D_{1} (c)

compared to the one obtained for

D_{3} (c)

and a larger value of the parameter

σ^{2} > 0

for

D_{3} (c)

compared to the one obtained for

D_{1} (c)

. If we look at Figure 7, we notice that

D_{1} (c) \geq D_{3} (c)

. Therefore, a larger value of the diffusion functions is balanced by a smaller value of

σ^{2}

to obtain a similar diffusion effect. This holds also for

D_{1} (c)

and

D_{3} (c)

for the circle in Figure 6b. Furthermore, comparing

D_{2} (c)

and

D_{4} (c)

for the square image, we can see that the resulting parameters are smaller for

D_{2} (c)

in contrast to the one obtained with

D_{4} (c)

. This is consistent because, again, we can see from Figure 7 that

D_{2} (c) \geq D_{4} (c)

. If we now compare

D_{2} (c)

and

D_{4} (c)

for the circle image, we can see that the value of

σ^{2}

is similar in this case. Nevertheless, in this case, the difference is provided by the values of

Δ_{1}

and

Δ_{2}

, which are both smaller for

D_{2} (c)

. This indicates that, for different diffusion functions, the optimal parameters adjust to yield similar results. A very straightforward approach is to obtain similar values of

Δ_{1}

and

Δ_{2}

and a lower value of

σ^{2}

for the diffusion function that has a higher value, as in the case of the square image. However, the example of the circle image shows us that we can also obtain different combinations of parameters so as to counter the effect of a larger diffusion function.

Table 1. Parameters obtained for different diffusion functions for the square and circle images. The loss metric used to obtain these parameters was the Surface Dice Coefficient with a tolerance equal to the length of 1 pixel.

From Table 2, we can see the parameters obtained by minimizing three different optimization metrics using as a diffusion function

D_{1} (c)

for the square image. For all the cases, the resulting Surface Dice Coefficient was equal to one, indicating perfect overlap between the computed and ground truth segmentation masks. The resulting binary masks obtained were the same for the three examples and are equivalent to the ones shown in Figure 6. For the Volumetric and Surface Dice coefficients, we can see that the parameters obtained were identical. Nevertheless, for the Jaccard Index, the resulting parameters differed, being smaller in this case. The loss is null in both cases, consistent with the relationship in (29).

Table 2. Parameters obtained for the square image by minimizing the Jaccard Index and the Volumetric and Surface Dice coefficients. For the Surface Dice Coefficient, the tolerance was set to the length of 1 pixel. The loss obtained was zero for the three cases.

4.2. Determining the Final Time

In this section, we specify the criteria that we implemented to determine the final time

T > 0

. As defined in Section 3.1, we approximate the solution of (18) through a DSMC approach even though we have no analytical insight on the form of the steady state. The objective is to find the values of the final time

T > 0

such that a numerical steady state can be defined. We stress that the time taken to reach the equilibrium state for different initial conditions is not the same, so we need to determine the time parameters for all the images we want to analyze. To this end, if

f^{n} (x, c)

is the approximation of the density at time

t^{n} = n Δ t

, we define

T = \int_{R^{2} \times [0, 1]} | f^{n + 1} (x, c) - f^{n} (x, c) | d x d c,

(35)

which represents an index of variation between two successive time steps of the reconstructed kinetic density. As the solution evolves, this quantity decreases and tends to zero as the equilibrium state is reached, as illustrated in Figure 8 for the case of the square image with a blurry background. Hence, we may introduce a breaking criterion based on the condition

T < δ

for some

δ > 0

. When this condition is satisfied, the reconstructed density is considered to be an approximation of the steady state.

Figure 8. Evolution of

T

, where the kinetic density is that considered in Figure 6a. The image consists of

(256, 256)

pixels. We can observe how

T

decreases until condition

T < δ

is reached with

δ = 0.005

.

The same procedure was conducted for all images presented in this work so as to fulfill the condition presented in this section.

4.3. Optimization Metrics for Biomedical Image Segmentation

In this section, we study the impact that the different optimization metrics have on the resulting binary masks for the core and whole tumor. We also analyze the parameters obtained for the different optimization metrics. Both brain tumor images consist of

N = (240, 240)

pixels, and, for the optimization procedure, we determine

T = 300

and

Δ t = 0.01

for both the core and whole tumor for all the optimization metrics addressed in this section. For each segmentation mask generated, we evaluated 300 different combinations of parameters. Figure 9 shows the segmentation masks obtained for both the whole and core tumor by optimizing the Jaccard Index and the Volumetric Dice Coefficient. In Table 3, the resulting parameters and the loss obtained for both optimization metrics are presented; in this case, the loss is equal to 1 for a perfect overlap and 0 if the images are totally disjoint. First, we can observe that the loss values obtained with both metrics satisfy (29) as expected. It can be noticed that, for both segmentation masks, the loss obtained is greater for the Volumetric Dice Coefficient. Furthermore, the parameter

Δ_{1}

obtained with both optimization metrics is similar for both the core and whole tumor. Nevertheless, we can see that, for the whole tumor, the

Δ_{2}

parameter obtained with the Jaccard Index is larger than the one obtained with the Volumetric Dice Coefficient. For the case of the core tumor instead, the

Δ_{2}

parameter is larger for the Volumetric Dice Coefficient. If we compare this to the values obtained for

σ^{2}

in both cases for both metrics, we can see that a larger diffusion value is countered by a smaller value of

Δ_{2}

so as to obtain similar segmentation masks, as demonstrated in Figure 9.

Figure 9. Segmentation masks obtained by minimizing the Jaccard Index and the Volumetric Dice Coefficient. (a) Shows the results for the core tumor and (b) shows the results for the whole tumor. Both images consist of

240 \times 240

pixels. For the optimization procedure, we set

T = 300

and

Δ t = 0.01

. In both cases, we considered 300 iterations of the optimization algorithm. In both cases, the loss reported by the Jaccard Index was smaller compared to that obtained with the Volumetric Dice Coefficient. Furthermore, it can be noticed that the losses reported satisfy (29) as expected. From the values of the parameters, we can observe that a larger value of the diffusion is countered by a smaller value of

Δ_{2}

.

Table 3. Parameters obtained for the whole and core tumor using the Volumetric Dice Coefficient, Jaccard Index, and Surface Dice Coefficient. The loss reported is 1 for perfect overlap and 0 for complete deviation.

For the Surface Dice Coefficient, the tolerance

τ

was set to the length of 1 pixel, both when used as the optimization loss and when used as the evaluation metric. Figure 10 shows the resulting binary mask obtained with the Surface Dice Coefficient and the Volumetric Dice Coefficient for the core and whole tumor. In the case of the whole tumor, the loss obtained with the Surface Dice Coefficient is smaller than that obtained with the Jaccard Index and the Volumetric Dice Coefficient. For the core tumor, the loss obtained with the Surface Dice Coefficient is similar to that reported by the Jaccard Index, and both are smaller than that obtained with the Volumetric Dice Coefficient. For the whole tumor, we can see that the resulting parameters are similar for all the optimization metrics. Nevertheless, for the core tumor, we can notice that the parameters obtained with the Surface Dice Coefficient differ compared to the ones obtained with the Jaccard Index and the Volumetric Dice Coefficient. In particular, we obtained a smaller value for

σ^{2}

and slightly larger value for

Δ_{1}

. This indicates that a smaller value for the diffusion of the particles is compensated by enabling the particles to aggregate with others that are slightly more separated than regarding Volumetric Dice and the Jaccard Index. Given that both the Volumetric Dice Coefficient and Jaccard Index are a measure of the superposition between two volumes (in this case two surfaces), they do not represent the proximity between two surfaces, making the Surface Dice Coefficient more suitable to use as a loss metric when comparing two different surfaces.

Figure 10. Segmentation masks obtained by minimizing the Surface and Volumetric Dice coefficients. (a) Shows the results for the core tumor and (b) shows the results for the whole tumor. Both images consist of

240 \times 240

pixels. For the optimization procedure, we set

T = 300

and

Δ t = 0.01

. In both cases, we considered 300 iterations of the optimization algorithm. For the Surface Dice Coefficient, we set the tolerance

τ

equal to the length of 1 pixel. Given that both the Volumetric Dice Coefficient and Jaccard Index are a measure of the superposition between the two surfaces and do not account for the proximity between the two surfaces at every given point, the Surface Dice Coefficient represents a more suitable metric when comparing two different surfaces.

For the

F_{β}

-measure, we can see in Figure 11 the binary masks obtained for different values of

β

for the core and whole tumor. For the case of the core tumor, we can observe that, for

β = 0.25

, we obtain areas of misclassified pixels in the tumor region. This can also be seen from Table 4, where the number of false negatives is larger and the number of false positives is smaller compared to the results obtained for larger values of

β

. If we recall (31), we can see that, for low values of

β

, the false negatives are multiplied by a factor of

β^{2}

, thus having a smaller weight compared to the false positives. As we increase the value of

β

, we can notice from both Table 4 and Figure 11 that modifying the value of

β

has no impact on the resulting binary mask. This also holds true for the whole tumor as no difference can be noticed in the results obtained for different values of

β

. Finally, in Figure 12, we see the loss reported for different values of

β

, where the loss equal to 1 represents a perfect overlap. First, it can be noticed that we obtain the higher value of the loss for

β = 0.25

, meaning that this should be the most accurate result, which is balanced anyway by the fact that we obtain a larger number of false negatives. Again, we observe that this can be obtained from (31), where low values of

β

reduce the impact of a large number of false negatives on the resulting loss. Secondly, we observe that the loss decreases for larger values of

β

. This behavior arises because the loss is inversely proportional to

β

, while the resulting segmentation masks remain unchanged, as shown in Table 4. This shows that the

F_{β}

-measure may not be a reliable metric for these types of segmentation masks and this segmentation method, and that modifying the value of

β

provides no advantage.

Figure 11. Segmentation masks obtained for the

F_{β}

-loss metric. (a) Shows the segmentation masks obtained for

β = 0.25, 0.5, 0.75,

and

1.5

for the core tumor and (b) shows the segmentation masks obtained using the same values of

β

for the whole tumor. Both images consist of

240 \times 240

pixels. For the optimization procedure, we set

T = 300

and

Δ t = 0.01

. In both cases, we considered 300 iterations of the optimization algorithm. In (a), we can observe that, for

β = 0.25

, the resulting segmentation masks display areas of misclassified pixels, while, for larger values of

β

, the resulting segmentation mask does not differ. In (b), no zoomed area is shown as the segmentation masks display no visible differences for the different values of

β

. This is also evident in Table 4 by observing the number of false positives (FPs), false negatives (FNs), and true positives (TPs) obtained for both images.

Table 4. Parameters obtained for the

F_{β}

-measure for different values of

β

. The loss reported is 1 for perfect overlap and 0 for complete deviation. The numbers of false positives (FPs), false negatives (FNs), and true positives (TPs) are presented for the resulting segmentation masks for each value of

β

.

Figure 12. Relationship between the

F_{β}

-loss value and the

β

value for both the core and whole tumor images. As

β

increases, the

F_{β}

-loss decreases, showing that, for lower values of

β

, we should obtain a more precise segmentation mask as the loss indicated in this figure is 1 for perfect overlap. Nevertheless, the resulting binary mask is less accurate for lower values of

β

, showing that this is not an appropriate metric for optimizing the consensus-based model.

5. Conclusions

In this paper, we presented a consensus-based kinetic method and demonstrated how this model can be applied for the problem of image segmentation. A pixel in a 2D image is interpreted as a particle that interacts with the rest through a consensus-type process, which enables us to identify different clusters and generate an image segmentation. We developed a procedure that enables us to approximate the ground truth segmentation masks of different brain tumor images. Furthermore, we presented and evaluated different optimization metrics and studied the impact on the results obtained. In particular, we found that the Jaccard Index and Volumetric and Surface Dice coefficients are appropriate metrics to optimize our model. Nevertheless, given that the Surface Dice Coefficient is a measure of discrepancy between the boundaries of two surfaces, it is a better representation compared to the Jaccard Index and the Volumetric Dice Coefficient as they account only for absolute differences and do not capture pointwise differences. Furthermore, we assessed the use of

F_{β}

-loss as a potential optimization metric. We found that both the loss values and corresponding results were difficult to interpret as low loss values often corresponded to low accuracy, making this metric challenging to apply effectively for optimization in this context. Future research will focus on the case of multidimensional features to deal with color images as RGB color models are defined by 3D features specifying red, green, and blue values. As a result, we plan to define a pipeline for learning model parameters depending on these multidimensional characteristics, aiming to enhance accuracy and applicability in real-world scenarios.

Author Contributions

Conceptualization, M.Z. and R.F.C.; methodology, M.Z. and R.F.C.; software, R.F.C. and H.T.; validation, M.Z., R.F.C. and H.T.; formal analysis, M.Z. and H.T.; investigation, R.F.C., H.T. and M.Z.; resources, R.F.C. and H.T.; data curation, R.F.C. and H.T.; writing—original draft preparation, R.F.C., M.Z. and H.T.; visualization, R.F.C. and H.T.; supervision, M.Z.; project administration, M.Z.; funding acquisition, M.Z. All authors have read and agreed to the published version of the manuscript.

Funding

M.Z. is member of GNFM (Gruppo Nazionale di Fisica Matematica) of INdAM, Italy, and acknowledges support of PRIN2022PNRR project No. P2022Z7ZAJ, European Union—NextGenerationEU. M.Z. acknowledges partial support by ICSC—Centro Nazionale di Ricerca in High Performance Computing, Big Data and Quantum Computing, funded by European Union—NextGenerationEU.

Data Availability Statement

All data are publicly available at http://medicaldecathlon.com/ (accessed on 28 November 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Agosti, A.; Shaqiri, E.; Paoletti, M.; Solazzo, F.; Bergsland, N.; Colelli, G.; Savini, G.; Muzic, S.; Santini, F.; Deligianni, X.; et al. Deep learning for automatic segmentation of thigh and leg muscles. Magn. Reson. Mater. Phys. Biol. Med. 2021, 35, 467–483. [Google Scholar] [CrossRef]
Barbano, R.; Arridge, S.; Jin, B.; Tanno, R. Uncertainty quantification in medical image synthesis. In Biomedical Image Synthesis and Simulation: Methods and Applications; The MICCAI Society book Series, Biomedical Image Synthesis and Simulation; Academic Press: Cambridge, MA, USA, 2022; pp. 601–641. [Google Scholar]
Coupé, P.; Manjón, J.; Fonov, V.; Pruessner, J.; Robles, M.; Collins, D. Patch-based segmentation using expert priors: Application to hippocampus and ventricle segmentation. NeuroImage 2011, 54, 940–954. [Google Scholar] [CrossRef] [PubMed]
Medaglia, A.; Colelli, G.; Farina, L.; Bacila, A.; Bini, P.; Marchioni, E.; Figini, S.; Pichiecchio, A.; Zanella, M. Uncertainty quantification and control of kinetic models of tumour growth under clinical uncertainties. Int. J. Non-Linear Mech. 2022, 141, 103933. [Google Scholar] [CrossRef]
Nikolov, S.; Blackwell, S.; Mendes, R.; Fauw, J.; Meyer, C.; Hughes, C.; Askham, H.; Romera-Paredes, B.; Karthikesalingam, A.; Chu, C.; et al. Deep learning to achieve clinically applicable segmentation of head and neck anatomy for radiotherapy. arXiv 2018, arXiv:1809.04430. [Google Scholar]
Sharma, N.; Aggarwal, L. Automated medical image segmentation techniques. J. Med. Phys. 2010, 35, 3. [Google Scholar] [CrossRef] [PubMed]
Hesamian, M.; Jia, W.; He, X.; Kennedy, P. Deep learning techniques for medical image segmentation: Achievements and challenges. J. Digit. Imaging 2019, 32, 582–596. [Google Scholar] [CrossRef] [PubMed]
Isensee, F.; Jaeger, P.; Kohl, S.; Petersen, J.; Maier-Hein, K. nnU-Net: A self-configuring method for deep learning-based biomedical image segmentation. Nat. Methods 2021, 18, 203–211. [Google Scholar] [CrossRef]
Kwon, Y.; Won, J.; Kim, B.; Paik, M. Uncertainty quantification using Bayesian neural networks in classification: Application to biomedical image segmentation. Comput. Stat. Data Anal. 2020, 142, 106816. [Google Scholar] [CrossRef]
Liu, X.; Song, L.; Liu, S.; Zhang, Y. A review of deep-learning-based medical image segmentation methods. Sustainability 2021, 13, 1224. [Google Scholar] [CrossRef]
Lizzi, F.; Agosti, A.; Brero, F.; Cabini, R.F.; Fantacci, M.E.; Figini, S.; Lascialfari, A.; Laruina, F.; Oliva, P.; Piffer, S.; et al. Quantification of pulmonary involvement in COVID-19 pneumonia by means of a cascade of two U-nets: Training and assessment on multiple datasets using different annotation criteria. Int. J. Comput. Assist. Radiol. Surg. 2022, 17, 229–237. [Google Scholar] [CrossRef] [PubMed]
Lizzi, F.; Postuma, I.; Brero, F.; Cabini, R.F.; Fantacci, M.E.; Lascialfari, A.; Oliva, P.; Rinaldi, L.; Retico, A. Quantification of pulmonary involvement in COVID-19 pneumonia: An upgrade of the LungQuant software for lung CT segmentation. Eur. Phys. J. Plus. 2023, 138, 326. [Google Scholar] [CrossRef]
Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the 18th International Conference, Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar]
Yu, Z.; Au, O.; Zou, R.; Yu, W.; Tian, J. An adaptive unsupervised approach toward pixel clustering and color image segmentation. Pattern Recognit. 2010, 43, 1889–1906. [Google Scholar] [CrossRef]
Zhou, Z.; Rahman Siddiquee, M.; Tajbakhsh, N.; Liang, J. Unet++: A nested u-net architecture for medical image segmentation. In Deep Learning in Medical Image Analysis And Multimodal Learning for Clinical Decision Support; Springer: Cham, Switzerland, 2018; pp. 3–11. [Google Scholar]
Cordier, N.; Delingette, H.; Ayache, N. A patch-based approach for the segmentation of pathologies: Application to glioma labelling. IEEE Trans. Med. Imaging 2015, 35, 1066–1076. [Google Scholar] [CrossRef] [PubMed]
Frigui, H.; Krishnapuram, R. A robust competitive clustering algorithm with applications in computer vision. IEEE Trans. Pattern Anal. Mach. Intell. 1999, 21, 450–465. [Google Scholar] [CrossRef]
Jain, A.; Murty, M.; Flynn, P. Data clustering: A review. ACM Comput. Surv. 1999, 31, 264–323. [Google Scholar] [CrossRef]
Kayal, S. Unsupervised image segmentation using the Deffuant-Weisbuch model from social dynamics. Signal Image Video Process. 2017, 11, 1405–1410. [Google Scholar] [CrossRef]
Pizzagalli, D.; Gonzalez, S.; Krause, R. A trainable clustering algorithm based on shortest paths from density peaks. Sci. Adv. 2019, 5, eaax3770. [Google Scholar] [CrossRef]
Quetti, F.M.; Figini, S.; Ballante, E. A Bayesian Approach to Clustering via the Proper Bayesian Bootstrap: The Bayesian Bagged Clustering (BBC) algorithm. arXiv 2024, arXiv:2409.08954. [Google Scholar]
Cabini, R.; Pichiecchio, A.; Lascialfari, A.; Figini, S.; Zanella, M. A kinetic approach to consensus-based segmentation of biomedical images. Kinet. Relat. Models 2025, 18, 286–311. [Google Scholar] [CrossRef]
Herty, M.; Pareschi, L. Visconti, G. Mean field models for large data–clustering problems. Netw. Heterog. Media. 2020, 15, 463. [Google Scholar] [CrossRef]
Hegselmann, R.; Krause, U. Opinion dynamics and bounded confidence models, analysis, and simulation. J. Artif. Soc. Soc. Simul. 2002, 5, 3. [Google Scholar]
Deffuant, G.; Neau, D.; Amblard, F.; Weisbuch, G. Mixing beliefs among interacting agents. Adv. Complex Syst. 2000, 3, 87–98. [Google Scholar] [CrossRef]
DeGroot, M. Reaching a consensus. J. Am. Stat. Assoc. 1974, 69, 118–121. [Google Scholar] [CrossRef]
French, J., Jr. A formal theory of social power. Psychol. Rev. 1956, 63, 181–194. [Google Scholar] [CrossRef] [PubMed]
Sznajd-Weron, K.; Sznajd, J. Opinion evolution in closed communities. Int. J. Mod. Phys. C 2000, 11, 1157–1165. [Google Scholar] [CrossRef]
Borra, D.; Lorenzi, T. Asymptotic analysis of continuous opinion models under bounded confidence. Commun. Pure Appl. Anal. 2013, 12, 1487–1499. [Google Scholar] [CrossRef]
Castellano, C.; Fortunato, S.; Loreto, V. Statistical physics of social dynamics. Rev. Mod. Phys. 2009, 81, 591–646. [Google Scholar] [CrossRef]
Fagioli, S.; Favre, G. Opinion formation on evolving network: The DPA method applied to a nonlocal cross-diffusion PDE-ODE system. Eur. J. Appl. Math. 2024, 35, 748–775. [Google Scholar] [CrossRef]
Motsch, S.; Tadmor, E. Heterophilious dynamics enhances consensus. SIAM Rev. 2014, 56, 577–621. [Google Scholar] [CrossRef]
Albi, G.; Pareschi, L.; Toscani, G.; Zanella, M. Recent advances in opinion modeling: Control and social influence. In Active Particles Volume 1, Advances in Theory, Models, and Applications; Bellomo, N., Degond, P., Tadmor, E., Eds.; Birkhäuser: Cham, Switzerland, 2017. [Google Scholar]
Carrillo, J.A.; Fornasier, M.; Rosado, J.; Toscani, G. Asymptotic flocking dynamics for the kinetic Cucker-Smale model. SIAM J. Math. Anal. 2010, 42, 218–236. [Google Scholar] [CrossRef]
Düring, B.; Wolfram, M.-T. Opinion dynamics: Inhomogeneous Boltzmann-type equations modelling opinion leadership and political segregation. Proc. R. Soc. Lond. A 2015, 471. [Google Scholar] [CrossRef]
Fagioli, S.; Radici, E. Opinion formation systems via deterministic particles approximation. Kinet. Relat. Mod. 2021, 14, 45–76. [Google Scholar] [CrossRef]
Pareschi, L.; Toscani, G. Interacting Multiagent Systems: Kinetic Equations and Monte Carlo Msethods; Oxford University Press: Oxford, UK, 2013. [Google Scholar]
Pareschi, L.; Tosin, A.; Toscani, G.; Zanella, M. Hydrodynamic models of preference formation in multi-agent societies. J. Nonlin. Sci. 2019, 29, 2761–2796. [Google Scholar] [CrossRef]
Toscani, G. Kinetic models of opinion formation. Commun. Math. Sci. 2006, 4, 481–496. [Google Scholar] [CrossRef]
Auricchio, G.; Codegoni, A.; Gualandi, S.; Toscani, G.; Veneroni, M. On the equivalence between Fourier-based and Wasserstein metrics. Rend. Lincei Mat. Appl. 2020, 31, 627–649. [Google Scholar]
Chinchor, N. MUC-4 Evaluation Metrics. 1992. Available online: https://aclanthology.org/M92-1002.pdf (accessed on 28 November 2024).
Dice, L. Measures of the Amount of Ecologic Association Between Species. Ecology 1945, 26, 297–302. [Google Scholar] [CrossRef]
Jaccard, P. The distribution of the flora in the alpine zone.1. New Phytol. 1912, 11, 37–50. [Google Scholar] [CrossRef]
Mittal, H.; Pandey, A.; Saraswat, M.; Kumar, S.; Pal, R.; Modwel, G. A comprehensive survey of image segmentation: Clustering methods, performance parameters, and benchmark datasets. Multimed. Tools. Appl. 2022, 81, 35001–35026. [Google Scholar] [CrossRef] [PubMed]
Taha, A.; Hanbury, A. Metrics for evaluating 3D medical image segmentation: Analysis, selection, and tool. BMC Med. Imaging 2015, 15, 29. [Google Scholar] [CrossRef]
Bertels, J.; Eelbode, T.; Berman, M.; Vandermeulen, D.; Maes, F.; Bisschops, R.; Blaschko, M. Optimizing the Dice Score and Jaccard Index for Medical Image Segmentation: Theory & Practice. arXiv 2019, arXiv:1911.01685. [Google Scholar]
Albi, G.; Pareschi, L.; Zanella, M. On the optimal control of opinion dynamics on evolving networks. In Proceedings of the 27th IFIP TC 7 Conference, CSMO 2015, Sophia Antipolis, France, 29 June–3 July 2015; IFIP Advances in Information and Communication, Technology. Bociu, L., Désidéri, J.A., Habbal, A., Eds.; Springer: Cham, Switzerland, 2016; Volume 494. [Google Scholar]
Nugent, A.; Gomes, S.N.; Wolfram, M.-T. Steering opinion dynamics through control of social networks. arXiv 2024, arXiv:2404.09849. [Google Scholar] [CrossRef]
Carrillo, J.A.; Fornasier, M.; Toscani, G.; Vecil, F. Particle, kinetic, and hydrodynamic models of swarming. In Mathematical Modeling of Collective Behavior in Socio-Economic and Life Sciences; Modeling and Simulation in Science, Engineering and Technology; Naldi, G., Pareschi, L., Toscani, G., Eds.; Birkhäuser: Basel, Switzerland, 2010. [Google Scholar]
Piccoli, B.; Tosin, A.; Zanella, M. Model-based assessment of the impact of driver-assist vehicles using kinetic theory. Z. Angew. Math. Phys. 2020, 71, 152. [Google Scholar] [CrossRef]
Dimarco, G.; Pareschi, L. Numerical methods for kinetic equations. Acta Numer. 2014, 23, 369–520. [Google Scholar] [CrossRef]
Pareschi, L.; Russo, G. An Introduction to Monte Carlo Methods for the Boltzmann Equation. ESAIM Proc. 1999, 10, 35–76. [Google Scholar] [CrossRef]
Pareschi, L.; Zanella, M. Structure preserving schemes for nonlinear Fokker-Planck equations and applications. J. Sci. Comput. 2018, 74, 1575–1600. [Google Scholar] [CrossRef]
Van Der Walt, S.; Schönberger, J.; Nunez-Iglesias, J.; Boulogne, F.; Warner, J.; Yager, N.; Gouillart, E.; Yu, T. Scikit-image: Image processing in python. PeerJ 2014, 2, e453. [Google Scholar] [CrossRef] [PubMed]
Bergstra, J.; Komer, B.; Eliasmith, C.; Yamins, D.; Cox, D. Hyperopt: A Python library for model selection and hyperparameter optimization. Comput. Sci. Discov. 2015, 8, 014008. [Google Scholar] [CrossRef]
Sasaki, Y. The truth of the F-measure. Teach Tutor Mater 2007, 1, 1–5. [Google Scholar]

Figure 1. Large time distribution of the 2D-bounded confidence model for different parameters characterizing the compromise propensity and the diffusion for N =

10^{5}

particles in

[0, T]

with

T = 100

and

Δ t = 0.01

. In (a), the final state converges to a number of clusters depending on the value of

Δ

. As we reduce the range of interaction, more clusters are created. In rows (b,c), we can see the interplay between the tendency of particles to aggregate and diffuse. In the first column, we see that the steady state converges to a Gaussian distribution with a standard deviation provided by

σ^{2}

. In the second column, for (b,c), we see that the final states differ greatly in their structure. Finally, the last column shows the final states in the case where the diffusion surpasses considerable aggregation tendency.

Figure 2. A schematic representation of the proposed model, where each pixel is interpreted as a particle

(x_{i}, y_{i}, c_{i})

, with

c_{i}

being a static feature in the interval

[0, 1]

that represents the grey level.

Figure 3. Representation of the evolution of pixels as they tend to aggregate in different clusters.

Figure 4. Summary of the segmentation process. The first image shows the input image. By means of Algorithm 1, we generate the multi-level mask where we reassign each picture’s gray level to the mean value of the cluster it is assigned to. The binary mask is produced as result of the binarization process. The final mask is the result after the two morphological refinement steps have been applied.

Figure 5. Representation of the relevant areas between the predicted

S_{t}^{1}

and ground truth

S_{g}^{1}

segmentation masks.

B_{t}^{τ}

and

B_{g}^{τ}

represent the corresponding boundaries with a

τ

threshold. (a) Intersection area or true positive (TP). (b) Union area. (c) False positive (FP). (d) False negative (FN). (e) Intersection of boundaries at

τ = 0

. (f) Intersection of boundaries at

τ > 0

.

Figure 6. Images used to test different diffusion functions. The first column displays the original images, the second column presents the expected segmentation mask, and the third column shows the resulting binary mask. Each picture consists of

(256, 256)

pixels. For the optimization procedure, we set

T = 200

and

Δ t = 0.1

. We define the number of iterations at 50. Row (a) shows the image with a square on a blurry background, while row (b) displays a similar image but with a circle. Only one resulting binary mask was reported for each of the images because all the tests described in this section obtain the same segmentation mask.

Figure 7. Diffusion functions defined in (34) to assess the variability related to a given feature’s level.

Figure 8. Evolution of

T

, where the kinetic density is that considered in Figure 6a. The image consists of

(256, 256)

pixels. We can observe how

T

decreases until condition

T < δ

is reached with

δ = 0.005

.

Figure 9. Segmentation masks obtained by minimizing the Jaccard Index and the Volumetric Dice Coefficient. (a) Shows the results for the core tumor and (b) shows the results for the whole tumor. Both images consist of

240 \times 240

pixels. For the optimization procedure, we set

T = 300

and

Δ t = 0.01

. In both cases, we considered 300 iterations of the optimization algorithm. In both cases, the loss reported by the Jaccard Index was smaller compared to that obtained with the Volumetric Dice Coefficient. Furthermore, it can be noticed that the losses reported satisfy (29) as expected. From the values of the parameters, we can observe that a larger value of the diffusion is countered by a smaller value of

Δ_{2}

.

Figure 10. Segmentation masks obtained by minimizing the Surface and Volumetric Dice coefficients. (a) Shows the results for the core tumor and (b) shows the results for the whole tumor. Both images consist of

240 \times 240

pixels. For the optimization procedure, we set

T = 300

and

Δ t = 0.01

. In both cases, we considered 300 iterations of the optimization algorithm. For the Surface Dice Coefficient, we set the tolerance

τ

equal to the length of 1 pixel. Given that both the Volumetric Dice Coefficient and Jaccard Index are a measure of the superposition between the two surfaces and do not account for the proximity between the two surfaces at every given point, the Surface Dice Coefficient represents a more suitable metric when comparing two different surfaces.

Figure 11. Segmentation masks obtained for the

F_{β}

-loss metric. (a) Shows the segmentation masks obtained for

β = 0.25, 0.5, 0.75,

and

1.5

for the core tumor and (b) shows the segmentation masks obtained using the same values of

β

for the whole tumor. Both images consist of

240 \times 240

pixels. For the optimization procedure, we set

T = 300

and

Δ t = 0.01

. In both cases, we considered 300 iterations of the optimization algorithm. In (a), we can observe that, for

β = 0.25

, the resulting segmentation masks display areas of misclassified pixels, while, for larger values of

β

, the resulting segmentation mask does not differ. In (b), no zoomed area is shown as the segmentation masks display no visible differences for the different values of

β

. This is also evident in Table 4 by observing the number of false positives (FPs), false negatives (FNs), and true positives (TPs) obtained for both images.

Figure 12. Relationship between the

F_{β}

-loss value and the

β

value for both the core and whole tumor images. As

β

increases, the

F_{β}

-loss decreases, showing that, for lower values of

β

, we should obtain a more precise segmentation mask as the loss indicated in this figure is 1 for perfect overlap. Nevertheless, the resulting binary mask is less accurate for lower values of

β

, showing that this is not an appropriate metric for optimizing the consensus-based model.

Table 1. Parameters obtained for different diffusion functions for the square and circle images. The loss metric used to obtain these parameters was the Surface Dice Coefficient with a tolerance equal to the length of 1 pixel.

Square
	$Δ_{1}$	$Δ_{2}$	$σ^{2}$
$D_{1} (c)$	0.884	0.310	0.889
$D_{2} (c)$	0.351	0.054	0.047
$D_{3} (c)$	0.817	0.407	1.341
$D_{4} (c)$	0.442	0.081	0.624
Circle
	$Δ_{1}$	$Δ_{2}$	$σ^{2}$
$D_{1} (c)$	0.435	0.341	1.829
$D_{2} (c)$	0.013	0.160	2.717
$D_{3} (c)$	0.408	0.268	2.693
$D_{4} (c)$	0.154	0.228	2.572

Table 2. Parameters obtained for the square image by minimizing the Jaccard Index and the Volumetric and Surface Dice coefficients. For the Surface Dice Coefficient, the tolerance was set to the length of 1 pixel. The loss obtained was zero for the three cases.

Square
	$Δ_{1}$	$Δ_{2}$	$σ^{2}$
Vol. Dice	0.884	0.310	0.889
Surf. Dice	0.884	0.310	0.889
JAC	0.442	0.081	0.624

Table 3. Parameters obtained for the whole and core tumor using the Volumetric Dice Coefficient, Jaccard Index, and Surface Dice Coefficient. The loss reported is 1 for perfect overlap and 0 for complete deviation.

Whole Tumor
Opt. Function	$Δ_{1}$	$Δ_{2}$	$σ^{2}$	Loss
Vol. Dice	0.4972	0.0888	2.6867	0.9292
JAC	0.5075	0.1187	2.3631	0.8672
Surf. Dice	0.6383	0.0579	2.6504	0.7447
Core Tumor
Opt. Function	$Δ_{1}$	$Δ_{2}$	$σ^{2}$	Loss
Vol. Dice	0.3795	0.1254	2.1808	0.9360
JAC	0.3823	0.1004	2.7001	0.8796
Surf. Dice	0.6841	0.0760	1.4155	0.8727

Table 4. Parameters obtained for the

F_{β}

-measure for different values of

β

. The loss reported is 1 for perfect overlap and 0 for complete deviation. The numbers of false positives (FPs), false negatives (FNs), and true positives (TPs) are presented for the resulting segmentation masks for each value of

β

.

Table 4. Parameters obtained for the

F_{β}

-measure for different values of

β

. The loss reported is 1 for perfect overlap and 0 for complete deviation. The numbers of false positives (FPs), false negatives (FNs), and true positives (TPs) are presented for the resulting segmentation masks for each value of

β

.

Whole Tumor
	$Δ_{1}$	$Δ_{2}$	$σ^{2}$	FP	FN	TP	Loss
$β = 0.25$	0.6873	0.1707	2.2395	134	347	3170	0.9559
$β = 0.5$	0.3351	0.1080	2.7051	134	350	3167	0.9470
$β = 0.75$	0.5939	0.2304	2.6718	134	350	3167	0.9373
$β = 1.5$	0.5316	0.1092	2.7105	136	349	3168	0.9179
$β = 5.0$	0.5662	0.1225	2.7043	136	349	3168	0.9032
$β = 10.0$	0.6061	0.2835	2.1243	136	349	3168	0.9013
Core Tumor
	$Δ_{1}$	$Δ_{2}$	$σ^{2}$	FP	FN	TP	Loss
$β = 0.25$	0.6575	0.2725	0.0257	9	206	849	0.9763
$β = 0.5$	0.3989	0.0637	1.8094	25	107	948	0.9582
$β = 0.75$	0.4073	0.0942	1.6972	25	105	950	0.9460
$β = 1.5$	0.5444	0.2077	2.3545	25	105	950	0.9220
$β = 5.0$	0.5587	0.1742	2.6864	25	105	950	0.9032
$β = 10.0$	0.6137	0.2425	1.9757	25	105	950	0.9012

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Understanding the Impact of Evaluation Metrics in Kinetic Models for Consensus-Based Segmentation

Abstract

1. Introduction

2. Consensus Modeling and Applications to Image Segmentation

2.1. The 2D-Bounded Confidence Model

2.2. Kinetic Models for Consensus Dynamics

2.3. Application to Image Segmentation

3. Evaluation Metrics and Parameter Estimation

3.1. DSMC Algorithm for Image Segmentation

3.2. Generation of Model-Oriented Segmentation Masks

Parameter Optimization

3.3. Segmentation Metrics

3.3.1. Volumetric and Surface Dice Indexes

3.3.2. Jaccard Index

3.3.3. $F_{β}$ -Measure

4. Numerical Results

4.1. Impact of Different Diffusion Functions

4.2. Determining the Final Time

4.3. Optimization Metrics for Biomedical Image Segmentation

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics

Understanding the Impact of Evaluation Metrics in Kinetic Models for Consensus-Based Segmentation

Abstract

1. Introduction

2. Consensus Modeling and Applications to Image Segmentation

2.1. The 2D-Bounded Confidence Model

2.2. Kinetic Models for Consensus Dynamics

2.3. Application to Image Segmentation

3. Evaluation Metrics and Parameter Estimation

3.1. DSMC Algorithm for Image Segmentation

3.2. Generation of Model-Oriented Segmentation Masks

Parameter Optimization

3.3. Segmentation Metrics

3.3.1. Volumetric and Surface Dice Indexes

3.3.2. Jaccard Index

3.3.3. F β -Measure

4. Numerical Results

4.1. Impact of Different Diffusion Functions

4.2. Determining the Final Time

4.3. Optimization Metrics for Biomedical Image Segmentation

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics

3.3.3. $F_{β}$ -Measure