Next Article in Journal
Quantum Dynamics Framework with Quantum Tunneling Effect for Numerical Optimization
Previous Article in Journal
A Network Analysis of the Impact of the Coronavirus Pandemic on the US Economy: A Comparison of the Return and the Momentum Picture
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Understanding the Impact of Evaluation Metrics in Kinetic Models for Consensus-Based Segmentation

by
Raffaella Fiamma Cabini
1,
Horacio Tettamanti
2 and
Mattia Zanella
2,*
1
Euler Institute, Università della Svizzera Italiana, 6900 Lugano, Switzerland
2
Department of Mathematics “F. Casorati”, University of Pavia, 27100 Pavia, Italy
*
Author to whom correspondence should be addressed.
Entropy 2025, 27(2), 149; https://doi.org/10.3390/e27020149
Submission received: 29 November 2024 / Revised: 15 January 2025 / Accepted: 24 January 2025 / Published: 1 February 2025
(This article belongs to the Section Multidisciplinary Applications)

Abstract

:
In this article, we extend a recently introduced kinetic model for consensus-based segmentation of images. In particular, we will interpret the set of pixels of a 2D image as an interacting particle system that evolves in time in view of a consensus-type process obtained by interactions between pixels and external noise. Thanks to a kinetic formulation of the introduced model, we derive the large time solution of the model. We will show that the parameters defining the segmentation task can be chosen from a plurality of loss functions that characterize the evaluation metrics.

1. Introduction

The primary objective of image segmentation is to partition an image into distinct pixel regions that exhibit homogeneous characteristics, including spatial proximity, intensity values, color variations, texture patterns, brightness levels, and contrast differences, thereby enabling more effective analysis and interpretation of the visual data. The application of image segmentation methods plays an important role in clinical research by facilitating the study of anatomical structures, highlighting regions of interest, and measuring tissue volume [1,2,3,4,5,6]. In this context, the accurate recognition of areas affected by pathologies can have a great impact on more precise early diagnosis and monitoring in a great variety of diseases that range from brain tumors to skin lesions.
Over the past few decades, a variety of computational strategies and mathematical approaches have been developed to address image segmentation challenges. Among these, deep learning techniques and neural networks have emerged as some of the most widely used methods in contemporary image segmentation tasks [7,8,9,10,11,12,13,14,15]. Leveraging a set of examples, these techniques are capable of approximating the complex nonlinear relationship between inputs and desired outputs. While deep learning models excel in complex segmentation problems, their dependence on large annotated datasets remains a significant challenge, particularly in fields such as biomedical imaging, where data availability is limited and manual labeling can be both expensive and time-consuming. A different approach is based on clustering methods [16,17,18,19,20,21]. These methods group pixels with similar characteristics, effectively partitioning the image into distinct regions. Clustering-based methods offer an attractive alternative to deep learning techniques as they do not require supervised training and therefore can be used on small unlabeled datasets. In this direction, a kinetic approach for unsupervised clustering problems for image segmentation has been introduced in [22,23]. In these works, microscopic-consensus-type models have been connected to image segmentation tasks by considering the pixels of an image as an interacting system where each particle is characterized by its space position and a feature determining the gray level. A virtual interaction between the particles will then determine the asymptotic formation of a finite number of clusters. Hence, a segmentation mask is generated by assigning the mean of their gray levels to each cluster of particles and by applying a binary threshold. Among the various nonlinear compromise terms that have been proposed in the literature, we will consider the Hegselmann–Krause model described in [24], where it is supposed that each agent may only interact with other agents that are sufficiently close. This type of interaction is classically known as a bounded confidence interaction function. As a result, two pixels will interact based on their distance in space and their gray level. The approach developed in [22] is based on the methods of kinetic theory for consensus formation. In recent decades, following the first model developed in [25,26,27,28], several approaches have been designed to investigate the emergence of patterns and collective structures for large systems of agents/particles [29,30,31,32]. To this end, the flexibility of kinetic-type equations has been of paramount importance to link the microscopic scale and the macroscopic observable scale [33,34,35,36,37,38,39].
In order to construct a data-oriented pipeline, we calibrate the resulting model by exploiting a family of existing evaluation metrics to obtain the relevant information from a ground truth image [40,41,42,43,44,45]. The main development of this study, compared to the one described in [46], relies on the fact that we evaluate multiple metrics to quantify segmentation error, which is crucial for the optimization of the internal model parameters. In particular, we will concentrate on the Standard Volumetric Dice Similarity Coefficient (Volumetric Dice), a volumetric measure based on the quotient between the intersection of the obtained segmented images and their total volume, and the Surface Dice Similarity Coefficient, which is analogous to Volumetric Dice but exploits the surface of the segmented images [46]. Furthermore, we test the Jaccard Index, which is an alternative option to evaluate the volumetric similarity between two segmentation masks, and the F β -measure, which is a performance metric that facilitates balance between precision and sensitivity. In this paper, we describe these metrics in detail and analyze how such choices regarding evaluation metrics influence the parameter optimization process. Furthermore, we discuss the most suitable metrics for the final assessment of the produced segmentations. This expanded evaluation provides novel insights into the impact of evaluation metrics on model performance and enhances our understanding of how to efficiently optimize the introduced segmentation pipeline.
In more detail, the manuscript is organized as follows. In Section 2, we introduce an extension of the Hegselmann–Krause model in 2D and present the structure of the emerging steady states for different values of the model parameters. Next, we present a description of the model based on a kinetic-type approach. Furthermore, we show how this model can be extended and applied to the image segmentation problem. In Section 3, we present a Direct Simulation Monte Carlo (DSMC) method to approximate the evolution of the system and introduce possible optimization methods to produce segmentation masks for particular images. To this end, we introduce the definition of the principal optimization metrics used in the context of biomedical images and their principal characteristics. In Section 4, we show the results for a simple case of segmenting a geometrical image with a blurry background and compare the results obtained for different choices regarding the diffusion function. Finally, we present the results obtained for various brain tumor images and discuss how the choice regarding different metrics may affect the final result. We show that the F β -measure does not produce consistent results for different values of β . We reproduce the expected relationship between the Volumetric Dice Coefficient and Jaccard Index and show that both metrics plus the Surface Dice Coefficient yield similar results. Nevertheless, we argue that, for this type of image, the Surface Dice Coefficient produces more accurate loss values, and its definition is more representative compared to the Volumetric Dice Coefficient and Jaccard Index.

2. Consensus Modeling and Applications to Image Segmentation

In recent years, there has been growing interest in exploring consensus formation within opinion models to gain a deeper understanding of how social forces affect nonlinear aggregation processes in multiagent systems. To this end, various models have been proposed considering different scenarios and hypotheses on how the pairwise interactions may lead to the emergence of a position. For a finite number of particles, the dynamics are usually defined in terms of first-order differential equations with the general form
d x i d t = 1 N j = 1 N P ( x i , x j ) ( x j x i ) ,
where x i ( t ) R d , d 1 characterize the position of the agent i = 1 , , N at time t 0 , and P ( · , · ) 0 tunes the interaction between the agents x i , x j R d ; see, e.g., [24,30,32,47,48].
In addition to microscopic-agent-based models, in the limit of an infinite number of agents, it is possible to derive the evolution of distribution functions characterizing the collective behavior of interacting systems. These approaches, typically grounded in kinetic-type partial differential equations (PDEs), are capable of bridging the gap between microscopic forces and the emerging properties of the system; see [37].

2.1. The 2D-Bounded Confidence Model

We now consider the bidimensional case d = 2 , and we specify the interaction function based on the so-called bounded confidence model. In more detail, we consider N 2 agents and define their opinion variable through a vector x = ( x i ( t ) , y i ( t ) ) R 2 , characterized by initial states { x 1 ( 0 ) , , x N ( 0 ) } . Agents will modify their opinion as a result of the interaction with other agents only if | x i x j | Δ , where Δ 0 is a given confidence level. Hence, we can write (1) as follows
d d t x i = 1 N j = 1 N P Δ ( x i , x j ) ( x j x i ) ,
where P Δ ( x i , x j ) = χ ( | x i x j | Δ ) : R 2 { 0 , 1 } and χ ( A ) being the characteristic function of the set A R 2 . We can easily observe that the mean position of the ensemble of agents is conserved in time, indeed
d d t i = 1 N x i = 1 N i , j = 1 N χ ( | x i x j | Δ ) ( x j x i ) = 0 ,
thanks to the symmetry of the considered bounded confidence interaction function. The bounded confidence model converges to a steady configuration, meaning that the systems achieve consensus in finite time. The structure of the steady state depends on the value of Δ ; see [38].
Furthermore, to account for random fluctuations provided by external factors in the opinion of agents, we may consider a diffusion component as follows:
d x i = 1 N j = 1 N P Δ ( | x i x j | Δ ) ( x j x i ) d t + 2 σ 2 d W i
where { W i } i = 1 N is a set of independent Wiener processes. The impact of the diffusion is weighted by the variable σ 2 > 0 . To visualize the interplay between consensus forces and diffusion, we depict in Figure 1 the steady configuration of the model (4) for different combinations of the model parameters. For σ 2 = 0 , the system forms a finite number of clusters depending on the value of Δ > 0 , as illustrated in Figure 1a. For values of the diffusion coefficient σ 2 > 0 , the number of clusters of the system varies as depicted in Figure 1b. The right panel of Figure 1b shows the scenario in which the diffusion effect becomes comparable to the tendency of agents to cluster. Finally, in Figure 1c, for σ = 0.05 , the diffusion effect dominates the grouping tendency, resulting in a homogeneous steady state distribution.

2.2. Kinetic Models for Consensus Dynamics

In the limit N + , it can be shown that the empirical density
f ( N ) ( x , t ) = 1 N i = 1 N δ ( x x i ( t ) )
of the system of particles (4) converges to a continuous density f ( x , t ) : R 2 × R + R + solution to the following mean-field equation
t f ( x , t ) = x · [ Ξ [ f ] ( x , t ) + σ 2 x f ] f ( x , 0 ) = f 0 ( x )
where Ξ [ f ] ( x , t ) is defined as follows
Ξ [ f ] ( x , t ) = R 2 P Δ ( x i , x j ) ( x x * ) f ( x * , t ) d x * ;
see, e.g., [49].
We can derive (5) using a kinetic approach by writing x : = x i ( t ) and x * : = x j ( t ) for a generic pair ( i , j ) of interacting agents/particles, and we approximate the time derivative in (4) in a time step ϵ = Δ t > 0 , through a Euler–Maryuama approach, in the same spirit as [34,50]. Hence, we recover the binary interaction rule
x = x + ϵ P Δ ( x , x * ) ( x * x ) + 2 σ 2 η x * = x * + ϵ P Δ ( x * , x ) ( x x * ) + 2 σ 2 η * ,
where x = x i ( t + ϵ ) , x * = x j ( t + ϵ ) and η , η * are two independent 2D-centered Gaussian distribution random variables such that
η = η * = 0 η 2 = ϵ
where · denotes the integration with respect to the distribution η . Furthermore, in (7), we shall consider P ( x , x * ) = χ ( | x x * | < Δ ) . We can remark that, if σ = 0 , since P Δ [ 0 , 1 ] and ϵ ( 0 , 1 ) , we obtain
x + x * = x + x * + Δ t ( P Δ ( x , x * ) P Δ ( x * , x ) ) ( x * x ) = x + x *
since the interaction function P Δ is symmetric, consistent with (3). This shows that the mean position is conserved at every interaction. Finally, we have
| x | 2 + | x | 2 = | x | 2 + | x | 2 2 Δ t P Δ | x x | 2 + o ( Δ t )
and the mean energy is dissipated at each interaction since P Δ 0 . Hence, we consider the distribution function f = f ( x , t ) : R 2 × R + R + such that f ( x , t ) d x represents the fraction of agents/particles in [ x 1 , x 1 + d x 1 ) × [ x 2 , x 2 + d x 2 ] at time t 0 . The evolution of f as a result of binary interaction scheme (7) is obtained by a Boltzmann-type equation, which reads in weak form
d d t R 2 φ ( x ) f ( x , t ) d x = R 4 ( φ ( x ) φ ( x ) ) f ( x , t ) f ( x * , t ) d x d x * ,
φ ( · ) being a test function. As observed in [39], when Δ t = ϵ 0 + , we can observe that the binary scheme (7) becomes quasi-invariant, and we can introduce the following expansion
φ ( x ) φ ( x ) = x x · x φ ( x ) + 1 2 ( x x ) T H [ φ ] ( x x ) + R ϵ ( x , x * )
R ϵ ( x , x * ) being a reminder term and H [ φ ] the Hessian matrix. Hence, scaling τ = ϵ t and the distribution f ϵ ( x , τ ) = f ( x , τ / ϵ ) , we may plug (12) into (11) to obtain
d d τ R 2 φ ( x ) f ϵ ( x , t ) d x = 1 ϵ R 4 x x · x φ ( x ) f ϵ ( x , τ ) f ϵ ( x * , τ ) d x d x * + 1 2 ϵ R 4 ( x x ) T H [ φ ( x ] ( x x ) f ϵ ( x , τ ) f ϵ ( x * , τ ) d x d x * + 1 ϵ R 4 R ϵ ( x , x * ) f ϵ ( x , τ ) f ϵ ( x * , τ ) d x d x *
Following [22], see also [37], we can prove that
R 4 R ϵ ( x , x * ) f ϵ ( x , τ ) f ϵ ( x * , τ ) d x d x * 0 + ,
as ϵ 0 + . Hence, integrating back by parts the first two terms, we obtain (5). In more detail, we can prove that f ϵ converges up to extraction of a subsequence to a probability density f ( x , τ ) that is weak solution to the nonlocal Fokker–Planck Equation (5).

2.3. Application to Image Segmentation

An application of the Hegselmann–Krause model for data clustering problems has been proposed in [23]. The idea is to extend the 2D model by characterizing each particle with an internal feature c i [ 0 , 1 ] that represents the gray color of the ith pixel. Therefore, we interpret each pixel in the image as a particle characterized by a position vector and the static feature c as shown in Figure 2.
To address the segmentation task, we can define a dynamic feature for the system of pixels through an interaction function that accounts for alignment processes among pixels with sufficiently similar features. In particular, let us consider the following:
P Δ 1 , Δ 2 ( x i , x j , c i , c j ) = χ ( | x i x j | Δ 1 ) χ ( | c i c j | Δ 2 ) .
Therefore, the time-continuous evolution for the system of pixels is provided by
d d t x i = 1 N j = 1 N P Δ 1 , Δ 2 ( x i , x j , c i , c j ) ( x j x i ) d d t c i = 0
In this case, we introduced two confidence bounds Δ 1 0 , Δ 2 0 , taking into account the position and the gray level of the pixels, respectively. In this way, the interactions between the pixels will generate a large time distribution that is characterized by several clusters depending on the values of Δ 1 and Δ 2 . Hence, consistent with k-means methods, see, e.g., [44], a pixel belongs to a cluster C μ = { x i : x i μ α } , with α > 0 being the pixel size, if it is sufficiently close to the local quantity μ R 2 . We highlight how we are only interested in clustering with respect to the space variable.
This dynamics are represented in Figure 3.
Biomedical images are often subject to ambiguities arising from various sources of uncertainty related to clinical factors and potential bottlenecks in data acquisition processes [2,9]. These uncertainties can be broadly categorized into aleatoric uncertainty, stemming from inherent stochastic variations in the data collection process, and epistemic uncertainty, relating to uncertainties in model parameters, potentially leading to deviations in the results. Aleatoric uncertainties poses significant challenges in image segmentation as image processing models must contend with limitations in the raw acquisition data. Addressing these uncertainties is critical, and the study of uncertainty quantification in image segmentation is an expanding field aimed at developing robust segmentation algorithms capable of mitigating erroneous outcomes. To this end, in [22], an extension of (14) has been proposed to consider segmentation of biomedical images. In particular, the particle model (15) has integrated a nonconstant stochastic part to take into account aleatoric uncertainties arising from the data acquisition process. These uncertainties may include factors such as motion artifacts or field inhomogeneities in magnetic resonance imaging (MRI). They modified Equation (14) as follows:
d x i = 1 N j = 1 N P Δ 1 , Δ 2 ( x i , x j , c i , c j ) ( x j x i ) d t + 2 σ 2 D ( c ) d W i d d t c i = 0
where { W i } i = 1 N is set of independent Wiener processes, P Δ 1 , Δ 2 ( · , · , · , · ) [ 0 , 1 ] is the interaction function defined in (13), and D ( c ) 0 quantifies the impact of diffusion related to the value of the feature c [ 0 , 1 ] . Since the aleatoric uncertainties are expected to appear far away from the static feature’s boundaries, only diffusion functions that are maximal at the center and satisfy D ( 0 ) = D ( 1 ) = 0 are considered. Similarly to (7), we may introduce the following binary interaction scheme by writing ( x , x * ) : = ( x i ( t ) , x j ( t ) ) , with a random couple of pixels having features ( c , c * ) : = ( c i ( t ) , c j ( t ) ) . We obtain
x = x + ϵ P Δ 1 , Δ 2 ( x , x * , c , c * ) ( x * x ) + 2 σ 2 D ( c ) η x * = x * + ϵ P Δ 1 , Δ 2 ( x * , x , c * , c ) ( x x * ) + 2 σ 2 D ( c * ) η c * = c * c = c ,
where ( x , x * ) : = ( x i ( t + Δ t ) , x j ( t + Δ t ) ) and ( c , c * ) : = ( c i ( t + Δ t ) , c j ( t + Δ t ) ) . At the statistical level, as in [22], we may follow the approach described in Section 2.2. Hence, we introduce the distribution function f = f ( x , c , t ) : R 2 × [ 0 , 1 ] × R + R + , such that f ( x , t ) d x represents the fraction of agents/particles in [ x 1 , x 1 + d x 1 ) × [ x 2 , x 2 + d x 2 ] characterized by a feature c [ 0 , 1 ] at time t 0 . The evolution of f, whose interaction follows the binary scheme (16), is provided by the following Boltzmann-type equation:
d d t 0 1 R 2 φ ( x , c ) f ( x , c , t ) d x d c = [ 0 , 1 ] 2 R 4 ( φ ( x , c ) φ ( x , c ) ) f ( x , c , t ) f ( x * , c * , t ) d x d x * d c d c * ,
Hence, since the feature is not evolving in time, we can proceed as in Section 2.2 to derive in the quasi-invariant limit for ϵ 0 + the corresponding Fokker–Planck-type PDE
t f ( x , c , t ) = x · Ξ [ g ] Δ 1 , Δ 2 ( x , c , t ) f ( x , c , t ) + σ 2 D ( c ) x f ( x , c , t )
where
Ξ [ g ] Δ 1 , Δ 2 ( x , c , t ) = 0 1 R 2 P Δ 1 , Δ 2 ( x , x * , c , c * ) ( x x * ) f ( x * , c * , t ) d x * d c * .

3. Evaluation Metrics and Parameter Estimation

In this section, we present a classical Direct Simulation Monte Carlo (DSMC) method to numerically approximate the evolution of (17) as a quasi-invariant approximation of the Fokker–Planck Equation (18). The resulting numerical algorithm is fundamental to estimate consistent parameters from MRI images. To this end, we present several loss metrics with the aim to compare the result of our model-based approach with existing methods for biomedical image segmentation. In this work, we focus exclusively on binary metrics. For evaluation of segmentation with multiple labels, we direct the reader to [45] for a detailed presentation of various metrics.

3.1. DSMC Algorithm for Image Segmentation

The numerical approximation of Boltzmann-type equations has been deeply investigated in recent decades; see, e.g., [51,52]. The approximation of this class of equations is particularly challenging due to the curse of dimensionality brought up by the multidimensional integral of the collision operator, and the presence of multiple scales. Furthermore, the preservation of relevant physical quantities is essential for a correct description of the underlying physical problem [53].
In view of its computational efficiency, in the following, we will adopt a DSMC approach. Indeed, the computational cost of this method is O ( N ) , where N represents the number of particles. Next, we describe the DSMC method based on a Nanbu–Bavosky scheme [52]. We begin by randomly selecting N / 2 pairs of particles and making them evolve following the binary scheme presented in (7). We consider a time interval [ 0 , T ] , which we divide into N t intervals of size Δ t > 0 . The DSMC approach for the introduced kinetic equation is based on a first-order forward time discretization. In the following, we will always consider the case Δ t = ϵ > 0 such that all the particles are going to interact; see [52] for more details. We introduce the stochastic rounding of a positive real number as
S r o u n d ( x ) = x + 1 with probability x x x with probability 1 x + x
where x is the integer part of x. The random variable η is sampled from a 2D Gaussian distribution centered at zero and a diagonal covariance matrix.

3.2. Generation of Model-Oriented Segmentation Masks

In this section, we present the procedure to estimate the segmentation masks of brain tumor images. The procedure described in this section closely follows the methodology presented in [22]. For a given image, we define the feature’s values in relation to the gray level of each pixel. In more detail, for a given pixel i { 1 , , N } , we define
c i = C i min i = 1 , , N C i max i = 1 , , N C i min i = 1 , , N C i [ 0 , 1 ] ,
C i , i = 1 , , N being the gray value of the original image. Therefore, the value c i = 1 represents a white pixel and c i = 0 represents black pixel.
In particular, for this work, we used the brain tumor dataset that consists of 3D multi-parametric MRI of patients affected by glioblastoma or lower-grade glioma, publicly available in the context of the Brain Tumor Image Segmentation Challenge http://medicaldecathlon.com/ (accessed on 28 November 2024). The acquisition sequences include T 1 -weighted, post-Gadolinium contrast T 1 -weighted, T 2 -weighted, and T 2 Fluid-Attenuated Inversion Recovery volumes. Each MRI scan is accompanied by corresponding ground truth segmentation mask, which is a binary image where anatomical regions of interest are highlighted as white pixels while all other areas are represented as black pixels. These ground truth segmentation masks were manually delineated by experienced radiologists and specifically identify three structures: “tumor core”, “enhancing tumor”, and “whole tumor”. We evaluate the performance of the DSMC algorithm for two different segmentation tasks: “tumor core” and “whole tumor” annotations. For the first task, we use a single slice in the axial plane of the post-Gadolinium contrast T 1 -weighted scans, while, for the second task, we use a single slice in the axial plane of the T 2 -weighted scans. The procedure to generate the segmentation masks is as follows:
  • We begin by associating each pixel with a position vector ( x i , y i ) and with static feature c i . We scale the vector position to a domain [ 1 , 1 ] × [ 1 , 1 ] and the static feature to  [ 0 , 1 ] .
  • We apply a DSMC approach as described in Algorithm 1 to numerically approximate the large-time solution of the Boltzmann-type model defined in (17). This approach enables pixels to aggregate into clusters based on their Euclidean distance and gray color level.
  • The segmentation masks are generated by assigning to the original position of each pixel the mean values of the clusters they belong to. Thus, we generate a multi-level mask composed of a number of homogenous regions.
  • Finally, we obtain the binary mask by defining a threshold c ˜ such that
    c i = 1 i f c c ˜ 0 i f c < c ˜
    For all the following experiments, c ˜ is defined as the 10th percentile of pixels in the image that belong to the region of interest. This percentile was chosen as an optimal value for brain tumor images; however, it could also be considered as a parameter to be optimized within the process outlined in the section on parameter optimization.
Algorithm 1 DSMC algorithm for Boltzmann equation
1:
Given N particles ( x n 0 , c n 0 ) , with n = 1 , , N computed from the initial distribution f 0 ( x , c ) ;
2:
for   t = 1   to N t do
3:
      set n p = Sround ( N / 2 ) ;
4:
     sample n p pairs ( i , j ) uniformly without repetition among all possible pairs of particles at time step t;
5:
      for each pair ( i , j ) , sample η , η *
6:
      for each pair ( i , j ) , compute the data change
Δ x i t = ϵ P Δ 1 , Δ 2 ( x i t , x j t , c i 0 , c j 0 ) ( x j t x i t ) + 2 σ 2 D ( c i 0 ) η Δ x j t = ϵ P Δ 1 , Δ 2 ( x j t , x i t , c j 0 , c i 0 ) ( x i t x j t ) + 2 σ 2 D ( c j 0 ) η *
compute
x i , j t + 1 = x i , j t + Δ x i , j t
7:
end for
Following this procedure, we apply two morphological refinement steps to remove small regions that have been misclassified as foreground parts and to fill small regions that have been incorrectly categorized as background pixels. We begin by labeling all the connected pixels in the foreground and reassigning to the background those whose number of pixels is less than a certain threshold. Then, we repeat the same procedure but for the pixels in the background. To this end, we use the scikit-image Python library that detects distinct objects of a binary image [54]. This enables us to obtain more precise segmentation masks by reducing small imperfections. This entire process is illustrated in Figure 4.

Parameter Optimization

In this section, we outline the procedure for optimizing the parameters Δ 1 > 0 , Δ 2 > 0 , and σ 2 > 0 that best approximate the ground truth segmentation masks. The goal is to identify the parameter configuration that minimizes the discrepancy between the computed and ground truth masks, measured through a predefined loss metric. To achieve this, we solve the following minimization problem:
min Δ 1 , Δ 2 , σ 2 > 0 L o s s ( S g , S t ) = min Δ 1 , Δ 2 , σ 2 > 0 1 M e t r i c ( S g , S t )
where S g is the ground truth segmentation mask and S t is the segmentation mask computed by the model. The different loss metrics quantify the discrepancy between the masks, with lower values indicating greater similarity. Accordingly, the metric function, detailed in Section 3.3, measures the similarity between the two masks, with higher values indicating better agreement. The relationship l o s s = 1 m e t r i c is satisfied when the metric is defined to take a value of 1 for perfect agreement and 0 for complete mismatch.
To solve the optimization problem (23), we used the Hyperopt package [55]. This optimization method randomly samples the parameter configurations from predefined distributions and selects the configuration that minimizes the l o s s metric. This sampling process is repeated for a predefined number of iterations. In this work, we sample the values of our parameters from the following distributions:
Δ 1 U ( Δ x , 0.7 ) Δ 2 U ( 0.05 , 0.3 ) σ 2 log - uniform ( e 5 , 1 )
where Δ x represents the distance between the initial positions of the pixels at t = 0 . We perform 300 iterations of the optimization process. To ensure reproducibility and correctly compare the different results obtained, the random seed for parameter sampling is fixed.

3.3. Segmentation Metrics

Next, we introduce the principal optimization metrics used for evaluating a binary segmentation mask. We define { S g 0 , S g 1 , S t 0 , S t 1 } , where S g 0 and S g 1 represent the sets of pixels that belong to the background and foreground of the ground truth segmentation mask, respectively. The same applies for S t 0 , S t 1 but for the binary mask we want to evaluate. One could also wish to assess the validity of a segmentation mask with multiple labels; we refer to [45] for an introduction to the subject. Figure 5 presents a summary of the key terms used in the definitions of metrics.

3.3.1. Volumetric and Surface Dice Indexes

The Volumetric Dice Index, also known as the Standard Volumetric Dice Similarity Coefficient, first introduced in [42], is the most used metric when evaluating volumetric segmentation masks. It is defined as follows:
DICE = 2 | S g 1 S t 1 | | S g 1 | + | S t 1 |
where | · | indicates the total number of pixels of the considered region. This metric is equal to one if there is a perfect overlap between the two segmentation masks and null if both segmentation masks are completely disjoint. Since the Volumetric Dice Coefficient is the most commonly used metric, especially in the biomedical field, the results are highly interpretable and can be compared with those obtained in other studies. However, when assessing surface segmentation masks, the Volumetric Dice Coefficient can yield suboptimal results. This limitation arises because the Volumetric Dice Coefficient evaluates the similarity between segmentation masks based on pixel overlap without considering the spatial accuracy of the boundaries. Specifically, it treats all pixel displacements equally without considering how far a segmentation error might be from the true boundary of the object. This means that segmentation masks with minor errors spread across multiple areas and those with a major error in a single area might receive similar scores. To address this limitation, the Surface Dice Similarity Coefficient was presented in [5] as a metric that can assess the accuracy of segmentation masks by considering the similarity of their boundaries. We define ζ : I R 2 as a parameterization of S i , the boundary of the segmentation mask S i . The border region B i ( τ ) , which is a region around the boundary S i with tolerance τ , is defined as
B i ( τ ) = x R 2 / y I s . t . | | x ζ ( y ) | | τ
where τ is a positive real number that defines the maximum allowable distance from the boundary S i for a point x to be considered part of the border region B i ( τ ) . The Surface Dice Similarity Coefficient between S t and S g with tolerance τ is defined as
R g , t ( τ ) = 2 B g ( τ ) B t ( τ ) B g ( τ ) + B t ( τ )
R g , t ( τ ) , ranges from 0 to 1. A score of 1 indicates a perfect overlap between the two surfaces, while a score of 0 indicates no overlap. A larger value of τ results in a wider border region, making the metric more tolerant to small deviations in the boundary.

3.3.2. Jaccard Index

The Jaccard Index (JAC) [43], similar to the Volumetric Dice Coefficient, measures the similarity between two segmentation masks by quantifying the overlap between the computed mask and the ground truth. It is defined as the ratio between the intersection and the union of the foreground’s segmentation masks
JAC = | S g 1 S t 1 | | S g 1 S t 1 | .
The JAC Index and the Volumetric Dice Coefficient are closely related since we have
JAC = DICE 2 DICE DICE = 2 JAC 1 + JAC .
From (29), we obtain the relationship between the JAC index and the Volumetric Dice Coefficient. While both are widely used for measuring segmentation similarity, they can produce slightly different results. To understand the implications of these differences, we can analyze how their absolute and relative errors are related.
Definition 1 
(Absolute Approximation). A similarity S is absolutely approximated by S ˜ with error ϵ 0 if the following holds for all y and y ˜ :
| S ( y , y ˜ ) S ˜ ( y , y ˜ ) | ϵ
Definition 2 
(Relative Approximation). A similarity S is relatively approximated by S ˜ with error ϵ 0 if the following holds for all y and y ˜ :
S ˜ ( y , y ˜ ) 1 + ϵ S ( y , y ˜ ) S ˜ ( y , y ˜ ) · ( 1 + ϵ ) .
The following result holds.
Proposition 1. 
JAC and Volumetric Dice approximate each other with a relative error of 1 and an absolute error of 3 2 2 .
We direct the reader to [46] for a deeper comparison between the Jaccard and Volumetric Dice indexes.

3.3.3. F β -Measure

The F β -measure is commonly used as an information retrieval metric [41,56]. To define this metric, we first introduce two terms: positive predicted value (PPV) and true positive rate (TPR), which are also known as precision and sensitivity, respectively. The precision metric quantifies the proportion of correctly predicted foreground pixels (true positives, TPs) out of all pixels predicted as foreground (TPs + false positives, FPs). The sensitivity measures the proportion of actual foreground pixels (TPs) correctly identified by the model out of all actual foreground pixels (TPs + false negatives, FNs). These two metrics can be expressed as follows
Precision = PPV = TP TP + FP Sensitivity = TPR = TP TP + FN
The precision metric indicates how many of the predicted foreground pixels are actually correct. The sensitivity metric, on the other hand, measures how many of the actual foreground pixels were correctly predicted by the model.
We can define the F β -measure as a combination of precision and sensitivity, with a parameter β that controls the trade-off between these two metrics. Specifically, the F β -measure is provided by
FMS β = ( β 2 + 1 ) · PPV · TPR β 2 · PPV + TPR
We may observe that, if β = 1 , we obtain the Volumetric Dice metric.
To understand the impact of β in the F β -measure, we can substitute the definitions of PPV and TPR into (31), which results in the following
FMS β = ( β 2 + 1 ) TP 2 ( β 2 + 1 ) TP 2 + TP ( β 2 FN + FP )
If β > 1 , the F β -measure emphasizes minimizing false negatives (maximizing sensitivity), which can lead to more false positives (lower precision). If β < 1 , the F β -measure focuses on minimizing false positives (maximizing precision), potentially increasing the number of false negatives (lower sensitivity).
Furthermore, it can be noticed that
lim β ( β 2 + 1 ) TP 2 ( β 2 + 1 ) TP 2 + TP ( β 2 FN + FP ) = Sensitivity = TP TP + FN
since for β 0 we neglect the contribution of the false positives by considering only the contribution of the false negatives where we re-obtain the TPR metrics defined in (30).
In summary, thanks to the β parameter, the F β -measure offers a flexible way to evaluate segmentation models by enabling a tunable balance between precision and sensitivity. It provides a useful metric when dealing with class imbalances, especially in the field of medical imaging, where the relative importance of false positives and false negatives can vary according to each segmentation task.

4. Numerical Results

4.1. Impact of Different Diffusion Functions

In this section, we study the impact of choosing different diffusion functions D ( c ) in images consisting of a blurry background and a geometric shape in the center, as shown in Figure 6. The objective is to detect the shape of the geometric figure and to compare how the choice of different diffusion functions affects the value of the model parameters, Δ 1 , Δ 2 , and σ 2 , where the optimization process is identical to the one introduced in the section on parameter optimization. To this end, we chose the following diffusion functions:
D 1 ( c ) = c ( 1 c ) D 2 ( c ) = 4 c 2 ( 1 c ) 2 D 3 ( c ) = c 2 if c 0.5 c 2 ( 1 c ) if c > 0.5 D 4 ( c ) = 64 c 4 ( 1 c ) 4 .
We direct the reader to Figure 7 for a summary of the various introduced diffusion functions in (34).
For both the square and circle images, the Surface Dice Coefficient was used to optimize the parameters with a tolerance equal to the length of 1 pixel. Both images have a shape of ( 256 , 256 ) pixels. The final time was set to T = 200 with Δ t = 0.1 . The resulting binary mask was the same for all choices of diffusion functions, obtaining the same loss function value. The results are shown in Figure 6. In the case of the square in Figure 6a, we can see from Table 1 that, for D 1 ( c ) and D 3 ( c ) , the values of Δ 1 do not differ greatly for these two diffusion functions. In the case of Δ 2 , we obtain a slightly smaller value for D 1 ( c ) compared to the one obtained for D 3 ( c ) and a larger value of the parameter σ 2 > 0 for D 3 ( c ) compared to the one obtained for D 1 ( c ) . If we look at Figure 7, we notice that D 1 ( c ) D 3 ( c ) . Therefore, a larger value of the diffusion functions is balanced by a smaller value of σ 2 to obtain a similar diffusion effect. This holds also for D 1 ( c ) and D 3 ( c ) for the circle in Figure 6b. Furthermore, comparing D 2 ( c ) and D 4 ( c ) for the square image, we can see that the resulting parameters are smaller for D 2 ( c ) in contrast to the one obtained with D 4 ( c ) . This is consistent because, again, we can see from Figure 7 that D 2 ( c ) D 4 ( c ) . If we now compare D 2 ( c ) and D 4 ( c ) for the circle image, we can see that the value of σ 2 is similar in this case. Nevertheless, in this case, the difference is provided by the values of Δ 1 and Δ 2 , which are both smaller for D 2 ( c ) . This indicates that, for different diffusion functions, the optimal parameters adjust to yield similar results. A very straightforward approach is to obtain similar values of Δ 1 and Δ 2 and a lower value of σ 2 for the diffusion function that has a higher value, as in the case of the square image. However, the example of the circle image shows us that we can also obtain different combinations of parameters so as to counter the effect of a larger diffusion function.
From Table 2, we can see the parameters obtained by minimizing three different optimization metrics using as a diffusion function D 1 ( c ) for the square image. For all the cases, the resulting Surface Dice Coefficient was equal to one, indicating perfect overlap between the computed and ground truth segmentation masks. The resulting binary masks obtained were the same for the three examples and are equivalent to the ones shown in Figure 6. For the Volumetric and Surface Dice coefficients, we can see that the parameters obtained were identical. Nevertheless, for the Jaccard Index, the resulting parameters differed, being smaller in this case. The loss is null in both cases, consistent with the relationship in (29).

4.2. Determining the Final Time

In this section, we specify the criteria that we implemented to determine the final time T > 0 . As defined in Section 3.1, we approximate the solution of (18) through a DSMC approach even though we have no analytical insight on the form of the steady state. The objective is to find the values of the final time T > 0 such that a numerical steady state can be defined. We stress that the time taken to reach the equilibrium state for different initial conditions is not the same, so we need to determine the time parameters for all the images we want to analyze. To this end, if f n ( x , c ) is the approximation of the density at time t n = n Δ t , we define
T = R 2 × [ 0 , 1 ] | f n + 1 ( x , c ) f n ( x , c ) | d x d c ,
which represents an index of variation between two successive time steps of the reconstructed kinetic density. As the solution evolves, this quantity decreases and tends to zero as the equilibrium state is reached, as illustrated in Figure 8 for the case of the square image with a blurry background. Hence, we may introduce a breaking criterion based on the condition T < δ for some δ > 0 . When this condition is satisfied, the reconstructed density is considered to be an approximation of the steady state.
The same procedure was conducted for all images presented in this work so as to fulfill the condition presented in this section.

4.3. Optimization Metrics for Biomedical Image Segmentation

In this section, we study the impact that the different optimization metrics have on the resulting binary masks for the core and whole tumor. We also analyze the parameters obtained for the different optimization metrics. Both brain tumor images consist of N = ( 240 , 240 ) pixels, and, for the optimization procedure, we determine T = 300 and Δ t = 0.01 for both the core and whole tumor for all the optimization metrics addressed in this section. For each segmentation mask generated, we evaluated 300 different combinations of parameters. Figure 9 shows the segmentation masks obtained for both the whole and core tumor by optimizing the Jaccard Index and the Volumetric Dice Coefficient. In Table 3, the resulting parameters and the loss obtained for both optimization metrics are presented; in this case, the loss is equal to 1 for a perfect overlap and 0 if the images are totally disjoint. First, we can observe that the loss values obtained with both metrics satisfy (29) as expected. It can be noticed that, for both segmentation masks, the loss obtained is greater for the Volumetric Dice Coefficient. Furthermore, the parameter Δ 1 obtained with both optimization metrics is similar for both the core and whole tumor. Nevertheless, we can see that, for the whole tumor, the Δ 2 parameter obtained with the Jaccard Index is larger than the one obtained with the Volumetric Dice Coefficient. For the case of the core tumor instead, the Δ 2 parameter is larger for the Volumetric Dice Coefficient. If we compare this to the values obtained for σ 2 in both cases for both metrics, we can see that a larger diffusion value is countered by a smaller value of Δ 2 so as to obtain similar segmentation masks, as demonstrated in Figure 9.
For the Surface Dice Coefficient, the tolerance τ was set to the length of 1 pixel, both when used as the optimization loss and when used as the evaluation metric. Figure 10 shows the resulting binary mask obtained with the Surface Dice Coefficient and the Volumetric Dice Coefficient for the core and whole tumor. In the case of the whole tumor, the loss obtained with the Surface Dice Coefficient is smaller than that obtained with the Jaccard Index and the Volumetric Dice Coefficient. For the core tumor, the loss obtained with the Surface Dice Coefficient is similar to that reported by the Jaccard Index, and both are smaller than that obtained with the Volumetric Dice Coefficient. For the whole tumor, we can see that the resulting parameters are similar for all the optimization metrics. Nevertheless, for the core tumor, we can notice that the parameters obtained with the Surface Dice Coefficient differ compared to the ones obtained with the Jaccard Index and the Volumetric Dice Coefficient. In particular, we obtained a smaller value for σ 2 and slightly larger value for Δ 1 . This indicates that a smaller value for the diffusion of the particles is compensated by enabling the particles to aggregate with others that are slightly more separated than regarding Volumetric Dice and the Jaccard Index. Given that both the Volumetric Dice Coefficient and Jaccard Index are a measure of the superposition between two volumes (in this case two surfaces), they do not represent the proximity between two surfaces, making the Surface Dice Coefficient more suitable to use as a loss metric when comparing two different surfaces.
For the F β -measure, we can see in Figure 11 the binary masks obtained for different values of β for the core and whole tumor. For the case of the core tumor, we can observe that, for β = 0.25 , we obtain areas of misclassified pixels in the tumor region. This can also be seen from Table 4, where the number of false negatives is larger and the number of false positives is smaller compared to the results obtained for larger values of β . If we recall (31), we can see that, for low values of β , the false negatives are multiplied by a factor of β 2 , thus having a smaller weight compared to the false positives. As we increase the value of β , we can notice from both Table 4 and Figure 11 that modifying the value of β has no impact on the resulting binary mask. This also holds true for the whole tumor as no difference can be noticed in the results obtained for different values of β . Finally, in Figure 12, we see the loss reported for different values of β , where the loss equal to 1 represents a perfect overlap. First, it can be noticed that we obtain the higher value of the loss for β = 0.25 , meaning that this should be the most accurate result, which is balanced anyway by the fact that we obtain a larger number of false negatives. Again, we observe that this can be obtained from (31), where low values of β reduce the impact of a large number of false negatives on the resulting loss. Secondly, we observe that the loss decreases for larger values of β . This behavior arises because the loss is inversely proportional to β , while the resulting segmentation masks remain unchanged, as shown in Table 4. This shows that the F β -measure may not be a reliable metric for these types of segmentation masks and this segmentation method, and that modifying the value of β provides no advantage.

5. Conclusions

In this paper, we presented a consensus-based kinetic method and demonstrated how this model can be applied for the problem of image segmentation. A pixel in a 2D image is interpreted as a particle that interacts with the rest through a consensus-type process, which enables us to identify different clusters and generate an image segmentation. We developed a procedure that enables us to approximate the ground truth segmentation masks of different brain tumor images. Furthermore, we presented and evaluated different optimization metrics and studied the impact on the results obtained. In particular, we found that the Jaccard Index and Volumetric and Surface Dice coefficients are appropriate metrics to optimize our model. Nevertheless, given that the Surface Dice Coefficient is a measure of discrepancy between the boundaries of two surfaces, it is a better representation compared to the Jaccard Index and the Volumetric Dice Coefficient as they account only for absolute differences and do not capture pointwise differences. Furthermore, we assessed the use of F β -loss as a potential optimization metric. We found that both the loss values and corresponding results were difficult to interpret as low loss values often corresponded to low accuracy, making this metric challenging to apply effectively for optimization in this context. Future research will focus on the case of multidimensional features to deal with color images as RGB color models are defined by 3D features specifying red, green, and blue values. As a result, we plan to define a pipeline for learning model parameters depending on these multidimensional characteristics, aiming to enhance accuracy and applicability in real-world scenarios.

Author Contributions

Conceptualization, M.Z. and R.F.C.; methodology, M.Z. and R.F.C.; software, R.F.C. and H.T.; validation, M.Z., R.F.C. and H.T.; formal analysis, M.Z. and H.T.; investigation, R.F.C., H.T. and M.Z.; resources, R.F.C. and H.T.; data curation, R.F.C. and H.T.; writing—original draft preparation, R.F.C., M.Z. and H.T.; visualization, R.F.C. and H.T.; supervision, M.Z.; project administration, M.Z.; funding acquisition, M.Z. All authors have read and agreed to the published version of the manuscript.

Funding

M.Z. is member of GNFM (Gruppo Nazionale di Fisica Matematica) of INdAM, Italy, and acknowledges support of PRIN2022PNRR project No. P2022Z7ZAJ, European Union—NextGenerationEU. M.Z. acknowledges partial support by ICSC—Centro Nazionale di Ricerca in High Performance Computing, Big Data and Quantum Computing, funded by European Union—NextGenerationEU.

Data Availability Statement

All data are publicly available at http://medicaldecathlon.com/ (accessed on 28 November 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Agosti, A.; Shaqiri, E.; Paoletti, M.; Solazzo, F.; Bergsland, N.; Colelli, G.; Savini, G.; Muzic, S.; Santini, F.; Deligianni, X.; et al. Deep learning for automatic segmentation of thigh and leg muscles. Magn. Reson. Mater. Phys. Biol. Med. 2021, 35, 467–483. [Google Scholar] [CrossRef]
  2. Barbano, R.; Arridge, S.; Jin, B.; Tanno, R. Uncertainty quantification in medical image synthesis. In Biomedical Image Synthesis and Simulation: Methods and Applications; The MICCAI Society book Series, Biomedical Image Synthesis and Simulation; Academic Press: Cambridge, MA, USA, 2022; pp. 601–641. [Google Scholar]
  3. Coupé, P.; Manjón, J.; Fonov, V.; Pruessner, J.; Robles, M.; Collins, D. Patch-based segmentation using expert priors: Application to hippocampus and ventricle segmentation. NeuroImage 2011, 54, 940–954. [Google Scholar] [CrossRef] [PubMed]
  4. Medaglia, A.; Colelli, G.; Farina, L.; Bacila, A.; Bini, P.; Marchioni, E.; Figini, S.; Pichiecchio, A.; Zanella, M. Uncertainty quantification and control of kinetic models of tumour growth under clinical uncertainties. Int. J. Non-Linear Mech. 2022, 141, 103933. [Google Scholar] [CrossRef]
  5. Nikolov, S.; Blackwell, S.; Mendes, R.; Fauw, J.; Meyer, C.; Hughes, C.; Askham, H.; Romera-Paredes, B.; Karthikesalingam, A.; Chu, C.; et al. Deep learning to achieve clinically applicable segmentation of head and neck anatomy for radiotherapy. arXiv 2018, arXiv:1809.04430. [Google Scholar]
  6. Sharma, N.; Aggarwal, L. Automated medical image segmentation techniques. J. Med. Phys. 2010, 35, 3. [Google Scholar] [CrossRef] [PubMed]
  7. Hesamian, M.; Jia, W.; He, X.; Kennedy, P. Deep learning techniques for medical image segmentation: Achievements and challenges. J. Digit. Imaging 2019, 32, 582–596. [Google Scholar] [CrossRef] [PubMed]
  8. Isensee, F.; Jaeger, P.; Kohl, S.; Petersen, J.; Maier-Hein, K. nnU-Net: A self-configuring method for deep learning-based biomedical image segmentation. Nat. Methods 2021, 18, 203–211. [Google Scholar] [CrossRef]
  9. Kwon, Y.; Won, J.; Kim, B.; Paik, M. Uncertainty quantification using Bayesian neural networks in classification: Application to biomedical image segmentation. Comput. Stat. Data Anal. 2020, 142, 106816. [Google Scholar] [CrossRef]
  10. Liu, X.; Song, L.; Liu, S.; Zhang, Y. A review of deep-learning-based medical image segmentation methods. Sustainability 2021, 13, 1224. [Google Scholar] [CrossRef]
  11. Lizzi, F.; Agosti, A.; Brero, F.; Cabini, R.F.; Fantacci, M.E.; Figini, S.; Lascialfari, A.; Laruina, F.; Oliva, P.; Piffer, S.; et al. Quantification of pulmonary involvement in COVID-19 pneumonia by means of a cascade of two U-nets: Training and assessment on multiple datasets using different annotation criteria. Int. J. Comput. Assist. Radiol. Surg. 2022, 17, 229–237. [Google Scholar] [CrossRef] [PubMed]
  12. Lizzi, F.; Postuma, I.; Brero, F.; Cabini, R.F.; Fantacci, M.E.; Lascialfari, A.; Oliva, P.; Rinaldi, L.; Retico, A. Quantification of pulmonary involvement in COVID-19 pneumonia: An upgrade of the LungQuant software for lung CT segmentation. Eur. Phys. J. Plus. 2023, 138, 326. [Google Scholar] [CrossRef]
  13. Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the 18th International Conference, Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar]
  14. Yu, Z.; Au, O.; Zou, R.; Yu, W.; Tian, J. An adaptive unsupervised approach toward pixel clustering and color image segmentation. Pattern Recognit. 2010, 43, 1889–1906. [Google Scholar] [CrossRef]
  15. Zhou, Z.; Rahman Siddiquee, M.; Tajbakhsh, N.; Liang, J. Unet++: A nested u-net architecture for medical image segmentation. In Deep Learning in Medical Image Analysis And Multimodal Learning for Clinical Decision Support; Springer: Cham, Switzerland, 2018; pp. 3–11. [Google Scholar]
  16. Cordier, N.; Delingette, H.; Ayache, N. A patch-based approach for the segmentation of pathologies: Application to glioma labelling. IEEE Trans. Med. Imaging 2015, 35, 1066–1076. [Google Scholar] [CrossRef] [PubMed]
  17. Frigui, H.; Krishnapuram, R. A robust competitive clustering algorithm with applications in computer vision. IEEE Trans. Pattern Anal. Mach. Intell. 1999, 21, 450–465. [Google Scholar] [CrossRef]
  18. Jain, A.; Murty, M.; Flynn, P. Data clustering: A review. ACM Comput. Surv. 1999, 31, 264–323. [Google Scholar] [CrossRef]
  19. Kayal, S. Unsupervised image segmentation using the Deffuant-Weisbuch model from social dynamics. Signal Image Video Process. 2017, 11, 1405–1410. [Google Scholar] [CrossRef]
  20. Pizzagalli, D.; Gonzalez, S.; Krause, R. A trainable clustering algorithm based on shortest paths from density peaks. Sci. Adv. 2019, 5, eaax3770. [Google Scholar] [CrossRef]
  21. Quetti, F.M.; Figini, S.; Ballante, E. A Bayesian Approach to Clustering via the Proper Bayesian Bootstrap: The Bayesian Bagged Clustering (BBC) algorithm. arXiv 2024, arXiv:2409.08954. [Google Scholar]
  22. Cabini, R.; Pichiecchio, A.; Lascialfari, A.; Figini, S.; Zanella, M. A kinetic approach to consensus-based segmentation of biomedical images. Kinet. Relat. Models 2025, 18, 286–311. [Google Scholar] [CrossRef]
  23. Herty, M.; Pareschi, L. Visconti, G. Mean field models for large data–clustering problems. Netw. Heterog. Media. 2020, 15, 463. [Google Scholar] [CrossRef]
  24. Hegselmann, R.; Krause, U. Opinion dynamics and bounded confidence models, analysis, and simulation. J. Artif. Soc. Soc. Simul. 2002, 5, 3. [Google Scholar]
  25. Deffuant, G.; Neau, D.; Amblard, F.; Weisbuch, G. Mixing beliefs among interacting agents. Adv. Complex Syst. 2000, 3, 87–98. [Google Scholar] [CrossRef]
  26. DeGroot, M. Reaching a consensus. J. Am. Stat. Assoc. 1974, 69, 118–121. [Google Scholar] [CrossRef]
  27. French, J., Jr. A formal theory of social power. Psychol. Rev. 1956, 63, 181–194. [Google Scholar] [CrossRef] [PubMed]
  28. Sznajd-Weron, K.; Sznajd, J. Opinion evolution in closed communities. Int. J. Mod. Phys. C 2000, 11, 1157–1165. [Google Scholar] [CrossRef]
  29. Borra, D.; Lorenzi, T. Asymptotic analysis of continuous opinion models under bounded confidence. Commun. Pure Appl. Anal. 2013, 12, 1487–1499. [Google Scholar] [CrossRef]
  30. Castellano, C.; Fortunato, S.; Loreto, V. Statistical physics of social dynamics. Rev. Mod. Phys. 2009, 81, 591–646. [Google Scholar] [CrossRef]
  31. Fagioli, S.; Favre, G. Opinion formation on evolving network: The DPA method applied to a nonlocal cross-diffusion PDE-ODE system. Eur. J. Appl. Math. 2024, 35, 748–775. [Google Scholar] [CrossRef]
  32. Motsch, S.; Tadmor, E. Heterophilious dynamics enhances consensus. SIAM Rev. 2014, 56, 577–621. [Google Scholar] [CrossRef]
  33. Albi, G.; Pareschi, L.; Toscani, G.; Zanella, M. Recent advances in opinion modeling: Control and social influence. In Active Particles Volume 1, Advances in Theory, Models, and Applications; Bellomo, N., Degond, P., Tadmor, E., Eds.; Birkhäuser: Cham, Switzerland, 2017. [Google Scholar]
  34. Carrillo, J.A.; Fornasier, M.; Rosado, J.; Toscani, G. Asymptotic flocking dynamics for the kinetic Cucker-Smale model. SIAM J. Math. Anal. 2010, 42, 218–236. [Google Scholar] [CrossRef]
  35. Düring, B.; Wolfram, M.-T. Opinion dynamics: Inhomogeneous Boltzmann-type equations modelling opinion leadership and political segregation. Proc. R. Soc. Lond. A 2015, 471. [Google Scholar] [CrossRef]
  36. Fagioli, S.; Radici, E. Opinion formation systems via deterministic particles approximation. Kinet. Relat. Mod. 2021, 14, 45–76. [Google Scholar] [CrossRef]
  37. Pareschi, L.; Toscani, G. Interacting Multiagent Systems: Kinetic Equations and Monte Carlo Msethods; Oxford University Press: Oxford, UK, 2013. [Google Scholar]
  38. Pareschi, L.; Tosin, A.; Toscani, G.; Zanella, M. Hydrodynamic models of preference formation in multi-agent societies. J. Nonlin. Sci. 2019, 29, 2761–2796. [Google Scholar] [CrossRef]
  39. Toscani, G. Kinetic models of opinion formation. Commun. Math. Sci. 2006, 4, 481–496. [Google Scholar] [CrossRef]
  40. Auricchio, G.; Codegoni, A.; Gualandi, S.; Toscani, G.; Veneroni, M. On the equivalence between Fourier-based and Wasserstein metrics. Rend. Lincei Mat. Appl. 2020, 31, 627–649. [Google Scholar]
  41. Chinchor, N. MUC-4 Evaluation Metrics. 1992. Available online: https://aclanthology.org/M92-1002.pdf (accessed on 28 November 2024).
  42. Dice, L. Measures of the Amount of Ecologic Association Between Species. Ecology 1945, 26, 297–302. [Google Scholar] [CrossRef]
  43. Jaccard, P. The distribution of the flora in the alpine zone.1. New Phytol. 1912, 11, 37–50. [Google Scholar] [CrossRef]
  44. Mittal, H.; Pandey, A.; Saraswat, M.; Kumar, S.; Pal, R.; Modwel, G. A comprehensive survey of image segmentation: Clustering methods, performance parameters, and benchmark datasets. Multimed. Tools. Appl. 2022, 81, 35001–35026. [Google Scholar] [CrossRef] [PubMed]
  45. Taha, A.; Hanbury, A. Metrics for evaluating 3D medical image segmentation: Analysis, selection, and tool. BMC Med. Imaging 2015, 15, 29. [Google Scholar] [CrossRef]
  46. Bertels, J.; Eelbode, T.; Berman, M.; Vandermeulen, D.; Maes, F.; Bisschops, R.; Blaschko, M. Optimizing the Dice Score and Jaccard Index for Medical Image Segmentation: Theory & Practice. arXiv 2019, arXiv:1911.01685. [Google Scholar]
  47. Albi, G.; Pareschi, L.; Zanella, M. On the optimal control of opinion dynamics on evolving networks. In Proceedings of the 27th IFIP TC 7 Conference, CSMO 2015, Sophia Antipolis, France, 29 June–3 July 2015; IFIP Advances in Information and Communication, Technology. Bociu, L., Désidéri, J.A., Habbal, A., Eds.; Springer: Cham, Switzerland, 2016; Volume 494. [Google Scholar]
  48. Nugent, A.; Gomes, S.N.; Wolfram, M.-T. Steering opinion dynamics through control of social networks. arXiv 2024, arXiv:2404.09849. [Google Scholar] [CrossRef]
  49. Carrillo, J.A.; Fornasier, M.; Toscani, G.; Vecil, F. Particle, kinetic, and hydrodynamic models of swarming. In Mathematical Modeling of Collective Behavior in Socio-Economic and Life Sciences; Modeling and Simulation in Science, Engineering and Technology; Naldi, G., Pareschi, L., Toscani, G., Eds.; Birkhäuser: Basel, Switzerland, 2010. [Google Scholar]
  50. Piccoli, B.; Tosin, A.; Zanella, M. Model-based assessment of the impact of driver-assist vehicles using kinetic theory. Z. Angew. Math. Phys. 2020, 71, 152. [Google Scholar] [CrossRef]
  51. Dimarco, G.; Pareschi, L. Numerical methods for kinetic equations. Acta Numer. 2014, 23, 369–520. [Google Scholar] [CrossRef]
  52. Pareschi, L.; Russo, G. An Introduction to Monte Carlo Methods for the Boltzmann Equation. ESAIM Proc. 1999, 10, 35–76. [Google Scholar] [CrossRef]
  53. Pareschi, L.; Zanella, M. Structure preserving schemes for nonlinear Fokker-Planck equations and applications. J. Sci. Comput. 2018, 74, 1575–1600. [Google Scholar] [CrossRef]
  54. Van Der Walt, S.; Schönberger, J.; Nunez-Iglesias, J.; Boulogne, F.; Warner, J.; Yager, N.; Gouillart, E.; Yu, T. Scikit-image: Image processing in python. PeerJ 2014, 2, e453. [Google Scholar] [CrossRef] [PubMed]
  55. Bergstra, J.; Komer, B.; Eliasmith, C.; Yamins, D.; Cox, D. Hyperopt: A Python library for model selection and hyperparameter optimization. Comput. Sci. Discov. 2015, 8, 014008. [Google Scholar] [CrossRef]
  56. Sasaki, Y. The truth of the F-measure. Teach Tutor Mater 2007, 1, 1–5. [Google Scholar]
Figure 1. Large time distribution of the 2D-bounded confidence model for different parameters characterizing the compromise propensity and the diffusion for N = 10 5 particles in [ 0 , T ] with T = 100 and Δ t = 0.01 . In (a), the final state converges to a number of clusters depending on the value of Δ . As we reduce the range of interaction, more clusters are created. In rows (b,c), we can see the interplay between the tendency of particles to aggregate and diffuse. In the first column, we see that the steady state converges to a Gaussian distribution with a standard deviation provided by σ 2 . In the second column, for (b,c), we see that the final states differ greatly in their structure. Finally, the last column shows the final states in the case where the diffusion surpasses considerable aggregation tendency.
Figure 1. Large time distribution of the 2D-bounded confidence model for different parameters characterizing the compromise propensity and the diffusion for N = 10 5 particles in [ 0 , T ] with T = 100 and Δ t = 0.01 . In (a), the final state converges to a number of clusters depending on the value of Δ . As we reduce the range of interaction, more clusters are created. In rows (b,c), we can see the interplay between the tendency of particles to aggregate and diffuse. In the first column, we see that the steady state converges to a Gaussian distribution with a standard deviation provided by σ 2 . In the second column, for (b,c), we see that the final states differ greatly in their structure. Finally, the last column shows the final states in the case where the diffusion surpasses considerable aggregation tendency.
Entropy 27 00149 g001
Figure 2. A schematic representation of the proposed model, where each pixel is interpreted as a particle ( x i , y i , c i ) , with c i being a static feature in the interval [ 0 , 1 ] that represents the grey level.
Figure 2. A schematic representation of the proposed model, where each pixel is interpreted as a particle ( x i , y i , c i ) , with c i being a static feature in the interval [ 0 , 1 ] that represents the grey level.
Entropy 27 00149 g002
Figure 3. Representation of the evolution of pixels as they tend to aggregate in different clusters.
Figure 3. Representation of the evolution of pixels as they tend to aggregate in different clusters.
Entropy 27 00149 g003
Figure 4. Summary of the segmentation process. The first image shows the input image. By means of Algorithm 1, we generate the multi-level mask where we reassign each picture’s gray level to the mean value of the cluster it is assigned to. The binary mask is produced as result of the binarization process. The final mask is the result after the two morphological refinement steps have been applied.
Figure 4. Summary of the segmentation process. The first image shows the input image. By means of Algorithm 1, we generate the multi-level mask where we reassign each picture’s gray level to the mean value of the cluster it is assigned to. The binary mask is produced as result of the binarization process. The final mask is the result after the two morphological refinement steps have been applied.
Entropy 27 00149 g004
Figure 5. Representation of the relevant areas between the predicted S t 1 and ground truth S g 1 segmentation masks. B t τ and B g τ represent the corresponding boundaries with a τ threshold. (a) Intersection area or true positive (TP). (b) Union area. (c) False positive (FP). (d) False negative (FN). (e) Intersection of boundaries at τ = 0 . (f) Intersection of boundaries at τ > 0 .
Figure 5. Representation of the relevant areas between the predicted S t 1 and ground truth S g 1 segmentation masks. B t τ and B g τ represent the corresponding boundaries with a τ threshold. (a) Intersection area or true positive (TP). (b) Union area. (c) False positive (FP). (d) False negative (FN). (e) Intersection of boundaries at τ = 0 . (f) Intersection of boundaries at τ > 0 .
Entropy 27 00149 g005aEntropy 27 00149 g005b
Figure 6. Images used to test different diffusion functions. The first column displays the original images, the second column presents the expected segmentation mask, and the third column shows the resulting binary mask. Each picture consists of ( 256 , 256 ) pixels. For the optimization procedure, we set T = 200 and Δ t = 0.1 . We define the number of iterations at 50. Row (a) shows the image with a square on a blurry background, while row (b) displays a similar image but with a circle. Only one resulting binary mask was reported for each of the images because all the tests described in this section obtain the same segmentation mask.
Figure 6. Images used to test different diffusion functions. The first column displays the original images, the second column presents the expected segmentation mask, and the third column shows the resulting binary mask. Each picture consists of ( 256 , 256 ) pixels. For the optimization procedure, we set T = 200 and Δ t = 0.1 . We define the number of iterations at 50. Row (a) shows the image with a square on a blurry background, while row (b) displays a similar image but with a circle. Only one resulting binary mask was reported for each of the images because all the tests described in this section obtain the same segmentation mask.
Entropy 27 00149 g006
Figure 7. Diffusion functions defined in (34) to assess the variability related to a given feature’s level.
Figure 7. Diffusion functions defined in (34) to assess the variability related to a given feature’s level.
Entropy 27 00149 g007
Figure 8. Evolution of T , where the kinetic density is that considered in Figure 6a. The image consists of ( 256 , 256 ) pixels. We can observe how T decreases until condition T < δ is reached with δ = 0.005 .
Figure 8. Evolution of T , where the kinetic density is that considered in Figure 6a. The image consists of ( 256 , 256 ) pixels. We can observe how T decreases until condition T < δ is reached with δ = 0.005 .
Entropy 27 00149 g008
Figure 9. Segmentation masks obtained by minimizing the Jaccard Index and the Volumetric Dice Coefficient. (a) Shows the results for the core tumor and (b) shows the results for the whole tumor. Both images consist of 240 × 240 pixels. For the optimization procedure, we set T = 300 and Δ t = 0.01 . In both cases, we considered 300 iterations of the optimization algorithm. In both cases, the loss reported by the Jaccard Index was smaller compared to that obtained with the Volumetric Dice Coefficient. Furthermore, it can be noticed that the losses reported satisfy (29) as expected. From the values of the parameters, we can observe that a larger value of the diffusion is countered by a smaller value of Δ 2 .
Figure 9. Segmentation masks obtained by minimizing the Jaccard Index and the Volumetric Dice Coefficient. (a) Shows the results for the core tumor and (b) shows the results for the whole tumor. Both images consist of 240 × 240 pixels. For the optimization procedure, we set T = 300 and Δ t = 0.01 . In both cases, we considered 300 iterations of the optimization algorithm. In both cases, the loss reported by the Jaccard Index was smaller compared to that obtained with the Volumetric Dice Coefficient. Furthermore, it can be noticed that the losses reported satisfy (29) as expected. From the values of the parameters, we can observe that a larger value of the diffusion is countered by a smaller value of Δ 2 .
Entropy 27 00149 g009aEntropy 27 00149 g009b
Figure 10. Segmentation masks obtained by minimizing the Surface and Volumetric Dice coefficients. (a) Shows the results for the core tumor and (b) shows the results for the whole tumor. Both images consist of 240 × 240 pixels. For the optimization procedure, we set T = 300 and Δ t = 0.01 . In both cases, we considered 300 iterations of the optimization algorithm. For the Surface Dice Coefficient, we set the tolerance τ equal to the length of 1 pixel. Given that both the Volumetric Dice Coefficient and Jaccard Index are a measure of the superposition between the two surfaces and do not account for the proximity between the two surfaces at every given point, the Surface Dice Coefficient represents a more suitable metric when comparing two different surfaces.
Figure 10. Segmentation masks obtained by minimizing the Surface and Volumetric Dice coefficients. (a) Shows the results for the core tumor and (b) shows the results for the whole tumor. Both images consist of 240 × 240 pixels. For the optimization procedure, we set T = 300 and Δ t = 0.01 . In both cases, we considered 300 iterations of the optimization algorithm. For the Surface Dice Coefficient, we set the tolerance τ equal to the length of 1 pixel. Given that both the Volumetric Dice Coefficient and Jaccard Index are a measure of the superposition between the two surfaces and do not account for the proximity between the two surfaces at every given point, the Surface Dice Coefficient represents a more suitable metric when comparing two different surfaces.
Entropy 27 00149 g010aEntropy 27 00149 g010b
Figure 11. Segmentation masks obtained for the F β -loss metric. (a) Shows the segmentation masks obtained for β = 0.25 , 0.5 , 0.75 , and 1.5 for the core tumor and (b) shows the segmentation masks obtained using the same values of β for the whole tumor. Both images consist of 240 × 240 pixels. For the optimization procedure, we set T = 300 and Δ t = 0.01 . In both cases, we considered 300 iterations of the optimization algorithm. In (a), we can observe that, for β = 0.25 , the resulting segmentation masks display areas of misclassified pixels, while, for larger values of β , the resulting segmentation mask does not differ. In (b), no zoomed area is shown as the segmentation masks display no visible differences for the different values of β . This is also evident in Table 4 by observing the number of false positives (FPs), false negatives (FNs), and true positives (TPs) obtained for both images.
Figure 11. Segmentation masks obtained for the F β -loss metric. (a) Shows the segmentation masks obtained for β = 0.25 , 0.5 , 0.75 , and 1.5 for the core tumor and (b) shows the segmentation masks obtained using the same values of β for the whole tumor. Both images consist of 240 × 240 pixels. For the optimization procedure, we set T = 300 and Δ t = 0.01 . In both cases, we considered 300 iterations of the optimization algorithm. In (a), we can observe that, for β = 0.25 , the resulting segmentation masks display areas of misclassified pixels, while, for larger values of β , the resulting segmentation mask does not differ. In (b), no zoomed area is shown as the segmentation masks display no visible differences for the different values of β . This is also evident in Table 4 by observing the number of false positives (FPs), false negatives (FNs), and true positives (TPs) obtained for both images.
Entropy 27 00149 g011aEntropy 27 00149 g011b
Figure 12. Relationship between the F β -loss value and the β value for both the core and whole tumor images. As β increases, the F β -loss decreases, showing that, for lower values of β , we should obtain a more precise segmentation mask as the loss indicated in this figure is 1 for perfect overlap. Nevertheless, the resulting binary mask is less accurate for lower values of β , showing that this is not an appropriate metric for optimizing the consensus-based model.
Figure 12. Relationship between the F β -loss value and the β value for both the core and whole tumor images. As β increases, the F β -loss decreases, showing that, for lower values of β , we should obtain a more precise segmentation mask as the loss indicated in this figure is 1 for perfect overlap. Nevertheless, the resulting binary mask is less accurate for lower values of β , showing that this is not an appropriate metric for optimizing the consensus-based model.
Entropy 27 00149 g012
Table 1. Parameters obtained for different diffusion functions for the square and circle images. The loss metric used to obtain these parameters was the Surface Dice Coefficient with a tolerance equal to the length of 1 pixel.
Table 1. Parameters obtained for different diffusion functions for the square and circle images. The loss metric used to obtain these parameters was the Surface Dice Coefficient with a tolerance equal to the length of 1 pixel.
Square
Δ 1 Δ 2 σ 2
D 1 ( c ) 0.8840.3100.889
D 2 ( c ) 0.3510.0540.047
D 3 ( c ) 0.8170.4071.341
D 4 ( c ) 0.4420.0810.624
Circle
Δ 1 Δ 2 σ 2
D 1 ( c ) 0.4350.3411.829
D 2 ( c ) 0.0130.1602.717
D 3 ( c ) 0.4080.2682.693
D 4 ( c ) 0.1540.2282.572
Table 2. Parameters obtained for the square image by minimizing the Jaccard Index and the Volumetric and Surface Dice coefficients. For the Surface Dice Coefficient, the tolerance was set to the length of 1 pixel. The loss obtained was zero for the three cases.
Table 2. Parameters obtained for the square image by minimizing the Jaccard Index and the Volumetric and Surface Dice coefficients. For the Surface Dice Coefficient, the tolerance was set to the length of 1 pixel. The loss obtained was zero for the three cases.
Square
Δ 1 Δ 2 σ 2
Vol. Dice0.8840.3100.889
Surf. Dice0.8840.3100.889
JAC0.4420.0810.624
Table 3. Parameters obtained for the whole and core tumor using the Volumetric Dice Coefficient, Jaccard Index, and Surface Dice Coefficient. The loss reported is 1 for perfect overlap and 0 for complete deviation.
Table 3. Parameters obtained for the whole and core tumor using the Volumetric Dice Coefficient, Jaccard Index, and Surface Dice Coefficient. The loss reported is 1 for perfect overlap and 0 for complete deviation.
Whole Tumor
Opt. Function Δ 1 Δ 2 σ 2 Loss
Vol. Dice0.49720.08882.68670.9292
JAC0.50750.11872.36310.8672
Surf. Dice0.63830.05792.65040.7447
Core Tumor
Opt. Function Δ 1 Δ 2 σ 2 Loss
Vol. Dice0.37950.12542.18080.9360
JAC0.38230.10042.70010.8796
Surf. Dice0.68410.07601.41550.8727
Table 4. Parameters obtained for the F β -measure for different values of β . The loss reported is 1 for perfect overlap and 0 for complete deviation. The numbers of false positives (FPs), false negatives (FNs), and true positives (TPs) are presented for the resulting segmentation masks for each value of β .
Table 4. Parameters obtained for the F β -measure for different values of β . The loss reported is 1 for perfect overlap and 0 for complete deviation. The numbers of false positives (FPs), false negatives (FNs), and true positives (TPs) are presented for the resulting segmentation masks for each value of β .
Whole Tumor
Δ 1 Δ 2 σ 2 FPFNTPLoss
β = 0.25 0.68730.17072.239513434731700.9559
β = 0.5 0.33510.10802.705113435031670.9470
β = 0.75 0.59390.23042.671813435031670.9373
β = 1.5 0.53160.10922.710513634931680.9179
β = 5.0 0.56620.12252.704313634931680.9032
β = 10.0 0.60610.28352.124313634931680.9013
Core Tumor
Δ 1 Δ 2 σ 2 FPFNTPLoss
β = 0.25 0.65750.27250.025792068490.9763
β = 0.5 0.39890.06371.8094251079480.9582
β = 0.75 0.40730.09421.6972251059500.9460
β = 1.5 0.54440.20772.3545251059500.9220
β = 5.0 0.55870.17422.6864251059500.9032
β = 10.0 0.61370.24251.9757251059500.9012
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Cabini, R.F.; Tettamanti, H.; Zanella, M. Understanding the Impact of Evaluation Metrics in Kinetic Models for Consensus-Based Segmentation. Entropy 2025, 27, 149. https://doi.org/10.3390/e27020149

AMA Style

Cabini RF, Tettamanti H, Zanella M. Understanding the Impact of Evaluation Metrics in Kinetic Models for Consensus-Based Segmentation. Entropy. 2025; 27(2):149. https://doi.org/10.3390/e27020149

Chicago/Turabian Style

Cabini, Raffaella Fiamma, Horacio Tettamanti, and Mattia Zanella. 2025. "Understanding the Impact of Evaluation Metrics in Kinetic Models for Consensus-Based Segmentation" Entropy 27, no. 2: 149. https://doi.org/10.3390/e27020149

APA Style

Cabini, R. F., Tettamanti, H., & Zanella, M. (2025). Understanding the Impact of Evaluation Metrics in Kinetic Models for Consensus-Based Segmentation. Entropy, 27(2), 149. https://doi.org/10.3390/e27020149

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop