Level-Set-Based Kidney Segmentation from DCE-MRI Using Fuzzy Clustering with Population-Based and Subject-Specific Shape Statistics

The segmentation of dynamic contrast-enhanced magnetic resonance images (DCE-MRI) of the kidney is a fundamental step in the early and noninvasive detection of acute renal allograft rejection. In this paper, a new and accurate DCE-MRI kidney segmentation method is proposed. In this method, fuzzy c-means (FCM) clustering is embedded into a level set method, with the fuzzy memberships being iteratively updated during the level set contour evolution. Moreover, population-based shape (PB-shape) and subject-specific shape (SS-shape) statistics are both exploited. The PB-shape model is trained offline from ground-truth kidney segmentations of various subjects, whereas the SS-shape model is trained on the fly using the segmentation results that are obtained for a specific subject. The proposed method was evaluated on the real medical datasets of 45 subjects and reports a Dice similarity coefficient (DSC) of 0.953 ± 0.018, an intersection-over-union (IoU) of 0.91 ± 0.033, and 1.10 ± 1.4 in the 95-percentile of Hausdorff distance (HD95). Extensive experiments confirm the superiority of the proposed method over several state-of-the-art level set methods, with an average improvement of 0.7 in terms of HD95. It also offers an HD95 improvement of 9.5 and 3.8 over two deep neural networks based on the U-Net architecture. The accuracy improvements have been experimentally found to be more prominent on low-contrast and noisy images.


Introduction
Acute rejection is the most frequent cause of graft failure after kidney transplantation [1]. However, acute renal rejection is treatable, and early detection is critical in order to ensure graft survival. The diagnosis of renal transplant dysfunction using traditional blood and urine tests is inaccurate because the failure can be detected after losing 60% of the kidney function [1]. In this respect, the DCE-MRI technique has achieved an increasingly important role in measuring the physiological parameters of the kidney and follow-up patients. DCE-MRI data acquisition is carried out through injecting the patient with a contrast agent and, during the perfusion, the kidney images are captured quickly and repeatedly at three second intervals. The contrast agent perfusion leads to contrast variation in the acquired images. Consequently, the intensity of the images at the beginning of the sequence is low (pre-contrast interval), gradually increases until reaching its maximum (post-contrast interval), and then decreases slowly (late-contrast interval). Figure 1 shows a time sequence of DCE-MRI kidney images of one of the patients that was taken during the contrast agent perfusion. Accurate kidney segmentation from these images is an important first step for a complete noninvasive characterization of the renal status. However, segmenting shows a time sequence of DCE-MRI kidney images of one of the patients that was taken during the contrast agent perfusion. Accurate kidney segmentation from these images is an important first step for a complete noninvasive characterization of the renal status. However, segmenting the kidneys is challenging due to the motion that is made by the patient's breathing, the contrast variation, and the low spatial resolution of DCE-MRI acquisitions [1,2].

Pre-contrast
Post-contrast Late-contrast = 1 = 2 = 47 = 48 = 81 Figure 1. Contrast variation in DCE-MRI images of a patient's kidney scanned at different time instants after bolus injection. The concentration of the contrast agent in the kidney tissue is low at the beginning of the acquisition process, yielding low-intensity images (pre-contrast interval), reaches its maximum, generating high-intensity images (post-contrast interval), and then decreases slowly, resulting again in low-intensity images (late-contrast interval).
In order to overcome these problems, several researchers have proposed multiple techniques to segment the kidney from DCE-MRI images. A careful examination of the related literature reveals that level-set-based segmentation methods [3][4][5][6][7][8] have been the most popular for this purpose. In these methods, a deformable model adapts to the shape of the kidney, its evolution being constrained by the image properties and prior knowledge of the expected kidney shape. In [3], the authors developed a DCE-MRI kidney segmentation method employing prior kidney shape and gray-level distribution density in the level set speed function in order to constrain the evolution of the level set contour. However, their method had large segmentation errors on noisy and low-contrast images. Thus, in [4,5], Khalifa et al. proposed a speed function combining the intensity information, the shape prior information, and the spatial information modeled by 2nd-and 4th-order Markov Gibbs random field (MGRF) models, respectively. In order to circumvent the issue of the rather similar appearance between the kidney and the background tissues, Liu et al. [6] proposed to remove the intensity information from the speed function in [5] and to use a 5th-order MGRF to model the spatial information.
Incorporating the shape information into the level set method typically requires a separate registration step [3][4][5][6] to align an input DCE-MRI image to the shape prior model in order to compensate for the motion that is caused by the patient's breathing and movement during data acquisition. In a different manner, Hodneland et al. [7] proposed a new model that jointly combines the segmentation and registration into the level set's energy function and applied it to segment kidneys from 4D DCE-MRI images. From another perspective, the level set contour evolution is guided by deriving a partial differential equation in the direction that minimizes a predefined cost functional containing several weighting parameters that need manual tuning [3][4][5][6][7]. In contrast, Eltanboly et al. [8] proposed a level set segmentation method employing the gray-level intensity and shape information without using weighting parameters. Some work [9,10] has also been carried out in addressing the intensity inhomogeneity and the low contrast problems of DCE-MRI images that are caused during the acquisition process. Based on fractional calculus, Al-Shamasneh et al. [9] proposed a local fractional entropy model to enhance the contrast of DCE-MRI images. Later, in [10], they presented a fractional Mittag-Leffler energy function based on the Chan-Vese algorithm for segmenting the kidneys from low-contrast and degraded MR images. Contrast variation in DCE-MRI images of a patient's kidney scanned at different time instants t after bolus injection. The concentration of the contrast agent in the kidney tissue is low at the beginning of the acquisition process, yielding low-intensity images (pre-contrast interval), reaches its maximum, generating high-intensity images (post-contrast interval), and then decreases slowly, resulting again in low-intensity images (late-contrast interval).
In order to overcome these problems, several researchers have proposed multiple techniques to segment the kidney from DCE-MRI images. A careful examination of the related literature reveals that level-set-based segmentation methods [3][4][5][6][7][8] have been the most popular for this purpose. In these methods, a deformable model adapts to the shape of the kidney, its evolution being constrained by the image properties and prior knowledge of the expected kidney shape. In [3], the authors developed a DCE-MRI kidney segmentation method employing prior kidney shape and gray-level distribution density in the level set speed function in order to constrain the evolution of the level set contour. However, their method had large segmentation errors on noisy and low-contrast images. Thus, in [4,5], Khalifa et al. proposed a speed function combining the intensity information, the shape prior information, and the spatial information modeled by 2nd-and 4th-order Markov Gibbs random field (MGRF) models, respectively. In order to circumvent the issue of the rather similar appearance between the kidney and the background tissues, Liu et al. [6] proposed to remove the intensity information from the speed function in [5] and to use a 5th-order MGRF to model the spatial information.
Incorporating the shape information into the level set method typically requires a separate registration step [3][4][5][6] to align an input DCE-MRI image to the shape prior model in order to compensate for the motion that is caused by the patient's breathing and movement during data acquisition. In a different manner, Hodneland et al. [7] proposed a new model that jointly combines the segmentation and registration into the level set's energy function and applied it to segment kidneys from 4D DCE-MRI images. From another perspective, the level set contour evolution is guided by deriving a partial differential equation in the direction that minimizes a predefined cost functional containing several weighting parameters that need manual tuning [3][4][5][6][7]. In contrast, Eltanboly et al. [8] proposed a level set segmentation method employing the gray-level intensity and shape information without using weighting parameters. Some work [9,10] has also been carried out in addressing the intensity inhomogeneity and the low contrast problems of DCE-MRI images that are caused during the acquisition process. Based on fractional calculus, Al-Shamasneh et al. [9] proposed a local fractional entropy model to enhance the contrast of DCE-MRI images. Later, in [10], they presented a fractional Mittag-Leffler energy function based on the Chan-Vese algorithm for segmenting the kidneys from low-contrast and degraded MR images.
More recently, convolutional neural networks (CNNs) have been successfully used for several image segmentation tasks, including kidney segmentation. For example, Lundervold et al. [11] developed a CNN-based approach for segmenting kidneys from 3D DCE-MRI data using a transfer learning technique from a network that was trained for brain hippocampus segmentation. Haghighi et al. [12] employed two cascaded U-Net models [13] to segment kidneys from 4D DCE-MRI data. Later on, Milecki et al. [14] developed a 3D unsupervised CNN-based approach for the same reason. Bevilacqua et al. [15] presented two different CNN-based approaches for accurate kidney segmentation from MRI data. On the other hand, the authors in [16] integrated a mono-objective genetic algorithm and deep learning for an MRI kidney segmentation task. Isensee et al. [17] presented the top scoring model in the CHAOS challenge [18] for an abdominal organs segmentation task, in which they used an nnU-Net model to segment the left and right kidneys from MRI data. The CHAOS challenge dataset includes the data of 80 different subjects, including 40 CTs and 40 MRIs. Each sequence contains an average of 90 scans in CT and 36 in MRI in the DICOM format.
The research gap is as follows: The common stumbling block facing CNN methods is that they typically require annotated data of a large size in order to train the network, which is often difficult to obtain in the medical field. Thus, the aforementioned deep learning methods struggle to achieve high segmentation accuracy. On the other hand, the level-setbased kidney segmentation methods [3][4][5][6][7][8] have proved their effectiveness in achieving a superior performance with more accurate segmentation. However, unfortunately, almost all of them need accurate level set contour initialization to be performed manually by the user. Inaccurate initialization may cause a drop in the segmentation accuracy or even cause the method to fail. In order to overcome this problem, in [19] we have presented an automated DCE-MRI kidney segmentation, called FCMLS, based on FCM clustering [20] and level sets [21]. In our FCMLS method, we constrain the contour evolution by the shape prior information and the intensity information that are represented in the fuzzy memberships. In addition, in order to ensure the robustness of the FCMLS method against contour initialization, we employ smeared-out Heaviside and Dirac delta functions in the level set method. The FCMLS method has indeed demonstrated its efficiency in segmenting the kidneys from DCE-MRI images. However, it still has some limitations. First, its performance drops on low-contrast images, such as those in the pre-and late-contrast parts of the time sequence in Figure 1. Second, the FCM algorithm is used for computing the fuzzy memberships of the image pixels before the level set evolution begins. Once the level set starts evolving, the obtained memberships are not changed, and this might be not accurate enough in some cases.
In order to enhance the segmentation accuracy of FCMLS, and to improve its robustness on low-contrast images, we have developed a new kidney segmentation method, named the FML method, in [22]. In this method, we model the correlation between neighboring pixels into the level set's objective functional by a Markov random field energy term. We also embed the FCM algorithm into the level set method and iteratively update the fuzzy memberships of the image pixels during contour evolution. The experimental results have confirmed the improved accuracy and robustness of this method. However, the integration of the Markov random field model within the level set formulation has increased the computational complexity of the FML method significantly.
In this paper, we follow a different strategy in order to improve the segmentation performance of our previous method without sacrificing the computational complexity. The shape information plays a key role in kidney segmentation since human kidneys tend to have a common shape, with between-subject variations. Thus, we seek to take full advantage of this in our new level set formulation by exploiting the level set method's flexibility to accommodate the shape information about the target object that is to be segmented [23]. Inspired by [24], we employ PB-shape and SS-shape models for kidney segmentation. The PB-shape model is built offline from a range of kidney images from various subjects that are manually segmented by human experts, whereas the SS-shape model is constructed on the fly from the segmented kidneys of a specific patient.
This new methodology is able to generate high segmentation accuracy because the PB-shape model is used on images with high contrast in the post-contrast interval of the image sequence. Moreover, the SS-shape model that is generated from those accurate segmentations is employed on the more challenging, lower contrast images from pre-and late-contrast intervals of the sequence, as it more accurately reflects the kidney's shape from the same patient. Our early work on this new methodology has been drafted in [25], on which we build and develop several novel contributions in the present paper. First, we embed FCM clustering into the level set evolution. Thus, the kidney/background fuzzy memberships are computed and updated every time the level set contour evolves. Second, the representation of the shape information in [25] is based on a 1st-order shape method, which might be inaccurate when some kidney pixels are not observed at all in the images that are used to construct the shape model. In this paper, we adopt an efficient Bayesian parameter estimation method [26] in computing the PB-shape and SS-shape models, which more accurately accounts for the kidney pixels that are possibly not observed during the model building. Third, we propose an automated and time-efficient, yet effective, strategy to determine the images from the patient's sequence, to which the PB-shape model, the SS-shape model, or both of the models blended together are applied.
The proposed method is used to segment the kidneys of 45 subjects from DCE-MRI sequences, and the segmentation accuracy is assessed using the Dice similarity coefficient (DSC), the intersection-over-union (IoU), and the 95-percentile of Hausdorff distance (HD95) metrics [2,27]. Our experimental results prove that the proposed method can achieve high accuracy, even on noisy and low-contrast images, with no need for tuning the weighting parameters. The experiments also show that the segmentation accuracy is not affected by changing the position of the initial level set contour, which demonstrates the high consistency of the proposed method. We compare our method's segmentation accuracy with several state-of-the-art level set methods, as well as our own earlier methods [19,22,25]. Furthermore, we compare its performance against the base U-Net model and one of its modifications named BCDU-Net [28], which is trained for the same kidney segmentation task. The two networks are trained from scratch on our DCE-MRI data, which are augmented with the KiTS19 challenge dataset [29]. This dataset contains 300 subjects' data, where 210 out of all of the data are publicly released for training and the remaining 90 subjects are held out for testing. Each subject has a sequence of high quality CT scans, with their ground-truth labels that are manually segmented by medical students. It also includes a chart review that illustrates all of the relevant clinical information about this patient. All of the CT images and segmented annotations are provided in an anonymized NIFTI format. The comparison results confirm that the proposed method outperforms all of the other methods.
The remainder of this paper is structured as follows: Section 2 introduces the mathematical formulation of the proposed kidney segmentation method. Then, Section 3 provides the experimental results and the comparisons. Finally, a discussion and the conclusions are presented in Section 4.

Materials and Methods
In this section, we present the formulation of the proposed segmentation method in detail.

Materials
DCE-MRI data are collected from 45 subjects who underwent kidney transplantation at Mansoura University Hospital, Egypt. In order to acquire the data, a dose of 0.2 mL/kg BW of Gd-DTPA contrast agent was injected intravenously at a rate of 3-4 mL/s. Meanwhile, the kidney is scanned quickly and repeatedly, at 3 s intervals, using a 1.5T MRI scanner with a phased-array torso surface coil. The transition of the contrast agent results in a variation in the contrast of the images. Therefore, each subject has a dataset of about 80 repeated temporal frames, which are 256 × 256 pixels in size. Each image in the sequence is manually segmented by an expert radiologist at the hospital. A sample sequence of one subject is shown in Figure 1.

Problem Statement and Notations
In DCE-MRI, to evaluate the transplanted kidney function, the kidney needs to be accurately segmented from each image separately. Let I t , t = 1, . . . , N, be a time-point image captured at time t from a DCE-MRI sequence of length N. I t (x, y) is the intensity of a pixel (x, y) in the image domain Ω. The target is to label each pixel (x, y) in the image as kidney (K) or background (B).

Level-Set-Based Segmentation Model with Fuzzy Clustering and Shape Statistics
Given a DCE-MRI time-point image I t , the level set contour ∂Ω partitions the domain Ω of the image into a kidney region Ω K and background region Ω B . At any time t, ∂Ω corresponds to the level set of a higher-dimensional function φ t (x, y), i.e., The function φ is defined as the shortest Euclidean distance between every pixel (x, y) in the image and the contour. The distance is positive for the pixels inside of the contour, negative outside, and zero on the contour. The level set contour iteratively evolves in the direction minimizing the following energy function: where λ 1 and λ 2 are positive normalizing parameters that control the impact of the energy terms. E F CM (φ(x, y)) is an FCM-based energy function computed from the input image I t to attract the contour towards the position of the kidney in the image, which is defined as follows: where Hφ ε = H ε (φ(x, y)) is the smeared-out Heaviside function, which is defined as follows: where the parameter ε determines the degree of smearing. L(φ) in (1) is a length term that is responsible for keeping the level set contour φ(x, y) smooth and defined, as follows: where δφ ε = δ ε (φ(x, y)) is the Dirac delta function, which is the derivative of Hφ ε , and is given as follows: (2) represents either kidney (for L = K) or background (for L = B) energy function of the pixel (x, y) in the image and is defined as follows: where µ L (x, y) is the kidney/background fuzzy membership degrees of the pixel (x, y). P L (x, y) and S L (x, y) are prior probabilities of the pixel (x, y) derived from PBshape and SS-shape models, respectively. The weight factor ω t is used to control the contribution of both models in the segmentation operation. Information about how the value of ω t is computed for each image in the sequence is explained in Section 2.6.
According to the calculus of variations, the minimization of the function in (1), with respect to φ, is given as follows: Finally, the level set contour is iteratively evolved to the boundary of the object as follows: where integer n is a number of time steps, as follows: t = nτ for τ > 0. It is worth noting that using the smeared-out Heaviside and Dirac delta function is important in order to obtain a global minimizer for the function in (1), irrespective of the level set initialization in the image [21].

FCM Membership Function
Given an image I t , the FCM clustering algorithm divides the pixels in the image domain Ω into two separate clusters, kidney and background, as shown in Figure 2. According to this algorithm, the optimal centroid values of the clusters and the corresponding membership degrees are obtained by iteratively minimizing an objective function of the following form [30]: where C L is the centroid value of kidney (L = K) or background (L = B) clusters, and µ L (x, y) ∈ [0, 1] is the fuzzy membership degree of the pixel (x, y) in the cluster L and satisfies the condition of µ K (x, y) + µ B (x, y) = 1. || . || represents the Euclidean distance between the pixel's intensity and cluster's centroid.
Bioengineering 2022, 9, x FOR PEER REVIEW 7 of 19 Overall, the membership values of the pixels to a specific cluster depend on the distances between the intensity of the pixels and the cluster centroid. This means that the pixels are assigned high membership values (close to 1) to a certain cluster when their intensities are close to the centroid value and low membership values (close to 0) when they are far from the centroid. As illustrated in Figure 2, the higher the brightness of a pixel is in the kidney/background cluster, the higher its probability to belonging to this cluster. As shown in Figure 2, relying only on fuzzy membership is often not enough to obtain accurate kidney segmentation, especially on low-contrast images. Thus, we incorporate the shape prior information with fuzzy memberships to control the level set evolution.

Statistical Kidney Shape Model
Some earlier approaches (e.g., [6,25]) employ a 1st-order shape method in the construction of a kidney shape model. The major drawback of this method appears when a pixel is classified as kidney or background in all of the training images. In such cases, the pixel-wise probability of the observed label will be exactly 1, and the unobserved label's probability will be exactly 0, which is often unreasonable. To circumvent this issue, we In our earlier method [25], the FCM algorithm is used to compute the fuzzy memberships of the image pixels before the level set evolution begins. Once the level set starts evolving, the obtained memberships are not changed, and this might be not accurate enough in some cases. We improve this approach in the present paper. First, the kidney and background centroid values are initially defined as the mean of the pixel intensities inside and outside of the initial level set contour, respectively. Then, the kidney/background fuzzy membership degrees of each pixel (x, y) are iteratively updated during the level set evolution as follows: Similarly, the centroid values of the kidney/background clusters are computed as follows: . As such, the per-pixel fuzzy memberships and kidney/background centroids are coupled with the level set function via (10) and (11) and are updated in each evolution step.
Overall, the membership values of the pixels to a specific cluster depend on the distances between the intensity of the pixels and the cluster centroid. This means that the pixels are assigned high membership values (close to 1) to a certain cluster when their intensities are close to the centroid value and low membership values (close to 0) when they are far from the centroid. As illustrated in Figure 2, the higher the brightness of a pixel is in the kidney/background cluster, the higher its probability to belonging to this cluster. As shown in Figure 2, relying only on fuzzy membership is often not enough to obtain accurate kidney segmentation, especially on low-contrast images. Thus, we incorporate the shape prior information with fuzzy memberships to control the level set evolution.

Statistical Kidney Shape Model
Some earlier approaches (e.g., [6,25]) employ a 1st-order shape method in the construction of a kidney shape model. The major drawback of this method appears when a pixel is classified as kidney or background in all of the training images. In such cases, the pixel-wise probability of the observed label will be exactly 1, and the unobserved label's probability will be exactly 0, which is often unreasonable. To circumvent this issue, we adopt the Bayesian parameter estimation method [26] in the construction of the PB-shape and SS-shape models in our work here. For the PB-shape model, a number N of DCE-MRI kidney images are selected from varying subjects, and one among them is considered as a reference image. These images are mutually registered to the selected target image, assuming 2D affine transformation by the maximization of mutual information [31] (Figure 3). Then, the co-aligned images are manually segmented by an expert. Finally, the obtained ground-truth segmentations are used to build the shape model, as follows: For each pixel (x, y) in the co-aligned ground-truth images, when kidney and background labels are both observed, the empirical kidney/background probability of this pixel is computed as follows [26,32]: where O(x, y) denotes the number of observed labels, which in this case equals 2, because both labels are observed. N L (x, y) indicates how many times the label L is observed. β is a pseudo count added to the count of each observed label, and is the total number of possible region labels (kidney and background). On the other hand, when a kidney or background label is observed in all of the images, O(x, y) equals 1 and the probability of the observed label is computed from (12), while the probability of the unobserved label is computed as follows: both labels are observed.
( , ) indicates how many times the label is observed. is a pseudo count added to the count of each observed label, and ℓ is the total number of possible region labels (kidney and background). On the other hand, when a kidney or background label is observed in all of the images, ( , ) equals 1 and the probability of the observed label is computed from (12), while the probability of the unobserved label is computed as follows: According to the above steps, an example PB-shape model is shown in Figure 3. The same methodology is also adopted in the construction of the SS-shape model, but from a set of images selected on the fly from the specific patient's sequence being segmented. Figure 3. PB-shape model constructed using the Bayesian parameter estimation method: Some DCE-MRI kidney images before (a) and after (b) affine registration. Column (c) shows manually segmented kidneys after alignment. Column (d) shows the PB-shape model constructed before (top) and after (bottom) affine registration.

Sequence Partitioning and the Weight Factor
In order to segment the kidney of a specific patient, we partition the patient's DCE-MRI sequence into three subsets. The already-constructed PB-shape model is employed to segment the kidneys from the images in the first subset 1 . The obtained kidney segmentations are used to construct the SS-shape model, which is blended with the PB-shape model to segment the kidneys from the images in the second subset 2 . The images in the third subset 3 are segmented using only the SS-shape model.
We propose to employ an automated, fast approach for this sequence partitioning. First, all of the images in a given patient sequence are co-aligned via affine transformations to the reference image used in the PB-shape model construction. Then, for each image in the sequence, the mean of the pixel intensities in the kidney region is computed using the PB-shape model as follows: Note that this step does not require any kidney segmentation beforehand, thus can be carried out before starting our segmentation method. Figure 4 shows these mean values across the sequence in Figure 1. A number ℕ 1 of images with the highest mean values (indicated by red circles in Figure 4) is selected to constitute the subset 1 . Images of Figure 3. PB-shape model constructed using the Bayesian parameter estimation method: Some DCE-MRI kidney images before (a) and after (b) affine registration. Column (c) shows manually segmented kidneys after alignment. Column (d) shows the PB-shape model constructed before (top) and after (bottom) affine registration.
According to the above steps, an example PB-shape model is shown in Figure 3. The same methodology is also adopted in the construction of the SS-shape model, but from a set of images selected on the fly from the specific patient's sequence being segmented.

Sequence Partitioning and the Weight Factor
In order to segment the kidney of a specific patient, we partition the patient's DCE-MRI sequence into three subsets. The already-constructed PB-shape model is employed to segment the kidneys from the images in the first subset S 1 . The obtained kidney segmentations are used to construct the SS-shape model, which is blended with the PB-shape model to segment the kidneys from the images in the second subset S 2 . The images in the third subset S 3 are segmented using only the SS-shape model.
We propose to employ an automated, fast approach for this sequence partitioning. First, all of the images in a given patient sequence are co-aligned via affine transformations to the reference image used in the PB-shape model construction. Then, for each image I t in the sequence, the mean of the pixel intensities in the kidney region is computed using the PB-shape model as follows: Note that this step does not require any kidney segmentation beforehand, thus can be carried out before starting our segmentation method. Figure 4 shows these mean values across the sequence in Figure 1. A number N 1 of images with the highest mean values (indicated by red circles in Figure 4) is selected to constitute the subset S 1 . Images of length N 2 whose m t values come next (indicted by black diamonds) are selected to form the subset S 2 . Finally, the remaining N 3 images (∑ 3 i=1 N i = N) in the sequence constitute the subset S 3 . Accordingly, the weight factor ω t in (6) is computed for each image in these subsets as follows: where i is the index of the image I t in S 2 , as decreasingly ordered by its m t value. The dashed green line in Figure 4 shows the values of ω t across the subject's sequence shown in Figure 1.
Bioengineering 2022, 9, x FOR PEER REVIEW 9 of 19 length ℕ 2 whose values come next (indicted by black diamonds) are selected to form the subset 2 . Finally, the remaining ℕ 3 images (∑ ℕ 3 =1 = ℕ) in the sequence constitute the subset 3 . Accordingly, the weight factor in (6) is computed for each image in these subsets as follows: where is the index of the image in 2 , as decreasingly ordered by its t value. The dashed green line in Figure 4 shows the values of across the subject's sequence shown in Figure 1. Note that this partitioning procedure collects the high contrast images of the post-contrast interval of the MRI sequence in 1 , thus allowing the PB-shape model alone ( = 1) to accurately segment the kidneys from the 1 images. The SS-shape model is constructed from the segmented kidneys from the 1 images and used together with the PB-shape model (while is gradually decreasing) to segment the images in 2 . The SS-shape model is incrementally updated on the fly while working on the 2 images (Figure 5). As a new segmentation becomes available, it is added onto the set employed that is to update the SS-shape model. Once all of the 2 images are segmented, the SS-shape model is not updated anymore.  Note that this partitioning procedure collects the high contrast images of the postcontrast interval of the MRI sequence in S 1 , thus allowing the PB-shape model alone (ω t = 1) to accurately segment the kidneys from the S 1 images. The SS-shape model is constructed from the segmented kidneys from the S 1 images and used together with the PBshape model (while ω t is gradually decreasing) to segment the images in S 2 . The SS-shape model is incrementally updated on the fly while working on the S 2 images ( Figure 5). As a new segmentation becomes available, it is added onto the set employed that is to update the SS-shape model. Once all of the S 2 images are segmented, the SS-shape model is not updated anymore. Note that this partitioning procedure collects the high c post-contrast interval of the MRI sequence in 1 , thus allowing th ( = 1) to accurately segment the kidneys from the 1 images constructed from the segmented kidneys from the 1 images and PB-shape model (while is gradually decreasing) to segment SS-shape model is incrementally updated on the fly while working ure 5). As a new segmentation becomes available, it is added ont is to update the SS-shape model. Once all of the 2 images are s model is not updated anymore.  Note that the partitioning procedure keeps the more challenging, lower contrast images from the pre-and late-contrast intervals of the sequence in S 3 . However, the SS-shape model is solely (ω t = 0) able to precisely segment those S 3 images as it more accurately captures the kidney's shape of this specific patient.
Finally, a flowchart of the proposed kidney segmentation method is shown in Figure 6.
Bioengineering 2022, 9, x FOR PEER REVIEW 10 of 19 Note that the partitioning procedure keeps the more challenging, lower contrast images from the pre-and late-contrast intervals of the sequence in 3 . However, the SS-shape model is solely ( = 0) able to precisely segment those 3 images as it more accurately captures the kidney's shape of this specific patient.
Finally, a flowchart of the proposed kidney segmentation method is shown in Figure  6.

Results
The performance of the proposed method was evaluated on DCE-MRI datasets of 45 subjects. The segmentation accuracy was assessed using DSC (mean ± standard deviation), IoU (mean ± standard deviation), and HD95 (mean ± standard deviation) metrics [2]. The PB-shape model was trained from the 30 ground-truth images of 30 different subjects. The parameters of the proposed method were experimentally set as follows: = 1.5, λ 1 = 6, λ 2 = 6, ℕ 1 = 20, ℕ 2 = 10, and = 1. The values of all of the parameters were not changed or further tuned in all of the conducted experiments.

Method Performance with Comparisons to Other Methods
We first evaluated the performance of the proposed method on the gathered DCE-MRIs. Figure 7 depicts the segmentation process by our method for two different images. It shows the level set contour evolution during the segmentation procedure after different iterations. The figure also shows the final segmentation result. As shown in Figure 7, the proposed method can efficiently drive the contour towards the boundary of the kidneys in the images.
We then compared the segmentation performance of this new method against the following previous methods: FCMLS [19], FML [22], and PBPSFL [25]. In our experiments, we initialized-on purpose-the level set contour extremely far away from the kidney in all of the methods. We reported the performances on all of images and also on a particular set of low-contrast images (the first 5 images from each subject, totaling 225 images) in terms of DSC, IoU, and HD95 in Table 1. Table 1. Comparison between the segmentation performance of the proposed method and our previous methods.

Results
The performance of the proposed method was evaluated on DCE-MRI datasets of 45 subjects. The segmentation accuracy was assessed using DSC (mean ± standard deviation), IoU (mean ± standard deviation), and HD95 (mean ± standard deviation) metrics [2]. The PB-shape model was trained from the 30 ground-truth images of 30 different subjects. The parameters of the proposed method were experimentally set as follows: ε = 1.5, λ 1 = 6, λ 2 = 6, N 1 = 20, N 2 = 10, and β = 1. The values of all of the parameters were not changed or further tuned in all of the conducted experiments.

Method Performance with Comparisons to Other Methods
We first evaluated the performance of the proposed method on the gathered DCE-MRIs. Figure 7 depicts the segmentation process by our method for two different images. It shows the level set contour evolution during the segmentation procedure after different iterations. The figure also shows the final segmentation result. As shown in Figure 7, the proposed method can efficiently drive the contour towards the boundary of the kidneys in the images.
We then compared the segmentation performance of this new method against the following previous methods: FCMLS [19], FML [22], and PBPSFL [25]. In our experiments, we initialized-on purpose-the level set contour extremely far away from the kidney in all of the methods. We reported the performances on all of images and also on a particular set of low-contrast images (the first 5 images from each subject, totaling 225 images) in terms of DSC, IoU, and HD95 in Table 1.

Method
All Images Low-Contrast Images DSC IoU HD95 DSC IoU HD95 FCMLS [19] 0.941 ± 0.042 0.89 ± 0.056 1.87 ± 6.21 0.88 ± 0.137 0.80 ± 0.156 8.18 ± 22.8 PBPSFL [25] 0.952 ± 0.041 0.90 ± 0.043 1.11 ± 1.7 0.923 ± 0.13 0.88 ± 0.056 1.93 ± 2.32 FML [22] 0.956 ± 0.019 0.91 ± 0.035  The results in Table 1 demonstrate the improvement of the proposed method over our previous methods by achieving the highest mean DSC and IoU values and the lowest mean HD95 values, with a noticeable advantage on the low-contrast images. The lower standard deviation values of all of the evaluation metrics confirm the new method's more consistent performance compared to the other methods. Figure 8 shows a qualitative comparison between these methods on two low-contrast images from two different subjects. Clearly, the proposed method achieves notably better segmentation accuracy than the other methods.  The results in Table 1 demonstrate the improvement of the proposed method over our previous methods by achieving the highest mean DSC and IoU values and the lowest mean HD95 values, with a noticeable advantage on the low-contrast images. The lower standard deviation values of all of the evaluation metrics confirm the new method's more consistent performance compared to the other methods. Figure 8 shows a qualitative comparison between these methods on two low-contrast images from two different subjects. Clearly, the proposed method achieves notably better segmentation accuracy than the other methods.
In order to further confirm the high-performance of the proposed method over the PBPSFL method [25], the two methods are used to segment the kidneys from the images that were corrupted by additive Gaussian noise (mean 0, variance 0.01, image intensities are normalized to range [0, 1]). Figure 9 visually compares the segmentation performances of both of the methods on a number of noisy images, while quantitative comparison results are given in Table 2.
The proposed method clearly outperforms the PBPSFL method in the presence of noise. It has a higher mean DSC and IoU, and lower mean HD95 values. While the two methods share the idea of using both the PB-shape and the SS-shape models, the higher performance of the new method can be attributed to the better shape model that was constructed, as explained in Section 2.5, and to updating the FCM memberships during the level set evolution. FCMLS method [19], (c) PBPSFL method [25], (d) FML method [22], and (e) proposed method.
In order to further confirm the high-performance of the proposed method over the PBPSFL method [25], the two methods are used to segment the kidneys from the images that were corrupted by additive Gaussian noise (mean 0, variance 0.01, image intensities are normalized to range [0, 1]). Figure 9 visually compares the segmentation performances of both of the methods on a number of noisy images, while quantitative comparison results are given in Table 2.   [19], (c) PBPSFL method [25], (d) FML method [22], and (e) proposed method. FCMLS method [19], (c) PBPSFL method [25], (d) FML method [22], and (e) proposed method.
In order to further confirm the high-performance of the proposed method over the PBPSFL method [25], the two methods are used to segment the kidneys from the images that were corrupted by additive Gaussian noise (mean 0, variance 0.01, image intensities are normalized to range [0, 1]). Figure 9 visually compares the segmentation performances of both of the methods on a number of noisy images, while quantitative comparison results are given in Table 2.  The proposed method clearly outperforms the PBPSFL method in the presence of  The efficiency of the proposed method is further demonstrated by comparing its accuracy against those of a number of state-of-the-art level-set-based methods. Table 3 compares the accuracy of the results of all of the images by the proposed method and our previous FCMLS [19], PBPSFL [25], FML [22] methods, as well as by the shape-based (SB) method [33], the vector level set (VLS) [34], the 2nd-order MGRF level set (2nd-MGRF) [4], and a parametric kernel graph cut (PKGC) [35]. The DSC values of the PKGC and the 2nd-MGRF methods are reported in [5,6], using the same DCE-MRI datasets that were used in our study. As neither the output segmented kidneys that were obtained by these two methods nor the faithful implementations of the two methods are available to us, we are not able to compute/report the IoU and HD95 values of the two methods. Clearly, as shown in Table 3, the proposed method achieves the best segmentation accuracy compared with to other methods. Table 3. Comparison between the segmentation accuracy of the proposed method and the existing methods.

Ablation Experiments
We have performed an ablation study in order to assess the contribution of each component to the proposed method's performance. We then evaluated the effect of some user-supplied parameters on the obtained segmentation accuracy. In our ablation study, we compared three scenarios. First, we evaluated the performance of our level-set-based method incorporating only fuzzy memberships and the PB-shape model. The FCM algorithm was used to compute the fuzzy memberships of the input image before the level set evolution began, and the memberships were not changed afterwards. In the second scenario, the FCM algorithm was embedded into the level set method, and the fuzzy memberships were updated as the level set evolved. The third scenario represented our complete approach, integrating the SS-shape model with the PB-shape model and the embedded fuzzy memberships. In all three of the scenarios, we assessed the impact on the final segmentation accuracy. The quantitative and qualitative comparison results are reported in Table 4 and Figure 10. While the segmentation accuracy in Table 4 for all of the images improved from Scenario 1 to Scenario 2 to, eventually, Scenario 3, the impact was more prominent on lowcontrast images. Updating the fuzzy memberships during the contour evolution improved the segmentation accuracy of the low-contrast images by about 3%, 4%, and 2.5 mm in terms of the mean DSC, IoU, and HD95, respectively. Moreover, the incorporation of the SS-shape model yielded a further improvement of about 3% in DSC, 5% in IoU, and 1.6 mm in HD95. As shown in Figure 10, the proposed method in Scenario 3 could more efficiently segment and catch the boundary of the target kidneys, thus generating more accurate segmentations. Overall, the results in Table 4 and Figure 10 highlight the benefit of the proposed integration of the embedded fuzzy memberships and the SS-shape model along with the PB-shape model into our level set framework. Then, we studied the effect of some user-supplied parameters on the proposed method. We first investigated the impact of changing the values of ℕ 1 and ℕ 2 on the segmentation accuracy. Table 5 reports these results for three combinations of ℕ 1 and ℕ 2 values, demonstrating that the proposed method achieved the best segmentation accuracy when ℕ 1 = 20 and ℕ 2 = 10 as this allows more images to build the SS-shape model.  While the segmentation accuracy in Table 4 for all of the images improved from Scenario 1 to Scenario 2 to, eventually, Scenario 3, the impact was more prominent on low-contrast images. Updating the fuzzy memberships during the contour evolution improved the segmentation accuracy of the low-contrast images by about 3%, 4%, and 2.5 mm in terms of the mean DSC, IoU, and HD95, respectively. Moreover, the incorporation of the SS-shape model yielded a further improvement of about 3% in DSC, 5% in IoU, and 1.6 mm in HD95. As shown in Figure 10, the proposed method in Scenario 3 could more efficiently segment and catch the boundary of the target kidneys, thus generating more accurate segmentations. Overall, the results in Table 4 and Figure 10 highlight the benefit of the proposed integration of the embedded fuzzy memberships and the SS-shape model along with the PB-shape model into our level set framework.
Then, we studied the effect of some user-supplied parameters on the proposed method. We first investigated the impact of changing the values of N 1 and N 2 on the segmentation accuracy. Table 5 reports these results for three combinations of N 1 and N 2 values, demonstrating that the proposed method achieved the best segmentation accuracy when N 1 = 20 and N 2 = 10 as this allows more images to build the SS-shape model. Afterwards, we evaluated the proposed method's performance against different level set initializations. Figure 11 shows the accuracy of the proposed method on a sample DCE-MRI image with the level set contour that was initialized in different positions in the image. From the visual results and the reported DSC values in Figure 11, the segmentation accuracy was not changed in all cases. This confirms the method's high and consistent performance, regardless of where the level set contour is initialized in the image. Afterwards, we evaluated the proposed method's performance against different level set initializations. Figure 11 shows the accuracy of the proposed method on a sample DCE-MRI image with the level set contour that was initialized in different positions in the image. From the visual results and the reported DSC values in Figure 11, the segmentation accuracy was not changed in all cases. This confirms the method's high and consistent performance, regardless of where the level set contour is initialized in the image. DSC = 0.985 DSC = 0.985 DSC = 0.985 Figure 11. Segmentation results using the proposed method on a DCE-MRI image with different level set contour initializations (top row, red outlines). The segmented kidney (bottom row, red outlines), for any initialization, closely matches the ground-truth (green outlines), as evidenced by the associated DSC.

Comparison to U-Net-Based Deep Neural Networks
In recent years, CNN in general, and the U-Net architecture in particular [13], have been applied to various medical image segmentation problems with good results [11][12][13][14][15][16][17][18]. Therefore, we have compared the proposed method versus a base U-Net CNN and one of its variants named BCDU-Net [28]. Both of the networks were trained from scratch on data from 18 subjects, were validated against data from 12 subjects, and were tested using the remaining 15 subjects. In order to prevent the models from overfitting, the data of each subject were augmented by performing the following operations on each image in the sequence: vertical and horizontal flipping, random x − and y-translations, rotation by ±45°, ±90°, 180° angles, and noising by adding Gaussian noise with zero mean and a variance of 0.01, 0.02, and 0.05 (the image intensities were normalized to range [0,1]). The augmentation results in a total number of 16,404 images for training and 10,980 images for validation.
In order to further enlarge the data, following [18,36], we used high quality CT scans of 210 subjects from a KiTS19 dataset [29]. We manually split each image into two sub-images of size 256 × 256 for the left and right kidneys. This eventually increased the number of training and validation images to 40,050 and 10,980, respectively. The two networks were trained for 200 epochs using the Adam optimizer and an initial learning rate 0.0001, which decays by a factor of 0.1 whenever the validation loss is not reduced for 10 consecutive epochs. In order to further avoid overfitting, we used dropout regularization with a 50% ratio during the training. The training was carried out on a workstation with dual Figure 11. Segmentation results using the proposed method on a DCE-MRI image with different level set contour initializations (top row, red outlines). The segmented kidney (bottom row, red outlines), for any initialization, closely matches the ground-truth (green outlines), as evidenced by the associated DSC.

Comparison to U-Net-Based Deep Neural Networks
In recent years, CNN in general, and the U-Net architecture in particular [13], have been applied to various medical image segmentation problems with good results [11][12][13][14][15][16][17][18]. Therefore, we have compared the proposed method versus a base U-Net CNN and one of its variants named BCDU-Net [28]. Both of the networks were trained from scratch on data from 18 subjects, were validated against data from 12 subjects, and were tested using the remaining 15 subjects. In order to prevent the models from overfitting, the data of each subject were augmented by performing the following operations on each image in the sequence: vertical and horizontal flipping, random x− and y-translations, rotation by ±45 • , ±90 • , 180 • angles, and noising by adding Gaussian noise with zero mean and a variance of 0.01, 0.02, and 0.05 (the image intensities were normalized to range [0, 1]). The augmentation results in a total number of 16,404 images for training and 10,980 images for validation.
In order to further enlarge the data, following [18,36], we used high quality CT scans of 210 subjects from a KiTS19 dataset [29]. We manually split each image into two sub-images of size 256 × 256 for the left and right kidneys. This eventually increased the number of training and validation images to 40,050 and 10,980, respectively. The two networks were trained for 200 epochs using the Adam optimizer and an initial learning rate 0.0001, which decays by a factor of 0.1 whenever the validation loss is not reduced for 10 consecutive epochs. In order to further avoid overfitting, we used dropout regularization with a 50% ratio during the training. The training was carried out on a workstation with dual 2.20 GHz Intel Xeon Silver 4114 CPUs with 128 GB of RAM and two Nvidia GPUs in a Python environment, using Keras API with Tensorflow backend. The trained networks were then used to segment the kidney from the test subjects. Table 6 presents a comparison between the segmentation accuracies that were obtained by the proposed method, the U-Net model, and the BCDU-Net model with three dense blocks. Table 6. Comparison between the segmentation performance of the proposed method versus U-Net and BCDU-Net models.

Method
All Images Low-Contrast Images It can be seen from Table 6 that the BCDU-Net model had a considerably better performance than the base U-Net model, yet our method performed notably better than it. The mean DSC and IoU of the proposed method were higher than those of the U-Net and BCDU-Net models. Moreover, the mean HD95 values definitely showed the gap between the accuracy of our method and the two models. The HD95 metric characterized the divergence between the boundary surfaces of the segmentation result and the groundtruth kidney [2,27]. As such, unlike DSC, it was more sensitive to the shape deviations of the segmentation result against the ground-truth [27]. On the other hand, the standard deviations of DSC, IOU, and HD95 indicated that the proposed method was much more consistent and stable than the U-Net and BCDU-Net models. The improvement was more profound on the low-contrast images. It is important to mention that the behavior of the proposed method is easier to explain, as well as the interpretation of the results, compared to the deep U-Nets. For example, obtaining rather a noisy kidney contour from the segmentation result would suggest increasing the weighting factor λ 1 in our method, as a corrective action.

5.
The experiments that were performed on 45 subjects demonstrate the accuracy of the proposed method and its robustness against noise, low contrast, and contour initialization with no need for tuning the method's parameters. The comparisons with several state-of-the-art level set methods, and two CNN based on the U-Net architecture, confirm the superior and consistent performance of the proposed method.
Nevertheless, the proposed method has some limitations. First, incorporating the shape information into the level set method requires a prerequisite registration step in order to align an input DCE-MRI image to the shape prior model in order to compensate for the motion due to patient's breathing and movement during the data acquisition. Any errors occurring in this alignment step would affect the segmentation performance. Second, similar to all of the level-set-based methods, the level set contour evolution in our method is guided by a partial differential equation containing several weighting parameters. Moreover, we use a weight factor that controls the contribution of the two shape statistics in the segmentation procedure. All of these weighting parameters require proper setting. Third, our new method takes about 7 s to segment one image of a size of 256 × 256 pixels, which is not suitable for real-time operation yet.
Our current research is directed towards improving the proposed method and alleviating its limitations. In our experiments, the values of the weighting parameters are experimentally chosen and then fixed throughout all of the conducted experiments without further tuning. We plan to investigate other weighting strategies in order to systematically find out the proper values for these weights, similarly to the scheme that was proposed in [37]. We also plan to investigate combining kidney segmentation and registration into the level set's energy function. Simultaneously solving this issue for both of the tasks would diminish the propagation of errors from one task to the other. Last but not least, in an attempt to improve the time performance of the proposed method, we are working on converting the MATLAB code to C++ code that is optimized for GPU computing.