On a Variational and Convex Model of the Blake–Zisserman Type for Segmentation of Low-Contrast and Piecewise Smooth Images

This paper proposes a new variational model for segmentation of low-contrast and piecewise smooth images. The model is motivated by the two-stage image segmentation work of Cai–Chan–Zeng (2013) for the Mumford–Shah model. To deal with low-contrast images more effectively, especially in treating higher-order discontinuities, we follow the idea of the Blake–Zisserman model instead of the Mumford–Shah. Two practical ideas are introduced here: first, a convex relaxation idea is used to derive an implementable formulation, and second, a game reformulation is proposed to reduce the strong dependence of coupling parameters. The proposed model is then analysed for existence and further solved by an ADMM solver. Numerical experiments can show that the new model outperforms the current state-of-the-art models for some challenging and low-contrast images.


Introduction
Image segmentation is a widely studied yet still challenging subject, especially for new and emerging imaging modalities where Mumford-Shah and extremely strong noise may be present. Of course, extremely simple images with clear contrast, without noise and without blur may be segmented by the simple methods, such as thresholding the image intensity values.
Real-life images inevitably have noise and low contrast which poses a challenge for the simple algorithms. Variational segmentation models generally provide more robust solutions for complex images and can usually be categorised loosely into two categories: edgebased or region-based models. Well-known edge-based methods include Kass et al. [1] and Caselles et al. [2]. Region-based models are generally referred to the pioneering work of Mumford-Shah (MS) [3], with some simplified variants such as Chan-Vese [4,5] that are most widely used.
In the last few years, when mentioning segmentation of challenging images, we would automatically recommend machine-learning-based approaches such as the UNet [6] and Resnet [7]. However, such works are data-dependent, and often, networks are tailored to a specific task. Firstly, they require training data which may not be available (or reliably available) at all. Secondly, we cannot yet conduct automatic transfer learning from a subject area to another to overcome the lack of sufficient training data, e.g., aircraft identification network cannot be adapted to identification of livers in medical imaging. A reliable way of overcoming the lack of sufficient training data is by weakly or semi-supervised learning which uses a small set of training data (in a supervised way) and a larger set of data without annotations (in an unsupervised way) [8,9]. Here, 'unsupervised' means that a suitable segmentation model is required; developing such a model is our aim. This paper addresses the fundamental problem of how to segment low-contrast images where image features of interest have piecewise smooth intensities. In fact, the difficulties of the two problems, namely low-contrast and piecewise smooth features, are well-known challenges. Low contrast implies that edge information by way of image gradients alone is not sufficient enough to detect weak jumps. Moreover, many well-known models such as [4] or its variants assume an input image has approximately piecewise constant intensities; piecewise smooth features imply these models cannot segment such features (or a feature would be split into sub-regions (or multiple phases) according to the intensity distribution, which means that the segmentation is already incorrect). Many approximation models based on the MS [3] can deal with segmentation of piecewise smooth features but not necessarily images displaying low contrast.
Therefore, this paper considers the Blake-Zisserman model [10] which can improve on the MS model [3]. The model [10] cannot be implemented directly and exactly, just as with the MS [3], which was never solved directly.
The rest of the paper is organised as follows. Section 2 briefly reviews related segmentation models. Section 3 introduces our new model and a game theory reformulation to facilitate subsequent solutions. Proof of the solution existence of the game formulation is given. Section 4 presents our numerical algorithm for the game formulation, and Section 5 shows numerical experiments. Brief conclusions are drawn in Section 6.

Related Works
The above-mentioned Mumford-Shah model [3] minimises the following: given the input (possibly noisy) image f : Ω → R 2 , where, most importantly, the segmentation is defined by the unknown boundary Γ, g : Ω → R 2 is a piecewise smooth approximation of f , and H 1 (Γ) denotes the Hausdorff measure (i.e., length of the boundary). In the literature, there are many follow-up works of this paper, proposed to make revised models implementable numerically. Successful results have been obtained. See [11][12][13], among others. However, for images that have weak edges possibly buried in noise and blur, the Mumford-Shah type models may fail to capture the 'discontinuities of second kind' or gradient discontinuity, which may be called the staircasing effect for gradients. The Blake-Zisserman (BZ) type model [10], though less well-known and published earlier than [3], can be very useful for a class of challenging images where MS is less effective; e.g., see [14,15]. The functional of a BZ model takes the form where g, ∇g ∈ BV(Ω). Here, Γ ∇ is the discontinuity of ∇g. As with the original formulation (1), the BZ model (2) is theoretical, not in a readily solvable form. This paper will propose an approximate and solvable model. Our work is motivated by Cai-Chan-Zeng [12], who derived a solvable and convex model for (1). We now review this model briefly. As a first step of reformulation of (1), Cai-Chan-Zeng [12] rewrites (1) in an equivalent form where Γ is assumed to be a Jordan curve as the boundary ∂Σ for the closed domain Σ = Inside(Γ). Hence, g 1 , g 2 are defined in the inside and outside of Γ, respectively. Of course, both g 1 , g 2 can be smoothly extended to the entire domain Ω. A key observation in [12], motivated by [5], is that the term H 1 (Γ), which is the length of Γ, may be approximated by Ω |∇g 1 |dx. Then, viewing the smooth functions g 1 , g 2 as a single function, the model by [12] is the following: We now propose a solvable model based on the Blake-Zisserman model (2). Assume the given image is f , and our approximation is g ∈ W 1,2 (Ω), with ∇g ∈ (W 1,2 (Ω)) 2 .
Motivated by the work of [12], we shall respectively approximate the key quantities H 1 (Γ), H 1 (Γ ∇ \Γ) by Ω |∇g|dx, Ω |∇(∇g)|dx. Therefore, our initial minimisation model takes the form While (5) is well-defined in terms of solvability, to facilitate the choice of coupling parameters, we now consider a game formulation. A game formulation encourages independent players to complete with each. Here, each player is a sub-problem in an optimisation formulation; see [16]. Here, independence means that parameters of sub-problems do not have to rely on each other.

The New Model and Reformulation as a Nash Game
In this work, we are interested in a particular case of a two-player game formulation. Instead of optimising the single energy (5), we consider a game reformulation, where two individuals, or 'players', are involved. The first player is the variable g, and the second one will be introduced by using the idea of operator splitting [17] to reduce the high-order derivatives in (5) as first-order terms and to simplify subsequent solution. The solution to this game is the Nash equilibrium, whose existence must be established. For important techniques and results in game theory and its connections to partial differential equations (PDEs) for other problems, the reader is directed to [18][19][20][21].
is called a Nash equilibrium for the game involving the two energies E 1 (·) and E 2 (·), defined on W, if One could consider only the single energy E 1 + E 2 to be optimised; however, for the theoretical analysis, the ellipticity of the sum energy is not guaranteed because of the coupling term between g and G. Hence, the existence of minimisers is not straightforward. However, we emphasise that in the game formulation, the energies E 1 (·, G) and E 2 (g, ·) are partially elliptic, i.e., with respect to the variables g and G, respectively. This is a very important property which eases the proof of the existence of Nash equilibrium.

Proof of Proposition 1.
Since is partially strict convex, partially elliptic and weakly lower semi-continuous with respect to variable g, is partially strict convex, partially elliptic and weakly lower semi-continuous with respect to variable G, the proof is a straightforward and direct application of the the Nash theorem [22].

Numerical Algorithms and Implementation
In this section, we detail the numerical algorithm to solve our game model and show how we utilise the outputs to obtain a segmentation result.

Stage One: Solution of the Main Model Using ADMM
The discretised version of our two-player game model (6) and (7) is given as follows: is discretised using backwards differences with zero Neumann boundary conditions. We aim to solve the coupled problem using the split-Bregman variant of the alternating direction method of multipliers (ADMM) [23], which is commonly used for problems containing L 1 regularisation. In order to do this, we introduce a new variable into each sub-problem: Applying split-Bregman to enforce the constraints gives us: We detail briefly how to solve each of the sub-problems: g sub-problem: We aim to solve the minimisation problem for fixed v (k) , G (k) , b 1 : 1 , which amounts to solving the following equation for g: This can be solved by using discrete Fourier transforms F : where .
v sub-problem: We aim to solve this minimisation problem for fixed g (k+1) , b which is solved analytically by a generalised shrinkage formula: where s (k) The associated Bregman update is: G sub-problem: We aim to solve the minimisation problem for fixed g (k+1) , w (k) , b 2 : whose solution is defined by the following: To find the solution G, we apply discrete Fourier transforms F : where 2 ). w sub-problem: We aim to solve the minimisation problem for fixed G (k+1) , b (k) 2 : which, similar to (9), is solved by using a shrinkage formula: where r (k) = ∇G (k+1) + b (k)

Stage Two: Segmentation of f by Thresholding g
In order to acquire a segmentation result for f , we take the minimiser g from stage one and threshold it according to some suitably defined threshold parameter(s). As in [12], the advantage of this method is that changing the threshold value(s) does not require the re-computation of the optimisation done in stage one.
There are two strategies that can be employed to define the threshold(s). The first is to use the k-means algorithm, which is an automatic method that partitions a given input into K clusters, for K ≥ 2. The second is to define the threshold value(s) manually, which generally provides better results. As the threshold values are applied after optimisation, a wide range of values can easily be tried and the best selected. In our experiments, we use manual threshold values for two-phase segmentation, whereas for multiphase segmentation with multiple threshold values, we use k-means to simplify the process.

Numerical Results
In this section, we display some examples of the performance of our model and compare it with a number of models, namely: and also a deep learning model. We first show some visual comparisons, where noise is added to the original image, and then later do a quantitative analysis on a dataset. Note that all the models above (and ours) except for the CRCV model is capable of multiphase segmentation, whereas the CRCV model (in the Chan-Vese framework) is only capable of two-phase segmentation. For this reason, in the experiments, we only include the CRCV model in two-phase examples.

Qualitative Results
In Figure 1, we show an image from an ultrasound. We add additive Gaussian noise with mean 0 and standard deviation 10. We display the outputs of all the competing models, the segmentation result overlaid on the original image, and for all but the CRCV show the segmentation result after thresholding (as the segmentation result after thresholding is the binary output shown first). We see that the segmentation result from our model is better at segmenting the object in the image, noticing that our segmentation effectively segments the "tail" part at the top of the object, whereas the CCZ model fails to segment it well. The CRCV and CNC models segment the tail but fail to remove the noise. We note that the T-ROF model is the best competing model but does not quite segment all the tail.
Similarly, in Figure 2, we show another two-phase segmentation example, where we have the clean image but add Gaussian noise with mean 0 and standard deviation 25. It is clear that none of the competing models are as good as ours. Our result manages to preserve more detail in general, notably at the strand at the top, and the curved structure at the bottom of the image, without being susceptible to the noise.  In Figures 3-5, we show some examples of multiphase segmentation on MRI images of the brain. In all cases, we add Gaussian noise with mean 0 and standard deviation 17 and run the noisy image as input to both for all models but the CRCV model (as this is a two-phase model only). The output is then given as input to the k-means algorithm with K = 4. We show the clustering output in the final column of the relevant figures. We see that the segmentation result of our model is better at finding some of the finer edges; for example, the white matter segmentation from our model is in general more detailed than the segmentation from the competing ones.

Quantitative Analysis
To assess our method quantitatively, we run our model on 20 images in the Digital Retinal Images for Vessel Extraction (DRIVE) dataset (https://drive.grand-challenge.org/ accessed 25 October 2021). We use the manual segmentation image as the clean image and add additive Gaussian noise with mean 0 and standard deviation 100 to use as the input image, as shown in Figures 6-9a,b respectively. We display the output of the competing models and our model here as well as a deep learning model (abbreviated as DL). We trained a U-Net [6] network on 15 of the images (and used the other five as validation set), where the noisy image served as input and we trained with binary cross-entropy loss function to match with the clean image. The results are good; however, we lack a large dataset to provide the impressive result that deep learning approaches usually provide. Figures 6-9 are four examples on the given dataset; however, we run on the 20 images available to provide some quantitative analysis. We use the DICE coefficient and the JACCARD similarity coefficient as quantitative measures to evaluate the performance of segmentation results. Given a binary segmentation result Σ from a model and ground truth segmentation GT, the DICE coefficient is given as: The JACCARD similarity coefficient is given as: In Table 1, we show the mean and standard deviation values of the DICE and JAC-CARD scores on the dataset. We see clearly that our model is more effective than the Cai model from these results. We note that the numerical values provided for the DL method are run on all 20 images in the dataset; however, the DL was trained on 15 of these images. This is somewhat of an unfair comparison; however, we see that the numerical values for our approach are still larger than the values for the DL approach despite this. Figure 10 shows the boxplots of quantitative results on the data, for further visualisation.

Conclusions
In this paper, we have developed a convex relaxed game formulation of the less well-known Blake-Zisserman model in order to segment images with low contrast and strong noise. The advantages of the game formulation are that the existence of Nash equilibrium can be proved and there is less dependence on parameters for each subproblem, i.e., parameters of each sub-problem do not rely on each other, and so can be tuned appropriately and separately. The game model was implemented using a fast split-Bregman algorithm, and numerical experiments show improvements in segmentation results over competing models, especially over the well-known Mumford-Shah type methods for low-contrast images.