Blind Additive Gaussian White Noise Level Estimation from a Single Image by Employing Chi-Square Distribution

The additive Gaussian white noise (AGWN) level in real-life images is usually unknown, for which the empirical setting will make the denoising methods over-smooth fine structures or remove noise incompletely. The previous noise level estimation methods are easily lost in accurately estimating them from images with complicated structures. To cope with this issue, we propose a novel noise level estimation scheme based on Chi-square distribution, including the following key points: First, a degraded image is divided into many image patches through a sliding window. Then, flat patches are selected by using a patch selection strategy on the gradient maps of those image patches. Next, the initial noise level is calculated by employing Chi-square distribution on the selected flat patches. Finally, the stable noise level is optimized by an iterative strategy. Quantitative, with association, to qualitative results of experiments on synthetic real-life images validate that the proposed noise level estimation method is effective and even superior to the state-of-the-art methods. Extensive experiments on noise removal using BM3D further illustrate that the proposed noise level estimation method is more beneficial for achieving favorable denoising performance with detail preservation.


Introduction
In image denoising, additive Gaussian white noise level is an important parameter, but it is usually unknown in real-life images [1,2]. A denoising method with accurate noise levels may generate comfortable results with abundant richness [3]. The prior ways usually set it empirically, which may result in undesirable maps: the method with a high noise level may over-smooth rich structures, while AGWN may still exist in the denoising result with a low level [4,5].
To provide accurate noise levels, researchers have developed a number of noise level estimation methods [6,7]. Among them, the patch-based methods are widely used due to their easy implantation and low computational burden. However, noise levels may be overestimated or underestimated with the usage of inaccurate homogeneous patches, which are sensitive to the complexity of images and noise levels.
To cope with the problems, this paper presents a novel method of noise level estimation, which combines a simple and effective patch-based method with the Chi-square distribution. Figure 1 shows the flowchart of our proposed method. The input image is first divided into image patches through a sliding window. Then, we select the flat patches using a flat patch selection strategy with the texture maps extraction. Next, the initial noise level is calculated by employing Chi-square distribution on the selected flat patches. Finally, the estimation performance is boosted via an iterative strategy. The main contributions are detailed as follows:

1.
A novel patch-based noise level estimation method based on Chi-square distribution is proposed.

2.
An optimization iteration scheme is proposed to improve the accuracy and stability of the noise level estimation strategy.

3.
Quantitative results associated with the qualitative results of the experiments are used to verify the effectiveness of the proposed noise level strategy. The remainder of this paper is organized as follows. Section 2 elaborates the literature of existing noise level estimation methods. The proposed noise level estimation method is described in Section 3. Section 4 shows our experimental results, and we conclude this paper in Section 5.

Literature Review
Currently, the noise level estimation methods are roughly divided into the following three categories: • Filter-based methods: These methods extract the differential image by convolving the noisy image with a specially designed filter, and then use the filtered differential image as the noise map to estimate the noise level [8,9]. For example, Immerkaer [10] designed an image structure insensitive method to filter noisy images and estimate noise level by averaging the convolved images. It performed well on estimating noise levels, but lost in structural images [11]. To address this issue, Rank et al. [12] combined histogram statistics with a filter-based approach to generate the stable noise level. However, it produced an overestimation noise level from texture images [13].
To reduce the adverse effects caused by the image structures, Tai et al. [14,15] applied a Laplacian operator to remove strong edge pixels before filtering so as to improve the accuracy at low noise levels. • Transform-based methods: Instead of using spatial information, these methods estimate noise level by transforming an image into other spaces [16,17]. For example, Donoho [18] proposed a mean absolute deviation (MAD) method to estimate the noise level on the wavelet domain. They treated all the coefficients of the highest frequency subband as noise, and estimated the standard deviation. This method performed well on estimating high noise level, but the error is increased when noise level is low [19,20].
Recently, the models based on the singular value decomposition (SVD) were widely used in noise level estimation [21,22]. For example, Wei et al. [23] used singular value tail data in SVD to estimate the noise level, which minimizes the interference of image structures. However, image details and noise cannot be completely separated at the end of the singular value for images with rich structures, so the noise level is invariably overestimated [24]. • Patch-based methods: In these methods, a noisy image is initially decomposed into a group of patches. Then, the homogeneous patches are selected via various statis-tical techniques for noise level estimation [25][26][27]. For example, Pyatykh et al. [28] proposed a method based on principal component analysis (PCA), which viewed the smallest eigenvalue of the image patch covariance matrix as the noise level. Since the minimum eigenvalue of PCA about the noisy image patch does not always satisfy the null hypothesis, it is easy to cause instability or overestimation of the noise level estimation [29][30][31]. In order to optimize the above methods, Liu et al. [32] proposed an automatic noise level estimation method by adaptively selecting effective image patches for covariance calculation. It effectively reduces the obvious overestimation of the low noise level based on the PCA method, but there is still underestimation in the case of high noise level [33].
The above mentioned methods have achieved considerable progress in improving the accuracy of noise level estimation, but they also have their respective drawbacks [34,35]. For example, filter-based methods have a large error when the structures in an image are dense and have a highly computational complexity [36,37]. Transform-based methods always produce over-estimation results when noise levels are low [38,39]. Patch-based methods may generate underestimation results at high noise levels [17,40,41].

Image Decomposition into Patches
Assuming that an image is degraded by additive white Gaussian noise (AWGN) with unknown standard deviation σ, the model can be defined as where X is the latent clean image, N is AWGN: N ∼ N (0, σ 2 ), and Y is the noisy image. For a pure flat image, the contaminated image Y F can be expressed as where X F represents the ground truth image. As X F is purely flat, its gradient maps are zero. Then the noise level of the noisy image Y F can be easily obtained by the following formula: where P and Q are the height and width of the noisy image Y F , y(i, j) represents gray value of Y F at the point (i, j),Ȳ F is the average of Y F , Num is the number of pixels, and σ f is the estimated noise level. Unfortunately, natural images usually have rich structures, so it is inaccurate to estimate the noise level directly using Equation (3). To accurately estimate the noise level of a degraded image, we first decompose the image into many image patches through a sliding window (such as 6 × 6 in this paper). The patch y k can be defined as where S is the number of image patches. x k is the k-th clear patch from X, and each patch is defined by its center pixel. y k denotes noisy patch corrupted by Gaussian noise n k with zero-mean and noise level σ n . For the follow-up work, we used the set {Y k }| S k=1 to represent the noisy patches, which can be subdivided into contaminated detail and latent flat patches.

Flat Patches Selection
In this work, we will select latent flat patches from set {Y k }| S k=1 and define them as set {Z l }| C l=1 , where C represents the number of flat patches. Comparing the details and flat of the patches, we find that the flatness of the patches can be well expressed by the gradient features maps.
The gradient maps of y k can be calculated with the following function where H h and H v are the horizontal h and vertical v gradient matrices, respectively. G h and G v respectively represent the horizontal and vertical gradient strength maps. By substituting Equation (4) into Equation (5), it is rewritten as The gradient texture intensity ε k of the image patch y k is formulated as where K and J respectively represent the height and width of the image patch y k . In order to set a reasonable threshold to select flat patches, we conduct a more detailed study on the gradient texture intensity ε k . To simplify the process, Equation (7) can be rewritten as where x k is the latent clean patch, , n k represents the Gaussian noise of patch and B(n k ) = The gradient texture intensity component A(x k ) is determined by the patch x k and B(n k ) is generated by AGWN. For flat patches from {Z l }| C l=1 , x l is absolutely flat so that A(x) = 0. Therefore, the gradient texture intensity ε f lat of the patch can be expressed as where x f lat is the latent flat patch and n f lat is AGWN. Since B(n k ) is only affected by AGWN and patch size, we can fit on the extremely flat image with known conditions to calculate the gradient texture intensity threshold. Firstly, the image is divided into patches, and then the gradient texture intensity of each patch is counted by Equation (9). The gradient texture intensity threshold ε δ can be calculated from where B(n k (i, j)) represents the noise gradient texture intensity component of the noise patch n k (i, j). According to the statistics of a large number of noise patches, the maximum value of B(n k (i, j)) is selected as ε δ . So we have Equation (11) shows that the left side is a good approximation of the gradient texture intensity of the flat patches. In the selection of the flat patch on the natural image, we will use the parameter ε δ as the threshold. If the gradient texture intensity ε k is less than the threshold ε δ , such that the patch can be regarded as a flat patch and used for noise level estimation. For the patch z l in the selected flat patch set {Z l }| C l=1 , we have where z l is the latent flat patch of the noisy image Y, x l is the flat patch of X, and n l is AGWN.

Image Noise Level Estimation
Ideally, according to Equation (12), since z l is the flat image patch, its average is equal to the mean of the clean patch x l , and the variance is provided by the noise n l . We can draw the conclusions where µ is the average of all patches {Z l }| C l=1 , µ l (l ∈ (1, C)) is the mean of the l-th flat patch z l , and σ 2 l is the variance of the l-th flat patch z l . Equation (13) can be deduced as In the ideal case, σ 2 = σ 2 1 = σ 2 2 = · · · = σ 2 C , so Equation (14) is simplified as Assuming M l = (Z l − µ l ), M 1 , M 2 , · · · , M C are mutually independent and identically distributed random variables. Then we can obtain For all flat patches, the distribution U = ∑ C l=1 G 2 l follows the Chi-square distribution with the degree of freedom C, and can be denoted as χ 2 (C). Then, the calculation formula of the noise level is defined as follows: with the confidence level T.

Noise Level Estimation Optimization
In order to improve the accuracy of noise level estimation, we use the iterative noise level σ t as a parameter in combination with the gradient texture intensity threshold ε δ to further select the flat patches {Z l }| C l=1 . As shown in Algorithm 1, first, the initial noise level σ 0 is estimated by the method in Section 3.3. Next, the threshold value ε δ is obtained through the gradient feature and iterative noise level σ t . After, the flat patches {Z l }| C l=1 are selected according to ε δ . Then, the noise level σ (t+1) is estimated by Chi-square distribution on the flat patches {Z l }| C l=1 . This process is iterated until the estimated noise level σ (t+1) converges to a fixed point. The iterative convergence criterion is where γ = 10 −3 . σ t is the estimation of the noise level in the t-th iteration while σ t+1 is the estimation result in the (t + 1)-th iteration. If the termination condition is satisfied, the estimated final noise level σ end is output.
Algorithm 1 Noise level estimation optimization.
Step 1: Step 2: Generating the gradient texture intensity dataset {E k }| S k=1 , where ε k is calculated using Equation (7) from the patch dataset {Y k }| S k=1 ; Step 3: Estimated initial noise level σ 0 by Section 3.3; Step 4: Calculating the threshold ε δ combine with σ t using Equation (10); // The selection of flat patches // The patch y k is regarded as the flat image patch Z l ; else The patch y k is considered a detail patch and is removed; end if end for // Image noise level calculation // for t = 0 to Iter do Calculating σ (t+1) with {Z l }| C l=1 using Equation (17); Generating Φ using Equation (18); if Φ ≥ γ then σ t = σ (t+1) and back to step 4; else σ end = σ (t+1) and break; end if end for Output: Estimated noise level σ end .

Experimental Results and Discussions
In this section, extensive experiments are conducted to evaluate the performance of our proposed method. In Section 4.1, we first perform simulation experiments by superimposing different noise levels on three synthetic flat images to verify the correctness of our theory. The selection of flat patches and the stability of the iterative strategy are then examined as described in Sections 4.2 and 4.3. Finally, we compare our model with the state of the art to demonstrate its powerful performance. Furthermore, in Sections 4.5 and 4.6, we test the practical applicability of our method by combining it with the BM3D denoising method.
The complete experiments of the image noise level estimation algorithms were performed under the MATLAB 2019b software on a Windows 10 with Intel i7 eight-core, CPU 3.0 GHz, and 8 GB RAM memory. In addition, the comparison models used in our experiments, such as Olsen [8], Tai [14], Donoho [18], are generic versions without tuning parameters.

Feasibility Study on Test Flat Images
To verify the feasibility of the proposed noise level estimation theory, three different flat images shown in Figure 2 are selected to be experimented. These images are 1024 × 1024 pixels, and the gray values are respectively 64, 128, and 192. The sliding window size for the image patches is 6 × 6 and T = 1 − e −6 .
The noise level estimation curves are shown in Figure 3, from which we can see that the noise levels estimated by the proposed scheme are very close to the ground-truth noise levels, indicating that the variance distribution of flat image patches obeys the Chi-square distribution, which validates the feasibility of the proposed method.

Analysis of the Flat Patch Selection
The accuracy of the flat patches selection is key to our method. Some patches with its gradient texture strength maps and the threshold of noisy image Stable (σ = 5) are shown in Figure 4. According to Equation (11), the patches are viewed as flat patches when ε k < ε δ = 36,944, where ε δ is calculated by the extremely flat image with iterative noise level σ t+1 .
As shown in Figure 5, the visual results of four representative images are displayed. We experimented on these images at four different standard noise levels (σ = 5, 10, 15, and 20). One can observe that the image is divided into numerous patches, and the flat patches are shown in the green area of the labeled images. At different noise levels, our method can adaptively select the flat patches, which can be helpful to estimate the noise level more accurately. The method of flat patch selection will be used in Sections 4.4-4.6.

Discussion of the Iterative Model
In this section, we test the impact of the iterative model on our method. Firstly, the gradient texture intensity threshold ε δ is calculated through the iterative noise level σ t . Secondly, the iterative noise level σ t+1 is obtained according to the Chi-square distribution. Finally, the final estimated noise level σ end is output according to the termination condition Φ and Iter, where Φ is calculated by Equation (18) and Iter = 10.
As illustrated in Figure 6, they are the experimental results of our optimization method. Figure 6a-c shows the estimated noise level along with iteration on the Stable image respectively with σ = 5 and σ = 15; both of them were experimented 100 times. Accordingly, Figure 6b-d shows their average estimated noise level along with iteration. These experimental results show that the noise level with the iterative strategy in our method can be converged to a fixed point, which verifies its effectiveness.

Comparisons of Experiments on Synthetic Images
We tested the proposed noise level estimation method on six well-known images (which are from the BSD image database [42]) shown in Figure 7. The proposed method is compared with state-of-the-arts, such as Donoho [18], Immerkar [10], Olsen [8], Pyatykh [28] and Tai [14].
The comparisons of noise level estimation results are shown in Figure 8, where the noise level range from 1 to 50. We can observe that the proposed method performs better than the state of the art on noise level estimation for each image under the same noise level. In particular, the proposed method can also yield pleasing results on images with rich textures (such as the Koala image shown in Figure 8f. In addition, the Table 1 shows the average error of estimated noise levels from six images, which is calculated as E err = |E n −Ē n |. At each noise level, the first two best results are bold. The results show that our method produces smaller average errors in most cases than other advanced noise level estimation methods, demonstrating that our method has powerful capacity in accurately estimating noise level. Compared with the Tai algorithm, our method is slightly inferior when the noise level is 15, where the reason may be that some noised structural patches are mis-viewed as flat patches for noise level estimation. However, our method gains superiority when the noise level is other.      Figure 7. The bold font indicates the first two best results at each noise level.

Combined with BM3D on Synthetic Images
We also test noise level estimations with the classic block-matching and 3D collaborative filtering (BM3D) [43] method for blind AWGN noise removal. Two synthetic images with different noise levels are tested, and the visual results are compared in Figures 9 and 10. However, their differences at each level on each image are small, resulting in the subtle differences of visual results, especially in the enlarged areas. The quantitative denoising results (the peak signal-to-noise ratio (PSNR) [44] is adopted to assess their ability for noise reduction and the structural similarity index (SSIM) [45] is exploited to assess their capability for structure preservation) are respectively presented in Tables 2 and 3. We combine BM3D with true noise level to obtain the reconstructed image and use its PSNR and SSIM values as benchmarks to evaluate noise level estimation methods. It can be seen that our method tends to produces competitive results that are closer to the benchmarks than the state-of-the-art methods. Generally, combining BM3D with our method produces the best results compared to the images produced by other methods, but the performance on some images (such as Church and Koala) is a little worse than other methods. The reason is that the BM3D method over-smooths the irregular structures because the self-similarity property of these two noised images is not exhibited.  [8], (d) [10], (e) [14], (f) [18], (g) [28], and (h) ours. Table 2. Average PSNR scores and its standard deviation of the images in Figure 7; each was experimented 100 times. The bold font denotes the best results.  Table 3. Average SSIM scores and its standard deviation of the images in Figure 7; each was experimented 100 times. The bold font denotes the best results.  [8] 0.8994 ± 0.0007 0.8011 ± 0.0015 0.6819 ± 0.0020 0.8766 ± 0.0010 0.8046 ± 0.0009 0.7141 ± 0.0019 0.9109 ± 0.0007 0.8184 ± 0.0013 0.6948 ± 0.0026 S. Pyatykh [28] 0.8997 ± 0.0009 0.7985 ± 0.0019 0.6805 ± 0.0020 0.8750 ± 0.  [8], (d) [10], (e) [14], (f) [18], (g) [28], and (h) ours.

Combined with BM3D on Real-World Images
To further verify the effectiveness of our method, we conduct experiments on reallife X-ray angiograms, as shown in Figures 11 and 12. From the figures, several observations can be concluded: First, when the noise level is empirically set too small, although the blood vessels are preserved well, the noise will be removed incompletely, as shown in Figures 11c and 12c. Second, when the noise level is empirically set too large, the noise is thoroughly removed but the fine vessels are over-smoothed, as shown in Figures 11d and 12d. In contrast, the proposed method can perfectly reduce noise as well as preserving rich vessels, as shown in Figures 11b and 12b. The extensively experimental results illustrate that our method can yield satisfactory results, which are beneficial for applications.

Conclusions
Image noise levels are unknown in the real-life world, thus how to robustly and accurately estimate the blind noise level from real-world images is important for image denoising methods to remove noise. This paper proposed a novel noise level estimation method by using Chi-square distribution on image patches. The key procedures of the proposed method, including image decomposition into patches, flat patches selection, image noise level estimation based on Chi-square distribution, and robust noise level estimation by an iterative strategy, are discussed in detail. The results of the experiments on flat images, synthetic maps, and real-world images respectively validated the feasibility, the effectiveness, and the further application of the proposed method. Although the proposed method obtained feasible and competitive results, its performance may decrease when the images have rich structures. In the future work, we will further optimize the selection of flat patches to estimate noise levels more accurately.