Next Article in Journal
N-DEPTH: Neural Depth Encoding for Compression-Resilient 3D Video Streaming
Previous Article in Journal
A Novel Feature Selection Strategy Based on the Harris Hawks Optimization Algorithm for the Diagnosis of Cervical Cancer
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Communication

Fast Noise Level Estimation via the Similarity within and between Patches

1
Institute of Robotics and Intelligent Systems, School of Information Science and Engineering, Wuhan University of Science and Technology, Wuhan 430081, China
2
School of Mechnical and Electrical Engineering, Xinxiang University, Xinxiang 453003, China
3
Institute for Infocomm Research (I2R), A*STAR, Singapore 138632, Singapore
*
Author to whom correspondence should be addressed.
Electronics 2024, 13(13), 2556; https://doi.org/10.3390/electronics13132556
Submission received: 28 May 2024 / Revised: 24 June 2024 / Accepted: 25 June 2024 / Published: 28 June 2024

Abstract

Patch level-based noise level estimation (NLE) is often inaccurate and inefficient because of the harsh criteria required to select a small number of homogeneous patches. In this paper, a fast image NLE method based on a global search for similar pixels is proposed to solve the above problem. Specifically, the mean square distance (MSD) is first expressed in the form of the standard deviation (std) and mean value of image patches. Afterward, the two values, std and mean, are calculated and stored in advance. Then, a 2D statistical histogram and summed area table are adopted to speed up the search for similar patches. Further, the most similar pixels are selected from similar patches to obtain an initial estimation. Finally, we correct the deviation of the initial estimation by re-injecting noise for secondary estimation. Experimental results show that the proposed method outperforms the state-of-the-art techniques in fast NLE and guided denoising.

1. Introduction

As a low-level machine vision operation, image noise level estimation (NLE) is an essential task that is widely used in many fields, such as denoising [1,2], compression [3], segmentation [4], image quality assessment [5], optical flow [6], image deblurring [7], object recognition [8], feature extraction [9], and super-resolution [10], and needs to be corrected according to the noise level. The performance of these algorithms strongly depends on the noise level, but the noise level is often unknown beforehand in practice. Therefore, developing an efficient, fast NLE technique is an important endeavor in image processing.
The commonly used strategies in NLE are the selection of homogeneous regions and noise statistical analysis. In terms of the method used for the selection of homogeneous regions [11,12,13,14,15,16,17,18,19,20,21], filtering-based methods [11,12,13,14,15] typically require a special filter to eliminate strong textures in the image, such as edges and other structural content; the majority of the remaining noise is pixel-independent, but due to changes in local texture, the noise level is easily overestimated, or if a small number of patches are selected, the noise level will be underestimated. In order to find sufficient homogeneous regions NLE, the authors in [16,17,18] use filters to search for pixel differences between patches, but their performance is largely affected by thresholds. Our previous work [19] searched for homogeneous regions by calculating the mode of patches to obtain stable estimation results, but this method is prone to overestimation in low-noise images with complex textures. As for the typical representation methods, the patch-based principal component analysis (PCA) method [20,21] has also achieved good results, but due to it only including small eigenvalues of some noise components, its estimated noise variance is usually smaller than the actual variance. Therefore, PCA-based estimators make it easy to underestimate variance, especially for low-frequency images with high noise levels.
In statistical methods, high-order moment invariance methods [22,23,24,25,26] usually require us to solve complex nonlinear optimization problems. In the global SVD methods [22], the noise level is estimated by counting the number of smaller singular values, but the noise subspace is difficult to find and requires a large amount of statistical data. Other statistical strategies [23] estimate noise levels by establishing the relationship between noise variance and statistics. The method in [24] utilizes the symmetry of the Gaussian distribution of noise to establish the relationship between patch redundancy eigenvalues and noise; when the mean and median of patch eigenvalues are equal, the eigenvalue of this position is the noise level. Therefore, NLE is converted into an estimate of the dimensionality of patch redundancy. However, this method performs poorly when dealing with images with complex textures. In summary, the accuracy of these methods depends on the number of homogeneous patches or the fine design of the model.
The key idea of the above methods is to separate the contribution of the image structure to the NLE results. Although carefully selected, the smallest unit is only patch-based, and still contains the image’s structure information. Therefore, Hou et al. [2] proposed a method to estimate the noise level based on the non-local self-similarity (NLE-NSS) of pixels and obtained better performance. However, this method is still a local strategy, i.e., it essentially searches for similar patches from the neighborhood, and its average output of all local estimates will overestimate the noise level.
Therefore, a novel method to accelerate the search for similar patches and pixels globally to estimate the noise level is proposed in this paper. Specifically, the mean square distance (MSD) is first expressed in the form of both the standard deviation (std) and mean of image patches. Then, the patch features, including the std and mean, are pre-calculated and stored to search for similar patches and pixels quickly. Finally, the NLE is extended from local to global. In particular, we further correct the deviation of the estimation results. The experimental results show that the proposed method is superior to the state-of-the-art methods in NLE and guided denoising. The contributions of this paper are as follows:
  • We determine the MSD of image patches, and prove that it is more accurate than the Euclidean distance and can be expressed as the mean and std of the patches;
  • We propose a pixel-level method to select similar pixels for fast NLE based on the 2D statistical histogram and summed area table;
  • We propose to correct the initial estimation results by re-injecting noise to achieve more accurate NLE.
The rest of this paper is organized as follows. In Section 2, we propose our fast NLE method. We give the estimation and denoising results in Section 3. Finally, we conclude this paper in Section 4.

2. The Proposed Method

The generalized additive Gaussian white noise (AWGN) model can be formulated as y = x + n , where y is the observed noisy image, x is the original clean image, n is the damaged noise and obeys the Gaussian independent identical distribution (i.i.d) n ~ N ( 0 , σ 2 ) , and σ is the standard deviation (std) of noise. Furthermore, σ is also the noise level parameter to be focused on during estimation, as expanded on below.

2.1. Mean Square Distance

To solve the time-consuming problem caused by searching for similar patches and pixels, we express the MSD as the std and mean of the image patch.
Suppose the noisy image y contains N image patches { y v } R n × n ( v = 1,2 , , N ) of size n × n ; considering a reference image patch, y u ( u = 1,2 , , N ), in order to search a sufficient number of similar patches, for any image patch, y v , to be searched for, we express the MSD as follows:
d u v = y u y v 2 = E y u y v 2 = E y u E y u y v + E y v + E y u E y v 2 μ u μ v 2 + σ u 2 + σ v 2 2 c o v y u , y v   ,
where μ u and σ u 2 are the mean and variance of y u ( y v is the same), respectively. c o v ( y u , y v ) is the covariance of y u and y v . Notice the following statistical relationship,
c o v y u , y v = E y u μ u y v μ v                       = μ u v 2 μ u μ v + μ u μ v = μ u v μ u μ v ,
where μ u v is the mean of the product of y u and y v . Replacing Equation (2) and the correlation coefficient equation η u v = c o v y u , y v / σ u σ v into Equation (1), we derive the following:
d u v = μ u μ v 2 + σ u η u v σ v 2 + ( 1 η u v 2 ) σ u 2 ,
The patches most similar to reference patch y u are those with the features μ u = μ v and σ u = σ v . Therefore, similar patches could be completely found using their statistical features, i.e., their mean and std. Unlike μ u v , μ v and σ v can be independently computed and stored. However, μ u v is also an indicator of patch similarity. To express μ u v in the form of union μ u , μ v , and σ u , σ v , we give the following theorem, Theorem 1, by considering the quasi-Gaussian property [27,28,29] of noisy patches:
Theorem 1. 
Assuming that the noisy image patches  y u  and  y v  can be fitted via Gaussian distribution and approximately obey the Gaussian distributions  y u ~ N u ( μ u , σ u 2 )  and  y v ~ N v ( μ v , σ v 2 ) , respectively,  y u y v  obeys the Gaussian distribution  y u y v ~ N u v ( μ u v , σ u v 2 ) , where
μ u v = μ u σ v 2 + μ v σ u 2 σ u 2 + σ v 2 ,         σ u v 2 = σ u 2 σ v 2 σ u 2 + σ v 2
The detailed proof of Theorem 1 is as follows:
Proof of Theorem 1. 
The key of Theorem 1 is that the product of two Gaussian functions is still a Gaussian function. According to the assumption, the probability density functions (PDFs) of y u  and y v  are as follows:
y u = 1 2 π σ u e x p ( ( y μ u ) 2 2 σ u 2 )
y v = 1 2 π σ v e x p ( ( y μ v ) 2 2 σ v 2 )
Their product is as follows:
h ( y ) = y u y v = 1 2 π σ u σ v e x p [ ( y μ u ) 2 2 σ u 2 + ( y μ v ) 2 2 σ v 2 ]
Consider the exponential part, and expand it into the quadratic function form of y :
β = ( y μ u ) 2 2 σ u 2 + ( y μ v ) 2 2 σ v 2 = σ u 2 + σ v 2 y 2 2 μ u σ v 2 + μ v σ u 2 y + μ u 2 σ v 2 + μ v 2 σ u 2 2 σ u 2 σ v 2
Compare eqn. β  with the ordinary Gaussian form:
P y = 1 2 π σ e x p ( ( y μ ) 2 2 σ 2 ) = 1 2 π σ e x p ( y 2 2 μ y + μ 2 2 σ 2 )
Formulate eqn. β  into the form of the ordinary Gaussian function:
β = y 2 2 y μ u σ v 2 + μ v σ u 2 σ u 2 + σ v 2 + ( μ u σ v 2 + μ v σ u 2 σ u 2 + σ v 2 ) 2 2 σ u 2 σ v 2 σ u 2 + σ v 2 + μ u σ v 2 + μ v σ u 2 σ u 2 + σ v 2 ( μ u σ v 2 + μ v σ u 2 σ u 2 + σ v 2 ) 2 2 σ u 2 σ v 2 σ u 2 + σ v 2 = ( y μ u σ v 2 + μ v σ u 2 σ u 2 + σ v 2 ) 2 2 σ u 2 σ v 2 σ u 2 + σ v 2 + ( μ u μ v ) 2 2 ( σ u 2 + σ v 2 )
Let σ u v 2 = σ u 2 σ v 2 σ u 2 + σ v 2 , μ u v = μ u σ v 2 + μ v σ u 2 σ u 2 + σ v 2 , then
β = ( y μ u v ) 2 2 σ u v + ( μ u μ v ) 2 2 ( σ u 2 + σ v 2 )
Substitute this into Eqn. h y  to obtain the following:
h y = 1 2 π σ u v e x p ( ( y μ u v ) 2 2 σ u v 2 ) 1 2 π ( σ u 2 + σ v 2 ) e x p ( ( μ u μ v ) 2 2 ( σ u 2 + σ v 2 ) )
Therefore, when normalization can be ignored, this is enough to prove that h y  is a Gaussian function,
h y = G u v 2 π σ u v e x p ( ( y μ u v ) 2 2 σ u v 2 )
where the scaling factor, G u v , is itself a gaussian probability density function (PDF) on mean μ v  with the standard deviation σ u 2 + σ v 2
G u v = 1 2 π ( σ u 2 + σ v 2 ) e x p ( ( μ u μ v ) 2 2 ( σ u 2 + σ v 2 ) )
In other words, the distribution function of the product is a compressed or amplified Gaussian distribution. G u v  is the scaling factor, and the integral of the probability density of the product is not equal to 1, but its variance and mean value properties remain unchanged. □
According to Theorem 1, μ u v can be expressed by the mean value μ u , μ v and standard deviation σ u , σ v of y u and y v . It should be noted that the condition of Theorem 1 is that both y u and y v approximately obey the Gaussian distribution, while non-homogeneous noisy image patches are not necessarily true. Therefore, this paper is only used to replace μ u v approximately. In this approximation, Equation (3) is transformed into the following:
d u v = μ u 2 + μ v 2 + σ u 2 + σ v 2 2 μ u σ v 2 + μ v σ u 2 σ u 2 + σ v 2
Now, the similarity of two patches can be calculated using the new distance, i.e., the MSD, and it is only related to the mean and standard deviation of the statistical characteristics of patches. Obviously, the problem of finding similar patches is the use of a search strategy.
From Figure 1, it can be seen that the actual MSD is at least two orders of magnitude smaller than the Euclidean distance. After normalization (i.e., dividing all data by the maximum value), the two trends are basically the same, and the Euclidean distance is more sensitive to data, especially noise; only by increasing the distance threshold can enough similar patches be selected. Increasing the threshold has a significant impact on the selection of different regions; the MSD is more robust in selecting similar patches. Therefore, we can search for a similar mean and std to obtain similar patches. In the following sub-section, a 2D histogram and summed area table are provided for the fast searching of similar patches.

2.2. Image Patch Feature Statistics

2.2.1. 2D Statistical Histogram to Represent Statistical Features

For the image patch set { y v } R n × n ( v = 1,2 , , N ) , we can obtain the mean, μ v , and the std, σ v , for each patch, y v . Let N μ , N σ represent the total bins in μ v and σ v , respectively; the variables μ v and σ v can be correlated using the 2D statistical histogram H μ , σ R N μ v × N σ v shown in Figure 2b. For example, H μ 1 , σ 2 indicates the number that satisfies μ = μ 1 and σ = σ 2 simultaneously. The 2D statistical histogram directly reveals the distribution of patch features.

2.2.2. Summed Area Table for Fast Calculation of the Number of Similar Patches

For a reference patch, y u , the number of similar patches can be found the from 2D statistical histogram directly. Considering that there will be a large number of similar patches in homogeneous regions, such as the “sky” region in Figure 2a, the number of patches with special texture is very small. If few similar patches are found, searching for sufficiently similar patches in the whole feature domain iteratively is time-consuming. To solve this problem, the summed area table [25] is employed to improve search speed.
A summed area table, denoted S A T ( x , y ) , represents the sum of values located in the upper-left rectangular region of location H [ x , y ] , which is defined as follows [25]:
S A T ( x , y ) = x x , y y H [ x , y ]
where x and y represent the coordinates in the 2D statistical histogram, and x and y represent coordinates smaller than x and y, respectively. As shown in Figure 3, the sum of a rectangular region can be easily obtained via
L 1 x x L 4 ( x ) L 1 y y L 4 y H [ x , y ] = L 4 + L 1 L 2 L 3
where L 1 , L 2 , L 3 , and L 4 are the four angular coordinates of the sum of the rectangular regions. For simplicity, the coordinate variables x and y are hidden. The 2D histogram counts the mean, variance, and coordinates of all patches in the image. Assuming that the current patch’s mean and variance are ( μ 1 , σ 2 ), i.e., ( x , y ) in Equation (6), if the patch is in a smooth region, there are many similar patches. We can search and calculate a sufficient number of similar patches only at and near ( μ 1 , σ 2 ) in the 2D histogram. At this point, the difference in time consumption between directly searching, calculating, and using S A T is small. If the patch is located in a special texture region, such as corners or other regions with fast texture changes, there are fewer similar patches. We need to search and calculate the number of patches in ( μ 1 , σ 2 ) and larger areas of the 2D histogram while S A T stores the cumulative number of patches’ mean and variance in different positions in advance. By setting search ranges L 1 , L 2 , L 3 , and L 4 for the mean and variance ( μ 1 , σ 2 ), S A T can quickly calculate the required number without the need for an iterative search and calculation. It is seen from Equation (7) that the search for similar patches using the summed-area table involves only three addition and subtraction operations.

2.3. Similar Patch and Pixel Search

We only need a mean filter and an std filter to quickly obtain the mean and std of all patches in noisy images. The 2D statistical histogram is used to count these mean and std values, the summed area table is used to quickly search for enough similar patches, and the MSD is used to find the most similar patches. The search strategies for similar patches and pixels in this paper are as follows.
Horizontal Search for Similar Patches. Suppose we want to search for m patches, Y u = [ y v , 1 , y v , 2 , , y v , m ] R n × n (including y u itself, i.e., y u = y v , 1 ), that are most similar to y u from the patch set { y v   } R n × n   ( v = 1,2 , , N ) ; we give priority to the case where the mean is equal, i.e., μ u = μ v , and then gradually increase and decrease the std to search for similar patches. If the number of patches is insufficient, we only need to increase the search and calculation ranges of μ u and σ u , i.e., L 1 , L 2 , L 3 , and L 4 , to quickly find enough similar patches. The summed area table is used to calculate the number of similar patches to speed up the search, i.e., m = S A T ( x , y ) . It should be noted that such a similar patch search strategy is global for the original image in a feature domain, i.e., with the mean and std.
The formation of Y u is based on the similarity of the patch level, and we call this the horizontal search for similar patches. For the similarity of two patches, priority is given to ensuring that the pixels in the same row are similar to some extent. However, for rare image textures and details, some pixels will have large changes due to changes in shape. The similarity of these pixels at the same level is not reasonable. In order to solve this problem, we stretch the patches in Y u = [ y v , 1 , y v , 2 , , y v , m ] R n × n into vector I u = [ I v , 1 , I v , 2 , , I v , m ] R 1 × n by column and carefully select the most similar pixels from I u .
Vertical Search for Similar Pixels. For each row of I u , there are m pixels in the same location of different patches that are similar to each other. In order to refine the similarity of pixels, consider the i -th row in I u , i.e., I v i , and calculate the distance, d v i j = I v i I v j 2 , between the i -th row, I v i , and the j -th row, I v j . Finally, select the q rows, { I v i 1 , I v i 2 , , I v i q }, which are most similar to those of I v i (including I v i itself), and aggregate these rows into the matrix Ι u i q R q × m ,
Ι u i q = I v i 1 , 1 I v i 1 , m I v i q , 1 I v i q , m
where {   i 1 , i 2 , , i q } { 1,2 , , n } . For each row I v i of I u , the most similar pixel matrix, Ι u i q , of q rows can be searched for. Traversing all rows of I u to obtain the set, { Ι u i q }( i = 1,2 , , n ;   u = 1,2 , , N ), of similar pixels in different regions of the entire image, we can estimate the noise level of the image according to each Ι u i q , because each Ι u i q is composed of the most similar rows.

2.4. Noise Level Estimation

Since the pixels in the selected q rows of Ι v i q are very similar to each other, the authors of [2] explained that this similar pixel search scheme can convert non-Gaussian real noise into quasi-Gaussian noise; therefore, the std of among them can be viewed as the noise level. For simplicity, we assume that the noise follows a Gaussian distribution with std σ . Let the distances between the i -th row of I v and its most similar q rows be d v i i 1 , d v i i 2 , , d v i i q ( i 1 = i ); the local noise level (LNL) σ u of this paper can be computed as follows:
σ u = 1 n ( q 1 ) t = 2 q i = 1 n 1 m ( d v i i t ) 2
In NLE-NSS [2], the strategy of searching for similar patches based on the neighborhood makes it easy to select less similar patches when the distribution of similar textures is very scattered, which leads to an overestimation of the local noise level. However, our method of searching for similar patches based on the feature domain is global and can be used for the pixel domain; thus, the proposed method is more feasible than the method in [2].
On the other hand, in order to make the proposed method more robust for NLE without overestimation, we extend the NLE from a local region to a global one. To achieve this, the initial global noise level (GNL) is averaged by the local estimated σ u as follows:
σ g = 1 N v = 1 N σ u
In addition, the image content might magnify or attenuate the noise effects to some extent. Although our method obtains a good estimation result, the preliminary experimental results in Figure 4 show that our estimation has a bias, and the relationship between our estimate and the true noise level is approximately linear. To further reduce estimation bias, we suggest a rectification procedure based on noise injection.
Specifically, we formulate the relationship between the initial estimation, σ g , and the underlying true noise level, σ , using the following linear model:
σ g = α σ
Clearly, there are two unknown variables, α and σ , in (11). Uncovering these two unknowns from a single equation is an ill-posed problem. To tackle this challenge, we employ noise injection strategy to generate an additional equation. We inject the same type of i.i.d. noise with variance σ t 2 into the noisy image, and carry out an additional round of noise estimation. Due to the independence between the original noise and the injected noise, we have
σ g , 1 2 = α 2 σ 2 , σ g , 2 2 = α 2 ( σ 2 + σ t 2 )
where σ g , 1 2 and σ g , 2 2 denote the estimated noise variances before and after injection, respectively. Here, the value of α is assumed to be invariant in those two rounds of NLE. This assumption is reasonable because α only depends on the original image content; as shown in Figure 4, even if the noise level increases after secondary injection of noise, the estimated deviation still satisfies Figure 4a while the original image remains unchanged. Therefore, we fixed α . From (12), it is easy to solve σ 2 as follows:
σ 2 = σ g , 1 2 σ t 2 σ g , 2 2 σ g , 1 2
Conceptually, the variance in injected noise can be set arbitrarily. However, too heavy an injection tends to wipe out the effects resulting from the image content in practice. One reasonable choice is to set the injected noise variance as the first estimated one, i.e., σ t 2 = σ g , 1 2 . Note that the noise level becomes larger after noise injection, which implies σ g , 2 2 > σ g , 1 2 . Then, the underlying true noise variance can be estimated via the following:
σ ^ g 2 = σ g , 1 4 σ g , 2 2 σ g , 1 2
However, due to the simplicity of this linear model, the rectified noise level may still slightly deviate from the true noise level. To make a more robust estimation, we fuse the estimated results before and after noise injection together through a linear convex combination,
σ ^ = β 0 σ ^ g 2 + β 1 σ g 2
where σ ^ denotes the final estimation, and β 0 and β 1 are weighting factors satisfying β 0 + β 1 = 1 . In our proposed scheme, β 0 = 0.613 and β 1 = 0.387, which are obtained from an off-line training procedure with over 100 noisy images in the BSDS500 [30] benchmark dataset.

2.5. Algorithm and Complexity Analysis

The proposed method is summarized as follows, as Algorithm 1:
Algorithm 1 Estimating Noise Level
Inputs:  Observed   image   y R h × w ,   and   parameters   n ,   m ,   q
1 :   The   mean   filter   M E A N ( y , [ n , n ] )   and   std   filter   S T D ( y , [ n , n ] ) are used to
   calculate   the   mean   μ   and   std   σ of each patch
2 :   2 D   histogram   H μ , σ   statistics   μ   and   σ
3 :     for   u = 1 : v
   Search   m   most   similar   patches   and   stretch   into   vector   I u   by   use   summed-area   table   S A T ( x , y )   in   H ( μ , σ )
  Calculate the most similar q rows of each row to obtain { Ι u i q }
  Calculate the local noise level σ u   from   { Ι u i q }
  end for
4 :   Obtain   the   initial   estimation   σ g   from   all   local   noise   level   σ u
5 :   return   the   final   estimate   σ ^ is obtained by Equations (14) and (15) with quadratic estimation correction
We gradually analyze the complexity of the proposed algorithm as follows:
(1)
The complexity of calculating patch mean μ v , std σ v is O ( N ), establishing 2D statistical histogram H μ , σ is O ( N );
(2)
The complexity of searching similar patches is O ( r N ), where r is the number of times to gradually increase or decrease the mean and std;
(3)
The complexity of searching similar pixels is O ( n 2 r N );
(4)
The complexity of computing σ u is O ( N ), the complexity of computing σ g , σ ^ , and each calculation of the summed-area table S A T ( x , y ) is O (1).
In summary, NLE is performed twice, and the algorithm complexity of the proposed method is O ( 2 n 2 r N ). Compared with the most relevant method [2], each calculation of the local noise level in the method in [2] needs to search for similar patches in the neighborhood with the size of W × W , and the overall complexity of the method in [2] is O ( n W 2 N ). Meanwhile, the proposed method only needs to increase or decrease the mean and std every time. Our experiments show that we can search for a sufficient number of similar patches only by increasing or decreasing them a few times, i.e., W > n r . Therefore, the proposed method is more efficient than the method in [2].

3. Experimental Results

To evaluate the performance of the proposed fast NLE (FNLE) method, we compare our FNLE method with seven state-of-the-art methods: Liu’s method [14], Wu’s method [19], Yang’s methods [12], Zoran’s method [15], the PCA method [20], Hou’s method [2], and Gupta’s method [17]. For a fair comparison, all the methods are implemented by using Matlab R2014a on a laptop with Windows 10 OS, with 8G RAM, 2.4 GHz CPU, and Intel Core i5-6300U. The parameters of our FNLE are set as follows: the image patch size is n = 7, the minimum number of similar patches is m = 64 , and the number of rows for similar pixels is q = 8 .
Furthermore, we consider two benchmark datasets: TID2008 [26] and BSDS500 [30]. The TID2008 dataset [26] has been widely used for the evaluation of full-reference image quality assessment metrics, which contain 25 reference images without any compression, but it has been argued that all images in the TID2008 dataset contain small or large homogeneous regions [24]. To further demonstrate the robustness of our method on images full of textures, we compare the performance of different methods on a more challenging dataset—the BSDS500 dataset with 200 test images [30]. All images are injected with additive Gaussian white noise (AWGN) at different noise levels, σ = { 1 ,   3 ,   5 ,   10 ,   20 ,   30 ,   40 ,   50 } , i.e., all pixels of images are polluted by noise. Different methods are used to estimate the noise levels of noisy images.
To further evaluate the performance of the proposed method on real noisy images, two benchmark datasets of real-world images are considered: the Cross-Channel (CC) dataset [31] and the Darmstadt Noise Dataset (DND) [32].
The CC dataset [31] includes noisy images of 11 static scenes captured by Canon 5D Mark 3, Nikon D600, and Nikon D800 cameras. The real-world noisy images were collected under a controlled indoor environment. Each scene was shot 500 times using the same camera and settings. The average of the 500 shots is taken as the “ground truth”. The authors cropped 15 images measuring a size of 512 × 512 to evaluate different NLE methods.
The DND dataset [32] includes 50 different scenes captured by Sony A7R, Olympus E-M10, Sony RX100 IV, and Huawei Nexus 6P. Each scene contains a pair of noisy and “ground truth” clean images. The noisy images are collected under higher ISO values with shorter exposure times, while the “ground truth” images are captured under lower ISO values with longer adjusted exposure times. For each scene, the authors cropped 100 bounding boxes measuring a size of 512 × 512 to evaluate different NLE methods.
With the “ground truth” and noisy images, the truth noise variance can be easily calculated according to the difference between the two images. As [19] estimates the noise level for each channel separately when handling colorful images, NLE is finally achieved by averaging the NLE of each channel. Evaluation indicators include the mean (Mean), the standard deviation (Std) and the root mean square error (RMSE) of the estimated results. The “Mean” is used to evaluate the accuracy of an estimator, “Std” is used to evaluate the robustness of an estimator, and “RMSE” is used to evaluate the overall performance of an estimator. Note that the smaller the bias, STD, or RMSE is, the better the estimator is.
Table 1 and Figure 5a,b show the results of the estimation of synthetic noise via various methods, and Figure 5c,d show the results of the estimation of real-world noise images. It can be seen from Table 1 and Figure 5a–d that the proposed FNLE method is slightly inferior to Liu’s and Wu’s methods when the noise is 1–5 but better than other methods in other cases.
Image denoising is a very important preliminary step for many computer vision methods. During the past few decades, impressive improvements have been made in this area. However, the noise level, σ , is regarded as a known parameter in many algorithms with good performance. BM3D [1] is one of such algorithms that requires the noise level as an input parameter. We apply our NLE method to guide the BM3D image denoising algorithm. In addition, we choose the peak signal-to-noise ratio ( P S N R ) and S S I M as evaluation indexes. Note that the larger the P S N R or S S I M , the better the denoising performance.
Table 2 shows the PSNR and SSIM results of BM3D denoising guided by the noise level estimations of different algorithms in the CC dataset and DND dataset. It can be seen from the Table 2 that the proposed FNLE estimates the real-world noise image more accurately, has better denoising performance, and obtains higher PSNR and SSIM values. Therefore, the proposed FNLE method is more suitable than other methods for estimating real-world noise images. In addition, without the optimized program, the proposed method achieves the highest operating efficiency.
In addition, Figure 6 and Figure 7 show the visual comparisons of the images denoised using BM3D with different noise level parameters. It can be seen that the proposed FNLE method achieves higher PSNR and SSIM values, with better visual quality than that of the other methods.

4. Conclusions

In order to overcome the influence of complex image texture on NLE, a fast image NLE method based on similar pixels is proposed in this paper. Specifically, the MSD is expressed as the std and mean of image patches. A 2D statistical histogram and summed area table are adopted to search for similar patches and pixels quickly. In particular, we extend the estimation from local to global and correct the deviation of the initial estimation. The experimental results show that the proposed method is superior to the state-of-the-art methods in terms of fast NLE and guided denoising. In particular, the proposed method is more suitable for NLE of real-world images.

Author Contributions

Conceptualization, J.W.; J.W., M.J., S.W. and S.X. conceptualized the study and contributed to the article’s organization; J.W., M.J., S.W. and S.X. contributed to the discussion of the simulation results; J.W., M.J., S.W. and S.X. drafted the manuscript, which was revised by all authors. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Henan Province University Science and Technology Innovation Talent Project (Project No. 21HASTIT021); Major Science and Technology Projects in Xinxiang City (Project No. 21ZD009); Science and Technology Tackling Key Issues Project in Henan Province (Project No. 242102220005; National Natural Science Foundation of China (Grant No. 61775172 and 61371190); Hubei Key Technical Innovation Project (Project No. ZDCX2019000025).

Data Availability Statement

The dataset and comparison program used in this paper can be downloaded from the corresponding reference paper’s website or author’s homepage. The program code for the proposed method will be uploaded to the author’s personal homepage for free download after publication.

Acknowledgments

The authors would like to gratefully thank the Section Managing Editor, and anonymous reviewers for their outstanding comments and suggestions, which greatly helped us to improve the technical quality and presentation of this manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Dabov, K.; Foi, A.; Katkovnik, V.; Egiazarian, K. Image denoising by sparse 3-D transform-domain collaborative filtering. IEEE Trans. Image Process. 2007, 16, 2080–2095. [Google Scholar] [CrossRef] [PubMed]
  2. Hou, Y.; Xu, J.; Liu, M.; Liu, G.; Liu, L.; Zhu, F.; Shao, L. NLH: A Blind Pixel-Level Non-Local Method for Real-World Image Denoising. IEEE Trans. Image Process. 2020, 29, 5121–5135. [Google Scholar] [CrossRef]
  3. Li, B.; Ng, T.T.; Li, X.; Tan, S.; Huang, J. Revealing the Trace of High-Quality JPEG Compression Through Quantization Noise Analysis. IEEE Trans. Inf. Forensics Secur. 2017, 10, 558–573. [Google Scholar]
  4. Guo, F.F.; Wang, X.X.; Shen, J. Adaptive fuzzy c-means algorithm based on local noise detecting for image segmentation. IET Image Process. 2016, 10, 272–279. [Google Scholar] [CrossRef]
  5. Jiang, P.; Zhang, J.-Z. No-reference image quality assessment based on local maximum gradient. J. Electron. Inf. Technol. 2015, 37, 2587–2593. [Google Scholar]
  6. Scharr, H.; Spies, H. Accurate optical flow in noisy image sequences using flow adapted anisotropic diffusion. Signal Process. Image Commun. 2005, 20, 537–553. [Google Scholar] [CrossRef]
  7. Nan, Y.; Quan, Y.; Ji, H. Variational-EM-Based Deep Learning for Noise-Blind Image Deblurring. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020. [Google Scholar]
  8. Heidari-Gorji, H.; Ebrahimpour, R.; Zabbah, S. A temporal hierarchical feedforward model explains both the time and the accuracy of object recognition. Sci. Rep. 2021, 11, 5640. [Google Scholar] [CrossRef] [PubMed]
  9. Pratap, T.; Kokil, P. Efficient Network Selection for Computer-aided Cataract Diagnosis Under Noisy Environment. Comput. Methods Programs Biomed. 2021, 200, 105927. [Google Scholar] [CrossRef]
  10. Gureyev, T.E.; Paganin, D.M.; Kozlov, A.; Nesterets, Y.I.; Quiney, H.M. Complementary aspects of spatial resolution and signal-to-noise ratio in computational imaging. Phys. Rev. A 2018, 97, 053819. [Google Scholar] [CrossRef]
  11. Katase, H.; Yamaguchi, T.; Fujisawa, T.; Ikehara, M. Image noise level estimation by searching for smooth patches with discrete cosine transform. In Proceedings of the 2016 IEEE 59th International Midwest Symposium on Circuits and Systems (MWSCAS), Abu Dhabi, United Arab Emirates, 16–19 October 2016. [Google Scholar]
  12. Yang, S.M.; Tai, S.C. Fast and reliable image-noise estimation using a hybrid approach. J. Electron. Imaging 2010, 19, 033007. [Google Scholar] [CrossRef]
  13. Kokil, P.; Pratap, T. Additive white gaussian noise level estimation for natural images using linear scale-space features. Circuits Syst. Signal Process. 2021, 40, 353–374. [Google Scholar] [CrossRef]
  14. Liu, X.; Tanaka, M.; Okutomi, M. Noise level estimation using weak textured patches of a single noisy image. In Proceedings of the IEEE International Conference on Image Processing, Orlando, FL, USA, 30 September–3 October 2012; pp. 665–668. [Google Scholar]
  15. Zoran, D.; Weiss, Y. Scale invariance and noise in natural images. In Proceedings of the IEEE 12th International Conference on Computer Vision (ICCV), Kyoto, Japan, 29 September–2 October 2009; pp. 2209–2216. [Google Scholar]
  16. Wu, M.W.; Jin, Y.; Li, Y.; Song, T.; Kam, P.Y. Maximum-Likelihood, Magnitude-Based, Amplitude and Noise Variance Estimation. IEEE Signal Process. Lett. 2021, 28, 414–418. [Google Scholar] [CrossRef]
  17. Gupta, P.; Bampis, C.G.; Jin, Y.; Bovik, A.C. Natural scene statistics for noise estimation. In Proceedings of the 2018 IEEE Southwest Symposium on Image Analysis and Interpretation (SSIAI), Las Vegas, NV, USA, 8–10 April 2018. [Google Scholar]
  18. Fang, Z.; Yi, X. A novel natural image noise level estimation based on flat patches and local statistics. Multimed. Tools Appl. 2019, 78, 17337–17358. [Google Scholar] [CrossRef]
  19. Wu, J.X.; Xie, S.L.; Li, Z.G.; Wu, S.Q. Image noise level estimation via kurtosis test. J. Electron. Imaging 2022, 31, 033015. [Google Scholar] [CrossRef]
  20. Pyatykh, S.; Hesser, J.; Zheng, L. Image noise level estimation by principal component analysis. IEEE Trans. Image Process. 2013, 22, 687–699. [Google Scholar] [CrossRef] [PubMed]
  21. Jiang, P.; Wang, Q.; Wu, J. Efficient Noise-Level Estimation Based on Principal Image Texture. IEEE Trans. Circuits Syst. Video Technol. 2020, 30, 1987–1999. [Google Scholar] [CrossRef]
  22. Liu, W.; Lin, W. Additive white Gaussian noise level estimation in SVD domain for images. IEEE Trans. Image Process. 2013, 22, 872–883. [Google Scholar] [CrossRef] [PubMed]
  23. Tang, C.; Yang, X.; Zhai, G. Noise Estimation of Natural Images via Statistical Analysis and Noise Injection. IEEE Trans. Circuits Syst. Video Technol. 2015, 25, 1283–1294. [Google Scholar] [CrossRef]
  24. Chen, G.; Zhu, F.; Heng, P.A. An efficient statistical method for image noise level estimation. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 477–485. [Google Scholar]
  25. Crow, F. Summed-Area Tables for Texture Mapping; SIGGRAPH: Chicago, IL, USA, 1984; Volume 84, pp. 207–212. [Google Scholar]
  26. Tampere Image Database 2008 (TID 2008). Available online: http://www.ponomarenko.info/tid2008.htm (accessed on 10 February 2024).
  27. Foi, A.; Trimeche, M.; Katkovnik, V.; Egiazarian, K. Practical poissonian-gaussian noise modeling and fitting for single-image rawdata. IEEE Trans. Image Process. 2008, 17, 1737–1754. [Google Scholar] [CrossRef]
  28. Liu, G.; Zhong, H.; Jiao, L. Comparing Noisy Patches for Image Denoising: A Double Noise Similarity Model. IEEE Trans. Image Process. 2015, 24, 862–872. [Google Scholar] [CrossRef]
  29. Deledalle, C.J.; Denis, L.; Tupin, F. How to compare noisy patches? patch similarity beyond gaussian noise. Int. J. Comput. Vis. 2012, 99, 86–102. [Google Scholar] [CrossRef]
  30. Arbelaez, P.; Maire, M.; Fowlkes, C.; Malik, J. Contour detection and hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2011, 33, 898–916. [Google Scholar] [CrossRef] [PubMed]
  31. Nam, S.; Hwang, Y.; Matsushita, Y.; Kim, S.J. A holistic approach to cross-channel image noise modeling and its application to image denoising. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 1683–1691. [Google Scholar]
  32. Plotz, T.; Roth, S. Benchmarking denoising algorithms with real photographs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
Figure 1. MSD and Euclidean distance comparison of 50 pairs of patches (patch size n = 7) randomly selected from a noisy image. (a) Actual distance; (b) normalized distance.
Figure 1. MSD and Euclidean distance comparison of 50 pairs of patches (patch size n = 7) randomly selected from a noisy image. (a) Actual distance; (b) normalized distance.
Electronics 13 02556 g001
Figure 2. (a) Original image; (b) 2D statistical histogram of mean ( μ ) and std ( σ ) of patches in (a).
Figure 2. (a) Original image; (b) 2D statistical histogram of mean ( μ ) and std ( σ ) of patches in (a).
Electronics 13 02556 g002
Figure 3. (a) Summed area table; (b) sum of a rectangular region.
Figure 3. (a) Summed area table; (b) sum of a rectangular region.
Electronics 13 02556 g003
Figure 4. NLE bias rectification on 100 synthetic noisy images. (a) The estimate obtained with the proposed method without rectification slightly deviating from the ideal estimation, MSE = 0.312; (b) the estimate obtained with the proposed method with rectification almost coinciding with the ideal estimation, MSE = 0.147.
Figure 4. NLE bias rectification on 100 synthetic noisy images. (a) The estimate obtained with the proposed method without rectification slightly deviating from the ideal estimation, MSE = 0.312; (b) the estimate obtained with the proposed method with rectification almost coinciding with the ideal estimation, MSE = 0.147.
Electronics 13 02556 g004
Figure 5. Comparison of estimation performance of various estimators for synthetic noise and real-world noise images on TID 2008 [26], BSDS 500 [30], CC [31] and DND [32] datasets. (a,b) The estimation performance of different estimators for synthetic noise images on the TID 2008 and BSDS 500 datasets, respectively; (c,d) the estimation performance of different estimators for real-world noise images on the CC and DND datasets, respectively.
Figure 5. Comparison of estimation performance of various estimators for synthetic noise and real-world noise images on TID 2008 [26], BSDS 500 [30], CC [31] and DND [32] datasets. (a,b) The estimation performance of different estimators for synthetic noise images on the TID 2008 and BSDS 500 datasets, respectively; (c,d) the estimation performance of different estimators for real-world noise images on the CC and DND datasets, respectively.
Electronics 13 02556 g005
Figure 6. Denoised images using BM3D with different noise level parameters. (a) Original image (PSNR/SSIM: 29.62/0.7858) from CC image dataset [31]. (bi) Denoised images of Liu [14] + BM3D: 29.83/0.7896, Wu [19] + BM3D: 34.58/0.9463, Yang [12] + BM3D: 31.14/0.8534, Zoran [15] + BM3D: 30.56/0.8265, Pyatykh [20] + BM3D: 30.53/0.8264, Hou [2] + BM3D: 33.74/0.9323, Gupta [17] + BM3D: 33.38/0.9248, and FNLE+BM3D: 34.70/0.9502.
Figure 6. Denoised images using BM3D with different noise level parameters. (a) Original image (PSNR/SSIM: 29.62/0.7858) from CC image dataset [31]. (bi) Denoised images of Liu [14] + BM3D: 29.83/0.7896, Wu [19] + BM3D: 34.58/0.9463, Yang [12] + BM3D: 31.14/0.8534, Zoran [15] + BM3D: 30.56/0.8265, Pyatykh [20] + BM3D: 30.53/0.8264, Hou [2] + BM3D: 33.74/0.9323, Gupta [17] + BM3D: 33.38/0.9248, and FNLE+BM3D: 34.70/0.9502.
Electronics 13 02556 g006
Figure 7. Denoised images using BM3D with different noise level parameters. (a) Original image (PSNR/SSIM: 30.21/0.8867) from DND image dataset [32]. (bj) Denoised images of Liu [14] + BM3D: 30.28/0.8891, Wu [19] + BM3D: 32.74/0.9433, Yang [12] + BM3D: 30.24/0.8890, Zoran [15] + BM3D: 30.50/0.8977, Pyatykh [20] + BM3D: 30.39/0.8933, Hou [2] + BM3D: 31.06/0.9162, Gupta [17] + BM3D: 30.58/0.8981, and FNLE+BM3D: 32.79/0.9434.
Figure 7. Denoised images using BM3D with different noise level parameters. (a) Original image (PSNR/SSIM: 30.21/0.8867) from DND image dataset [32]. (bj) Denoised images of Liu [14] + BM3D: 30.28/0.8891, Wu [19] + BM3D: 32.74/0.9433, Yang [12] + BM3D: 30.24/0.8890, Zoran [15] + BM3D: 30.50/0.8977, Pyatykh [20] + BM3D: 30.39/0.8933, Hou [2] + BM3D: 31.06/0.9162, Gupta [17] + BM3D: 30.58/0.8981, and FNLE+BM3D: 32.79/0.9434.
Electronics 13 02556 g007
Table 1. Comparison of AWGN estimation results. The best results are highlighted in bold.
Table 1. Comparison of AWGN estimation results. The best results are highlighted in bold.
Noise LevelLiu [14]Wu [19]Yang [12]Zoran [15]Pyatykh [20]Hou [2]Gupta [17]FNLE
MeanStdMeanStdMeanStdMeanStdMeanStdMeanStdMeanStdMeanStd
BSDS 50010.990.251.390.322.051.330.721.591.250.533.891.533.001.701.440.48
32.840.263.140.283.751.032.771.463.160.315.050.624.051.303.140.32
54.910.145.070.145.610.864.671.385.150.215.910.735.661.105.070.18
109.920.1810.070.1010.430.639.561.2510.130.1710.520.629.891.7110.060.10
2019.910.2120.040.0920.240.4419.311.3220.090.3520.540.7319.800.3220.010.09
3029.820.2630.050.2130.150.4029.181.3829.820.5030.670.5429.500.2830.040.21
4039.700.2740.090.2940.170.5239.091.3939.470.6940.710.7539.220.2640.070.25
5049.530.3250.270.3249.870.6049.061.4349.140.8350.490.6948.740.2949.920.28
TID 200813.083.311.430.452.110.901.991.291.430.542.841.333.073.301.840.71
34.432.923.270.273.710.703.531.683.310.353.981.014.422.923.260.41
56.022.645.140.405.560.604.251.615.140.315.920.926.012.645.130.33
1010.492.0310.040.1110.380.469.092.1210.120.1010.730.8310.492.0310.030.09
2019.981.3220.010.1420.220.3719.151.7819.970.2720.690.8519.981.3220.010.14
3029.560.9730.080.1530.070.3628.821.1429.780.5930.640.7929.560.9730.060.14
4039.230.7740.220.3139.810.4538.771.1639.440.7240.500.6139.230.7740.160.29
5048.850.6750.390.4749.700.5248.721.1449.200.7750.440.5548.810.6750.330.32
Table 2. Average results of PSNR, SSIM, and execution time (seconds) obtained via guided BM3D denoising with different methods of estimating real-world noise images in the CC dataset [31] and DND dataset [32]. The best results are highlighted in bold.
Table 2. Average results of PSNR, SSIM, and execution time (seconds) obtained via guided BM3D denoising with different methods of estimating real-world noise images in the CC dataset [31] and DND dataset [32]. The best results are highlighted in bold.
--Noisy ImageLiu [14]Wu [19]Yang [12]Zoran [15]Pyatykh [20]Hou [2]Gupta [17]FNLE
CC datasetPSNR33.4133.5835.6833.5633.6133.4533.8234.7535.76
SSIM0.90790.91140.94740.91130.9120.90870.91610.92580.9491
Time-5.7672.6692.43512.62819.7833.027314.0121.277
DND datasetPSNR28.8128.8633.3831.5029.0131.3130.2631.6933.57
SSIM0.78930.79170.91660.87430.79740.87110.86420.88920.9201
Time-3.0862.3862.27210.35817.4722.994301.8681.242
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wu, J.; Jia, M.; Wu, S.; Xie, S. Fast Noise Level Estimation via the Similarity within and between Patches. Electronics 2024, 13, 2556. https://doi.org/10.3390/electronics13132556

AMA Style

Wu J, Jia M, Wu S, Xie S. Fast Noise Level Estimation via the Similarity within and between Patches. Electronics. 2024; 13(13):2556. https://doi.org/10.3390/electronics13132556

Chicago/Turabian Style

Wu, Jiaxin, Meng Jia, Shiqian Wu, and Shoulie Xie. 2024. "Fast Noise Level Estimation via the Similarity within and between Patches" Electronics 13, no. 13: 2556. https://doi.org/10.3390/electronics13132556

APA Style

Wu, J., Jia, M., Wu, S., & Xie, S. (2024). Fast Noise Level Estimation via the Similarity within and between Patches. Electronics, 13(13), 2556. https://doi.org/10.3390/electronics13132556

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop