You are currently viewing a new version of our website. To view the old version click .
Remote Sensing
  • Article
  • Open Access

31 December 2025

Weighted Total Variation for Hyperspectral Image Denoising Based on Hyper-Laplacian Scale Mixture Distribution

,
,
,
,
and
1
College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, China
2
College of Engineering Computing and Cybernetics, Australian National University, Canberra 0200, Australia
*
Author to whom correspondence should be addressed.
Remote Sens.2026, 18(1), 135;https://doi.org/10.3390/rs18010135 
(registering DOI)
This article belongs to the Special Issue Recent Advances in Hyperspectral Remote Sensing: Theories, Technologies and Applications

Highlights

What are the main findings?
  • Proposes a Hyper-Laplacian Scale Mixture prior for HSI gradient modeling.
  • Introduces a region-adaptive weighted total variation model for HSI denoising.
What are the implications of the main findings?
  • Improves denoising performance in both smooth and textured regions with better texture preservation.
  • Achieves state-of-the-art performance and can enhance other TV-based methods when integrated.

Abstract

Conventional total variation (TV) regularization methods based on Laplacian or fixed-scale Hyper-Laplacian priors impose uniform sparsity penalties on gradients. These uniform penalties fail to capture the heterogeneous sparsity characteristics across different regions and directions, often leading to the over-smoothing of edges and loss of fine details. To address this limitation, we propose a novel regularization Hyper-Laplacian Adaptive Weighted Total Variation (HLAWTV). The proposed regularization employs a proportional mixture of Hyper-Laplacian distributions to dynamically adapt the sparsity decay rate based on image structure. Simultaneously, the adaptive weights can be adjusted based on local gradient statistics and exhibit strong robustness in texture preservation when facing different datasets and noise. Then, we propose an hyperspectral image (HSI) denoising method based on the HLAWTV regularizer. Extensive experiments on both synthetic and real hyperspectral datasets demonstrate that our denoising method consistently outperforms state-of-the-art methods in terms of quantitative metrics and visual quality. Moreover, incorporating our adaptive weighting mechanism into existing TV-based models yields significant performance gains, confirming the generality and robustness of the proposed approach.

1. Introduction

Hyperspectral images acquire data across hundreds of contiguous spectral bands, enabling detailed characterization of real-world scenes through rich spectral and spatial information. This makes hyperspectral images indispensable in a variety of applications, including anomaly detection [1], environmental monitoring [2], military surveillance [3], precision agriculture [4], and object detection [5]. However, during the acquisition process, HSI is often corrupted by various noise types such as Gaussian noise, impulse noise, stripe noise, and deadlines. These mixed noises significantly deteriorate the image quality and adversely affect subsequent processing tasks [6,7,8,9]. Therefore, robust and efficient denoising techniques are essential to enhance the reliability of hyperspectral images in practical scenarios.
An intuitive approach is to treat each band of a hyperspectral image as an individual grayscale image and directly apply traditional grayscale denoising algorithms, such as non-local means (NLM) [10], block-matching 3D filtering (BM3D) [11], and weighted nuclear norm minimization (WNNM) [12]. However, these methods ignore the spectral correlations of HSI. To solve this problem, some algorithms process the entire hyperspectral data as a 3D cube, such as block-matching 4D filtering (BM4D) [13] and video block-matching 4D filtering (V-BM4D) [14]. However, without prior constraints on the inherent characteristics of HSI, it is difficult to achieve optimal denoising performance. Consequently, researchers have begun developing denoising methods that exploit spectral correlations and spatial structures. These methods are generally categorized into two classes as follows: (1) Deep learning-based methods directly learn mapping functions between noisy and clean HSI pairs from training datasets to avoid explicit modeling of prior knowledge. (2) Prior model-based methods formulate HSI denoising as a constrained optimization problem by incorporating hyperspectral priors such as low-rankness in the spectral dimension, local smoothness in the spatial domain, and self-similarity of pixel patches.
In recent years, deep learning–based hyperspectral image denoising has advanced from simple spatial–spectral convolutional neural networks (CNNs) to more sophisticated architectures that effectively balance local and global context. HSID-CNN [15] pioneered the use of deep residual convolution with multiscale and multilevel feature fusion to suppress noise while preserving spectral–spatial consistency. QRNN3D [16] introduced alternating-direction quasi-recurrent pooling alongside 3D convolutions to model both spatio-spectral correlations and global spectral dependencies simultaneously. More recently, SSRT-UNet [17] integrated non-local spatial self-similarity and global spectral correlation within a single recurrent transformer U-Net block, enabling unified exploitation of hyperspectral priors across shift-window layers. TRQ3DNet [18] combined 3D quasi-recurrent blocks with U-Transformer modules connected via a bidirectional integration bridge, capturing both local textures and long-range dependencies and delivering state-of-the-art denoising performance. ILRNet [19] integrates the strengths of model-driven and data-driven approaches by embedding a rank minimization module within a U-Net architecture. However, these deep learning models still face significant limitations. Their performance heavily relies on paired clean–noisy datasets and assumed noise priors, which hinders their adaptability in real-world scenarios where such datasets are not available, or the noise characteristics may differ significantly. This reliance on prior knowledge and training on specific datasets restricts the generalization capability of these models when exposed to complex or unseen noise types, highlighting a key challenge in applying them to diverse and unstructured real-world noise conditions.
Prior model-based approaches for hyperspectral image denoising have attracted significant attention due to their flexibility and strong interpretability. Rather than relying on large-scale annotated datasets, these methods explicitly incorporate various types of prior knowledge about hyperspectral images, such as low-rankness and local smoothness. By leveraging these priors, model-based methods can adapt more effectively to different noise types and complex real world scenarios.
Among these priors, low-rankness has proven especially useful for modeling the strong spectral correlation inherent in hyperspectral data. Building upon this idea, Zhang et al. proposed the low-rank matrix recovery (LRMR) framework [20], which exploits spectral low-rank priors to suppress noise. However, methods based solely on low-rank tend to overlook spatial structure information, resulting in suboptimal performance when dealing with complex noise such as stripes or structured artifacts.
To address the above problem, spatial smoothness priors have been introduced. For example, He et al. incorporated TV regularization [21] in their low-rank matrix denoising (LRTV) method [22], applying both low-rankness and local smoothness to better preserve spatial details and enhance denoising performance. Nevertheless, this method requires unfolding hyperspectral data into matrices, which can destroy the intrinsic spatial structure. To overcome this limitation, recent methods have modeled hyperspectral images as 3D tensors. For example, LRTDTV [23] combines Tucker decomposition with spatial–spectral TV (SSTV) [24] to effectively remove complex noise. However, balancing the parameters between low-rank and smoothness regularizations remains challenging. Recognizing that low-rank regularization is often coupled with smoothness priors, Wang et al. [25] further proposed directly imposing low-rank constraints on gradient tensors to encode both low-rankness and local smoothing simultaneously within the tensor framework. This formulation improves the adaptivity and representation power of total variation regularization across different structural patterns in hyperspectral images. To improve the texture preserving capability of TV regularization, some methods introduce adaptive weighting strategies for sparsity constraints across spatial and spectral directions. For example, Anisotropic Spectral Spatial TV (ASSTV) [26] uses anisotropic full variability in three directions, which explores HSI spectral continuity and spatial smoothing while taking into account the directionality of the texture. Graph Spatio-Spectral TV (GSSTV) [27] weights the spatial difference operator of SSTV based on a graph reflecting the spatial structure of the HSI image, which can better respond to the spatial structure of the image. However, most of these weighting schemes apply the same sparsity constraint to every pixel, causing textured areas to be over-smoothed since they ignore local gradient variations. To solve this problem, Chen et al. proposed a TPTV regularizer [28] with gradient map weighting l 1 -parameter. The gradient map is utilized to adaptively adjust the weights so as to reduce the sparsity constraints on pixels with large variations.
In recent years, Hyper-Laplacian TV (HLTV) methods have garnered significant attention because they can more accurately model the peaky nature and heavy-tailed characteristics of spatial–spectral gradients in hyperspectral images compared to traditional Laplacian TV (LTV) [29,30]. However, although both LTV and HLTV approaches strive to equation the regularization with the true sparse distribution of gradients, current TV-based denoising methods still have the following problems in hyperspectral image denoising:
(1) Existing HLTV regularizers use a fixed l p -norm constraint for each gradient direction, implicitly assuming that gradients originating from smooth areas or richly textured regions, follow the same heavy-tailed distribution. In reality, hyperspectral images exhibit markedly different gradient statistics across spatial versus spectral dimensions and between smooth versus textured regions as shown in Figure 1. By ignoring this heterogeneity, HLTV regularization fails to capture the true prior distribution of the data. As a result, it often over-smooths details in textured areas, leading to suboptimal noise removal.
Figure 1. Hyper-Laplacian modeling of spatial–spectral gradients and region-wise (smooth vs. textured) gradient distributions.
(2) Although LTV models introduce heuristic weight functions to protect edges and textures, these weights are typically fixed mappings derived from simplified priors. They lack the flexibility to adapt dynamically to local gradient characteristics. Consequently, such models struggle to discriminate effectively between flat regions and highly textured ones as follows: over-smoothing may be introduced in flat regions, while insufficient protection is provided in highly textured regions, resulting in the underutilization of the sparsity feature of HSI gradients. In addition, such heuristics are difficult to adaptively set reasonable weight sizes based on the statistical properties of HSI, thus limiting the overall regularization performance.
Based on the above observations, we propose a novel HLAWTV regularization method, which is built upon a Hyper-Laplacian Scale Mixture (HLSM) distribution model. Our method assumes that gradients in hyperspectral images follow Hyper-Laplacian distributions with different scales in smooth and textured regions. We embed this assumption into a maximum a posteriori (MAP) estimation framework, which yields an adaptive weighted regularization term. Specifically, the regularization fits variable-scale Hyper-Laplacian distributions based on both the direction-specific gradient statistics and the magnitude of local gradient values, enabling the model to control the sparsity decay rate dynamically. This model improves the adaptivity of total variation regularization across different structural patterns in hyperspectral images.
The main contributions of this article can be summarized as follows.
(1) We propose a sparse prior modeling method based on the Hyper-Laplacian scale mixture distribution to more accurately characterize the statistical properties of hyperspectral image gradients. Specifically, considering that the HSI gradients usually exhibit distribution patterns with varying degrees of spiking and heavy tailing in different directions, we model them using a variable scale super-Laplacian distribution. To further introduce regional adaptation, we derive a probabilistic scale mixture model by exploiting the prior conjugacy of the Hyper-Laplacian distribution with the gamma distribution.
(2) Based on the HLSM distribution, we further propose a new weighted total variation regularization model HLAWTV. To efficiently solve the resulting optimization problem, we develop a tailored algorithm based on the Alternating Direction Method of Multipliers (ADMM), which achieves effective and scalable denoising performance.
(3) Experimental results demonstrate that the proposed denoising method performs better than other advanced methods. Furthermore, migration experiments show that the denoising performance can be significantly improved by embedding our weighting scheme into existing TV regularizers.
The rest of this article is organized as follows. The second part gives signs and preliminary knowledge. The third part introduces the HLSM distribution model, HLAWTV regularizer, and denoising model. The fourth part reports the experimental results. The fifth part is the conclusion.

2. Preliminaries

2.1. Notations

We summarize notations used in this article in Table 1. Next, we introduce the definitions of mode-n unfolding, the mode-k tensor-matrix product, and the mode-k gradient map.
Table 1. Notation declarations.
The n-mode unfolding of N-order tensors can be divided into horizontal unfolding X ( n ) R I 1 I n 1 I n + 1 I N and vertical unfolding X ( n ) R I 1 I n 1 I n + 1 I N .
The mode-k tensor-matrix product of a tensor X R I 1 × I 2 × × I n and a matrix A R J 1 × I k is defined as follows:
Y = X × k A . k = 1 , 2 , 3
The gradient map of X R I 1 × I 2 × I 3 along mode-k can be denoted as G k ( X ) , which is defined as follows:
G k ( X ) = k ( X ) = D k X , k = 1 , 2 , 3 ,
where D k represent the first-order finite difference operators.

2.2. The Laplacian Scale Mixture Distribution

A random variable s i is said to follow a Laplacian Scale Mixture (LSM) distribution if it can be expressed as follows:
s i = λ i 1 u i ,
where u i is a Laplacian random variable with the following unit scale:
p ( u i ) = 1 2 e | u i | ,
and λ i is a positive random variable with density p ( i ) . We assume u i and λ i are independent. Conditioned on λ i , the distribution of s i becomes a Laplacian with inverse scale λ i as follows:
p ( s i | λ i ) = λ i 2 e λ i | s i | .
Therefore, the marginal distribution of s i is an infinite mixture of Laplacians with different scales as follows:
p ( s i ) = 0 p ( s i λ i ) p ( λ i ) d λ i = 0 λ i 2 e λ i | s i | p ( λ i ) d λ i .

2.3. TV Regularizatiers

TV regularizatiers methods have emerged in hyperspectral image restoration, such as the early proposed 3-D TV model [23], the subsequent improved Enhanced 3-D TV (E3DTV) [31], and the correlated TV model [32]. To facilitate the subsequent analysis, we abstract and generalize their structures and use a unified representation framework to represent the above different TV regularizations as follows:
X 3 D T V = D 1 X 1 + D 2 X 1 + D 3 X 1 ,
where X is the mathematical representation of HSIs in different TV regularization models.

2.4. HSI Degradation Model

Since HSIs are often contaminated with various types of noise, the degradation model is typically represented as follows:
Y = X + E ,
where the tensor Y represents the observed hyperspectral image, the tensor X represents the clean HSI, and the tensor E is the noise term. The goal of HSI denoising is to estimate the clean image X using the corrupted image Y . The denoising model for HSIs can be represented as
arg min X , E 1 2 Y X E F 2 + η J 1 ( X ) + λ J 2 ( E ) ,
where the first term is the data fidelity term, the second term is used to describe the prior information of the clean HSI [33,34,35], and the third term is used to describe the sparsity of the mixed noise.

3. Methodology

In this section, we propose the HLAWTV regularizer and give its technical details, then propose a denoising model based on the HLAWTV regularizer, and finally utilize the ADMM algorithm to solve the denoising model.

3.1. HLSM Model

Different TV regularizers allow flexibility in balancing denoising and texture preservation capabilities by selecting different forms of X . In traditional regularization frameworks, the l 1 -norm is widely used to constrain the sparsity of the gradient map, implicitly assuming that the gradient map follows a fixed-scale LSM distribution [36,37] with probability density function (PDF) p ( x ; λ ) e λ | x | . However, some recent research has found that both spatial and spectral gradients in natural images exhibit significantly more heavy-tailed distributions. Specifically, their heavy-tailed distributions are closer to the Hyper-Laplacian distribution with probability density function p ( x ; λ ) e λ | x | α k , where α k controls the strength of the sparsity in that direction. A smaller α k represents a stronger sparse prior.
Based on the above theory, we propose a Hyper-Laplacian scale mixture distribution model. Current TV regularization based on the Hyper-Laplacian distribution assumes that the gradient values in the same difference direction independently and identically obey the Hyper-Laplacian distribution of the same scale. However, different regions of HSIs often have completely different structural characteristics. To better capture the nonstationary and sparse nature of gradient variations, we assume that the gradient values at each pixel point follow an independent distribution of the HLSM model. As shown in Figure 2, the methods based on HLSM distribution model closely matches the gradient distribution of X . Specifically, for each pixel-wise gradient value G i , j , b , we assume the following model:
p ( G i , j , b ) = i , j , b p ( G i , j , b ) .
Figure 2. Empirically fitted curves for the gradient of the PaviaC dataset.
Instead of assuming a LSM distribution, we define the following:
p ( G i , j , b ) = p ( G i , j , b Θ i , j ) p ( Θ i , j ) d Θ i , j p ( G i , j , b Θ i , j ) = α k Θ i , j 2 | G i , j , b | α k 1 exp Θ i , j | G i , j , b | α k ,
where 0 < α k < 1 is a sparsity-controlling exponent, allowing stronger sparsity than the Laplacian distribution (which corresponds to α = 1 ).
Since the integral form of p ( G i , j , b ) does not yield a closed-form solution, we adopt an Expectation-Maximization (EM) strategy to optimize the marginal log-likelihood ln ( p ( G ) ) .
E step: We compute the evidence lower bound (ELBO) of the log-marginal probability, given the current estimate G ( t ) . The ELBO is denoted as follows:
Q ( G , G ( t ) ) = i , j p ( Θ i , j G ( t ) ) ln i , j , b p ( G i , j , b Θ i , j ) d Θ i , j .
Using the form of the Hyper-Laplacian conditional distribution, the optimization problem can be reformulated as follows:
Q ( G , G ( t ) ) = i , j , b Θ i , j · | G i , j , b | α k + c o n s t ,
where Θ i , j = W ( i , j ) is the expectation of Θ i , j under the current posterior estimate. W ( i , j ) is a function of G ( t ) . α k = f ( μ k , σ k ) . Both W ( i , j ) and f ( μ k , σ k ) will be discussed later. In practice, we assume that the posterior p ( Θ i , j G ( t ) ) obeys a Gamma distribution, or use local statistics to approximate it.

3.2. HLAWTV Regularizer with HLSM

M step: We maximize the ELBO w.r.t. G , resulting in the following optimization:
G ( t + 1 ) = arg min G i , j , b W ( i , j ) · | G i , j , b | α k .
Consequently, following the optimization structure induced by the EM algorithm, we define the HLAWTV regularization term as follows:
Z HLAWTV = k = 1 3 D k X α k , W = k = 1 3 m , n , b W k ( i , j ) · | G k ( i , j , b ) | ff k .
In this way, incorporating HLAWTV as the regularization term is equivalent to maximizing the joint log-likelihood of the gradient priors along all directions, formulated as follows: i = 1 3 ln ( p ( G i X ) ) . This demonstrates that the adoption of HLAWTV serves not only as a regularization strategy but also as a probabilistic modeling of gradients, grounded in a prior-driven MAP estimation framework.
In our denoising framework, the adaptive weight map W plays a crucial role by acting as the scale parameter in the HLSM. Rather than assuming a fixed scale, we adopt a spatially adaptive formulation where W reflects the local structural complexity of the HSI. Because this scale parameter is a latent variable in HLSM, it is inherently unknown and its value is influenced by local spatial characteristics. To make W more data-adaptive, we explicitly design W ( i , j ) to approximate this latent scale, effectively linking the statistical sparsity prior with image-driven structural information. In particular, large weights correspond to stronger contractions to ensure sparsity in the smoothing region, while small weights preserve larger gradient magnitudes in texture or edge regions.
To robustly estimate W ( i , j ) , we use low-rank filtering to obtain denoising estimates for hyperspectral images, which reduces the interference of noise in weight calculation. HSI exhibits consistent texture patterns across all bands, whereas the noise in each band is random. In order to extract the texture region of HSI more accurately, we use mean filtering along the spectral dimension to obtain a gradient map with better texture detail quality. The calculation formula is as follows:
G k = 1 B b = 1 B ( G k ) ( : , : , b ) , k = 1 , 2 , 3
where G k ( i , j ) is a matrix that provides a more reliable characterization of texture structures along the three modes of HSIs. To enhance the adaptivity of the regularization strength across different image regions, we use the Gamma distribution as a hyperprior over the scale parameter Θ i , j in the HLSM as follows:
p ( Θ i ) = ( b a / Γ ( a ) ) Θ i a 1 e b Θ i .
This choice is justified on multiple grounds as follows: (1) The gamma distribution is conjugate to the exponential family, enabling tractable marginalization. (2) By adjusting the shape parameter a and inverse scale parameter b, the gamma distribution flexibly models diverse behavioral characteristics ranging from strong sparsity to weak sparsity. In our method, the weighting scheme originates from Bayesian inference under the gamma-exponential conjugate model. Specifically, when the prior distribution of Θ i , j is gamma and the conditional probability of G i , j , b follows a Hyper-Laplacian distribution with inverse scale Θ i , j , the posterior probability of Θ i , j given G i , j , b is also gamma. This conjugacy allows us to directly update the posterior distribution of latent variables via the E-step of the EM algorithm as shown in Equation (13). Therefore, the posterior distribution of Θ i , j is computed based on the current gradient G i , j , b and remains a Gamma distribution as follows:
Θ k | G k G a m m a ( a + 1 , b + | G k | ) .
The expectation of this posterior yields the following adaptive weight function:
W k ( i , j ) = E [ Θ i , j | G k ( i , j ) ] = a + 1 b + | G k ( i , j ) | .
The proposed weighting function is monotonically decreasing in G k ( i , j ) , which directly supports a sparse-modeling prior, as follows: when | G k ( i , j ) | 0 in smooth regions, the weight becomes large to encourage smoothing; when G k ( i , j ) is large at edges or textures, the weight decreases to preserve fine details. Additionally, this weighting scheme improves robustness against noise, mitigating instability caused by gradient fluctuations.

3.3. Proposed Model

To better characterize the spatial–spectral structures of hyperspectral images, we propose a novel denoising model that integrates spectral subspace representation with the HLAWTV regularizer. By projecting the HSI gradient tensor into the spectral subspace, we are able to utilize the inherent spectral redundancy to suppress noise and maintain structural consistency across bands. This subspace representation improves denoising robustness while maintaining computational efficiency and is a strong complement to our proposed HLAWTV. Therefore, the spectral gradient of HSIs can be expressed as follows:
D k ( X ) = U k × 3 V k T ,
where U k R M × N × r contains the corresponding spatial coefficients, and V k R B × r is the spectral basis. This transformation allows the spatial variation across bands to be compactly encoded in U i . We use the HLAWTV regularizer to constrain U i to enhance both sparsity and adaptivity across different spatial regions and gradient directions. Based on the above discussion, our proposed HLAWTV denoising model can be expressed as follows:
arg min X , E , U k , V k 1 2 Y X E F 2 + λ k = 1 3 U k HLAWTV + E 1 D k ( X ) = U k × 3 V k T V k T V k = I . k = 1 , 2 , 3

3.4. Algorithm

The proposed HSI denoising model is solved using ADMM. According to the ADMM method, the optimization problem described in Equation (21) can be solved by minimizing the following augmented Lagrangian function:
L ( X , E , U k , V k , M k ) = λ k = 1 3 U k α k , W + E 1 + k = 1 3 μ 2 D k ( X ) U k × 3 V k T + M k μ F 2 + μ 2 Y X E + M 4 μ F 2 ,
where M k (k = 1, 2, 3) and M 4 are Lagrange multipliers. μ is the penalty parameter. Each variable can be updated alternatively, while the other variables remain constant.
(1) Update U k and V k : The subproblem with respect to U i can be expressed as
min U k λ U i α k , W + μ 2 D k ( X ) + M k μ U k × 3 V k F 2 .
For each directional gradient tensor D k , we first unfold the 3D tensor into a 2D matrix to facilitate subspace modeling as follows:
D k = Unfold ( D k ( X ) ) R M N × B .
Then, the update of the subspace coefficient matrix U k is obtained by solving the following:
min U k μ 2 D k + M k μ U k V k F 2 + λ U k α k , W ,
where D k , U k and M k represent the mode-3 unfolding matrices of D k , U k and M k . This problem is solved using the following two-step procedure:
In the first step, we update the matrices U k ( t e m p ) and V k ( t + 1 ) with Singular Value Decomposition (SVD) as follows:
{ U k t e m p , V k ( t + 1 ) } = SVD D k + 1 μ M k .
In the second step, we perform weighted nonconvex shrinkage on U k ( t e m p ) to update U k ( t + 1 ) as follows:
U k ( t + 1 ) = Shrink λ W k μ q ( U k ( t e m p ) ) ,
where Shrink q ( · ) is a generalized shrankage/thresholding operator [38]. To set parameters that better alight with the sparsity characteristics of gradients in both spatial and spectral dimensions, we set the value of q as α k , shown in Equation (35).
(2) Update E : The subproblem with respect to E can be expressed as
min E Y X + M 4 μ E F 2 + 1 μ E 1 .
The solution of E can be computed using the following soft-thresholding operator:
E ( t + 1 ) = E 1 μ ( Y X + M 4 μ ) .
(3) Update X : The subproblem with X respect to can be expressed as
min X Y E + M 4 μ X F 2 + k = 1 3 U k × 3 V k T M k μ D k ( X ) F 2 .
Optimizing the above problem can be considered as solving the following linear system:
μ I + μ k = 1 3 D k T D k X = μ ( Y E ) + M 4 + μ i = 1 3 D k T U k × 3 V k T M k μ
where D k T indicates the transpose of D k . This quadratic problem has an analytic solution on the Fourier domain as follows:
H = Y E + M 4 μ + k = 1 3 D k T U k × 3 V k M k μ T = | F ( D 1 ) | 2 + | F ( D 2 ) | 2 + | F ( D 3 ) | 2 X t + 1 = F 1 F ( H ) 1 + T
where 1 represents the tensor with all elements as 1. F ( · ) denotes the Fourier transform operator, and | · | 2 represents the squared magnitude of the element.
(4) Update W k : To solve this subproblem, we calculate the parameter α k for each direction based on the mean and standard deviation of the current gradient. Specifically, for each direction, the calculation of α k is broken down into several steps as follows:
First, we normalize the rank of the current low-rank approximation matrix as follows:
R = r 0 r min r max r min .
As the iteration proceeds, R represents the rank of the current low-rank approximation matrix. It is normalized based on the current rank r 0 , and the minimum and maximum ranks r m i n and r m a x . This normalization ensures that the adjustment of the sparsity penalty is based on the current rank, dynamically adjusting it during iterations. Next, we compute the penalty term based on the standard deviation σ k and mean μ k of the gradient to fit the gradient distribution in different directions as follows:
P k = exp σ k μ k + ϵ .
Combining the ratio R and the penalty term P k , we calculate α k using the following formula:
α k = f ( μ i , j , σ i , j ) = ( 1 R P k ) · α max + R P k · α min .
Using the calculated values α k as the shape parameter a, we update the weight W k in each pixel ( i , j ) for each direction. The weight update is given by the following:
W k ( i , j ) = α k + 1 b + | G k ( i , j ) | .
To stabilize the weights, we normalize them as follows:
W k ( i , j ) = W k ( i , j ) max ( W k ( i , j ) ) .
The multipliers are updated as follows:
M k ( t + 1 ) = M k ( t ) + μ D k ( X ) U k × 3 V k T , k = 1 , 2 , 3 M 4 ( t + 1 ) = M 4 ( t ) + μ ( Y X E ) .
The specific process of the ADMM solver for HLAWTV is shown in Algorithm 1.
Algorithm 1 ADMM solver for HLAWTV
Require: Noisy HSI Y ; regularization λ ; rank bounds r min , r max ; shape-parameter bounds α min , α max
Ensure: Recovered HSI X
  1:
X Y , E 0 , M k 0
  2:
t 1
  3:
while not converged do
  4:
    Update U i , V i by Equations (23)–(27)
  5:
    Update E by Equation (29)
  6:
    Update X by Equation (32)
  7:
    Update W k by Equation (37)
  8:
    Update M k by Equation (38)
  9:
     t t + 1
10:
end while

4. Results

To validate the superiority of the proposed method, we conduct extensive comparative experiments on both synthetic and real HSI datasets. Our method is compared with the following thirteen classical or state-of-the-art methods: the QRNN3D [16] using the recurrent network to explore the spatial–spectral and the global correlation along the spectrum, the ILRNet [19] embedding a rank minimization module within a U-Net architecture to integrate the strengths of model-driven and data-driven approaches, the LRTV [22] applying the traditional 2-D total variation regularizer to each band independently, the LRTDTV [23] integrating the SSTV regularizer into a low-rank tensor decomposition framework, TCTV [25] considering the spectral correlation and sparsity prior on gradient tensor, E3DTV [31] adopting edge-preserving strategies to enhance spatial detail retention, RCTV [32] imposing TV regularization on representative low-rank coefficients, and ETPTV [28] introducing a texture-preserving weighted TV regularizer. In addition to TV-based models, we also consider the following low-rank- and non-local-based methods: TNN [35] employing a tensor nuclear norm to exploit global low-rankness, LRTF-DFR [39] introducing a dual-factor low-rank tensor decomposition with spatial regularization, HNN [40] using a two-dimensional frontal Haar wavelet transform to disentangle low- and high-frequencies, FastHyDe [41] combining spectral subspace projection with non-local denoising for high efficiency, and NGmeet [42] jointly leveraging global low-rank spectral subspace and non-local self-similarity through block matching and iterative refinement. The relationships and differences between the proposed method and the compared methods are presented in Table 2. All deep learning models are implemented in PyTorch 2.5.1 and trained on a workstation equipped with an Intel Xeon Platinum 8480+ CPU and Nvidia H800 GPU, using a batch size of 4. All traditional methods are implemented in MATLAB R2018b and executed on a workstation equipped with an Intel Core i5-12600KF processor (2.50 GHz) and 16 GB of RAM.

4.1. Synthetic Data Experiments

To ensure a fair and comprehensive comparison, we performed simulated experiments on two hyperspectral datasets with distinct spatial and spectral characteristics. The Pavia City Centre (PaviaC) dataset1 with a size of 200 × 200 × 80 and the Washington DC Mall (WDC) dataset2 whose size is 256 × 256 × 144. These two datasets can be regarded as ground-truth hyperspectral data, as they exhibit no perceptible noise. The pixel intensities of each spectral band are normalized to the range [0, 1]. To simulate realistic acquisition scenarios, we synthetically generate six types of noise and add them to the clean dataset.
Case 1: Each pixel was corrupted by additive i.i.d. Gaussian noise with a standard deviation of σ = 0.1.
Case 2: non-i.i.d. Gaussian noise was simulated by adding i.i.d. Gaussian noise with different variances to each spectral band, where the SNR for each band was randomly selected from the range of [1, 15] dB.
Case 3: A mixture of Gaussian noise and impulse noise is added to each band. The Gaussian noise is added as Case 1. The percentage of impulse noise is set to 0.2.
Case 4: The impulse noise is added as in Case 3. Furthermore, 20% (PaviaC) or 30% (WDC) of the bands were randomly selected to add random stripes, with the number of corrupted columns ranging from 10 to 30. A total of 10% of the bands were randomly selected to simulate deadline noise, in which 5 to 25 rows were randomly chosen and set to zero to emulate dead sensor lines.
Case 5: The stripe noise and deadlines are added as Case 4. The percentage of impulse noise is set to 0.5 (PaviaC) and 0.3 (WDC).
Case 6: A mixture of Gaussian noise, impulse noise, stripe noise, and deadlines are added to each band. The Gaussian noise is added as Case 2. The stripe noise and deadlines are added as Case 4. The percentage of impulse noise is set to 0.2.
To ensure the statistical reliability of the experimental results, each denoising experiment was repeated 20 times for each noise case. The denoising performance was quantitatively evaluated using three metrics: mean peak signal-to-noise ratio (MPSNR) [43], mean structural similarity index (MSSIM) [44], and spectral angle mapper (SAM), which together reflect spatial fidelity, structural preservation, and spectral accuracy. Furthermore, the average runtime of each method was recorded to assess computational efficiency.
Quantitative Comparison: We present the MPSNR, MSSIM, MSAM, and the mean time values of 20 repetitive experiments obtained by all the compared methods in Table 3 and Table 4. It can be seen that deep-learning denoising methods perform less effectively than traditional machine-learning approaches in both the PaviaC and WDC datasets, mainly because the training dataset ICVL is not similar to these datasets. In Case 2, FastHyDe achieves excellent performance in terms of MPSNR and MSSIM. This is because FastHyDe transforms non-i.i.d. Gaussian noise into i.i.d. noise through whitening, which significantly simplifies the noise model. However, it underperforms the proposed method in other cases. As observed, the proposed HLAWTV achieves overall superior results over the compared methods under nearly all cases. In both PaviaC and WDC datasets, our algorithm still achieves optimal results with i.i.d. Gaussian noise and non-i.i.d Gaussian noise compared to the algorithm using Gaussian noise constraint terms. In the case of impulse noise and structural noise (stripe noise and deadlines), our algorithm has less performance degradation with noise enhancement compared to other TV regularization based methods. This is due to the ability of our HLAWTV regularizer to adaptively adjust the sparse penalty strength based on structural information in different regions and directions.
Visual Comparison: Figure 3 and Figure 4 illustrate the denoising performance of all competing algorithms on the 25th band of the PaviaC and WDC datasets. The noisy observations (b) are severely degraded by impulse noise, vertical stripes, and deadlines, causing most scene details to be overwhelmed. Deep-learning-based methods can remove noise effectively, but they suffer from heavy over-smoothing. Traditional low-rank models such as TNN, LRTF-DFR, and FastHyDe suppress part of the impulse noise yet leave conspicuous stripe residuals. NGmeet and LRTV further attenuate the stripes but suffer from heavy over-smoothing. The texture in PaviaC and the architectural details in WDC become blurred. LRTDTV, TCTV, E3DTV, and HNN successfully remove structured noise but introduce blocky artifacts. RCTV and ETPTV preserve geometric structures, though isolated impulse dots remain visible. The proposed HLAWTV almost completely eliminates impulse and stripe/deadlines corruption while retaining sharp boundaries. These observations substantiate the effectiveness and robustness of the HLSM prior combined with our adaptive weighting.
Figure 3. Denoising results of different methods on Band 25 of the PaviaC dataset under Case 6.
Figure 4. Denoising results of different methods on Band 25 of the WDC dataset under Case 6.
Table 2. Comparison of HSI denoising methods in terms of modeling properties and noise assumptions.
Table 2. Comparison of HSI denoising methods in terms of modeling properties and noise assumptions.
MethodDeep Learning MethodLow-Rank ModelingLocal ContinuityNon-Local SimilarityGradient Distribution
SpatialSpectralBothSpatialSpectral
QRNN3D -
ILRNet -
TNN -
LRTF-DFR Laplacian
FastHyDe -
NGmeet -
LRTV Laplacian
LRTDTV Laplacian
TCTV Laplacian
E3DTV Laplacian
RCTV Laplacian
ETPTV Laplacian
HNN -
HLAWTV (Ours) Hyper-Laplacian

4.2. Real Data Experiments

To validate the proposed method under practical conditions, we executed experiments on two real HSI datasets. One of the real dataset selected is the Urban dataset whose size is 307 × 307 × 210, the other is the Indian Pines dataset with a size of 145 × 145 × 220. Both datasets are severely polluted, thereby furnishing a stringent benchmark for performance evaluation.
Figure 5 and Figure 6 show two representative bands of Urban. After denoising, methods such as QNNN3D, TNN, FastHyDe, LRTDTV, TCTV, RCTV, and HNN still exhibit obvious stripe noise. ILRNet, NGmeet, LRTV, and E3DTV produced noticeable over-smoothing. In contrast, HLAWTV simultaneously removed noise while preserving building edges and road textures.
Figure 5. Denoising comparison on the real scenario at band 207 of Urban.
Figure 6. Denoising comparison on the real scenario at band 108 of Urban.
Figure 7 and Figure 8 show two bands from the Indian Pines dataset. The low-rankness based algorithm such as TNN, TCTV, and HNN mitigates Gaussian noise but is less effective against impulse noise. The conventional TV-based model performs better in removing impulse noise but blurs the farmland boundaries. HLAWTV effectively removes noise while maintaining clear texture.
Figure 7. Denoising comparison on the real scenario at band 106 of Indian Pines.
Figure 8. Denoising comparison on the real scenario at band 220 of Indian Pines.
Figure 9 and Figure 10 plot the spectral features of the Urban pixel at (185, 132) and the Indian Pines pixel at (125,20). After ILRNet, TNN, NGmeet, LRTDTV, TCTV, RCTV, and HNN denoising, the strong spikes in the original spectral are partially suppressed. However, there are still noticeable fluctuations in several noisy bands, which implies that there is still residual noise interference. In contrast, the curves produced by the other methods are much smoother after denoising, indicating that the noise is well suppressed on all the bands.
Figure 9. Spectral signatures at pixel (185, 132) in the Urban dataset before and after denoising by different methods.
Figure 10. Spectral signatures at pixel (125,20) in the Indian Pines dataset before and after denoising by different methods.
To evaluate the denoising performance of various methods on real-noise datasets, we employ the Blind/Unreferenced Image Space Quality Evaluator (BRISQUE) [45] as the unreferenced metric to assess the results of different methods on the Urban and Indian Pines datasets. BRISQUE is a perceptual quality assessment metric that evaluates image quality based on the statistical characteristics of natural scenes, requiring no real reference images for evaluation. A lower BRISQUE score indicates higher image quality. The results are presented in Table 5.
Table 3. Quantitative indices of the PaviaC dataset with different methods under different cases.
Table 3. Quantitative indices of the PaviaC dataset with different methods under different cases.
IndexNoisyQRNN3DILRNetTNNLRTF-DFRFastHyDeNGmeetLRTVLRTDTVTCTVE3DTVRCTVETPTVHNNHLAWTV
Case 1: i.i.d. Gaussian Noise ( σ = 0.1)
MPSNR20.59231.08534.47425.38134.34534.53224.51330.38332.88130.04633.70034.16134.41132.32234.892
MSSIM0.4160.9130.9280.6280.9260.9250.4890.8140.8840.8180.9090.9180.9240.9320.937
MSAM0.4780.1150.0830.3130.0790.0980.1250.1010.0990.1970.0910.1080.0980.1080.077
Time0.0000.1551.49450.79313.3380.19723.1268.91326.5641345.6429.5412.85046.54428.02440.991
Case 2: non-i.i.d. Gaussian Noise (SNR ∈ [1, 15] dB)
MPSNR27.49433.24639.19731.96638.27042.05025.06734.10136.09036.08936.17638.64439.78736.93239.997
MSSIM0.7120.9480.9400.8660.9670.9840.5330.9150.9450.9420.9510.9630.9730.9770.974
MSAM0.3190.1070.0650.1800.0660.0520.1190.0950.0810.1160.0730.0890.0660.0710.069
Time0.0000.2711.89350.02013.1180.18822.6378.67526.5021435.79914.5954.676236.78429.841160.112
Case 3: Mixture of Gaussian and Impulse Noise
MPSNR11.28629.52629.58221.96630.78121.78619.84728.09230.62027.33030.79630.84532.24930.75933.531
MSSIM0.0930.8910.8520.4460.8560.7000.3000.7150.8170.7200.8460.8210.8870.8050.890
MSAM0.7880.1130.1300.4060.1060.1510.1630.1330.1200.2390.1070.1770.1130.1170.103
Time0.0000.3931.97152.62314.1640.19723.29025.30970.9881508.89013.4814.532228.65330.250153.020
Case 4: Mixed Impulse Noise + Stripes + Deadlines (Impulse Ratio 20%)
MPSNR11.49229.39729.39238.97033.94423.16620.79835.58736.82241.09441.60043.63449.01442.90349.884
MSSIM0.1130.8870.8460.9260.9480.7510.3600.9510.9580.9370.9880.9850.9960.9960.996
MSAM0.7990.1170.1340.2060.0780.1440.1560.0920.0830.2000.0450.0510.0340.0430.033
Time0.0000.3171.53185.98523.6410.39060.30227.270158.2612348.4319.6112.81146.31627.93142.589
Case 5: Mixed Impulse Noise + Stripes + Deadlines (Impulse Ratio 50%)
MPSNR7.56423.41724.20324.49025.42215.76315.17029.91833.15036.34034.37835.53037.93736.50043.255
MSSIM0.0270.7090.5930.6030.8110.4010.1220.8540.9210.9330.9510.9320.9760.9850.986
MSAM0.8590.1580.1700.3420.1130.2030.2040.1470.1100.1720.0740.1090.0540.0650.054
Time0.0000.3781.56952.50313.5870.18523.1678.55326.3111146.5889.1762.69045.51240.10945.334
Case 6: Mixed Gaussian + Impulse + Stripes + Deadlines
MPSNR11.24528.42028.72721.38630.20521.84619.79627.65930.27825.82230.61430.45332.08131.14932.389
MSSIM0.0910.8710.8250.4220.8470.6920.2820.7000.8100.6690.8410.8100.8840.8790.888
MSAM0.7890.1220.1420.4310.1190.1580.1680.1570.1330.2920.1110.2030.1170.1290.109
Time0.0000.2811.55949.78113.1020.19023.0738.78225.9231178.6799.4762.65246.39294.71840.634
Table 4. Quantitative indices of the WDC dataset with different methods under different cases.
Table 4. Quantitative indices of the WDC dataset with different methods under different cases.
IndexNoisyQRNN3DILRNetTNNLRTF-DFRFastHyDeNGmeetLRTVLRTDTVTCTVE3DTVRCTVETPTVHNNHLAWETV
Case 1: i.i.d. Gaussian Noise ( σ = 0.1)
MPSNR20.71330.26034.14525.29635.13735.70835.68130.23532.50829.57032.94735.22835.58733.53635.874
MSSIM0.4070.8770.9330.6150.9530.9560.9550.8370.9040.8070.9220.9500.9560.9330.957
MSAM0.4480.1520.0970.2890.0840.0810.0780.0930.0850.1760.0970.0750.0790.0890.070
Time0.0000.7273.27592.09278.9120.701105.312117.4851318.7651303.25653.93113.6241204.346192.072762.212
Case 2: non-i.i.d. Gaussian Noise (SNR ∈ [1, 15] dB)
MPSNR28.36631.83838.01832.46237.40842.15734.78733.32734.50735.61535.81738.32639.32337.98039.977
MSSIM0.7390.9070.9680.8880.9730.9880.9460.9150.9400.9490.9580.9760.9820.9790.984
MSAM0.3050.1310.0730.1400.0620.0470.0850.0810.0700.0930.0750.0570.0480.0570.050
Time0.0000.8373.741207.00950.7410.36743.73454.412254.3091246.46658.07514.8801267.650214.052829.846
Case 3: Mixture of Gaussian and Impulse Noise
MPSNR11.42528.67730.00223.13733.71622.25819.69328.95431.44727.91431.33133.60734.01532.00534.230
MSSIM0.1130.8600.8560.5080.9370.7790.4620.7960.8800.7460.8910.9280.9390.9060.940
MSAM0.7920.1750.1320.3500.0990.2200.2730.1090.0980.2070.1160.0970.0930.1100.087
Time0.0000.9263.84387.39754.8840.37139.02347.733100.479645.35958.41612.720365.13891.146258.205
Case 4: Mixed Impulse Noise + Stripes + Deadlines (Impulse Ratio 20 % )
MPSNR11.38528.52129.79534.97835.19722.93520.01934.55334.71736.54538.74242.17043.95043.84044.913
MSSIM0.1220.8570.8520.9230.9690.7950.4680.9490.9460.9300.9840.9920.9950.9950.996
MSAM0.8040.1790.1360.1750.0640.2090.2700.0640.0650.1700.0420.0280.0270.0190.018
Time0.0000.8363.485319.25492.5150.749131.510136.0321168.649951.13065.15016.020870.93685.302274.423
Case 5: Mixed Impulse Noise + Stripes + Deadlines (Impulse Ratio 30 % )
MPSNR9.65927.12628.53833.32332.85919.72017.88132.93634.34335.74437.42040.84343.46043.71044.770
MSSIM0.0750.8270.8130.9140.9570.7020.4180.9280.9410.9290.9790.9890.9940.9940.995
MSAM0.8510.1930.1510.1760.0770.2620.3020.0840.0690.1670.0500.0300.0280.0200.020
Time0.0000.7873.543765.55994.1880.38341.36452.584116.259847.85365.44012.865681.15583.519308.118
Case 6: Mixed Gaussian + Impulse + Stripes + Deadlines
MPSNR11.21428.17129.37022.13632.37721.82821.42528.49230.99526.56230.57432.57833.48829.22433.809
MSSIM0.1040.8510.8420.4640.9210.7610.6500.7810.8700.7040.8710.9110.9350.8370.937
MSAM0.8000.1840.1450.3890.1190.2310.2540.1340.1100.2490.1280.1150.1020.1620.097
Time0.0000.8443.548636.327168.9880.36739.66248.769102.941933.14255.82214.374533.27189.439238.935
Table 5. Blind/Referenceless Image Spatial Quality Evaluator (BRISQUE) comparison on the real data.
Table 5. Blind/Referenceless Image Spatial Quality Evaluator (BRISQUE) comparison on the real data.
IndexNoisyQRNN3DILRNetTNNLRTF-DFRFastHyDeNGmeetLRTVLRTDTVTCTVE3DTVRCTVETPTVHNNHLAWTV
Urban72.5039.4236.9547.9731.7934.9838.4335.8155.2448.6336.7136.8536.9750.4628.51
Indian Pines93.5731.7225.1227.7534.7445.9661.8527.7128.9664.0224.7635.8835.3957.9723.49

5. Discussion

5.1. Transferring to Other TV Regularizers

In this section, we investigate the portability of the proposed HLAWTV framework by integrating it into several TV-based regularizers. Therefore, three representative baselines with distinct TV formulations are selected as follows: LRTV applied 2-D TV to each band individually. LRTDTV considered pixel-variation distributions along the three tensor modes and assigned mode-specific weights to the corresponding gradient maps. RCTV adopted 2-D TV on representative coefficients of HSIs. For each baseline, we replace its original TV term with the HLAWTV regularization to obtain HLAW-LRTV, HLAW-LRTDTV, and HLAW-RCTV. The solver only needs to make local modifications to the TV minimization subproblem, while the overall optimization framework remains unchanged.
We conducted comparison experiments on the WDC dataset with the same noise settings as the synthetic data experiments, and the quantitative and visual results are shown in Table 6 and Figure 11. It can be seen that the denoising performance is greatly improved after using our HLAWTV framework, which indicates that the weighting scheme can be finely ported to other TV regularizers and significantly improve the performance of the original model.
Table 6. Performance comparison of all competing methods on the WDC dataset.
Figure 11. Comparison between original models and HLAW versions on Band 27 of the WDC dataset under case 5.

5.2. Ablation Study

To evaluate the effectiveness of each component in our proposed HLAWTV model, we perform a comprehensive ablation study on the WDC dataset under Case 5. Starting from the baseline LTV model, we progressively incorporate the following two key modules: adaptive weighting (AW) and HLSM. The results are shown in Table 7.
The LTV+AW model enables adaptive adjustment of the sparsity constraint between smooth and textured regions, thereby better preserving textures and edges in the image. However, due to the absence of global gradient sparsity modeling, its performance in terms of structural similarity and spectral accuracy remains suboptimal. In contrast, the LTV+HLSM model enhances global gradient sparsity by adjusting a heavy-tailed distribution on the gradients. Yet, it applies uniform penalties across the entire image without distinguishing between smooth and textured regions, leading to a lack of local adaptivity.
Although both AW and HLSM individually lead to slight decreases in SSIM and SAM compared to the baseline, their combination in the HLAWTV model results in the best overall performance across all three metrics (MPSNR, MSSIM, and MSAM). This clearly demonstrates the strong coupling between the following two components: adaptive weights enhance local sensitivity, while the Hyper-Laplacian prior ensures global denoising robustness.
Moreover, applying our proposed adaptive weighting design to the baseline LTV significantly reduces computation time. This confirms the effectiveness of leveraging the conjugacy between the Hyper-Laplacian and Gamma distributions to derive efficient, data-driven weight estimation. The resulting balance between computational efficiency and denoising quality highlights the practical value of our method in real-world applications.
Table 7. Ablation study results.
Table 7. Ablation study results.
MethodMPSNRMSSIMMSAMTime (s)
LTV43.4600.9940.028681
LTV+AW43.9010.9830.032232
LTV+HLSM43.9230.9850.026319
HLAWTV44.7700.9950.020308

5.3. Parameter Analysis

The robustness of the proposed HLAWTV model to its hyper-parameters λ and rank ( r 1 , r 2 , r 3 ) is evaluated on the simulated data of Case 4. In this case, the denoising quality is monitored with MPSNR and MSSIM while a single parameter is varied and all others are fixed. Each experiment is repeated ten times to obtain statistically reliable curves.
First, we investigated the effect of the regularization parameter λ on denoising performance. As shown in Figure 12a, when λ is small, the HLAWTV regularization plays a minor role in the model, resulting in inadequate noise suppression and thus low MPSNR values with visible residual artifacts remaining in the denoised images. As λ gradually increases, more noise is effectively eliminated, leading to a rapid improvement in MPSNR and optimal preservation of image structures at the peak value. However, further increasing λ causes the regularization to become excessively strong, which in turn leads to over-smoothing and a loss of fine spatial details, causing the MPSNR to gradually decrease. Nevertheless, it is noteworthy that within a certain range of λ , the MPSNR remains consistently high, indicating that the model exhibits acceptable sensitivity to this parameter in practical applications. In summary, λ controls the strength of HLAWTV regularization, and its optimal value can be chosen based on the smoothness of the image as follows: smoother images generally require a larger λ , while images with more texture or detail benefit from a smaller λ .
Figure 12. Sensitivity analysis to parameters λ and r for the PaviaC dataset in case 5. (a) Sensitivity of MPSNR to parameter λ . (b) Sensitivity of MPSNR to parameters ( r 1 , r 2 , r 3 ) .
The factor rank in HLAWTV is specified by a triplet ( r 1 , r 2 , r 3 ) , which controls the dimensionality of the low-rank subspace along the two spatial modes and the spectral mode. As shown in Figure 12b, varying these three ranks exhibits a clear pattern as follows: MPSNR increases rapidly as r 1 , r 2 , and r 3 grow from small values, reflecting the enhanced preservation of spatial and spectral structures. As r 1 5 and r 2 5 rise, the MPSNR plateaus and remains stable over a broad range of higher rank values, indicating that the model is robust to the choice of these parameters within this interval and does not easily overfit.
Overall, these parameter studies demonstrated the robustness of the proposed method and provided clear practical guidelines for selecting optimal parameter settings under various noise.

6. Conclusions

In this article, we proposed the Hyper-Laplacian Adaptive Weighted Total Variation model for hyperspectral image denoising. By integrating a Hyper-Laplacian Scale Mixture prior and leveraging the conjugacy between Gamma and Hyper-Laplacian distributions, our method efficiently computes adaptive weights that reflect the local sparsity and heavy-tailed characteristics of spatial-spectral gradients. The adaptive weighting, shared across all spectral bands at each pixel but varying spatially, enables precise noise suppression and detail preservation. Extensive experiments on both synthetic and real datasets demonstrated that HLAWTV consistently outperforms existing state-of-the-art methods. Furthermore, embedding our adaptive weighting mechanism into conventional total variation frameworks significantly enhances their effectiveness, confirming the broad applicability and robustness of our approach.

Author Contributions

Conceptualization, X.Y. and J.Z.; Methodology, J.Z.; Software, X.Y.; Validation, J.Z., S.F., and T.Z.; Formal analysis, T.Z.; Investigation, L.L.; Resources, J.Z.; Data curation, X.Y.; Writing—original draft preparation, X.Y.; Writing—review and editing, J.Z., S.F., T.Z., L.L., and X.H.; Visualization, T.Z.; Supervision, J.Z.; Project administration, J.Z.; Funding acquisition, X.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the National Natural Science Foundation of China under Grant 62072288; the Natural Science Foundation of Shandong Province under Grant ZR2021MF104 and ZR2021MF113; and the Open Project of the National Key Laboratory of Large Scale Personalized Customization System and Technology under Grant H&C-MPC-2023-02-04.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Huo, Y.; Dong, Y.; Wang, C.; Zhang, M.; Wang, H. Multi-scale memory network with separation training for hyperspectral anomaly detection. Inf. Process. Manag. 2026, 63, 104494. [Google Scholar] [CrossRef]
  2. Bioucas-Dias, J.M.; Plaza, A.; Camps-Valls, G.; Scheunders, P.; Nasrabadi, N.; Chanussot, J. Hyperspectral Remote Sensing Data Analysis and Future Challenges. IEEE Geosci. Remote Sens. Mag. 2013, 1, 6–36. [Google Scholar] [CrossRef]
  3. Shimoni, M.; Haelterman, R.; Perneel, C. Hyperspectral Imaging for Military and Security Applications: Combining Myriad Processing and Sensing Techniques. IEEE Geosci. Remote Sens. Mag. 2019, 7, 101–117. [Google Scholar] [CrossRef]
  4. Ram, B.G.; Oduor, P.; Igathinathane, C.; Howatt, K.; Sun, X. A Systematic Review of Hyperspectral Imaging in Precision Agriculture: Analysis of Its Current State and Future Prospects. Comput. Electron. Agric. 2024, 222, 109037. [Google Scholar] [CrossRef]
  5. Huo, Y.; Qian, X.; Li, C.; Wang, W. Multiple Instance Complementary Detection and Difficulty Evaluation for Weakly Supervised Object Detection in Remote Sensing Images. IEEE Geosci. Remote Sens. Lett. 2023, 20, 6006505. [Google Scholar] [CrossRef]
  6. Xiao, J.-L.; Huang, T.; Deng, L.; Wu, Z.; Vivone, G. A New Context-Aware Details Injection Fidelity with Adaptive Coefficients Estimation for Variational Pansharpening. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5408015. [Google Scholar] [CrossRef]
  7. Dong, W.; Liu, S.; Xiao, S.; Qu, J.; Li, Y. ISPDiff: Interpretable Scale-Propelled Diffusion Model for Hyperspectral Image Super-Resolution. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5519614. [Google Scholar] [CrossRef]
  8. Qu, J.; Liu, X.; Dong, W.; Li, Y.; Meng, D. Progressive Multi-Iteration Registration-Fusion Co-Optimization Network for Unregistered Hyperspectral Image Super-Resolution. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5519814. [Google Scholar] [CrossRef]
  9. Huo, Y.; Cheng, X.; Lin, S.; Zhang, M.; Wang, H. Memory-Augmented Autoencoder With Adaptive Reconstruction and Sample Attribution Mining for Hyperspectral Anomaly Detection. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5518118. [Google Scholar] [CrossRef]
  10. Buades, A.; Coll, B.; Morel, J.-M. A Non-Local Algorithm for Image Denoising. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), Washington, DC, USA, 20–25 June 2005; IEEE Computer Society. Volume 2, pp. 60–65. [Google Scholar] [CrossRef]
  11. Dabov, K.; Foi, A.; Katkovnik, V.; Egiazarian, K. Image Denoising by Sparse 3-D Transform-Domain Collaborative Filtering. IEEE Trans. Image Process. 2007, 16, 2080–2095. [Google Scholar] [CrossRef]
  12. Gu, S.; Zhang, L.; Zuo, W.; Feng, X. Weighted Nuclear Norm Minimization with Application to Image Denoising. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2862–2869. Greater Columbus Convention Center, Columbus, OH, USA, 23–28 June 2014; IEEE: New York, NY, USA, 2014. [Google Scholar] [CrossRef]
  13. Maggioni, M.; Katkovnik, V.; Egiazarian, K.; Foi, A. Nonlocal Transform-Domain Filter for Volumetric Data Denoising and Reconstruction. IEEE Trans. Image Process. 2013, 22, 119–133. [Google Scholar] [CrossRef] [PubMed]
  14. Maggioni, M.; Boracchi, G.; Foi, A.; Egiazarian, K. Video Denoising, Deblocking, and Enhancement Through Separable 4-D Nonlocal Spatiotemporal Transforms. IEEE Trans. Image Process. 2012, 21, 3952–3966. [Google Scholar] [CrossRef] [PubMed]
  15. Yuan, Q.; Zhang, Q.; Li, J.; Shen, H.; Zhang, L. Hyperspectral Image Denoising Employing a Spatial–Spectral Deep Residual Convolutional Neural Network. IEEE Trans. Geosci. Remote Sens. 2019, 57, 1205–1218. [Google Scholar] [CrossRef]
  16. Wei, K.; Fu, Y.; Huang, H. 3-D Quasi-Recurrent Neural Network for Hyperspectral Image Denoising. IEEE Trans. Neural Netw. Learn. Syst. 2020, 31, 2613–2627. [Google Scholar] [CrossRef] [PubMed]
  17. Fu, G.; Xiong, F.; Lu, J.; Zhou, J.; Qian, Y. Spatial–Spectral Recurrent Transformer U-Net for Hyperspectral Image Denoising. arXiv 2023, arXiv:2401.03885. [Google Scholar]
  18. Pang, L.; Gu, W.; Cao, X. TRQ3DNet: A 3D Quasi-Recurrent and Transformer Based Network for Hyperspectral Image Denoising. Remote Sens. 2022, 14, 4598. [Google Scholar] [CrossRef]
  19. Ye, J.; Xiong, F.; Zhou, J.; Qian, Y. Iterative Low-Rank Network for Hyperspectral Image Denoising. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5528015. [Google Scholar] [CrossRef]
  20. Zhang, H.; He, W.; Zhang, L.; Shen, H.; Yuan, Q. Hyperspectral Image Restoration Using Low-Rank Matrix Recovery. IEEE Trans. Geosci. Remote Sens. 2014, 52, 4729–4743. [Google Scholar] [CrossRef]
  21. Rudin, L.I.; Osher, S.; Fatemi, E. Nonlinear Total Variation Based Noise Removal Algorithms. Phys. Nonlinear Phenom. 1992, 60, 259–268. [Google Scholar] [CrossRef]
  22. He, W.; Zhang, H.; Zhang, L.; Shen, H. Total-Variation-Regularized Low-Rank Matrix Factorization for Hyperspectral Image Restoration. IEEE Trans. Geosci. Remote Sens. 2016, 54, 178–188. [Google Scholar] [CrossRef]
  23. Wang, Y.; Peng, J.; Zhao, Q.; Leung, Y.; Zhao, X.; Meng, D. Hyperspectral Image Restoration via Total Variation Regularized Low-Rank Tensor Decomposition. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 1227–1243. [Google Scholar] [CrossRef]
  24. Yang, F.; Chen, X.; Chai, L. Hyperspectral Image Destriping and Denoising Using Stripe and Spectral Low-Rank Matrix Recovery and Global Spatial–Spectral Total Variation. Remote Sens. 2021, 13, 827. [Google Scholar] [CrossRef]
  25. Wang, H.; Peng, J.; Qin, W.; Wang, J.; Meng, D. Guaranteed Tensor Recovery Fused Low-Rankness and Smoothness. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 10990–11007. [Google Scholar] [CrossRef] [PubMed]
  26. Chang, Y.; Yan, L.; Fang, H.; Luo, C. Anisotropic Spectral-Spatial Total Variation Model for Multispectral Remote Sensing Image Destriping. IEEE Trans. Image Process. 2015, 24, 1852–1866. [Google Scholar] [CrossRef] [PubMed]
  27. Cai, W.; Jiang, J.; Ouyang, S. Hyperspectral Image Denoising Using Adaptive Weight Graph Total Variation Regularization and Low-Rank Matrix Recovery. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
  28. Chen, Y.; Cao, W.; Pang, L.; Peng, J.; Cao, X. Hyperspectral Image Denoising via Texture-Preserved Total Variation Regularizer. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5516114. [Google Scholar] [CrossRef]
  29. Li, D.; Chu, D.; Guan, X.; He, W.; Shen, H. Adaptive Hyper-Laplacian Regularized Low-Rank Tensor Decomposition for Hyperspectral Image Denoising and Destriping. arXiv 2024, arXiv:2401.05682. [Google Scholar]
  30. Xu, S.; Qiao, K.; Peng, J.; Zhao, Z. Hyperspectral Image Denoising by Low-Rank Models with Hyper-Laplacian Total Variation Prior. Signal Process. 2024, 201, 108733. [Google Scholar] [CrossRef]
  31. Peng, J.; Xie, Q.; Zhao, Q.; Wang, Y.; Leung, Y.; Meng, D. Enhanced 3DTV Regularization and Its Applications on HSI Denoising and Compressed Sensing. IEEE Trans. Image Process. 2020, 29, 7889–7903. [Google Scholar] [CrossRef]
  32. Peng, J.; Wang, H.; Cao, X.; Liu, X.; Rui, X.; Meng, D. Fast Noise Removal in Hyperspectral Images via Representative Coefficient Total Variation. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5546017. [Google Scholar] [CrossRef]
  33. Candès, E.J.; Li, X.; Ma, Y.; Wright, J. Robust Principal Component Analysis? J. ACM 2011, 58, 1–37. [Google Scholar] [CrossRef]
  34. Fan, H.; Chen, Y.; Guo, Y.; Zhang, H.; Kuang, G. Hyperspectral Image Restoration Using Low-Rank Tensor Recovery. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2017, 10, 4589–4604. [Google Scholar] [CrossRef]
  35. Lu, C.; Feng, J.; Chen, Y.; Liu, W.; Lin, Z.; Yan, S. Tensor Robust Principal Component Analysis with a New Tensor Nuclear Norm. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 42, 925–938. [Google Scholar] [CrossRef] [PubMed]
  36. Xue, J.; Zhao, Y.; Bu, Y.; Chan, J.C.; Kong, S.G. When Laplacian Scale Mixture Meets Three-Layer Transform: A Parametric Tensor Sparsity for Tensor Completion. IEEE Trans. Cybern. 2022, 52, 13887–13901. [Google Scholar] [CrossRef]
  37. Garrigues, P.J.; Olshausen, B.A. Group Sparse Coding with a Laplacian Scale Mixture Prior. Adv. Neural Inf. Process. Syst. 2010, 23, 676–684. [Google Scholar]
  38. Zuo, W.; Meng, D.; Zhang, L.; Feng, X.; Zhang, D. A Generalized Iterated Shrinkage Algorithm for Non-Convex Sparse Coding. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Sydney, Australia, 1–8 December 2013; pp. 217–224. [Google Scholar] [CrossRef]
  39. Zheng, Y.-B.; Huang, T.; Zhao, X.; Chen, Y.; He, W. Double-Factor-Regularized Low-Rank Tensor Factorization for Mixed Noise Removal in Hyperspectral Image. IEEE Trans. Geosci. Remote Sens. 2020, 58, 8450–8464. [Google Scholar] [CrossRef]
  40. Xu, S.; Yu, C.; Peng, J.; Chen, S.; Cao, X.; Meng, D. Haar Nuclear Norms With Applications to Remote Sensing Imagery Restoration. IEEE Trans. Image Process. 2025, 34, 6879–6894. [Google Scholar] [CrossRef]
  41. Zhuang, L.; Bioucas-Dias, J.M. Fast Hyperspectral Image Denoising and Inpainting Based on Low-Rank and Sparse Representations. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 730–742. [Google Scholar] [CrossRef]
  42. He, W.; Yao, Q.; Li, C.; Yokoya, N.; Zhao, Q. Non-local Meets Global: An Integrated Paradigm for Hyperspectral Denoising. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 6868–6877. [Google Scholar] [CrossRef]
  43. Huynh-Thu, Q.; Ghanbari, M. Scope of Validity of PSNR in Image/Video Quality Assessment. Electron. Lett. 2008, 44, 800. [Google Scholar] [CrossRef]
  44. Wang, Z.; Bovik, A.; Sheikh, H.; Simoncelli, E. Image Quality Assessment: From Error Visibility to Structural Similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef]
  45. Mittal, A.; Moorthy, A.K.; Bovik, A.C. Blind/Referenceless Image Spatial Quality Evaluator. In Proceedings of the 2011 Conference Record of the Forty Fifth Asilomar Conference on Signals, Systems and Computers (ASILOMAR), Pacific Grove, CA, USA, 6–9 November 2011; pp. 723–727. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Article metric data becomes available approximately 24 hours after publication online.