A Multi-Scale Edge-Preserving Decomposition and Fusion Framework for Multi-Polarization Passive Millimeter-Wave Imaging

Chen, Xinpeng; Hu, Fei; Zhu, Dong; Su, Jinlong; Fang, Bo; Tao, Jingyu

doi:10.3390/s26113577

Open AccessArticle

A Multi-Scale Edge-Preserving Decomposition and Fusion Framework for Multi-Polarization Passive Millimeter-Wave Imaging

by

Xinpeng Chen

¹

,

Fei Hu

^1,2,*,

Dong Zhu

^1,2

,

Jinlong Su

^1,2,

Bo Fang

¹ and

Jingyu Tao

¹

School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan 430074, China

²

National Key Laboratory of Science and Technology on Multispectral Information Processing, Huazhong University of Science and Technology, Wuhan 430074, China

^*

Author to whom correspondence should be addressed.

Sensors 2026, 26(11), 3577; https://doi.org/10.3390/s26113577

Submission received: 6 May 2026 / Revised: 1 June 2026 / Accepted: 2 June 2026 / Published: 4 June 2026

(This article belongs to the Special Issue Advanced Non-Invasive Sensors: Methods and Applications—2nd Edition)

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

A hierarchical GWACF decomposition and layer-specific fusion strategy significantly enhance concealed object contrast in multi-polarization PMMW images while preserving edge integrity.
Experimental results confirm superior detection accuracy and robustness over mainstream methods, demonstrating reliable concealed object imaging on the human body.

What are the implications of the main findings?

The significant boost in concealed-object contrast and edge fidelity reduces both false alarms and missed detections, establishing multi-polarization PMMW imaging as a robust, privacy-safe solution for operational security screening.
The proposed hierarchical decomposition and fusion framework provides a robust, generalizable strategy for fully exploiting multi-polarization information, establishing a new reference for processing polarimetric passive imagery in security and remote sensing applications.

Abstract

Passive millimeter-wave (PMMW) imaging technology has become a highly promising technology that can protect privacy in human body security inspections. However, most existing methods rely on single-pixel and single-polarization processing mechanisms, which often lead to discrete false-alarm pixels or missed detections in practical applications. Although multi-polarization information can provide richer distinguishing features, the current methods typically depend on limited Stokes parameters or artificially designed polarization features, lacking a systematic framework to fully exploit the intrinsic potential of multi-polarization information. In this paper, we propose a novel multi-scale edge-preserving decomposition model, termed Gaussian and weighted average curvature filtering (GWACF), to hierarchically decompose a multi-polarization PMMW image into three structural layers: base structural (BS) layer, coarse structural (CS) layer, and fine structural (FS) layer. Furthermore, we also propose a fusion strategy in which a gradient-domain pulse-coupled neural network (PCNN) is employed to fuse the texture-rich CS and FS layers, while the energy attribute fusion method is applied to the BS layer where primary structure and background information play a dominant role. This method effectively leverages complementary polarimetric information without introducing artifacts or compromising edge sharpness. Experimental results demonstrate that the proposed method effectively enhances the brightness temperature (BT) contrast of concealed objects. Compared with existing mainstream methods, it exhibits notable advantages in both detection accuracy and robustness.

Keywords:

PMMW imaging; multi-polarization fusion; personnel security inspection; object detection

1. Introduction

Against the backdrop of an increasingly complex global security landscape and the public’s growing demand for safe travel, there is a pressing need for efficient, reliable, and privacy-respecting personal security screening technologies, which has become a key focus for both academia and industry [1,2,3]. Conventional metal detectors and handheld scanners are limited to identifying metallic objects, while being ineffective against non-metallic threats such as ceramic knives, plastic explosives, and liquid hazardous materials. Although active millimeter-wave imaging can penetrate clothing to reveal a human body outline [4,5], its use of transmitted electromagnetic waves has raised public concerns regarding potential health effects and personal privacy. Within this context, passive millimeter-wave (PMMW) imaging technology has gained prominence due to its distinct advantages. As a passive detection system, it only receives millimeter-wave signals radiated by the human body and concealed objects themselves, without the need for any form of electromagnetic wave emission [6,7]. Thus, it completely eliminates health risks and privacy violations at the physical level and is hailed as one of the ideal solutions for the next generation of human body security checks.

However, the practical application of PMMW imaging technology still faces a fundamental bottleneck: the low radiation brightness temperature (BT) contrast between the target and the background. Under indoor ambient temperature conditions, many non-metallic contraband objects exhibit radiation characteristics highly similar to those of human skin, resulting in extremely weak BT differences from the background in the final images, along with a low signal-to-noise ratio (SNR), which makes reliable detection and identification challenging [7,8]. Moreover, factors such as the material and thickness of clothing, fluctuations in environmental temperature, and the complex curvature of the human body further exacerbate image degradation [9,10] significantly limiting the detection performance and practical utility of traditional single-intensity imaging methods. To break through this bottleneck, researchers have turned their attention to another fundamental physical property of electromagnetic waves—polarization. Polarization information reveals the directional and structural characteristics of a target during the process of radiating or scattering electromagnetic waves, offering key insights that intensity images alone cannot provide [11]. Therefore, through polarimetric measurements, it becomes possible to extract material and structural “fingerprints” of concealed objects, enabling them to be more clearly distinguished from human skin amidst complex backgrounds.

Current techniques primarily rely on single-polarization images and single-pixel analysis modes. The information provided by a single polarization method is relatively limited, whereas multi-polarization techniques enable the acquisition of richer information [12,13]. Numerous studies have demonstrated the potential of multi-polarization sensing and processing across various applications, such as multi-polarization image enhancement [14,15], target segmentation [16,17], reflection interference suppression [18,19], material type identification [17,20], and wind direction/speed measurement [21,22]. Meanwhile, single-pixel processing methods often result in numerous discrete false positives or missed detections, thereby adversely affecting target detection and segmentation tasks. Consequently, many researchers advocate the adoption of multi-polarization imaging techniques to expand the dimensionality of BT data, enabling the extraction of richer scene information and enhanced detection performance. Previous studies have indicated that in-depth exploration and utilization of polarization information can significantly improve the detection capability of PMMW imaging [6,23]. To this end, parameters such as the degree of linear polarization (DoLP) [23], passive degree of polarization (PDoP) [11], linear polarization ratio (LPR) [16], and angle of polarization (AoP) [24] have been proposed as quantitative features for target detection in diverse scenarios, serving to measure the extent of target polarization. Such methods predominantly rely on polarization degree information for image segmentation to achieve target recognition. However, the polarization state is influenced not only by the target’s material composition but also by variations in target geometry and incidence angles, both of which can alter the observed polarimetric characteristics. These factors may lead to missed detections or false positives in polarization-based detection methods under specific conditions. Furthermore, the effective fusion of multi-dimensional polarization features with radiative intensity, spatial texture, and other relevant attributes is essential for constructing a robust detection system with low false alarm rates, representing a crucial step toward transitioning this technology from laboratory research to practical application.

To address the aforementioned challenges, this paper innovatively proposes a multi-polarization image fusion framework that combines multi-scale edge-preserving decomposition based on Gaussian and weighted average curvature filtering (GWACF) with a gradient-domain pulse-coupled neural network (PCNN). Unlike existing methods that mainly rely on handcrafted features such as Stokes parameters, simple polarization difference, or polarization ratio, our method, for the first time, cascades a weighted average curvature filter (WACF) with Gaussian filtering (GF) to hierarchically decompose an image into a base structural layer (BS), a coarse structural layer (CS), and a fine structural layer (FS). The uniqueness of this decomposition lies in the fact that WACF can simultaneously preserve edges and suppress noise, while the GF progressively strips away structural information at different scales, thus enabling each layer to carry complementary polarization features. Regarding the fusion strategy, we design distinct fusion rules for different layers: for the texture-rich CS and FS layers, a dual-channel PCNN modulated by multi-scale morphological gradient (MSMG) is adopted; for the BS layer that carries the dominant background, an energy-attribute weighted fusion scheme is used. This overall “decomposition–layering–adaptive fusion” pipeline has not been reported in the field of passive millimeter-wave (PMMW) multi-polarization image processing. In comparison, although the sub-region polarization fusion method [25] also exploits multi-polarization information, its fusion is primarily performed on sub-blocks of Stokes parameter images, lacking multi-scale separation of image structures. The Fisher vector polarization method [7] focuses on enhancing region classification through Fisher encoding, which essentially remains a feature-level post-processing rather than hierarchical fusion prior to image reconstruction. Deep learning-based methods [26] primarily rely on convolutional neural networks or Transformers for end-to-end object detection, and they have indeed achieved remarkable progress in performance. However, these methods typically require large amounts of annotated data and suffer from limited interpretability. In contrast, our method is a physics-driven image fusion method that does not rely on training data, thus offering better generalization and interpretability. Through edge-preserving decomposition and layer-wise differentiated fusion, it is physically more consistent with the millimeter-wave polarization scattering mechanism. The main contributions of this paper are:

(1): This paper designs a multi-scale edge-preserving decomposition model, which integrates Gaussian filtering with weighted average curvature filtering, termed GWACF. The model is designed to hierarchically represent multi-polarization PMMW images by decomposing them into three structural layers: the FS layer, the BS layer, and the CS layer. The FS layer and the CS layer jointly preserve the rich texture and specific structural information of the image, while the BS layer serves as the bottom-level expression of the image, capturing core elements such as primary structure and background content.
(2): In both the FS and CS layers, since they preserve the texture details and partial structural features of the image, the gradient-domain PCNN method performs well in fusing fine-textured images and is therefore employed for the fusion of these two layers. As for the BS layer, which contains the primary background information and overall structural contours of the image, the fusion effect at this level directly determines the overall quality of the final image. Given that the energy attribute (EA) fusion method performs exceptionally well when dealing with image with asymmetric information distribution, and is particularly suitable for coarse approximate image layers. Consequently, it is applied to the fusion of the BS layer.
(3): The proposed method was validated through multi-polarization PMMW imaging detection experiments. Experimental results demonstrate that the fusion method produces high-quality and robust imaging of concealed objects on the human body. Qualitative and quantitative comparisons with several competitive baseline methods further highlight its performance advantages.

The remainder of this paper is organized as follows. Section 2 presents the necessary preliminaries. The proposed fusion method is described in Section 3. Experimental results and performance analysis are provided in Section 4 to validate the effectiveness of our method. Finally, conclusions are given in Section 5.

2. Preliminaries

2.1. Analysis of the PMMW Imaging Model

The PMMW imaging security system detects concealed dangerous objects under a person’s clothing by measuring the effective radiation temperature distribution within the target scene. In practical experimental tests, the PMMW radiation temperature data received by the imaging system is the combined effect of the testing environment, the human body, and the target under examination.

As shown in Figure 1a, the radiation model for the case where no concealed objects are carried on the human body can be derived as follows:

T_{E c} = e_{c} T_{c} + r_{c} T_{a} + t_{c} T_{E b c}

(1)

T_{E b c} = e_{b} T_{b} + r_{b} T_{E c} + t_{b} T_{a}

(2)

where T_Ec denotes the effective radiation temperature in front of the clothing, T_c is the absolute temperature of the clothing, and T_a represents the absolute temperature of free space; e_c, r_c, and t_c represent the emissivity, reflectivity, and transmissivity of the clothing, respectively. T_Ebc denotes the effective radiation temperature between the skin and the clothing, while T_b is the absolute temperature of the human body; e_b, r_b, and t_b represent the emissivity, reflectivity, and transmissivity of the human body, respectively.

By substituting Equation (2) into Equation (1), the effective radiation temperature obtained by the PMMW imaging system for the front clothing area of the human body can be expressed by the following expression:

T_{E c} = \frac{e_{c} T_{c} + r_{c} T_{a} + t_{c} (e_{b} T_{b} + t_{b} T_{a})}{1 - t_{c} r_{b}}

(3)

In the scenario of a concealed object carried on the human body, as illustrated in Figure 1b, the radiation model is revised to the following:

T_{E c} = e_{c} T_{c} + r_{c} T_{a} + t_{c} T_{E o b}

(4)

T_{E o b} = e_{o b} T_{o b} + r_{o b} T_{E c} + t_{o b} T_{E b c}

(5)

T_{E b c} = e_{b} T_{b} + r_{b} T_{E o b} + t_{b} T_{a}

(6)

where T_Eob is the effective radiation temperature in front of the concealed object. E_ob, r_ob, and t_ob represent the emissivity, reflectivity, and transmissivity of the concealed object, respectively. T_ob denotes the absolute temperature of the concealed object.

Furthermore, substituting Equations (5) and (6) into Equation (4), Equation (3) can be rewritten as:

T_{E c} = \frac{(e_{c} T_{c} + r_{c} T_{a}) (1 - T_{o b} r_{b}) + t_{c} [e_{o b} T_{o b} + t_{o b} (e_{b} T_{b} + t_{b} T_{a})]}{1 + t_{c} r_{o b} - t_{o b} r_{b}}

(7)

2.2. Pulse-Coupled Neural Network

The pulse-coupled neural network (PCNN), as a third-generation artificial neural network, offers the key advantage of requiring no parameter training. Decades of development have demonstrated its significant potential in various fields, including image denoising [27,28], image segmentation [29,30], target detection [31,32], and image fusion [33,34].

A PCNN neuron (Figure 2) consists of three components: the receptive, modulation, and pulse generation fields, responsible for signal reception, internal state computation, and output, respectively. Its internal state computation is divided into two parts: connection and feedback inputs. The neuron communicates with its neighboring neurons via synaptic weights M and W, while receiving external input stimuli I. Each neuron retains its previous state, which decays over time.

F_{i j} (n) = e^{- α_{F}} F_{i j} (n - 1) + I_{i j} + V_{F} \sum_{k l} M_{i j k l} Y_{k l} (n - 1)

(8)

L_{i j} (n) = e^{- α_{L}} L_{i j} (n - 1) + V_{L} \sum_{k l} W_{i j k l} Y_{k l} (n - 1)

(9)

where F_ij denotes the feedforward component of the neuron embedded at position (i, j) in the two-dimensional neural array, n represents the iteration number of the neuron, and Y_kl corresponds to the neuronal output at time (n − 1). L_ij refers to the corresponding connection input. All three components possess memory capabilities, with their state values decaying over time. The parameters α_F and α_L are the temporal decay constants for the feeding and connection inputs, respectively, governing their decay rates. The terms V_F and V_L denote the amplification coefficients of the connection weights for the feeding and connection input. These coefficients help to prevent the saturation of neuronal output, which is usually normalied to constants.

By coupling F_ij and L_ij, the internal activity U_ij of the neuron is constituted, where β denotes the coupling strength. The mathematical expression for the internal activity is given by:

U_{i j} (n) = F_{i j} (n) (1 + β L_{i j} (n))

(10)

The output neuron Y_ij is generated by comparing the internal activity U_ij of the neuron with its dynamic threshold θ_ij.

Y_{i j} (n) = \{\begin{cases} 1, U_{i j} (n) > θ_{i j} (n) \\ 0, otherwise \end{cases}

(11)

The dynamic threshold θ_ij is a variable parameter, as it sharply increases when the neuron fires (Y_ij > θ_ij). It then gradually decays back until the neuron fires again. This process is mathematically described by Equation (12), as shown below:

θ_{i j} (n) = e^{- α_{θ}} θ_{i j} (n - 1) + V_{θ} Y_{i j} (n)

(12)

where α_θ is the time constant of decay for the dynamic threshold θ_ij, and V_θ denotes the amplification factor associated with it.

3. Methodology

This section systematically presents the proposed multi-polarization PMMW image fusion method, whose overall workflow is depicted in Figure 3. The framework sequentially comprises the following five core modules: GWACF-based multi-scale decomposition, fusion of fine structural (FS) layer, fusion of coarse structural (CS) layer, fusion of base structural (BS) layer, and final image reconstruction.

The process begins by acquiring observational data at four linear polarization angles (0°, 45°, 90°, and 135°), which are recorded as T_B₀, T_B₄₅, T_B₉₀, and T_B₁₃₅, respectively. These polarization measurements are then linearly averaged to yield two components, T_A and T_B, defined as follows.

T_{A} = \frac{T_{B 0} + T_{B 90}}{2}

(13)

T_{B} = \frac{T_{B 45} + T_{B 135}}{2}

(14)

Subsequently, the GWACF decomposition algorithm is employed to decompose the T_A and T_B images into three distinct layers: the FS layer, CS layer, and BS layer. The corresponding BS layers are coalesced using an energy property-based fusion strategy. In contrast, the fusion of the FS and CS layers is achieved using a gradient-domain PCNN approach. The final fused image is reconstructed by integrating these combined layers.

From the perspective of physical rationality, the pairing strategy adopted in this paper—(T_B₀ and T_B₉₀) as one group, and (T_B₄₅ and T_B₁₃₅) as the other—exhibits clear completeness in the polarization basis. Each group forms an orthogonal polarization basis, enabling comprehensive characterization of any linear polarization state. These two groups are sensitive to the structural and dielectric properties of the target along the horizontal/vertical and diagonal directions, respectively, thereby providing complementary polarization signatures. Moreover, this pairing effectively suppresses the strong dependence of a single polarization channel on target orientation, significantly enhancing the robustness of concealed target detection while reducing redundancy and noise interference under the premise of preserving complementary polarization information.

As for the direct fusion of all four polarization channels, although theoretically feasible, it suffers from notable limitations. First, there exists informational redundancy among the four polarization angles; directly fusing them introduces substantial repetitive information into the fusion network, which may lead to feature competition and unstable fusion decisions. Second, simultaneously feeding all four polarization channels into the fusion model greatly increases computational complexity and may introduce nonlinear inter-channel interference, resulting in artifacts or edge blurring in the fused image.

In contrast, the proposed method first constructs two complementary components, T_A and T_B, which physically mitigate signal fluctuations caused by target orientation while preserving polarization anisotropy information. This provides a more robust input for subsequent edge-preserving decomposition and hierarchical fusion. Therefore, direct fusion of all four channels is not considered in this paper.

3.1. GWACF Multi-Scale Decomposition

For a 2D image T, its weighted average curvature filtering (WACF) [35] can be expressed as:

\begin{matrix} G^{w} (T) & = k {‖\nabla T‖}_{2} G (T) \\ = {‖\nabla T‖}_{2} (\nabla \cdot \frac{\nabla T}{{‖\nabla T‖}_{2}}) \end{matrix}

(15)

where ∇· and ∇ denote the divergence operator and the gradient operator, respectively.

In the case of k = 2, the expression of Equation (15) simplifies to:

G^{w} (T) = Δ T - \frac{T_{x}^{2} T_{x x} + 2 T_{x} T_{y} T_{x y} + T_{y}^{2} T_{y y}}{T_{x}^{2} + T_{y}^{2}}

(16)

where Δ is the isotropic Laplacian operator. T_x and T_y are the first-order partial derivatives along the x and y directions, respectively. T_xx, T_xy and T_yy represent the corresponding second-order partial derivatives, respectively.

Within a 3 × 3 window, the investigation considers eight possible normalized half-window directions, with corresponding kernels generated for all eight cases:

\begin{array}{l} f_{1} = [\begin{matrix} 0 & 0 & 0 \\ 1 / 6 & - 1 & 1 / 6 \\ 1 / 6 & 1 / 3 & 1 / 6 \end{matrix}], f_{2} = [\begin{matrix} 0 & 1 / 6 & 1 / 6 \\ 0 & - 1 & 1 / 3 \\ 0 & 1 / 6 & 1 / 6 \end{matrix}], \\ f_{3} = [\begin{matrix} 1 / 6 & 1 / 3 & 1 / 6 \\ 1 / 6 & - 1 & 1 / 6 \\ 0 & 0 & 0 \end{matrix}], f_{4} = [\begin{matrix} 1 / 6 & 1 / 6 & 0 \\ 1 / 3 & - 1 & 0 \\ 1 / 6 & 1 / 6 & 0 \end{matrix}], \\ f_{5} = [\begin{matrix} 1 / 12 & 0 & 0 \\ 1 / 3 & - 1 & 0 \\ 1 / 6 & 1 / 3 & 1 / 12 \end{matrix}], f_{6} = [\begin{matrix} 0 & 0 & 1 / 12 \\ 0 & - 1 & 1 / 3 \\ 1 / 12 & 1 / 3 & 1 / 6 \end{matrix}], \\ f_{7} = [\begin{matrix} 1 / 12 & 1 / 3 & 1 / 6 \\ 0 & - 1 & 1 / 3 \\ 0 & 0 & 1 / 12 \end{matrix}], f_{8} = [\begin{matrix} 1 / 6 & 1 / 3 & 1 / 12 \\ 1 / 3 & - 1 & 0 \\ 1 / 12 & 0 & 0 \end{matrix}] . \end{array}

(17)

Furthermore, the eight distance values r_i can be computed using the eight kernels, respectively.

r_{i} = f_{i} * T, i = 1, 2, \dots, 8

(18)

where * denotes the convolution operation. The discrete form of Equation (16) can be expressed as:

G^{w} \approx r_{h}, h = argmin \{|r_{i}|; i = 1, 2, \dots, 8\}

(19)

For the purpose of analysis, the WACF procedure is represented as:

T_{o u t} = WACF (T_{i n})

(20)

where T_out and T_in represent the filtered output image and the original input image before filtering, respectively. WACF(·) denotes the WACF operation.

Based on this, the paper proposed an image decomposition method that integrates Gaussian filtering with weighted average curvature filtering, namely the GWACF method, whose overall workflow is illustrated in Figure 4. T_in refers to the input image,

T_{G F}^{m}

(m = 1, 2, 3) denotes the resultant image after the m-th Gaussian filtering operation, and

T_{W}^{m}

is the result after performing the WACF operation. The decomposed layers

T^{FSm}

,

T^{BS}

and

T^{CSm}

are then given by the following equations:

T^{F S m} = \{\begin{cases} T_{i n} - T_{W}^{m}, m = 1 \\ T_{G F}^{m - 1} - T_{W}^{m}, m = 2, 3 \end{cases}

(21)

T^{B S} = T_{G F}^{3}

(22)

T^{C S m} = T_{W}^{m} - T_{G F}^{m}

(23)

where

T_{G F}^{m}

and

T_{W}^{m}

can be respectively calculated as follows:

T_{G F}^{m} = \{\begin{cases} G F (T_{i n}), m = 1 \\ G F (T_{G F}^{m - 1}), m = 2, 3 \end{cases}

(24)

T_{W}^{m} = \{\begin{cases} W A C F (T_{i n}), m = 1 \\ W A C F (T_{G F}^{m - 1}), m = 2, 3 \end{cases}

(25)

where the symbol GF(·) denotes the Gaussian filtering operator.

After decomposition via the GWACF operator, T_in can be composed of three distinct hierarchical levels.

T_{in} = \sum_{m = 1}^{3} (T^{F S m} + T^{C S m}) + T^{B S}

(26)

Therefore, T_A and T_B in Figure 3 can be decomposed into

T_{A}^{F S m}

,

T_{B}^{F S m}

,

T_{A}^{B S}

,

T_{B}^{B S}

,

T_{A}^{C S m}

and

T_{B}^{C S m}

through Equations (21) to (23), respectively.

3.2. Fusion Strategy for Fine and Coarse Structural Layers

The multi-scale morphological gradient (MSMG) is an edge extraction method that integrates multi-scale strategies with mathematical morphological operations. By performing morphological operations at different structural element scales and fusing the results, it effectively enhances and captures image contours and detailed features with improved accuracy. The specific steps are as follows:

(1): The multi-scale structural elements are denoted as:

M S_{v} = \underset{v}{\underset{⏟}{M S_{1} \oplus M S_{1} \dots \oplus M S_{1}}}, v = 1, 2, \dots, M

(27)

where MS₁ denotes a basic structural unit, using a 3 × 3 matrix structuring element as the basic unit, i.e., MS₁ = [1, 1, 1; 1, 1, 1; 1, 1, 1]. v represents the scale factor, and ⨁ indicates the morphological dilation operation.

(2): For an image T, its gradient feature G_v is characterized by the morphological gradient operator as follows:

G_{v} (T) = T \oplus M S_{v} - T Θ M S_{v}

(28)

where

Θ

denotes the morphological erosion operation.

(3): The output value ρ of MSMG can be computed as the weighted sum of the gradients across different scales:

ρ = \sum_{v = 1}^{N} w_{v} \cdot G_{v} (T)

(29)

where w_v denotes the gradient weight at the v-th scale and can be expressed as:

w_{v} = \frac{1}{2 v + 1}

(30)

Both the CS and FS layers carry the texture properties and partial structural characteristics of polarized images. The gradient-domain PCNN fusion strategy is well-suited for fusing images containing small-scale texture features, and is therefore adopted in this paper. Specifically, the CS and FS layers obtained by applying MSMG calculations to T_A and T_B are utilized as the connection strengths input in the PCNN, thereby constituting an MSMG-modulated dual-channel PCNN model, as shown in Figure 5. Its mathematical expression is given as follows.

According to the introduction in reference [36], under the premise of preserving the biological characteristics of the original model, the receptive field of the PCNN model can be simplified as:

F_{i j}^{(1)} (n) = I_{i j}^{(1)} (n)

(31)

F_{i j}^{(2)} (n) = I_{i j}^{(2)} (n)

(32)

L_{i j} (n) = V_{L} \sum_{k l} W_{i j k l} Y_{k l} (n - 1)

(33)

where

I_{i j}^{(1)}

and

I_{i j}^{(2)}

denote the input stimuli received by channel 1 and channel 2, respectively, with their magnitudes corresponding to the pixel values at location (i, j) in the two input images. L_ij represents the connection parameter, and the connection weight amplification factor V_L is set to 1. W_ijkl denotes the connection weight matrix between neurons. When each neuron is positioned at the center of a 3 × 3 (or 5 × 5) weight matrix, its adjacent pixels correspond to the neurons within this matrix. The connection weights between neurons are closely related to their spatial distance.

Therefore, we define the connection weight as the inverse square of the Euclidean distance between connected neurons—specifically, the connection weight between neuron ij and neuron kl is given by:

W_{ijkl} = \{\begin{matrix} 0, & if i = k = j = l \\ \frac{1}{\sqrt{{(i - k)}^{2} + {(j - l)}^{2}}}, & otherwise \end{matrix}

(34)

The modulation field is defined as:

U_{ij} (n) = Max \{U_{i j}^{(1)} (n), U_{i j}^{(2)} (n)\}

(35)

where U_ij denotes the internal activity of the dual-channel output, while

U_{i j}^{(1)}

and

U_{i j}^{(2)}

correspond to the internal activities of channel 1 and channel 2, respectively, which can be expressed as:

U_{i j}^{(1)} (n) = F_{i j}^{(1)} (n) (1 + β_{1} L_{i j} (n))

(36)

U_{i j}^{(2)} (n) = F_{i j}^{(2)} (n) (1 + β_{2} L_{i j} (n))

(37)

Here, β₁ and β₂ represent the connection strengths of channel 1 and channel 2, respectively. As for the pulse generator field of this network, it can be represented by Equations (11) and (12).

Despite the notable advances PCNN has achieved in image fusion techniques, this method still exhibits a critical limitation: each pixel in its network architecture corresponds to an independent neuron. If PCNN is directly applied for fusion, pixels representing the same content within an image block may be activated asynchronously, leading to biased fusion decisions. Consequently, the final fused image may exhibit undesirable pixel-level mutations and block artifacts.

As mentioned above, the MSMG exhibits excellent capability in extracting image edge features. Therefore, it is employed as a pre-modulation processing unit for PCNN to effectively enhance spatial correlations across different layers. In this paper, the image results processed by MSMG are used as the connection strength input of the network, and the detailed mathematical representation is given as follows:

β_{1} = ρ_{1}

(38)

β_{2} = ρ_{2}

(39)

where ρ₁ and ρ₂ are defined as the outputs of the two input images under the MSMG operator, which can be derived from Equation (29).

Then, the fused FS and CS layers are denoted as:

T_{F}^{F S m} (i, j) = \{\begin{cases} T_{A}^{F S m} (i, j), if U_{i j}^{(1)} \geq U_{i j}^{(2)} \\ T_{B}^{F S m} (i, j), otherwise \end{cases}

(40)

T_{F}^{C S m} (i, j) = \{\begin{cases} T_{A}^{C S m} (i, j), if U_{i j}^{(1)} \geq U_{i j}^{(2)} \\ T_{B}^{C S m} (i, j), otherwise \end{cases}

(41)

where

U_{i j}^{(1)}

and

U_{i j}^{(2)}

can be obtained from Equations (36) and (37).

3.3. Fusion Strategy for Base Structural Layer

The BS contains the primary information of the polarized image (such as the main texture and the background). Given the information asymmetry inherent in polarization images, an energy-attribute fusion strategy is employed at this level to effectively balance disparities and achieve information complementarity and enhancement. The implementation of this strategy involves three main steps:

(1): Calculate the eigenvalues H_A and H_B of the BS layer, which are respectively expressed as:

H_{A} = {\bar{X}}_{A} + {\tilde{X}}_{A}

(42)

H_{B} = {\bar{X}}_{B} + {\tilde{X}}_{B}

(43)

where H_A and H_B represent the feature values of the BS layer.

{\bar{X}}_{A}

and

{\bar{X}}_{B}

represent the mean values of

T_{A}^{B S}

and

T_{B}^{B S}

, respectively, while

{\tilde{X}}_{A}

and

{\tilde{X}}_{B}

denote the median values of

T_{A}^{B S}

and

T_{B}^{B S}

, respectively.

(2): Calculate the energy attribute functions E_A and E_B, which are respectively given by:

E_{A} (i, j) = e^{(α_{G} |T_{A}^{B S} (i, j) - H_{A}|)}

(44)

E_{B} (i, j) = e^{(α_{G} |T_{B}^{B S} (i, j) - H_{B}|)}

(45)

where E_A and E_B represent the energy attribute functions of the BS layer. α_G denotes the gain coefficient, we set α_G = 4 in this paper.

(3): Obtain the final fused result $T_{F}^{B S}$ for the BS layer by weighted averaging:

T_{F}^{B S} (i, j) = \frac{E_{A} (i, j) \cdot T_{A}^{B S} (i, j) + E_{B} (i, j) \cdot T_{B}^{B S} (i, j)}{E_{A} (i, j) + E_{B} (i, j)}

(46)

4. Validation Experiments

Two volunteers were recruited for the experiment, as illustrated in Figure 6. One volunteer participated in Scenario 1, while the other volunteer performed Scenarios 2 and 3. In each scenario, five types of concealed objects were randomly placed on the volunteer’s chest, abdomen, and thigh pockets, with all objects positioned inside the clothing and close to skin. Imaging was conducted using a W-band focal plane scanning system [37], operating within the 70–110 GHz frequency range (bandwidth: 40 GHz). The radiometer channels featured a noise figure better than 3.5 dB, with an integration time of 280 μs, achieving a radiometric sensitivity better than 0.5 K.

A high-density polyethylene dielectric lens with a diameter of 460 mm was employed to focus the millimeter waves, and the observation distance was 2.5 m. Imaging under different linear polarizations was realized by rotating the radiometer and feed antenna. Polarization channel calibration was performed using the two-point calibration method: a blackbody absorbing material (emissivity: 0.999) was placed in front of the system, with its physical temperature varying between 25 °C and 60 °C, while a cold temperature reference was obtained by immersing the material in liquid nitrogen (approximately 77 K). Calibration was conducted prior to each polarization angle acquisition to compensate for system drift. To assess stability, each volunteer/scenario was independently measured three times. All experiments were carried out indoors at an ambient temperature of approximately 25 °C, and linear polarization images at 0°, 45°, 90°, and 135° were acquired using the W-band PMMW system.

4.1. Concealed Contraband Detection Imaging Results

In this paper, we adopt an MSMG-PCNN model to fuse the FS and CS layers. For the PCNN, the decay time constant α_θ is set to 3, and the amplification factors V_L and V_θ are 1 and 20, respectively. The number of iterations is fixed at 100, with the stopping rule defined as reaching the maximum iteration count. A 3 × 3 connection kernel is used as the neighborhood size. For the morphological gradient, the scale factor is v = 3, and the basic structural element MS₁ is a 3 × 3 all-ones structuring element. The GF is applied with σ = 20 and a kernel size of 5 × 5. In the WACF decomposition, the number of iterations per scale is set to 2. For the EA fusion, the gain coefficient is α_G = 4.

As shown in Figure 6, the results of three sets of concealed contraband detection imaging experiments are presented, corresponding to three different scenarios from top to bottom in sequence. Figure 6a shows the detection imaging results for Scenario 1, where five concealed objects are annotated with color boxes, including metal pliers (#N1), a utility knife (#N2), an alcohol bottle (#N3), a mobile phone (#N4), and a charging case (#N5). Figure 6b displays the image processing result for Scenario 2, containing a water bottle (#N1), a ceramic knife (#N2), a handgun (#N3), an alcohol bottle (#N4), and a utility knife (#N5). Figure 6c is the imaging result for Scenario 3, which includes a water bottle (#N1), an alcohol bottle (#N2), a handgun (#N3), a glue (#N4), and a ceramic knife (#N5).

As can be observed from the imaging results, all four linear polarization modes—horizontal (T_B₀), 45° (T_B₄₅), vertical (T_B₉₀), and 135° (T_B₁₃₅) linear polarization—struggle to effectively distinguish concealed contraband from the human body background, thereby significantly reducing the detection probability. This reveals an inherent limitation of linear polarization in PMMW security checks: its detection performance is highly dependent on the angle between the direction of the target and the polarization direction. For instance, a knife exhibits the strongest signal when aligned parallel to the polarization direction, while the response weakens considerably under orthogonal orientation, easily leading to missed detections. Additionally, non-metallic or structurally complex concealed contraband is difficult to identify due to its weak polarimetric signature. Moreover, the polarization wave scattering caused by the curvature and contour of the human body will also introduce interference and increase the difficulty of image interpretation.

To improve the detection capability of concealed objects, we further investigated multipolarization fusion methods. The results processed by seven fusion methods—polarization summation average (PSA) [25], principal component analysis (PCA) [38], discrete cosine transform (DCT) [39], laplacian pyramid fusion (LPF) [40], subregion fusion (SF) [25], and two advanced deep learning fusion models, FDFuse [41] and LSRNet [42]—along with the proposed fusion method are presented in Figure 6. It can be seen that the proposed fusion strategy effectively suppresses image noise while enhancing the intensity contrast between the concealed object and the human background, and presenting a more complete object shape and clearer contour features.

4.2. Performance Analysis

As can be observed from the detection imaging results in Figure 6, the proposed method effectively reconstructs the shape and contour features of concealed contraband. To further quantitatively evaluate its reconstruction performance, we introduce entropy [43], blind/referenceless image spatial quality evaluator (BRISQUE) [44], natural image quality evaluator (NIQE) [45], perception-based image quality evaluator (PIQE) [46], and signal-to-noise ratio (SNR) [25] for comprehensive analysis.

In PMMW imaging, the acquired images inherently suffer from SNR and pronounced noise interference. Under such conditions, high entropy values often originate not from meaningful target information but from background noise and random fluctuations. Pursuing high entropy alone may instead indicate insufficient noise suppression, which is detrimental to subsequent detection and recognition tasks. The core of the proposed method lies in enhancing the brightness temperature contrast of concealed targets and emphasizing target contours and structural information, rather than preserving all textural details including noise. Consequently, in the fused image, fluctuations in background regions are effectively suppressed, target information is strengthened, and the pixel distribution becomes more concentrated. The moderate reduction in entropy therefore reflects the effectiveness of the method, rather than signifying information loss. In fact, in PMMW tasks, lower entropy generally corresponds to a clearer background and more salient targets. Hence, entropy should not be interpreted in isolation as “information richness”. Instead, it must be jointly interpreted with task-relevant metrics such as SNR and detection accuracy.

On the other hand, although no-reference image quality assessment metrics such as BRISQUE, NIQE, and PIQE were originally designed for natural images, their core function is to measure structural integrity, perceptual quality, and the degree of distortion, and they are not strictly confined to natural scenes. These metrics are built upon spatial-domain statistical features, patch-based statistical models, or perceptual features, and they exhibit high sensitivity to blur, noise, block artifacts, and structural distortions—precisely the critical factors that need to be evaluated in PMMW image fusion. In PMMW imaging, quality assessment focuses on edge preservation, structural clarity, and noise suppression, which align closely with the statistical characteristics captured by the aforementioned no-reference metrics. Therefore, employing these metrics to evaluate the fusion performance of the proposed method is both reasonable and valid.

For the three experimental scenarios depicted in Figure 6, the computed results of each evaluation metric (entropy, BRISQUE, NIQE, PIQE) are summarized in Table 1. The data demonstrate that the proposed fusion strategy outperforms all comparative methods across all metrics, achieving the best PMMW imaging reconstruction performance. Furthermore, as evidenced by the SNR value for each concealed contraband in Table 2, the proposed method achieves the highest SNR value. This significantly enhances the contrast between concealed contraband and the surrounding background, thereby effectively improving target detectability.

4.3. Performance on ROC Curves

To comprehensively assess the detection performance, this paper adopts the ROC curve proposed in [47] as an analytical tool. This curve clearly illustrates the trade-off between detection sensitivity and specificity by plotting the relationship between the true positive rate (TPR) and false positive rate (FPR) across varying thresholds. The area under the curve (AUC) serves as a measure of the overall performance, where TPR reflects the detection sensitivity and FPR is related to its specificity. Their respective formulas are expressed as follows:

TPR = \frac{TP}{TP + FN}

(47)

FPR = \frac{FP}{FP + TN}

(48)

where TP (true positive) denotes pixels that truly belong to the target object and are correctly identified as such. TN (true negative) refers to pixels that truly belong to the background and are correctly classified as background. FP (false positive) represents pixels that truly belong to the background but are incorrectly identified as the target object. FN (false negative) indicates pixels that truly belong to the target object but are incorrectly identified as background.

Regarding the generation of ROC curves, we need to further clarify the following two points. First, we provide a detailed description of the threshold setting protocol: after normalizing the pixel values of the fused images to the range [0, 1], we traverse all possible thresholds at a step size of 0.01, and calculate the TPR and FPR at each step, thus generating a complete ROC curve. Second, regarding the generation of ground-truth masks, three researchers independently performed manual annotation based on the actual positions and contours of concealed objects in the original PMMW images. The final binary masks were determined by majority voting, while background regions were selected from typical areas free of interference around the targets. All methods under comparison (including the proposed method and the seven competing methods) strictly use the identical masks for pixel-wise evaluation, ensuring a consistent and fair basis for computing TP, FP, TN, and FN.

As illustrated in Figure 7, the ROC curves provide a visual representation of the detection outcomes for each concealed object across the three experimental scenarios. The experimental results demonstrate that the ROC curve of the proposed method is consistently closest to the upper-left corner in all detection tasks and achieves the largest AUC values. This finding indicates that the proposed method maintains a significant advantage in detecting concealed objects, exhibiting superior discriminative ability and higher detection accuracy. In other words, the proposed method attains the largest AUC for concealed contraband in all cases, further confirming its excellent detection performance.

4.4. Ablation Experiments

To further validate the contribution of each key component in the proposed framework, we conducted a series of ablation experiments on the first experimental scenario. Specifically, we evaluated the following variants: (1) Decomposition architecture: replacing GWACF with Gaussian filtering alone or WACF alone; (2) Texture layer fusion: removing the MSMG modulation from PCNN; (3) Background layer fusion: replacing EA fusion with simple weighted averaging; (4) Decomposition layers: using one-layer (BS only) or two-layer (BS + FS) decomposition; and (5) Input construction: using the original four-channel polarization images instead of TA/TB. The quantitative results are shown in Table 3, Table 4, Table 5, Table 6 and Table 7. The complete model consistently outperformed all variants across all evaluation metrics, confirming the necessity and effectiveness of each proposed module. In particular, GWACF achieved the best edge-preserving decomposition, MSMG-PCNN enhanced texture fusion quality, EA fusion maintained structural integrity, and the three-level decomposition provided the optimal multi-scale representation.

4.5. Time Complexity Analysis

The proposed method mainly consists of three parts: GWACF multi-scale decomposition, MSMG-PCNN fusion (applied to the FS and CS layers), and energy attribute fusion (applied to the BS layer). Let the size of a single input image be H × W. The number of GWACF decomposition scales is set to S = 3, and during the WACF decomposition process, the number of iterations at each scale is set to L. The number of MSMG scales is denoted as R. The number of PCNN iterations is N, and the size of the neighborhood window is K × K.

(1): GWACF decomposition: Each layer consists of GF operation (complexity O(HW)) and WACF operation. WACF involves convolution with eight directional kernels, and performing L iterations for each convolution yields a complexity of O(K²LHW). Therefore, the complexity of a single layer is O(8K²LHW). Consequently, the total complexity of the three-layer GWACF decomposition is O(3·(HW + 8K²LHW)), which can be further expressed as O(HW + K²LHW).
(2): MSMG-PCNN fusion: This is used for the FS and CS layers (a total of 2S = 6 layers). For each scale v in MSMG, morphological dilation and erosion operations are performed on the image. The morphological operation for each pixel requires traversing all pixels within the window, so the complexity for a single scale is O(HWv²). Summing over all scales yields O(HWR³). In the PCNN, each iteration involves computing the neighborhood connection weights (complexity O(K²HW)) and updating the internal activity (complexity O(HW)). With a total of N iterations, the complexity for a single layer is O(NK²HW). Consequently, the overall complexity of MSMG-PCNN fusion is O(6·(HWR³ + NK²HW)), which can be further simplified to O(HWR³ + NK²HW).
(3): Energy attribute fusion (BS layer): This involves only pixel-wise mean and exponential operations, yielding a complexity of O(HW).

In summary, the overall complexity of the proposed method is:

O(HW·(R³ + NK² + K²L))

where R = K = 3, L = 2, and N is set to 100, so the overall complexity is linearly related to the size of the image.

To verify the practical efficiency, we measured the average running time of each algorithm on the same hardware platform (a computer equipped with an i7-12700H CPU and a GeForce RTX 3060 GPU). The experimental results are presented in Table 8. For images with a resolution of 175 × 360, the average processing time of the proposed method is approximately 0.69 s. Although this time is higher than those of the other fusion methods, considering the significant improvement in detection accuracy and edge preservation achieved by our method, this computational cost remains within an acceptable range for practical security inspection scenarios. It is anticipated that the use of higher-performance processors, combined with further exploration of acceleration algorithms, could potentially enable real-time imaging at video rates in the future.

5. Conclusions

This paper addresses the insufficient application of multi-polarization technology in PMMW security imaging by proposing a multi-scale edge-preserving image decomposition framework integrated with a gradient-domain PCNN fusion strategy. The method employs a GWACF model to decompose images into three layers—FS, BS, and CS layers—which are subsequently fused using gradient-domain PCNN and energy-attribute methods. This enables effective complementary of multi-polarization information while preserving essential texture and edge details. Experimental results demonstrate that the proposed fusion method outperforms existing mainstream methods across multiple evaluation metrics, significantly enhancing target contour and structural information while providing superior image quality for subsequent detection and segmentation tasks. This study has verified the potential of the multi-polarization PMMW technology in the detection of concealed objects in the human body, and it is applicable to scenarios such as non-intrusive security checks at smart transportation hubs, covert security protection in public places, and non-destructive flaw detection in industries.

In terms of future research works, the following two aspects will be focused on: firstly, further optimization of the fusion algorithm to improve computational efficiency and achieve real-time imaging capability at video rate, and secondly, exploration of integration with deep learning-based fusion methods to enhance the extraction and characterization of multi-polarization target features.

Author Contributions

Conceptualization, X.C. and F.H.; data curation, X.C.; formal analysis, X.C. and D.Z.; funding acquisition, D.Z. and F.H.; methodology, X.C. and F.H.; project administration, X.C.; resources, X.C. and F.H.; supervision, F.H., J.S., B.F. and J.T.; validation, X.C. and F.H.; writing—original draft, X.C.; writing—review and editing, X.C., F.H., J.S., B.F. and J.T. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Natural Science Foundation of China under grant 62271219 and in part by the Fundamental Research Funds for the Central Universities (Huazhong University of Science and Technology).

Data Availability Statement

All data relevant to this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

List of Acronyms

The following abbreviations are used in this manuscript:

AoP	Angle of polarization
AUC	Area under the curve
BRISQUE	Blind/referenceless image spatial quality evaluator
BS	Base structural
BT	Brightness temperature
CS	Coarse structural
DCT	Discrete cosine transform
DoLP	Degree of linear polarization
EA	Energy attribute
FN	False negative
FP	False positive
FPR	False positive rate
FS	Fine structural
GWACF	Gaussian and weighted average curvature filtering
LPF	Laplacian pyramid fusion
LPR	Linear polarization ratio
MSMG	Multi-scale morphological gradient
NIQE	Natural image quality evaluator
PCA	Principal component analysis
PCNN	Pulse-coupled neural network
PDoP	Passive degree of polarization
PIQE	Perception based image quality evaluator
PMMW	Passive millimeter-wave
PSA	Polarization summation average
SF	Subregion fusion
SNR	Signal-to-noise ratio
TN	True negative
TP	True positive
TPR	True positive rate
WACF	Weighted average curvature filtering

References

Zhuravlev, A.; Razevig, V.; Rogozin, A.; Chizh, M. Microwave imaging of concealed objects with linear antenna array and optical tracking of the target for high-performance security screening systems. IEEE Trans. Microw. Theory Tech. 2023, 71, 1326–1336. [Google Scholar] [CrossRef]
Ji, L.; Mou, Y. Research on the feasibility of application of millimeter-wave security screening equipment in civil aviation. In Proceedings of the 2021 IEEE 3rd International Conference on Civil Aviation Safety and Information Technology (ICCASIT), Changsha, China, 20–22 October 2021; pp. 71–75. [Google Scholar]
Wang, Y.; Cheng, Y.; Liu, B.; Xiong, H.; Zhang, L.; Wang, Y.; Qiu, J. Polarization-guided 3-D reconstruction for occluded faces using passive millimeter-wave single-direction imaging. IEEE Trans. Microw. Theory Tech. 2025, 73, 10744–10757. [Google Scholar] [CrossRef]
Liu, C.; Yang, M.; Sun, X. Towards robust human millimeter wave imaging inspection system in real time with deep learning. Prog. Electromagn. Res. 2018, 161, 87–100. [Google Scholar] [CrossRef]
Wang, X.; Guo, S.; Li, J.; Zhao, Y.; Liu, Z.; Jiao, C.; Mao, S. Self-paced feature attention fusion network for concealed object detection in millimeter-wave image. IEEE Trans. Circuits Syst. Video Technol. 2022, 32, 224–239. [Google Scholar] [CrossRef]
Wang, Y.; Cheng, Y.; Tang, P.; Ma, H.; Zhang, L.; Wang, Y.; Qiu, J. Pathway detection using fusion polarization features in passive millimeter-wave imaging. IEEE Trans. Circuits Syst. Video Technol. 2025, 36, 2685–2696. [Google Scholar] [CrossRef]
Cheng, Y.; Tian, X.; Zhu, D.; Wu, L.; Zhang, L.; Qi, J.; Qiu, J. Regional-based object detection using polarization and Fisher vectors in passive millimeter-wave imaging. IEEE Trans. Microw. Theory Tech. 2023, 71, 2702–2713. [Google Scholar] [CrossRef]
Jiang, Z.; Cheng, Y.; Meng, Z.; Wang, N.; Qiu, J. Physics-based feature fusion of passive and active millimeter-wave imaging under noise illumination for handheld concealed object detection. IEEE Trans. Microw. Theory Tech. 2025, 73, 9402–9414. [Google Scholar] [CrossRef]
Yang, H.; Yang, Z.; Hu, A.; Liu, C.; Cui, T.; Miao, J. Unifying convolution and transformer for efficient concealed object detection in passive millimeter-wave images. IEEE Trans. Circuits Syst. Video Technol. 2023, 33, 3872–3887. [Google Scholar] [CrossRef]
Shen, X.; Dietlein, C.; Grossman, E.; Popovic, Z.; Meyer, F. Detection and segmentation of concealed objects in terahertz images. IEEE Trans. Image Process. 2008, 17, 2465–2475. [Google Scholar] [CrossRef]
Su, J.; Tian, Y.; Hu, F.; Cheng, Y.; Hu, Y. Material clustering using passive millimeter-wave polarimetric imagery. IEEE Photon. J. 2019, 11, 5500109. [Google Scholar] [CrossRef]
Cheng, Y.; Wu, H.; Ren, X.; Wang, N.; Qi, J.; Qiu, J. Object segmentation using polarization random feature in passive millimeter-wave imaging. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5302112. [Google Scholar] [CrossRef]
Wang, Y.; Yu, A.; Cheng, Y.; Qi, J. Matrix diffractive deep neural networks merging polarization into meta-devices. Laser Photon. Rev. 2024, 18, 2300903. [Google Scholar] [CrossRef]
Kim, W.; Moon, N.; Kim, H.; Kim, Y. Linear polarization sum imaging in passive millimeter-wave imaging system for target recognition. Prog. Electromagn. Res. 2013, 136, 175–193. [Google Scholar] [CrossRef]
Cheng, Y.; Wang, Y.; Niu, Y.; Zhao, Z. Concealed object enhancement using multi-polarization information for passive millimeter and terahertz wave security screening. Opt. Exp. 2020, 28, 6350–6366. [Google Scholar] [CrossRef]
Cheng, Y.; Wang, Y.; Niu, Y.; Rutt, H.; Zhao, Z. Physically based object contour edge display using adjustable linear polarization ratio for passive millimeter-wave security imaging. IEEE Trans. Geosci. Remote Sens. 2021, 59, 3177–3191. [Google Scholar] [CrossRef]
Wu, L.; Zhu, J.; Peng, S.; Xiao, Z.; Wang, Y. Post-processing techniques for polarimetric passive millimeter wave imagery. Appl. Comput. Electromagn. Soc. J. 2018, 33, 512–518. [Google Scholar]
Li, N.; Zhao, Y.; Pan, Q.; Kong, S. Removal of reflections in LWIR image with polarization characteristics. Opt. Exp. 2018, 26, 16488–16504. [Google Scholar] [CrossRef] [PubMed]
Wang, C.; Miyazaki, D. PolAttNet: A deep learning framework for polarized image reflection removal with multi-attention mechanisms. IEEE Access 2025, 13, 30762–30778. [Google Scholar] [CrossRef]
Wilson, J.; Schuetz, C.A.; Dillon, T.E.; Eng, D.L.K.; Kozacik, S.; Prather, D.W. Display of polarization information for passive millimeter-wave imagery. Opt. Eng. 2012, 51, 091607. [Google Scholar] [CrossRef]
Yueh, S.; Wilson, W.; Li, F.; Nghiem, S.; Ricketts, W. Polarimetric brightness temperatures of sea surfaces measured with aircraft K- and Ka-band radiometers. IEEE Trans. Geosci. Remote Sens. 1997, 35, 1177–1187. [Google Scholar] [CrossRef]
Yueh, S. Directional signals in windsat observations of hurricane ocean winds. IEEE Trans. Geosci. Remote Sens. 2008, 46, 130–136. [Google Scholar] [CrossRef]
Tang, F.; Gui, L.; Liu, J.; Chen, K.; Lang, L.; Cheng, Y. Metal target detection method using passive millimeter-wave polarimetric imagery. Opt. Exp. 2020, 28, 13336–13351. [Google Scholar] [CrossRef]
Garcia, N.; Erausquin, I.; Edmiston, C.; Gruev, V. Surface normal reconstruction using circularly polarized light. Opt. Exp. 2015, 23, 14391–14406. [Google Scholar] [CrossRef]
Cheng, Y.; Zhang, L.; Guo, D.; Wang, N.; Qi, J.; Qiu, J. Subregional polarization fusion via Stokes parameters in passive millimeter-wave imaging. IEEE Trans. Ind. Inform. 2024, 20, 8585–8595. [Google Scholar] [CrossRef]
Yang, H.; Zhang, D.; Hu, A.; Liu, C.; Cui, T.J.; Miao, J. Transformer-based anchor-free detection of concealed objects in passive millimeter wave images. IEEE Trans. Instrum. Meas. 2022, 71, 5012216. [Google Scholar] [CrossRef]
Shen, C.; Wang, D.; Tang, S.; Cao, H.; Liu, J. Hybrid image noise reduction algorithm based on genetic ant colony and PCNN. Vis. Comput. 2017, 33, 1373–1384. [Google Scholar] [CrossRef]
Deng, X.; Ma, Y.; Dong, M. A new adaptive filtering method for removing salt and pepper noise based on multilayered PCNN. Pattern Recognit. Lett. 2016, 79, 8–17. [Google Scholar] [CrossRef]
Helmy, A.; El-Taweel, G. Image segmentation scheme based on SOM–PCNN in frequency domain. Appl. Soft Comput. 2016, 40, 405–415. [Google Scholar] [CrossRef]
Wei, S.; Hong, Q.; Hou, M. Automatic image segmentation based on PCNN with adaptive threshold time constant. Neurocomputing 2011, 74, 1485–1491. [Google Scholar] [CrossRef]
Ranganath, H.; Kuntimad, G. Object detection using pulse coupled neural networks. IEEE Trans. Neural Netw. 1999, 10, 615–620. [Google Scholar] [CrossRef]
Zhou, T.; Si, J.; Wang, L.; Xu, C.; Yu, X. Automatic detection of underwater small targets using forward-looking sonar images. IEEE Trans. Geosci. Remote Sens. 2022, 60, 4207912. [Google Scholar] [CrossRef]
Nie, R.; Cao, J.; Zhou, D.; Qian, W. Multi-source information exchange encoding with PCNN for medical image fusion. IEEE Trans. Circuits Syst. Video Technol. 2021, 31, 986–1000. [Google Scholar] [CrossRef]
Panigrahy, C.; Seal, A.; Mahato, N. MRI and SPECT image fusion using a weighted parameter adaptive dual channel PCNN. IEEE Signal Process. Lett. 2020, 27, 690–694. [Google Scholar] [CrossRef]
Tan, W.; Thitøn, W.; Xiang, P.; Zhou, H. Multi-modal brain image fusion based on multi-level edge-preserving filtering. Biomed. Signal Process. Control 2021, 64, 102280. [Google Scholar] [CrossRef]
Deng, X.; Yan, C.; Ma, Y. PCNN mechanism and its parameter settings. IEEE Trans. Neural Netw. Learn. Syst. 2019, 31, 488–501. [Google Scholar] [CrossRef]
Cheng, Y.; Qiao, L.; Zhu, D.; Wang, Y.; Zhao, Z. Passive polarimetric imaging of millimeter and terahertz waves for personnel security screening. Opt. Lett. 2021, 46, 1233–1236. [Google Scholar] [CrossRef] [PubMed]
Batur, E.; Maktav, D. Assessment of surface water quality by using satellite images fusion based on PCA method in the Lake Gala, Turkey. IEEE Trans. Geosci. Remote Sens. 2019, 57, 2983–2989. [Google Scholar] [CrossRef]
Cao, L.; Jin, L.; Tao, H.; Li, G.; Zhuang, Z.; Zhang, Y. Multi-focus image fusion based on spatial frequency in discrete cosine transform domain. IEEE Signal Process. Lett. 2015, 22, 220–224. [Google Scholar] [CrossRef]
Yao, J.; Zhao, Y.; Bu, Y.; Kong, S.; Chan, J. Laplacian pyramid fusion network with hierarchical guidance for infrared and visible image fusion. IEEE Trans. Circuits Syst. Video Technol. 2023, 33, 4630–4644. [Google Scholar] [CrossRef]
Cheng, M.; Huang, H.; Liu, X.; Mo, H.; Wu, S.; Zhao, X. FDFuse: Infrared and visible image fusion based on feature decomposition. IEEE Trans. Instrum. Meas. 2025, 74, 5021413. [Google Scholar] [CrossRef]
Yang, B.; Hu, Y.; Liu, L.; Liu, Y.; Li, J. LSRNet: A novel interpretable low-rank sparse representation guided fusion network for polarization and intensity images. IEEE Trans. Image Process. 2026, 35, 4961–4974. [Google Scholar] [CrossRef]
Xu, Y.; Zhu, D.; Hu, F.; Fang, B.; Fu, P. Target imaging using compressed sampling in synthetic aperture interferometric radiometer. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5301515. [Google Scholar] [CrossRef]
Mittal, A.; Moorthy, A.; Bovik, A. No-reference image quality assessment in the spatial domain. IEEE Trans. Image Process. 2012, 21, 4695–4708. [Google Scholar] [CrossRef] [PubMed]
Zhang, Z.; Sun, W.; Zhou, Y.; Wu, H.; Li, C.; Min, X.; Liu, X.; Zhai, G.; Lin, W. Advancing zero-shot digital human quality assessment through text-prompted evaluation. IEEE Trans. Image Process. 2025, 34, 3503–3517. [Google Scholar] [CrossRef] [PubMed]
Venkatanath, N.; Praneeth, D.; Sumohana, S.; Swarup, S. Blind image quality evaluation using perception based features. In Proceedings of the 21st 2015 Twenty First National Conference on Communications (NCC), Mumbai, India, 27 February–1 March 2015; pp. 1–6. [Google Scholar]
Zhang, J.; Xu, C.; Gao, Z.; Rodrigues, J.; de Albuquerque, V. Industrial pervasive edge computing-based intelligence IoT for surveillance saliency detection. IEEE Trans. Ind. Inform. 2021, 17, 5012–5020. [Google Scholar] [CrossRef]

Figure 1. Schematic of the human body under radiation measurement. (a) The situation in which there are no concealed objects; (b) The situation in which objects are hidden beneath clothing.

Figure 2. Schematic diagram of the PCNN neuron model.

Figure 3. The framework of the proposed fusion strategy.

Figure 4. Flowchart of the GWACF decomposition method.

Figure 5. Gradient domain PCNN model with MSMG modulation.

Figure 6. Imaging results for concealed object detection in the human body. (a) Detection imaging results for experimental scenario 1, with concealed objects including a metal pliers (#N1), a utility knife (#N2), an alcohol bottle (#N3), a mobile phone (#N4), and a charging case (#N5). (b) Detection imaging results for experimental scenario 2, with concealed objects including a water bottle (#N1), a ceramic knife (#N2), a handgun (#N3), an alcohol bottle (#N4), and a utility knife (#N5). (c) Detection imaging results for experimental scenario 3, with concealed objects including a water bottle (#N1), an alcohol bottle (#N2), a handgun (#N3), a glue (#N4), and a ceramic knife (#N5).

Figure 7. Comparison of ROC curves for multiple methods in each concealed object. (a–e) Concealed object #N1~#N5 of experimental scenario 1 results. (f–j) Concealed object #N1~#N5 of experimental scenario 2 results. (k–o) Concealed object #N1~#N5 of experimental scenario 3 results.

Table 1. Computed results of evaluation metrics for various processing methods. The bold values indicate the best performance.

Scenario	Method	Entropy	BRISQUE	NIQE	PIQE
Scenario 1	T_B₀	6.91	40.58	12.87	54.41
	T_B₄₅	6.89	39.25	13.88	56.50
	T_B₉₀	6.86	38.43	15.40	55.47
	T_B₁₃₅	6.83	38.07	13.90	55.06
	T_PSA	6.75	33.94	11.53	42.80
	T_PCA	6.69	34.06	10.08	42.03
	T_DCT	6.54	35.73	10.22	42.66
	T_LPF	6.71	34.46	10.14	41.46
	T_SF	6.82	38.29	13.91	53.40
	T_FDFuse	6.21	34.58	10.31	42.65
	T_LSRNet	6.03	33.31	10.18	40.46
	T_Proposed	5.91	29.50	10.03	32.15
Scenario 2	T_B₀	6.86	41.09	12.92	55.19
	T_B₄₅	6.85	41.10	11.31	56.74
	T_B₉₀	6.74	40.05	12.29	54.38
	T_B₁₃₅	6.87	41.02	11.85	55.09
	T_PSA	6.60	35.91	10.45	45.45
	T_PCA	6.11	42.49	14.79	63.39
	T_DCT	6.49	36.17	10.40	44.13
	T_LPF	6.38	33.44	10.45	43.19
	T_SF	6.84	40.25	11.96	56.57
	T_FDFuse	5.46	33.48	8.87	43.29
	T_LSRNet	5.40	31.63	8.06	41.47
	T_Proposed	5.25	21.51	6.01	23.09
Scenario 3	T_B₀	6.79	40.96	12.43	55.79
	T_B₄₅	6.85	41.08	11.77	56.56
	T_B₉₀	6.83	41.36	12.67	57.74
	T_B₁₃₅	6.82	40.98	12.62	55.57
	T_PSA	6.61	36.48	10.82	45.40
	T_PCA	6.63	36.59	10.70	45.61
	T_DCT	6.50	40.63	10.56	44.66
	T_LPF	6.60	35.95	10.13	45.40
	T_SF	6.59	26.01	7.37	28.05
	T_FDFuse	6.21	35.66	5.13	28.43
	T_LSRNet	6.17	32.19	4.79	25.19
	T_Proposed	6.03	18.77	4.24	18.44

Table 2. SNR (Unit: dB) for each concealed object in the experimental results. The bold values indicate the best performance.

Scenario	Number	Concealed Object	T_B₀	T_B₄₅	T_B₉₀	T_B₁₃₅	T_PSA	T_PCA	T_DCT	T_LPF	T_SF	T_FDFuse	T_LSRNet	T_Proposed
Scenario 1	#N1	Metal pliers	6.89	7.15	5.72	6.01	7.67	7.84	7.69	7.98	7.18	8.08	8.19	9.39
	#N2	Utility knife	10.01	10.71	8.45	10.43	11.79	11.78	11.66	12.16	11.51	12.15	12.33	13.56
	#N3	Alcohol bottle	11.40	11.35	10.19	10.81	11.76	12.08	12.06	12.27	11.54	12.71	12.94	13.61
	#N4	Mobile phone	9.13	9.46	7.16	10.05	10.42	10.49	10.46	10.65	10.28	10.86	11.02	11.35
	#N5	Charging case	10.72	10.42	8.72	9.58	11.39	12.46	12.41	12.55	11.67	12.04	12.26	13.03
Scenario 2	#N1	Water bottle	9.02	6.47	7.40	7.22	11.29	11.27	11.31	11.41	6.88	11.89	12.01	12.92
	#N2	Ceramic knife	6.29	1.71	2.79	3.98	7.51	7.55	7.52	7.53	6.47	8.06	8.37	9.29
	#N3	Handgun	9.66	6.29	8.31	9.23	10.91	10.88	10.92	11.16	7.95	11.37	11.53	11.72
	#N4	Alcohol bottle	7.08	3.35	1.84	7.57	6.15	6.22	6.16	6.17	6.72	7.29	7.51	9.48
	#N5	Utility knife	6.19	5.63	3.73	5.82	6.81	6.86	6.84	6.87	6.23	7.93	8.26	8.95
Scenario 3	#N1	Water bottle	10.21	6.54	1.08	11.96	12.41	12.39	12.45	12.61	3.66	13.62	13.90	14.77
	#N2	Alcohol bottle	9.75	11.24	8.71	10.57	12.13	12.15	12.18	12.77	12.55	13.15	13.49	15.73
	#N3	Handgun	10.17	10.49	7.02	10.03	11.29	11.31	11.05	11.34	10.99	11.92	12.17	13.24
	#N4	Glue	5.35	2.81	2.05	3.65	4.64	4.63	4.83	5.03	5.77	6.85	7.08	8.31
	#N5	Ceramic knife	2.15	1.32	0.86	1.73	2.68	2.71	2.76	3.54	3.89	4.57	4.65	6.87

Table 3. SNR (Unit: dB) for each concealed object in the decomposition architecture ablation experiment results. The bold values indicate the best performance.

	Metal Pliers	Utility Knife	Alcohol Bottle	Mobile Phone	Charging Case
GF only	6.86	10.98	9.45	8.12	10.62
WACF only	7.31	11.67	11.38	9.71	11.70
GWACF (complete)	9.39	13.56	13.61	11.35	13.03

Table 4. SNR (Unit: dB) for each concealed object in the texture layer fusion ablation experiment results. The bold values indicate the best performance.

	Metal Pliers	Utility Knife	Alcohol Bottle	Mobile Phone	Charging Case
PCNN (without MSMG)	8.23	11.92	12.78	10.16	11.57
PCNN + MSMG (complete)	9.39	13.56	13.61	11.35	13.03

Table 5. SNR (Unit: dB) for each concealed object in the background layer fusion ablation experiment results. The bold values indicate the best performance.

	Metal Pliers	Utility Knife	Alcohol Bottle	Mobile Phone	Charging Case
Simple weighted average	8.77	12.81	12.93	10.82	12.35
EA fusion (complete)	9.39	13.56	13.61	11.35	13.03

Table 6. SNR (Unit: dB) for each concealed object in the different decomposition layers ablation experiment results. The bold values indicate the best performance.

	Metal Pliers	Utility Knife	Alcohol Bottle	Mobile Phone	Charging Case
BS only	5.41	8.85	8.08	6.93	7.74
BS+FS	7.53	10.75	10.66	8.51	9.18
BS + FS + CS (complete)	9.39	13.56	13.61	11.35	13.03

Table 7. SNR (Unit: dB) for each concealed object in the input construction method ablation experiment results. The bold values indicate the best performance.

	Metal Pliers	Utility Knife	Alcohol Bottle	Mobile Phone	Charging Case
T_B₀, T_B₄₅, T_B₉₀, T_B₁₃₅ direct fusion	8.88	12.49	12.73	10.17	11.36
T_A/T_B construction (complete)	9.39	13.56	13.61	11.35	13.03

Table 8. Average execution time of various methods (Unit: s).

Method	Average Execution Time
T_PSA	0.0085
T_PCA	0.2686
T_DCT	0.0153
T_LPF	0.3631
T_SF	0.3327
T_FDFuse	0.0968
T_LSRNet	0.1501
T_Proposed	0.6873

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chen, X.; Hu, F.; Zhu, D.; Su, J.; Fang, B.; Tao, J. A Multi-Scale Edge-Preserving Decomposition and Fusion Framework for Multi-Polarization Passive Millimeter-Wave Imaging. Sensors 2026, 26, 3577. https://doi.org/10.3390/s26113577

AMA Style

Chen X, Hu F, Zhu D, Su J, Fang B, Tao J. A Multi-Scale Edge-Preserving Decomposition and Fusion Framework for Multi-Polarization Passive Millimeter-Wave Imaging. Sensors. 2026; 26(11):3577. https://doi.org/10.3390/s26113577

Chicago/Turabian Style

Chen, Xinpeng, Fei Hu, Dong Zhu, Jinlong Su, Bo Fang, and Jingyu Tao. 2026. "A Multi-Scale Edge-Preserving Decomposition and Fusion Framework for Multi-Polarization Passive Millimeter-Wave Imaging" Sensors 26, no. 11: 3577. https://doi.org/10.3390/s26113577

APA Style

Chen, X., Hu, F., Zhu, D., Su, J., Fang, B., & Tao, J. (2026). A Multi-Scale Edge-Preserving Decomposition and Fusion Framework for Multi-Polarization Passive Millimeter-Wave Imaging. Sensors, 26(11), 3577. https://doi.org/10.3390/s26113577

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Multi-Scale Edge-Preserving Decomposition and Fusion Framework for Multi-Polarization Passive Millimeter-Wave Imaging

Highlights

Abstract

1. Introduction

2. Preliminaries

2.1. Analysis of the PMMW Imaging Model

2.2. Pulse-Coupled Neural Network

3. Methodology

3.1. GWACF Multi-Scale Decomposition

3.2. Fusion Strategy for Fine and Coarse Structural Layers

3.3. Fusion Strategy for Base Structural Layer

4. Validation Experiments

4.1. Concealed Contraband Detection Imaging Results

4.2. Performance Analysis

4.3. Performance on ROC Curves

4.4. Ablation Experiments

4.5. Time Complexity Analysis

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

List of Acronyms

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI