Optical Remote Sensing Image Classification Based on Quantum Statistics

Li, Xiaoli; Zhao, Longlong; Li, Hongzhong; Chen, Pan; Sun, Luyi; Guo, Shanxin; Zhao, Xuemei; Chen, Jinsong

doi:10.3390/electronics15102075

Open AccessArticle

Optical Remote Sensing Image Classification Based on Quantum Statistics

by

Xiaoli Li

^1,2

,

Longlong Zhao

^1,2

,

Hongzhong Li

^1,2

,

Pan Chen

^1,2,

Luyi Sun

^1,2

,

Shanxin Guo

^1,2

,

Xuemei Zhao

³ and

Jinsong Chen

^1,2,*

¹

Center for Geospatial Information, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China

²

Shenzhen Engineering Laboratory of Ocean Environmental Big Data Analysis and Application, Shenzhen 518055, China

³

School of Electronic Engineering and Automation, Guilin University of Electronic Technology, Guilin 541000, China

^*

Author to whom correspondence should be addressed.

Electronics 2026, 15(10), 2075; https://doi.org/10.3390/electronics15102075

Submission received: 10 April 2026 / Revised: 9 May 2026 / Accepted: 11 May 2026 / Published: 13 May 2026

(This article belongs to the Special Issue Challenges and Prospects in Remote Sensing Data Intelligent Interpretation)

Download

Browse Figures

Versions Notes

Abstract

To address the difficulty of finely classifying complex optical remote sensing images, this paper innovatively proposes a new image classification method based on quantum statistics (QS) inspired by quantum physics. Each pixel in the image is regarded as a fermion, which is one of the fundamental particles in quantum systems. The energy of the energy level where fermions are located is described using the negative logarithm of the distribution that the spectrum of the pixel follows. The Fermi-Dirac distribution, a quantum statistics model used to describe the complex occupation pattern of energy levels by fermions, is employed to characterize the membership relationship between pixels and classes, instead of traditional distance measures and probability measures. Then, the cost function guiding the convergence of classification is defined based on free energy, which is used to describe whether a system is in a state of thermal equilibrium according to energy, temperature, and entropy. To minimize the free energy, the derivative method and the simulated annealing algorithm are adopted to estimate the optimal solution for model parameters. The proposed method can describe complex features more effectively, obtain fine classification results, and overcome the curse of dimensionality in high-dimensional image classification. Finally, the feasibility and effectiveness are verified through qualitative and quantitative analysis of multispectral and hyperspectral image classification experiments.

Keywords:

quantum statistics; Fermi-Dirac distribution; fermions; image classification

1. Introduction

Optical remote sensing (ORS) image classification has always been a key focus in image processing, playing an important role in land cover monitoring, ecological environment monitoring, and many other fields [1,2]. With the improvement of the spatial resolution of ORS images, the distribution of land features is becoming increasingly complex, which brings difficulties and challenges to high-accuracy classification. Although deep learning is the most widely researched method in ORS image classification, the dependence on the quantity and quality of the sample severely restricts its generalization ability [3,4]. In practical applications, there are many cases where the quantity and quality of samples are insufficient. Therefore, unsupervised image classification still holds significant research value [5,6].

In unsupervised image classification, the key lies in how to accurately characterize the relationship between pixels and classes [7]. The most commonly used models primarily include distance measurement and probability measurement. Distance measurement achieves classification by describing the degree of dissimilarity between pixels and classes. Fuzzy C-means (FCM), defined by Euclidean distance, is one of the most representative algorithms [8]. However, Euclidean distance can only describe data with spherical distribution, which is ineffective in describing complex distribution feature data [9]. Considering the significant spatial interaction between pixels in an image, some studies introduce neighborhood pixel constraints to improve image classification accuracy [10,11], such as FCM_S, EnFCM, FLICM et al. To explore the profound influence of neighborhood effects, Wu and Wu [12] proposed the master-slave hierarchy local information driven FCM (MSHLICM), extending the first-order neighborhood to the second order. The noise resistance and robustness have been improved. The anti-noise performance of pixel-based algorithms is limited. Therefore, object-based methods have been vigorously developed [13]. It groups similar pixels into objects and then performs classification by taking the superpixel as the basic unit. Lei et al. [14] proposed the superpixel-based fast FCM (SF_FCM). It defines a multiscale morphological gradient reconstruction operation to obtain superpixels. Then, it performs FCM based on the histogram of superpixels. SF_FCM improves noise resistance but relies too much on the quality of the superpixel segmentation results. To increase the flexibility of superpixels, Li et al. [15] proposed a fully fuzzy Voronoi tessellation FCM algorithm (FVT-FCM). It extends superpixels to fuzzy superpixels and optimizes the fuzzy superpixel segmentation results according to the objective function. FVT-FCM can maintain high noise resistance and detail preservation. However, the distance measurements used above are sensitive to noise, so some probability measures used to describe the statistical characteristics of classes have been studied [16]. Based on the scalability of FCM, Chatzis and Varvarigou [17] proposed HMRF-FCM. It assumes that the spectrum of pixels follows a Gaussian distribution. Then, the probability density function is used to describe the degree to which pixels belong to a class. To adapt to different complex distribution situations, the finite mixture model (FMM), which models the statistical distribution of data using multiple linear combinations from the same distribution, is developed in image classification [18], such as the Gaussian mixture model (GMM) [19], Gamma mixture model (GaMM) [20], and Student’s-t mixture model (SMM) [21]. Shi and Li [22] proposed a hierarchical mixed model image segmentation method (HMM). The components of the mixed model can be selected based on the statistical distribution characteristics of the image. Zhao et al. [23] proved that GMM can approximate any complex distribution. Therefore, GMM can be used to model the general HMM, known as HGMM. Finally, it obtains segmentation results by optimizing the parameters of the hybrid model within the Bayesian framework. HGMM provides an effective approach for modeling complex image features, but the parameter optimization is inefficient.

To overcome the dependence of deep learning on samples, a series of unsupervised deep learning methods have been studied, such as Deepcluster [3], SWAV [4], DINO [24], PiCIE [5], Diffseg [25] et al. Deepcluster first combined clustering with deep learning. It performs K-means clustering on the features produced by the convnet and updates its weight by predicting the cluster assignments as pseudo-labels in a discriminative loss. To avoid the computational complexity caused by feature comparison, Caron et al. [4] proposed SWAV. It proposed a new paradigm that compares cluster assignments, allowing for contrast between different image views while not relying on explicit pairwise feature comparisons. Furthermore, DINO utilized the advantage of the Vision Transformer’s global perception to further improve classification accuracy. However, the above three algorithms are all image-level classification and are not suitable for pixel-level classification of remote sensing images. Although PiCIE extends unsupervised deep learning to the pixel-level, it still requires a large number of unlabeled samples for training. In order to completely get rid of the sample, Tian et al. [25] proposed Diffseg. It introduces a simple yet effective iterative merging process based on measuring the KL divergence among attention maps to merge them into valid segmentation masks. Diffseg does not require any training or language dependency to extract quality pixel-level classification for any images.

The above methods have certain limitations in the fine classification of spatial levels. To overcome this issue and describe complex ORS image features more effectively, an ORS image classification algorithm based on quantum statistics (QS) is proposed in this paper. The classification process of complex images is described by referencing quantum physics theory. The fundamental particles of quantum systems include bosons and fermions. Bosons are not exclusive and can easily lead to system collapse, resulting in class loss when applied in image classification. Following the Pauli exclusion principle, fermions can only occupy one quantum state, which is consistent with image classification, where a pixel can only belong to one class. Therefore, fermions and their theories will serve as the modeling foundation for this article. First, each pixel in the image is regarded as a fermion. The negative logarithm of the probability distribution, followed by pixel spectra, is used to describe the energy of the energy level where fermions are located. The membership relationship between pixels and classes is defined based on the Fermi-Dirac statistical distribution that is used to depict the complex physical processes of which energy level is occupied by fermions. This modeling approach, based on quantum statistics, can better describe complex situations than distance measurement and probability measurement, and it does not rely on neighborhood effects. Meanwhile, it also avoids the curse of dimensionality problem, i.e., the higher the data dimension, the more severe the invalidity of distance measurement, and the lower the accuracy. Then, the cost function for classification is modeled by free energy, which is a physical quantity that describes whether a system is in a state of thermal equilibrium according to energy, temperature, and entropy. Finally, the model parameters are optimized using the simulated annealing algorithm, with the criterion of minimizing free energy.

The main contributions of this paper are summarized as follows:

(1): This paper systematically proposes a classification method based on quantum statistics, combining quantum physics theory and image processing theory. It establishes a one-to-one correspondence between quantum systems and image classification processes.
(2): It can better describe complex classification processes and obtain fine classification results.
(3): It can overcome the curse of dimensionality.

This paper is organized as follows: Section 2 reviews the Fermi-Dirac distribution theory. Section 3 introduces the proposed algorithm. Section 4 discusses the feasibility and effectiveness of the proposed algorithm through multispectral and hyperspectral image classification experiments. Section 5 discusses the impact of model parameters, spectral dimensional robustness, time complexity and remaining challenges. Section 6 is the conclusion.

2. Fermi-Dirac Distribution

The Fermi-Dirac distribution describes the occupation probability of fermions at energy levels in a state of thermal equilibrium [26].

〈n_{l}〉 = \frac{1}{e^{(ε_{l} - μ) / k T} + 1},

(1)

where

〈n_{i}〉

is the average number of particles occupying energy level l, ε_l is the energy of energy level l, μ is the chemical potential, k is the Boltzmann constant, k = 1.380649 × 10⁻²³ J/K, and T is the absolute temperature. The Fermi-Dirac distribution is derived based on the Pauli exclusion principle, which stipulates that each quantum state can be occupied by at most one fermion. Therefore, n_l = 0 or n_l = 1, 0 ≤

〈n_{i}〉

≤ 1.

Given a system of mutually independent particles that can be in different energy levels, the total energy E and the total number of particles N are

E = \sum_{l} ε_{l} n_{l}, N = \sum_{l} n_{l},

(2)

When the particle is a fermion, the total energy of the system can be expressed as,

E = \sum_{l} ε_{l} 〈n_{l}〉 = \sum_{l} \frac{ε_{l}}{e^{(ε_{l} - μ) / k T + 1}} .

(3)

The entropy of the fermion system is,

S = - k \sum_{l} \{〈n_{l}〉 \log 〈n_{l}〉 + (1 - 〈n_{l}〉) \log (1 - 〈n_{l}〉)\} .

(4)

Free energy is an important physical quantity for judging whether a system is in an equilibrium state. The smaller the free energy, the closer the system is to an equilibrium state. It can be defined in different ways depending on the physical conditions. When temperature, volume, and particle number remain constant, it is defined as the Helmholtz free energy,

F = E - T S,

(5)

When temperature, pressure, and particle number remain constant, it is defined as the Gibbs free energy,

G = F - P V,

(6)

where P is the pressure, and V is the volume. When the number of particles is variable, it is defined as the grand potential,

Φ = F - μ N .

(7)

3. Methods

Given an image I = {I_i, i = 1, …, n}, where I_i = (I_i₁, I_i₂, …, I_id) is a spectral vector, i is the index of pixels, n is the total number of pixels, and d is the dimension of the image. Image classification is the process of assigning a class label to each pixel, i.e., L = {L_i, i = 1, …, n}, where L_i is the class label of pixel i, L_i ∈ {1, …, l, …, m}, l is the index of classes, and m is the number of classes.

3.1. Classification Model

Assuming that the spectral vectors of pixels within the same class follow an independent and identical multivariate Gaussian distribution, as the Gaussian distribution is the most commonly used distribution, and parameter optimization is convenient.

p (I_{i} | L_{i} = l; θ_{l}) = \frac{1}{{(2 π)}^{d / 2} {|Σ_{l}|}^{1 / 2}} \exp \{- \frac{1}{2} {(I_{i} - μ_{l})}^{T} Σ_{l}^{- 1} (I_{i} - μ_{l})\},

(8)

where θ_l= {μ_l, Σ_l}, μ_l and Σ_l are the mean and covariance of the Gaussian distribution for class l.

If each pixel is regarded as a fermion, then the image constitutes a multiparticle system. To convert image features into energy in quantum systems, the negative logarithm of the Gaussian distribution is modeled. Therefore, the energy occupied by fermions in energy levels can be expressed as

ε_{i l} (θ_{l}) = - \log p (I_{i} | L_{i} = l; θ_{l}) .

(9)

According to Section 2, fermion i occupying energy level l follows a Fermi-Dirac distribution. Pixel i belonging to class l is equivalent to fermion i occupying energy level l. To strictly meet the condition

\sum_{l = 1}^{m} p (L_{i} = l) = 1

, the Fermi-Dirac distribution is normalized, then,

p (L_{i} = l; α_{i}, θ_{l})

is modeled according to Equation (1),

p (L_{i} = l; α_{i}, θ_{l}) = \frac{1 / (\exp β (ε_{i l} - α_{i}) + 1)}{\sum_{l^{'} = 1}^{m} 1 / (\exp β (ε_{i l^{'}} - α_{i}) + 1)},

(10)

where β = 1/kT, α_i is the chemical potential of pixel i.

Therefore, the total energy of the image classification process is modeled according to Equation (3),

E (α, θ) = \sum_{i = 1}^{n} \sum_{l = 1}^{m} p (L_{i} = l; α_{i}, θ_{l}) ε_{i l} .

(11)

The entropy is modeled according to Equation (4),

\begin{array}{l} S (α, θ) = - k \sum_{i = 1}^{n} \sum_{l = 1}^{m} \{p (L_{i} = l; α_{i}, θ_{l}) \log p (L_{i} = l; α_{i}, θ_{l}) + \\ [1 - p (L_{i} = l; α_{i}, θ_{l})] \log [1 - p (L_{i} = l; α_{i}, θ_{l})]\} \end{array} .

(12)

Since the total number of pixels in an image remains unchanged, this paper selects the Helmholtz free energy (Equation (5)) to define the cost function for image classification,

J (α, θ) = E (α, θ) - T \cdot S (α, θ) .

(13)

The parameter solution corresponding to the minimum cost function is

\hat{α}

and

\hat{θ}

,

\{{\hat{α}}_{i}, {\hat{θ}}_{l} : i = 1, \dots, n, l = 1, \dots, k\} = \arg \min J (α, θ) .

(14)

Then, the image classification result is,

{\hat{L}}_{i} = \arg \max_{l} p (L_{i} = l; {\hat{α}}_{i}, {\hat{θ}}_{l}), \forall i .

(15)

3.2. Parameter Solution

For Gaussian distribution parameters θ_l= {μ_l, Σ_l}, the solution can be obtained by the derivative method, i.e.,

\frac{\partial J (α, θ)}{\partial μ_{l}} = 0

, and

\frac{\partial J (α, θ)}{\partial Σ_{l}} = 0

.

μ_{l}^{(t + 1)} = \frac{\sum_{i = 1}^{n} p (L_{i} = l; α_{i}^{(t)}, θ_{l}^{(t)}) I_{i}}{\sum_{i = 1}^{n} p (L_{i} = l; α_{i}^{(t)}, θ_{l}^{(t)})},

(16)

Σ_{l}^{(t + 1)} = \frac{\sum_{i = 1}^{n} p (L_{i} = l; α_{i}^{(t)}, θ_{l}^{(t)}) (I_{i} - μ_{l}^{(t + 1)}) {(I_{i} - μ_{l}^{(t + 1)})}^{T}}{\sum_{i = 1}^{n} p (L_{i} = l; α_{i}^{(t)}, θ_{l}^{(t)})},

(17)

where t is the index of iteration.

For parameter α, the simulated annealing algorithm is used to estimate the optimal solution, assuming that the state of α in the t-th iteration is α^(t) = {α_i^(t), i = 1, …, n}. After being disturbed, its state is α^* = α^(t) + ∆α, where ∆α follows a normal distribution with a mean of 0 and a variance of σ², i.e., ∆α~N(0, σ²). Then, the Helmholtz free energy of the entire system changes from J(α^(t), θ^(t)) to J(α^*, θ^(t)). According to the Metropolis criterion, the acceptance probability of α^(t) becoming α^* is

p (α^{(t)} \to α^{*}) = \{\begin{cases} 1 & if J (α^{*}, θ^{(t)}) < J (α^{(t)}, θ^{(t)}) \\ \exp [- \frac{J (α^{*}, θ^{(t)}) - J (α^{(t)}, θ^{(t)})}{T^{(t)}}] & otherwise \end{cases},

(18)

If accepted, α^(t+1) = α^*, otherwise, α^(t+1) = α^(t).

For the temperature parameter T, the simulated annealing cooling method is used to update the parameters,

T^{(t)} = \frac{T^{(0)}}{\log (1 + t)},

(19)

where T⁽⁰⁾ is the initial value.

3.3. Parameter Initialization

There are seven parameters that need to be initialized, i.e., (1) The number of classes m, (2) Normal distribution parameters σ², (3) Boltzmann constant k, (4) The initial mean of Gaussian distribution μ_l⁽⁰⁾, (5) The initial covariance of Gaussian distribution Σ_l⁽⁰⁾, (6) The initial temperature T⁽⁰⁾, (7) The initial chemical potential α⁽⁰⁾.

For m, it is manually set based on the image. For σ², it is set based on human experience. For k, due to the difference between the image system and the physical system, the given physical parameter values cannot be directly used; based on experience, 0 < k < 2 in this paper. For the mean and covariance of the Gaussian distribution, they are calculated from the initial random classification result, L⁽⁰⁾ = {L_i⁽⁰⁾, i = 1, …, n}.

μ_{l}^{(0)} = \frac{\sum_{L_{i}^{(0)} = l} I_{i}}{n_{l}^{(0)}},

(20)

Σ_{l}^{(0)} = \frac{\sum_{L_{i}^{(0)} = l} (I_{i} - μ_{l}^{(0)}) {(I_{i} - μ_{l}^{(0)})}^{T}}{n_{l}^{(0)}},

(21)

where n_l⁽⁰⁾ is the total number of pixels within class l.

For parameters α⁽⁰⁾ and T⁽⁰⁾, they satisfy the condition shown in Equation (22),

\{\begin{cases} α_{i}^{(0)} = - β^{(0)} ε_{i}^{(0)} \\ \exp \{- β^{(0)} [J (α_{i^{'}}^{(0)}, θ^{(0)}) - J (α_{i^{'}}^{*}, θ^{(0)})]\} = 0.5 \end{cases},

(22)

where ε_i⁽⁰⁾ is calculated according to Equation (9),

ε_{i}^{(0)} = \frac{1}{m} \sum_{l = 1}^{m} ε_{i l}^{(0)},

(23)

3.4. Summary of the Proposed QS Model

The correspondence between image classification and quantum systems is shown in Table 1. The process of the proposed QS algorithm can be summarized as follows.

S1 Parameter initialization, including m, σ², k, μ_l⁽⁰⁾, Σ_l⁽⁰⁾, T⁽⁰⁾, α⁽⁰⁾;

S2 Calculating Gaussian distribution parameter θ_l^(t) = {μ_l^(t), Σ_l^(t)} by Equations (16) and (17);

S3 Calculating the total energy of the image classification system E(α^(t), θ^(t)) by Equation (11);

S4 Calculating the entropy of an image classification system S(α^(t), θ^(t)) by Equation (12);

S5 Calculating the cost function J(α^(t), θ^(t)) by Equation (13);

S6 Updating parameter α^(t+1) by Equation (18);

S7 Updating parameter T^(t+1) by Equation (19);

S8 Repeating S2-S7, until the objective function converges, i.e., |J(α^(t+1), θ^(t+1)) − J(α^(t), θ^(t))| < ξ, or t reaches the threshold, where ξ is the smallest positive number. Then, the classification result is obtained by Equation (15).

4. Results

Extensive experiments are conducted on multispectral and hyperspectral images using FCM [8], MSHLICM [12], SF_FCM [14], FVT-FCM [15], HGMM [22], Diffseg [25], and the proposed QS algorithm. Their characteristics are listed in Table 2. Then, the overall accuracy (OA) and the Kappa coefficient (Ka) are used to quantitatively evaluate the effectiveness of the proposed QS algorithm, where OA is the proportion of correctly classified samples, and Ka is a statistical measure that accounts for the agreement occurring by chance between the classifier’s predictions and the ground truth.

(1) Multi-spectral images: Figure 1(a1) is clipped from the GF1 satellite remote sensing image. The spectral bands include R, G, B, and NIR. The spatial resolution is resampled to 2 m. Figure 1(a2) is clipped from the IKONOS satellite image. The spectral bands include R, G, and B. The spatial resolution is 1 m. Figure 1(b1,b2) is the corresponding ground truth.

(2) Hyperspectral images: Figure 1(a3) is clipped from the Houston 2018 dataset. Its hyperspectral data was captured using an ITRES CASI 1500 in 48 bands with a spectral range of 380–1050 nm at a 1 m ground sampling distance. Figure 1(a4) is clipped from the Pavia University dataset. It was captured using 103 bands with a spectral range of 430–860 nm at 1.3 m spatial resolution. Figure 1(b3,b4) is the corresponding ground truth.

4.1. Multi-Spectral Image Classification

Figure 2a–g are the representative classification results of GF1 with FCM, MSHLICM, SF_FCM, FVT-FCM, HGMM, Diffseg, and the proposed QS algorithm. The enlarged image of the representative area is shown in Figure 3, where Figure 3(a1,b1) is the ground truth, Figure 3(a2–a8) represent the classification results of various algorithms in area 1, and Figure 3(b2–b8) represent area 2. It can be seen that the classification results of FCM are relatively fine, but there are many errors, such as artificial surfaces being misclassified as oceans. MSHLICM has effectively classified oceans, land under construction, and forests, benefiting from their smooth or regular textures. Meanwhile, the master-slave hierarchy of local information blurs the boundaries of land features, making it difficult to distinguish low vegetation and artificial surfaces in complex situations. SF_FCM uses superpixels as the basic unit and has strong spatial constraint capability. It can only classify large areas and cannot achieve fine classification of small area targets. FVT-FCM uses fuzzy superpixels as the basic unit, which is relatively more flexible than SF_FCM in describing the boundaries of ground objects. It can recognize slender artificial surfaces such as roads, but the fixed initialization parameters cannot adapt to the global scope. HGMM uses probability measures to model the similarity between pixels and classes, which greatly improves the classification performance compared to the previous distance measures. However, there are still many misclassifications, such as oceans and artificial surfaces. Although Diffseg has introduced deep learning, it still cannot recognize low vegetation. The proposed QS algorithm is modeled based on quantum theory. It can effortlessly describe complex scenes and achieve fine classification of complex images.

Figure 4a–g are the representative classification results of IKONOS with FCM, MSHLICM, SF_FCM, FVT-FCM, HGMM, Diffseg, and the proposed QS algorithm. The enlarged image of the representative area is shown in Figure 5, where Figure 5(a1,b1) is the ground truth, Figure 5(a2–a8) represents the classification results of various algorithms in area 1, and Figure 5(b2–b8) represents area 2. It can be seen that FCM confuses buildings and roads. Due to the strong neighborhood effect, the classification results of MSHLICM and SF_FCM deviate significantly from the ground truth. FVT-FCM is good, but the classification boundary is too rough. HGMM is also slightly affected by neighborhood constraints, making it difficult to distinguish fine boundaries. Diffseg is similar to MSHLICM; the road is covered by surrounding classes. By contrast, the proposed QS algorithm can obtain high-quality classification results.

To further evaluate the effectiveness of the proposed method from a quantitative perspective, the median and the median absolute deviation of OA and Ka are listed in Table 3. The box plot is shown in Figure 6. It shows that FCM is generally stable at around 60%. Although MSHLICM has higher accuracy than FCM, it has lower Ka, which can be as low as 0.2, as shown in Figure 6(b2). SF_FCM has a small accuracy deviation. The accuracy of FVT-FCM and HGMM has significantly improved, but their accuracy deviation is relatively large. The accuracy of Diffseg varies significantly across different images. For example, IKONOS has an accuracy of 73.55%, but GF1 has an accuracy of 54.73%. The proposed QS algorithm can stabilize above 75 with minimal deviation.

4.2. Hyperspectral Image Classification

To verify the effectiveness of the proposed method for high-dimensional data, hyperspectral images are also experimentally tested. Figure 7a–e are the classification results of the Houston 2018 hyperspectral image with FCM, MSHLICM, SF_FCM, FVT-FCM, HGMM, Diffseg, and the proposed QS algorithm, where the background does not participate in classification. The enlarged image of the representative area is shown in Figure 8, where Figure 8(a1,b1) is the ground truth, Figure 8(a2–a8) represents the classification results of various algorithms in area 1, and Figure 8(b2–b8) represents area 2. It can be seen that FCM, MSHLICM, SF_FCM, and FVT-FCM algorithms modeled based on Euclidean distance all rely to varying degrees on the proximity of the image space to classify, due to interference from high-dimensional features. SF_FCM is the most severe. HGMM modeled by the Gaussian mixture distribution has overcome this problem to some extent, but it still misclassifies the non-residential buildings and roads. Diffseg automatically learns features by deep learning and overcomes the curse of high dimensionality to some extent. The proposed QS algorithms, modeled based on the Fermi-Dirac distribution, can effectively overcome the disasters caused by high-dimensional data and obtain better classification results, as shown in Figure 7g and Figure 8(a8,b8).

Figure 9a–e are the classification results of the Pavia University hyperspectral image with FCM, MSHLICM, SF_FCM, FVT-FCM, HGMM, Diffseg, and the proposed QS algorithm, where the background does not participate in classification. The enlarged image of the representative area is shown in Figure 10, where Figure 10(a1,b1) is the ground truth, Figure 10(a2–a8) represents the classification results of various algorithms in area 1, and Figure 10(b2–b8) represents area 2. It can be seen that FCM, MSHLICM, SF_FCM, and FVT-FCM all exhibit varying degrees of classification based on spatial distance because of the ineffective distance measurement in high-dimensional space. A large number of self-blocking bricks are classified as meadows. The results of HGMM have improved, but it also cannot distinguish self-blocking bricks and meadows. Diffseg achieves outstanding results by benefiting from deep learning. The same goes for proposing methods, which can obtain the result closest to the ground truth.

Table 4 is the quantitative evaluation of hyperspectral image classification. Figure 11 is the corresponding box plot. Although MSHLICM and SF_FCM have higher accuracy than FCM and FVT-FCM, their class numbers are often missing. In this situation, accuracy is severely affected by categories with larger areas, resulting in the illusion of false height. The accuracy of HGMM remains around 70%. Diffseg completely overcame the curse of dimensionality, with an accuracy of up to 95.87%. The proposed QS obtains similar results to Diffseg in the Houston 2018 image. The OA is 81.88, the Ka is 0.73. Although the accuracy of the proposed QS is not as good as Diffseg in the Pavia University image, it is still higher than other algorithms.

5. Discussion

5.1. Parameter Influence Analysis

Among numerous initialization parameters, there are four parameters that need to be manually given, which are m, σ², k, and T⁽⁰⁾. m depends on the image. σ² is constrained by the loss function and has a relatively small impact on the classification results. Therefore, the impact of k and T⁽⁰⁾ on classification accuracy is mainly discussed in this paper by taking the GF1 image as an example, as shown in Figure 12 and Figure 13. It can be seen that when both k and T⁽⁰⁾ are large, the accuracy briefly increases and then rapidly decreases. The main reason is that larger values of k make the algorithm more tolerant of differential features, making it difficult to distinguish classes of similar features. Larger values of T⁽⁰⁾ require more iterations to cool down, which affects the classification performance. When the values of the two are in the middle range, the accuracy changes relatively smoothly. When the values of the two are small, it is easy to achieve higher accuracy values.

5.2. Spectral Dimensional Robustness Analysis

To demonstrate the robustness of the proposed QS algorithm to high-dimensional images, the accuracy variation curve with spectral dimension is plotted, as shown in Figure 14. It shows that the accuracy of the three algorithms based on Euclidean distance modeling, FCM, MSHLICM, and FVT-FCM, shows a trend of first increasing and then slowly decreasing with the increase in dimensions, as shown in Figure 14a. The accuracy of SF_FCM fluctuates greatly. The main reason is the false high accuracy caused by class loss. For HGMM, Diffseg, and the proposed QS, the change in accuracy is relatively stable in the later stage of dimension increase, which shows that they can overcome the curse of dimensionality. Diffseg and the proposed QS maintain a leading position.

5.3. Time Complexity Analysis

To demonstrate the computational efficiency of the proposed method, the time complexity of each algorithm is listed in Table 5, where n is the number of pixels, m is the number of classes, d is the dimension of the spectrum, t is the number of iterations, w is the size of the neighborhood window, v is the number of objects in object-oriented methods, and c is the number of components in the mixed model. It can be seen that FCM has the smallest time complexity; it involves fewer parameters and belongs to linear time complexity. In MSHLICM, the hierarchical master-slave neighborhood models increase time complexity. The time complexity of SF_FCM is closer to that of FCM. FVT-FCM is an object-based classification algorithm, which is also affected by the number of objects. HGMM is severely affected by the components of the Gaussian mixture model, and it requires a large number of iterations to optimize the model parameters, such as 5000–10,000. Unlike other algorithms, Diffseg is based on a neural network and has a fixed input size, 512 × 512, so its time complexity is fixed. Normally, its running time is around 140 s. Although the time complexity of the proposed QS algorithm is affected by dimensionality, it only requires a small number of iterations to achieve stable classification results. Therefore, although its time complexity is not as low as FCM, SF_FCM, and Diffseg, it is still far superior to MSHLICM and HGMM.

5.4. Remaining Challenges

In Section 4, by comparing with the MSHLICM, SF_FCM, FVT-FCM, and HGMM algorithms, modeled with neighborhood constraints, it demonstrates that the proposed QS algorithm can achieve finer classification results when classifying complex features with large spectral variance and peculiar shapes. Compared with the FCM algorithm, the pixel-level algorithm without spatial constraints proves that the proposed QS algorithm can not only obtain fine classification results but also avoid confusing land cover types. By comparing with the Diffseg algorithm, modeled based on deep learning, it further proves the feasibility and effectiveness of the proposed QS method. QS provides a new modeling approach for describing complex images. However, as a new paradigm for image classification, it still faces many challenges, such as the problem of noise resistance to outliers. It can be seen that the proposed method can overcome the influence of most noise, but salt and pepper phenomena may occur in the face of outliers. We will improve this issue from two aspects: (1) Introducing image space neighborhood constraints without disrupting the fermion subsystem using Markov Random Field; (2) Abandoning the fixed invariance of parameter k and exploring a modeling scheme for adaptive k values according to the complexity of the image.

6. Conclusions

In this paper, an image classification based on quantum statistics is proposed. It establishes a one-to-one correspondence between quantum systems and image classification processes and utilizes the complexity of quantum systems to solve the problem of fine classification of complex images. The Fermi-Dirac distribution replaces distance and probability measurement to model the degree of membership of pixels to classes, effectively solving the curse of dimensionality in high-dimensional data. The cost function modeled based on the free energy of the quantum system effectively guides the convergence of classification results. The classification experiments of multispectral and hyperspectral images effectively verified the feasibility and effectiveness of the proposed method. In the future, we will focus on the remaining challenges, addressing the impact of outliers on classification results, and further improving the fine classification accuracy of complex images.

Author Contributions

This research was mainly performed by X.L. and J.C. X.L. contributed with ideas, conceived, and designed the study; J.C. supervised the study, and his comments were considered throughout the paper. L.Z., H.L., and P.C. organized the experimental results. L.S., S.G., and X.Z. aided in proposing the idea, giving suggestions, and revising the rough draft. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China, grant number 42401403, 42261061, and 42001286, the Guangdong Basic and Applied Basic Research Foundation, grant number 2024A1515030014, the Fundamental Research Foundation of Shenzhen Technology and Innovation Council, grant number KCXST20221021111611029, JCYJ20220818101617037, the Shenzhen Science and Technology Program, grant number KCXFZ20240903093800002, and the Scientific Research Project of Ecology Environment Bureau of Shenzhen Municipality: “Research on the Application of Multi-source Data in the Investigation and Evaluation of Ecosystem Status in Shenzhen”, grant number SZDL2023001387.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Chen, Y.; Li, Y.; Mao, H.; Chai, X.; Jiao, L. A Novel Deep Nearest Neighbor Neural Network for Few-Shot Remote Sensing Image Scene Classification. Remote Sens. 2023, 15, 666. [Google Scholar] [CrossRef]
Xiong, S.; Wu, L.; Zhang, B.; Zhang, D.; Tao, Y.; Tang, Y. HL-SAM-Seg: Complementary High- and Low-Resolution Features Based on SAM for Remote Sensing Image Semantic Segmentation. IEEE Trans. Geosci. Remote Sens. 2026, 64, 4402214. [Google Scholar] [CrossRef]
Caron, M.; Bojanowski, P.; Joulin, A.; Douze, M. Deep Clustering for Unsupervised Learning of Visual Features. In Proceedings of the 15th European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; part XIV, pp. 139–156. [Google Scholar]
Caron, M.; Misra, I.; Mairal, J.; Goyal, P.; Bojanowski, P.; Joulin, A. Unsupervised Learning of Visual Features by Contrasting Cluster Assignments. In Proceedings of the 34th annual Neural Information Processing Systems Conference (NeurIPS 2020), Vancouver, BC, Canada, 6–12 December 2020. [Google Scholar]
Cho, J.; Mall, U.; Bala, K.; Hariharan, B. PiCIE: Unsupervised Semantic Segmentation using Invariance and Equivariance in Clustering. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 16789–16799. [Google Scholar]
Xia, G.S.; Hu, J.; Hu, F.; Shi, B.; Bai, X.; Zhong, Y.; Liu, D.; Zhang, L. Remote Sensing Image Scene Classification: Benchmark and State of the Art. Proc. IEEE 2017, 105, 1865–1883. [Google Scholar] [CrossRef]
Tang, H.; Zhang, C.; Tang, D.; Lin, X.; Yang, X.; Xie, W. Few-Shot Hyperspectral Image Classification with Deep Fuzzy Metric Learning. IEEE Geosci. Remote Sens. Lett. 2025, 22, 5502205. [Google Scholar] [CrossRef]
Wu, C.; Zhang, J. One-Step Joint Learning of Self-Supervised Spectral Clustering with Anchor Graph and Fuzzy Clustering for Land Cover Classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 11178–11193. [Google Scholar] [CrossRef]
Yu, H.; Xie, S.; Fan, J.; Lan, R.; Lei, B. Mahalanobis-Kernel Distance-Based Suppressed Possibilistic C-Means Clustering Algorithm for Imbalanced Image Segmentation. IEEE Trans. Fuzzy Syst. 2024, 32, 4595–4609. [Google Scholar] [CrossRef]
Gong, M.; Zhou, Z.; Ma, J. Change Detection in Synthetic Aperture Radar Images Based On Image Fusion and Fuzzy Clustering. IEEE Trans. Image Process. 2012, 21, 2141–2151. [Google Scholar] [CrossRef]
Memon, K.H.; Lee, D. Generalised Kernel Weighted Fuzzy C-Means Clustering Algorithm with Local Information. Fuzzy Set. Syst. 2018, 340, 91–108. [Google Scholar] [CrossRef]
Wu, C.; Wu, W. Master-Slave Hierarchy Local Information Driven Fuzzy C-Means Clustering For Noisy Image Segmentation. Vis. Comput. 2024, 40, 865–897. [Google Scholar] [CrossRef]
Yang, X.; Chen, Z.; Zhang, B.; Li, B.; Bai, Y.; Chen, P. A Block Shuffle Network with Superpixel Optimization for Landsat Image Semantic Segmentation. Remote Sens. 2022, 14, 1432. [Google Scholar] [CrossRef]
Lei, T.; Jia, X.; Zhang, Y.; Liu, S.; Meng, H.; Nandi, A.K. Superpixel-Based Fast Fuzzy C-Means Clustering For Color Image Segmentation. IEEE Trans. Fuzzy Syst. 2019, 27, 1753–1766. [Google Scholar] [CrossRef]
Li, X.; Zhao, L.; Li, H.; Sun, L.; Chen, P.; Jiang, R.; Chen, J. Unsupervised Image Classification Based on Fully Fuzzy Voronoi Tessellation. Appl. Sci. 2024, 14, 11227. [Google Scholar] [CrossRef]
Tang, L.; Zhang, H.; Xu, Y.; Ren, Y.; Li, C. A Hybrid Variational Level Set Model Based On Double Gaussian Distribution Fitting Energy For Image Segmentation And Bias Correction. IET Image Proc. 2024, 18, 4461–4491. [Google Scholar] [CrossRef]
Chatzis, S.P.; Varvarigou, T. A Fuzzy Clustering Approach Toward Hidden Markov Random Field Models For Enhanced Spatially Constrained Image Segmentation. IEEE Trans. Fuzzy Syst. 2008, 16, 1351–1361. [Google Scholar] [CrossRef]
Tamandi, M. Finite Mixtures of Skew Distributions and Their Application in Image Segmentation. J. Stat. Theory Pract. 2025, 19, 60. [Google Scholar] [CrossRef]
Yin, S.; Wang, L.; Wang, Q.; Yang, J.; Jiang, M. Remote Sensing Image Segmentation Based on a Novel Gaussian Mixture Model and SURF Algorithm. Int. J. Swarm Intell. Res. 2023, 14, 1–15. [Google Scholar] [CrossRef]
Pappas, O.A.; Anantrasirichai, N.; Achim, A.M.; Adams, B.A. River Planform Extraction From High-Resolution SAR Images via Generalized Gamma Distribution Superpixel Classification. IEEE Trans. Geosci. Remote Sens. 2021, 59, 3942–3955. [Google Scholar] [CrossRef]
Shi, X.; Wang, Y.; Li, Y.; Dou, S. Remote Sensing Image Segmentation Based on Hierarchical Student’s-t Mixture Model and Spatial Constrains with Adaptive Smoothing. Remote Sens. 2023, 15, 828. [Google Scholar] [CrossRef]
Shi, X.; Li, Y. A Hierarchical Gamma Mixture Model Toward Hidden Markov Random Field for High-Resolution SAR Image Segmentation. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5218114. [Google Scholar] [CrossRef]
Zhao, C.; Dong, G.; Basu, A. Learning Distributions via Monte-Carlo Marginalization. arXiv 2023, arXiv:2308.06352. [Google Scholar] [CrossRef]
Caron, M.; Touvron, H.; Misra, I.; Jegou, H.; Mairal, J.; Bojanowski, P.; Joulin, A. Emerging Properties in Self-Supervised Vision Transformers. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021. [Google Scholar]
Tian, J.; Aggarwal, L.; Colaco, A. Diffuse, Attend, And Segment: Unsupervised Zero-Shot Segmentation Using. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023. [Google Scholar]
Tai, Y.L.; Huang, S.J.; Chen, C.C.; Lu, H.H.S. Computational Complexity Reduction of Neural Networks of Brain Tumor Image Segmentation by Introducing Fermi–Dirac Correction Functions. Entropy 2021, 23, 223. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Experimental data. (a1) GF1; (a2) IKONOS; (a3) Houston 2018; (a4) Pavia University. (b1–b4) Ground truth of (a1–a4).

Figure 2. Classification results of GF1. (a) FCM; (b) MSHLICM; (c) SF_FCM; (d) FVT-FCM; (e) HGMM; (f) Diffseg; (g) Proposed QS.

Figure 3. Enlarged view of areas 1 and 2 in GF1 image classification results. (a1) and (b1) Ground truth; (a2) and (b2) FCM; (a3) and (b3) MSHLICM; (a4) and (b4) SF_FCM; (a5) and (b5) FVT-FCM; (a6) and (b6) HGMM; (a7) and (b7) Diffseg; (a8) and (b8) Proposed QS.

Figure 4. Classification results of IKONOS. (a) FCM; (b) MSHLICM; (c) SF_FCM; (d) FVT-FCM; (e) HGMM; (f) Diffseg; (g) Proposed QS.

Figure 5. Enlarged view of areas 1 and 2 in IKONOS image classification results. (a1,b1) Ground truth; (a2,b2) FCM; (a3,b3) MSHLICM; (a4,b4) SF_FCM; (a5,b5) FVT-FCM; (a6,b6) HGMM; (a7,b7) Diffseg; (a8,b8) Proposed QS.

Figure 6. Box plot of multi-spectral image classification results. (a1,a2) OA and Ka of GF1; (b1,b2) OA and Ka of IKONOS.

Figure 7. Classification results of Houston 2018. (a) FCM; (b) MSHLICM; (c) SF_FCM; (d) FVT-FCM; (e) HGMM; (f) Diffseg; (g) Proposed QS.

Figure 8. Enlarged view of areas 1 and 2 in the Houston 2018 image classification results. (a1,b1) Ground truth; (a2,b2) FCM; (a3,b3) MSHLICM; (a4,b4) SF_FCM; (a5,b5) FVT-FCM; (a6,b6) HGMM; (a7,b7) Diffseg; (a8,b8) Proposed QS.

Figure 9. Classification results of Pavia University. (a) FCM; (b) MSHLICM; (c) SF_FCM; (d) FVT-FCM; (e) HGMM; (f) Diffseg; (g) Proposed QS.

Figure 10. Enlarged view of areas 1 and 2 in Pavia University image classification results. (a1,b1) Ground truth; (a2,b2) FCM; (a3,b3) MSHLICM; (a4,b4) SF_FCM; (a5,b5) FVT-FCM; (a6,b6) HGMM; (a7,b7) Diffseg; (a8,b8) Proposed QS.

Figure 11. Box plot of hyperspectral image classification results. (a1,a2) OA and Ka of Houston 2018; (b1,b2) OA and Ka of Pavia University.

Figure 12. Accuracy curve with T⁽⁰⁾ variation at different k values.

Figure 13. Accuracy curve with k variation at different T⁽⁰⁾ values.

Figure 14. Accuracy variation curve with spectral dimension. (a) Houston 2018; (b) Pavia University.

Table 1. The correspondence between the image classification system and the quantum system.

Image Classification System	Quantum System
Image	Multi-particle system
Pixel	Fermion
Class	Energy level
The negative logarithm of the distribution that the spectrum of the pixel follows.	The energy of the energy level where the fermion is located
The membership relationship between pixels and classes	The probability of occupying energy levels by fermions, modeled by the Fermi-Dirac distribution
Cost function for image classification	Free energy

Table 2. The characteristics of comparing algorithms and the proposed QS algorithm.

Algorithms	Basic Unit	Measurement Model	Spatial Constraint
FCM	Pixels	Distance measurement	No
MSHLICM	Pixels	Distance measurement	Yes
SF_FCM	Superpixels	Distance measurement	Yes
FVT-FCM	Fuzzy superpixels	Distance measurement	Yes
HGMM	Pixels	Probability measurement	Yes
Diffseg	Pixels	Deep learning	Yes
Proposed QS	Pixels	Quantum statistical measurement	No

Table 3. Classification accuracy of multi-spectral images.

Images	Accuracy	FCM	MSHLICM	SF_FCM	FVT-FCM	HGMM	Diffseg	Proposed QS
GF1	OA(%)	66.00 ± 0.99	68.67 ± 0.80	61.97 ± 1.50	67.95 ± 1.60	66.86 ± 5.52	54.73 ± 0.15	75.27 ± 1.68
GF1	Ka	0.58 ± 0.01	0.60 ± 0.01	0.53 ± 0.02	0.58 ± 0.01	0.55 ± 0.05	0.42 ± 0.00	0.67 ± 0.02
IKONOS	OA(%)	61.70 ± 0.25	51.92 ± 5.86	47.03 ± 1.41	75.97 ± 3.45	77.73 ± 6.47	73.55 ± 0.03	80.73 ± 1.10
IKONOS	Ka	0.42 ± 0.00	0.19 ± 0.08	0.15 ± 0.02	0.60 ± 0.07	0.38 ± 0.16	0.56 ± 0.00	0.71 ± 0.01

Table 4. Classification accuracy of hyperspectral images.

Images	Accuracy	FCM	MSHLICM	SF_FCM	FVT-FCM	HGMM	Diffseg	Proposed QS
Houston 2018	OA	51.00 ± 4.98	72.16 ± 0.01	53.61 ± 1.70	63.89 ± 2.36	67.08 ± 6.01	82.06 ± 0.97	81.88 ± 1.21
Houston 2018	Ka	0.30 ± 0.12	0.54 ± 0.01	0.30 ± 0.02	0.47 ± 0.02	0.48 ± 0.07	0.71 ± 0.02	0.73 ± 0.02
Pavia University	OA	76.24 ± 0.83	50.35 ± 2.38	80.85 ± 1.46	77.07 ± 2.18	70.33 ± 5.75	95.87 ± 3.99	80.55 ± 4.22
Pavia University	Ka	0.67 ± 0.01	0.43 ± 0.04	0.73 ± 0.01	0.66 ± 0.05	0.68 ± 0.12	0.94 ± 0.05	0.74 ± 0.02

Table 5. Time complexity.

Algorithms	Time Complexity
FCM	O(nmdt)
MSHLICM	O(nmdwt+nmw²t)
SF_FCM	O(nmdt+nwt)
FVT-FCM	O(nmdvt)
HGMM	O(nmd²tc+nmwt)
Diffseg	O(1)
Proposed QS	O(nmd²t)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, X.; Zhao, L.; Li, H.; Chen, P.; Sun, L.; Guo, S.; Zhao, X.; Chen, J. Optical Remote Sensing Image Classification Based on Quantum Statistics. Electronics 2026, 15, 2075. https://doi.org/10.3390/electronics15102075

AMA Style

Li X, Zhao L, Li H, Chen P, Sun L, Guo S, Zhao X, Chen J. Optical Remote Sensing Image Classification Based on Quantum Statistics. Electronics. 2026; 15(10):2075. https://doi.org/10.3390/electronics15102075

Chicago/Turabian Style

Li, Xiaoli, Longlong Zhao, Hongzhong Li, Pan Chen, Luyi Sun, Shanxin Guo, Xuemei Zhao, and Jinsong Chen. 2026. "Optical Remote Sensing Image Classification Based on Quantum Statistics" Electronics 15, no. 10: 2075. https://doi.org/10.3390/electronics15102075

APA Style

Li, X., Zhao, L., Li, H., Chen, P., Sun, L., Guo, S., Zhao, X., & Chen, J. (2026). Optical Remote Sensing Image Classification Based on Quantum Statistics. Electronics, 15(10), 2075. https://doi.org/10.3390/electronics15102075

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Optical Remote Sensing Image Classification Based on Quantum Statistics

Abstract

1. Introduction

2. Fermi-Dirac Distribution

3. Methods

3.1. Classification Model

3.2. Parameter Solution

3.3. Parameter Initialization

3.4. Summary of the Proposed QS Model

4. Results

4.1. Multi-Spectral Image Classification

4.2. Hyperspectral Image Classification

5. Discussion

5.1. Parameter Influence Analysis

5.2. Spectral Dimensional Robustness Analysis

5.3. Time Complexity Analysis

5.4. Remaining Challenges

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI