CV-CPKAN: Complex-Valued Convolutional Kolmogorov–Arnold Framework for PolSAR Image Classification

Kuang, Zuzheng; Liu, Shuxin; Bi, Haixia; He, Lijun; Li, Fan

doi:10.3390/rs18020330

Open AccessArticle

CV-CPKAN: Complex-Valued Convolutional Kolmogorov–Arnold Framework for PolSAR Image Classification

by

Zuzheng Kuang

¹

,

Shuxin Liu

²,

Haixia Bi

^1,*

,

Lijun He

¹ and

Fan Li

¹

School of Information and Communications Engineering, Xi’an Jiaotong University, Xi’an 710049, China

²

Southwesten Technical Institute of Physics, China Academy of Engineering Physics, Chengdu 610041, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2026, 18(2), 330; https://doi.org/10.3390/rs18020330

Submission received: 18 December 2025 / Revised: 5 January 2026 / Accepted: 12 January 2026 / Published: 19 January 2026

(This article belongs to the Section AI Remote Sensing)

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

A novel complex-valued convolutional Kolmogorov–Arnold framework (CV-CPKAN) is proposed, which integrates complex-valued KAN convolution layers and a multi-branch block (MBCcomplexKConv) to effectively extract both amplitude and phase features from PolSAR data.
CV-CPKAN achieves state-of-the-art classification performance on three benchmark PolSAR datasets while maintaining low computational cost and strong generalization ability.

What are the implications of the main findings?

The study demonstrates that combining KAN-based nonlinear mappings with convolutional operations in the complex domain enhances feature representation and classification accuracy for PolSAR data, offering a new architectural direction beyond CNNs and Transformers.
CV-CPKAN provides a parameter-efficient and robust framework for PolSAR data analysis, with potential applicability to other complex-valued remote sensing tasks requiring joint modeling of amplitude and phase information.

Abstract

Deep learning has significantly advanced PolSAR image processing, with a growing trend of integrating mathematical theories into deep neural networks to enhance their capabilities with regard to complex data. Kolmogorov–Arnold networks (KANs), which leverage nonlinear mappings derived from the Kolmogorov–Arnold theorem for automatic feature extraction, present a promising yet underexplored direction for PolSAR image classification. However, existing real-valued KAN-based layers fall short in effectively exploiting the complex-valued characteristics of PolSAR data, overlooking the important phase information. In this paper, we propose a complex-valued convolutional Kolmogorov–Arnold framework for PolSAR image classification (CV-CPKAN). The framework introduces complex KAN convolution layers, which are further employed to construct a multi-branch complex KAN convolution (MBComplexKConv) block, effectively extracting multi-scale features from both the amplitude and phase components of PolSAR data. Additionally, a complex-valued variant of PolyLoss (CV-PolyLoss) is proposed as our classification loss function. Through extensive evaluations on three benchmark PolSAR datasets, CV-CPKAN consistently surpasses state-of-the-art models based on CNN, Transformer and Mamba, achieving overall accuracies of 99.86%, 99.80% and 99.74% on Flevoland, San Francisco and Oberpfaffenhofen datasets, respectively. These results underscore the effectiveness of integrating convolutions with KAN-based nonlinear mapping, providing a new avenue for further research in PolSAR image classification.

Keywords:

PolSAR image classification; complex-valued deep neural network (CV-DNN); Kolmogorov–Arnold convolution layers; convolutional Kolmogorov–Arnold network (convolutional KAN)

Graphical Abstract

1. Introduction

As an advanced microwave imaging system, polarimetric synthetic aperture radar (PolSAR) enables comprehensive Earth surface observations independent of solar illumination or atmospheric constraints. PolSAR systems enrich scattering information by transmitting horizontally and vertically polarized microwaves and by receiving echoes in four polarization modes [1]. PolSAR image classification involves assigning each pixel to a specific land cover class. This task exhibits significant potential for various applications such as environmental monitoring and agricultural yield prediction [2,3]. Initial PolSAR image classification approaches mainly relied on target decomposition models [4] or statistical scattering models [5].

Traditional PolSAR image classification methods are largely model-driven, relying on manually designed features derived from the original data or its simple transformations. Their feature extraction process generally involves two main categories, i.e., polarimetric feature extraction and image feature extraction [6]. Polarimetric target decomposition is a fundamental technique for polarimetric feature extraction, aiming to recover physically meaningful parameters from the observed data. These methods can be broadly categorized into coherent target decomposition like Pauli decompositions [4] and Krogager decompositions [7] and incoherent decomposition. Incoherent decomposition encompasses approaches based on eigenvalue analysis, such as Cloude–Pottier decomposition [4] and Touzi decomposition [8], and those based on physical scattering models, such as Freeman–Durden decomposition [9] and Yamaguchi decomposition [10]. Recognizing that PolSAR data adheres to specific statistical distributions like Wishart distribution [11] and K-distribution [12], researchers have also integrated statistical models within target decomposition theory. In addition to polarimetric features, various image characteristics like texture [13] and spatial attributes [14] have been utilized for classification.

With the breakthrough advancement of computer science, machine learning-based methods for PolSAR image classification have become mainstream in a decade. A variety of machine learning methods such as support vector machines [15], k-nearest neighbors [16] and random forests [17] gained prominence in this field. Nevertheless, the performance of these approaches still exhibits a strong reliance on manually designed features. The manual design process not only demands extensive domain knowledge but also tends to produce features with limited generalization capability [18]. Owing to the superior data-driven modeling capacities, deep neural networks (DNNs) have been thoroughly studied in the PolSAR domain [19,20,21,22]. Notably, convolutional neural networks (CNNs) significantly contribute to the studied task via leveraging their translation-equivariant inductive bias through shared kernel parameters [18,23,24,25]. Subsequently, Vision Transformers (ViTs) have attracted growing research interest and have shown promising results in several PolSAR image classification studies [26,27,28]. Nevertheless, the fixed computational nature of both CNNs and ViTs fundamentally constrains their efficiency and potential in adapting to dynamic data characteristics and intricate polarimetric relationships [29,30].

Recent years have witnessed a growing trend of integrating advanced mathematical theories into deep learning frameworks to enhance the capacity of DNNs for modeling complex data structures [31,32,33]. A notable example is Kolmogorov–Arnold Networks (KANs), which provide a novel and promising architecture rooted in the Kolmogorov–Arnold theorem and have been demonstrated as a powerful alternative to multi-layer perceptrons (MLPs) [33]. Unlike fixed linear transformations in MLPs, KANs employ learnable nonlinear functions in place of weight matrices. Via shifting the learning paradigm from adjusting fixed numerical weights to directly acquiring nonlinear mappings, KANs achieve more parameter-efficient feature extraction and better generalization compared to MLPs [33]. Building upon the advancements in KANs, researchers have integrated the Kolmogorov–Arnold theorem into convolutional layers [30,34]. Literature [34] suggested that traditional CNNs, which rely on fixed weights and linear transformations, can be improved by incorporating nonlinear parameterizations, which offer greater adaptability. Therefore, KAN convolution layers were proposed as a viable substitute for conventional convolutional layers. In fact, research [35] has demonstrated the superior ability of KANs in modeling complex mathematical relationships. This makes KAN convolution layers particularly suited for PolSAR data interpretation, where intricate scattering mechanisms dominate.

While KAN convolutions show promise in computer vision, their potential in remote sensing remains unleashed. Furthermore, existing real-valued KAN convolution layers cannot fully leverage the unique complex-valued nature of PolSAR data, neglecting the scattering information embedded in the phase component. Inspired by these motivations, we propose a complex-valued convolutional Kolmogorov–Arnold framework for PolSAR image classification termed CV-CPKAN. The primary contributions of this paper are summarized as follows:

(1): We present CV-CPKAN for PolSAR image classification, advancing the application of KAN convolution layer-based architectures in the PolSAR domain. The proposed model incorporates specially designed complex KAN convolutional layers and an enhanced CV-PolyLoss function, enabling complete polarimetric feature learning from both amplitude and phase components of PolSAR data.
(2): We design a new multi-branch block, MBComplexKConv. It incorporates multi-scale complex KAN convolution layers and complex batch normalization layers, enriching polarimetric representations from different scales.
(3): We evaluate CV-CPKAN together with multiple classical and state-of-the-art models on three datasets. Experimental results validate the superiority of our approach.

The overall structure of this paper is outlined as follows. Section 2.1 and Section 2.2 introduce the related works on CV-CPKAN. Section 2.3, Section 2.4, Section 2.5, Section 2.6 and Section 2.7 introduce the theoretical foundations and then detail our methods. Subsequently, Section 3 and Section 4 delineate experiments on benchmark datasets and offer detailed analyses of different configurations. Finally, Section 5 provides a summary of this study with limitations and future directions.

2. Materials and Methods

2.1. Deep Learning in PolSAR Image Classification

The rapid advancement of deep learning has prompted significant interest in its application to PolSAR image classification [36]. Early work primarily utilized CNNs known for their ability to capture local features [37]. By enabling automatic feature learning from real-valued vectors obtained from coherence matrices, researchers began to apply CNNs to PolSAR data for classification [23]. Zhang et al. [38] initially applied complex-valued CNNs (CV-CNNs) to PolSAR image classification, which outperformed traditional CNNs. Zhang et al. [39] constructed a CNN framework that utilized polarimetric channel powers as input descriptors.

Distinct from CNNs which emphasize local feature extraction, ViTs [40] represent another prevalent deep framework, demonstrating a strong capacity for capturing global dependencies through their attention mechanism [41]. Dong et al. [26] devised a model focusing on global polarimetric representation learning, which was the first to apply ViT to PolSAR data for classification. To further enhance computational efficiency and local sensitivity, Jamali et al. [27] put forward a ViT incorporating local-window attention for PolSAR image classification, which improved feature generalization within spatially localized areas.

Recent advances have seen the emergence of hybrid architectures combining the strengths of CNNs and transformers [29]. To combine local feature learning with global context reasoning, the authors of [28] devised a multi-granularity model based on CNN and ViT that integrated external tokens and cross-attention mechanisms. Following the recent success of state space models in sequential data modeling [42], several Mamba-based approaches have also gained traction in computer vision [43], providing a promising paradigm for remote sensing image analysis [44,45,46].

2.2. KANs and Their Applications in Remote Sensing

Over the past few months, KANs [33] have arisen as a viable substitute for conventional MLPs, which reshapes neural network architecture by incorporating spline-parameterized activation functions on weights rather than fixed neurons. Some KAN variants have been recently proposed to enhance efficiency and generalization. Li et al. [47] introduced Fast KAN, which substituted B-splines with radial basis functions to reduce computational overhead while maintaining approximation accuracy. Sidharth et al. [48] put forward Chebyshev KAN, which further refined KANs by employing Chebyshev polynomials to enhance numerical stability and adaptability to discretized data such as images. Wolff et al. [49] introduced complex-valued KANs (CVKANs) for complex-valued machine learning tasks, which employs complex-valued radial basis functions and batch normalization.

In remote sensing, several studies have also showcased the considerable potential of KANs. Jamali et al. [50] introduced hybrid KANs for hyperspectral images, utilizing a multi-dimensional architecture combining 1D, 2D and 3D KAN modules to improve classification performance. Integrating KANs with pre-trained CNNs, researchers [51] accomplished remote sensing scene classification with fewer parameters, attaining high accuracy and faster convergence on the EuroSAT dataset [52]. Li and Ye [53] proposed KANet for hyperspectral image classification, which integrates a 3D-KAN with an adaptive grid update mechanism to capture complex spectral–spatial nonlinear relationships while reducing parameter redundancy. Zhao et al. [54] proposed boundary-aware network for hyperspectral image classification, combining deformable convolutions with KAN to capture irregular features and mitigate gradient-driven adversarial perturbations.

In summary, our CV-CPKAN differentiates itself from existing PolSAR image classification methods in three key aspects.

(1): We are the first to integrate KANs convolution layers in PolSAR image classification, leveraging the ability of KANs to model complex polarimetric features.
(2): We introduce a novel complex-valued KAN convolution framework, a fundamental architectural hybrid that combines KAN principles with convolutional operations in the complex domain. This sets our approach apart from CVKANs, which are merely the complex extensions of standard KANs.
(3): We devise a plug-and-play, multi-scale KAN-based block, which can enhance feature extraction for complex-valued deep learning tasks.

In the following subsections, we first concisely review the definition of KANs and KAN convolutions in Section 2.3. Then, our devised complex-valued layers are detailed in Section 2.4. Section 2.5 presents the extraction and preprocessing of the raw PolSAR data. Finally, Section 2.6 and Section 2.7 introduce the proposed MBComplexKConv blocks, CV-PolyLoss and optimization strategy.

2.3. KAN Layer to KAN Convolution Layer

Grounded in the Kolmogorov–Arnold theorem, KANs advance network architecture by substituting linear transformations of MLPs with learnable spline functions. Specifically, in a KAN, the connection between the i-th neuron in the l-th layer and the j-th neuron in the

(l + 1)

-th layer is represented by a unique activation function

ϕ_{l, j, i}

. The entire network contains a total of

n_{l} \times n_{l + 1}

activation functions between layer l and layer

l + 1

, where

n_{l}

and

n_{l + 1}

denote the number of neurons in the l-th and

(l + 1)

-th layers, respectively. The output

x_{l + 1, j}

of the j-th neuron in layer

l + 1

is expressed as

x_{l + 1, j} = \sum_{i = 1}^{n_{l}} ϕ_{l, j, i} (x_{l, i}), j = 1, 2, \dots, n_{l + 1} .

(1)

Here, the nonlinear function

ϕ (\cdot)

is

\begin{matrix} ϕ (x) & = w_{1} \cdot SiLU (x) + w_{2} \cdot Spline (x), \\ SiLU (x) & = x / (1 + exp (- x)), \\ Spline (x) & = \sum_{i} s_{i} B_{i} (x), \end{matrix}

(2)

where x stands for input,

w_{1}

and

w_{2}

are trainable parameters,

SiLU (\cdot)

serves as a basis function and

Spline (\cdot)

is defined as a learnable linear combination of B-splines.

Unlike standard convolutions where the kernel weights are fixed parameters, the kernel of KAN convolution is composed of an ensemble of univariate nonlinear functions. Given input features

z \in R^{c \times w \times h}

, with c, w and h denoting the channel number, width and height, respectively, the output

z^{'}

of a KAN convolution layer with kernel size

k_{1} \times k_{2}

is calculated as follows:

\begin{matrix} z_{i j}^{'} & = \sum_{d = 1}^{c} \sum_{a = 0}^{k_{1} - 1} \sum_{b = 0}^{k_{2} - 1} ϕ_{d, a, b} (z_{d, i + a, j + b}), \\ i & = 1, 2, \dots, w - k_{1} + 1, j = 1, 2, \dots, h - k_{2} + 1 . \end{matrix}

(3)

Here,

ϕ (\cdot)

refers to the same definition as Equation (2).

2.4. Complex KAN Layer to Complex KAN Convolution Layer

This subsection introduces the complex-valued extension from KAN layer to KAN convolution layer. According to Equation (2), we define the complex transformation function

Φ (x)

for the complex-valued input x as

\begin{matrix} Φ (x; w_{1}, w_{2}) = Φ (x^{r} + j x^{i}; w_{1}^{r} + j w_{1}^{i}, w_{2}^{r} + j w_{2}^{i}) \\ = & (w_{1}^{r} + j w_{1}^{i}) \cdot [Spline (x^{r}) + j \cdot Spline (x^{i})] + (w_{2}^{r} + j w_{2}^{i}) \cdot [SiLU (x^{r}) + j \cdot SiLU (x^{i})] \\ = & [w_{1}^{r} Spline (x^{r}) - w_{1}^{i} Spline (x^{i}) + w_{2}^{r} SiLU (x^{r}) - w_{2}^{i} SiLU (x^{i})] + \\ j \cdot [w_{1}^{r} Spline (x^{i}) + w_{1}^{i} Spline (x^{r}) + w_{2}^{r} SiLU (x^{i}) + w_{2}^{i} SiLU (x^{r})] \\ = & [ϕ (x^{r}; w_{1}^{r}, w_{2}^{r}) - ϕ (x^{i}; w_{1}^{i}, w_{2}^{i})] + j \cdot [ϕ (x^{i}; w_{1}^{r}, w_{2}^{r}) - ϕ (x^{r}; w_{1}^{i}, w_{2}^{i})] . \end{matrix}

(4)

Here,

x^{r}

,

x^{i}

,

w_{1}^{r}

,

w_{1}^{i}

and

w_{2}^{r}

,

w_{2}^{i}

are the real and imaginary parts of x,

w_{1}

and

w_{2}

, respectively.

ϕ

denotes the real-valued nonlinear function. Further defining

ϕ (\cdot; w_{1}^{r}, w_{2}^{r})

as the real-part transformation

ϕ^{r}

and

ϕ (\cdot; w_{1}^{i}, w_{2}^{i})

as the imaginary-part transformation

ϕ^{i}

, we finally get

y^{r} + j y^{i} = [ϕ^{r} (x^{r}) - ϕ^{i} (x^{i})] + j \cdot [ϕ^{r} (x^{i}) - ϕ^{i} (x^{r})],

(5)

in which

y^{r}

and

y^{i}

are the real and imaginary parts of the output y. The workflow of

Φ (x)

is depicted in Figure 1.

Based on Equations (1) and (3), the complex KAN layer is defined as follows. Output

x_{l + 1, j}

of j-th neuron in

(l + 1)

-th layer is

x_{l + 1, j} = \sum_{i = 1}^{n_{l}} Φ_{l, j, i} (x_{l, i}),

(6)

in which

j = 1, 2, \dots, n_{l + 1}

. Therefore, in the complex KAN convolution layer, the output

z^{'}

is computed as

\begin{matrix} z_{i j}^{'} & = \sum_{d = 1}^{c} \sum_{a = 0}^{k_{1} - 1} \sum_{b = 0}^{k_{2} - 1} Φ_{d, a, b} (z_{d, i + a, j + b}), \\ i & = 1, 2, \dots, w - k_{1} + 1, \\ j & = 1, 2, \dots, h - k_{2} + 1 . \end{matrix}

(7)

Through the above operations, all weights in the complex KAN convolution layer preserve both amplitude and phase information inherent in polarimetric features.

According to [33,34], the parameter counts for our complex-valued layers are defined below. Both

ϕ^{r}

and

ϕ^{i}

require

g r i d s i z e + 2

parameters each, where

g r i d s i z e

represents the grid dimension of B-splines and is set to 5 in our study. Therefore, each complex KAN layer contains

2 (g r i d s i z e + 2)

parameters, and each complex KAN convolution layer with kernel size

k_{1} \times k_{2}

requires

2 k_{1} k_{2} (g r i d s i z e + 2)

parameters.

2.5. Raw PolSAR Data Extraction and Preprocessing

Based on Pauli decomposition [4], every pixel in a PolSAR image can be described by a coherency matrix

T

:

T = 〈k_{P} \cdot k_{P}^{H}〉 = [\begin{matrix} 〈T_{11}〉 & 〈T_{12}〉 & 〈T_{13}〉 \\ 〈T_{12}^{*}〉 & 〈T_{22}〉 & 〈T_{23}〉 \\ 〈T_{13}^{*}〉 & 〈T_{23}^{*}〉 & 〈T_{33}〉 \end{matrix}],

(8)

where

k_{P}

denotes Pauli-basis,

〈 \cdot 〉

indicates ensemble averaging over multiple looks, ^H is the Hermitian transpose and ^∗ represents the complex conjugate. To preserve the completeness and compactness of information, the upper-triangular elements of

T

, namely

[T_{11} T_{22} T_{33} T_{12} T_{13} T_{23}]

, are extracted as the raw data, resulting in a total of six input channels.

Before training, all input vectors are preprocessed to facilitate model convergence. For each channel

T_{m n}

, its mean

T_{m n}^{a v g}

and standard deviation

T_{m n}^{s t d}

are precomputed and applied for Z-score standardization. For elements

T_{m n} (m, n \in {1, 2, 3})

, the corresponding mean and standard deviation are obtained by

\begin{matrix} T_{m n}^{a v g} & = \frac{\sum_{i = 1}^{v} T_{m n} (i)}{v}, \\ T_{m n}^{s t d} & = \sqrt{\frac{\sum_{i = 1}^{v} (T_{m n} (i) - T_{m n}^{a v g}) {(T_{m n} (i) - T_{m n}^{a v g})}^{*}}{v}} . \end{matrix}

(9)

Considering the complex-valued nature of PolSAR data, all parameters within CV-CPKAN are complex-valued. The diagonal channels, which are real-valued, are also treated as complex channels by setting imaginary components to zero.

2.6. CV-CPKAN Network Architecture

As depicted in Figure 2, the classification framework comprises two multi-branch complex KAN convolution (MBComplexKConv) blocks, each followed by a complex PReLU activation function and a complex average pooling layer. It should be noted that both PReLU activation and average pooling operations are performed on the real and imaginary parts of complex-valued polarimetric features, respectively. Finally, features are mapped to a class probability distribution

\hat{z}

via a complex linear layer.

Within each MBComplexKConv block, given input feature x, the output

y = MB C {KConv}^{c_{1}, c_{2}} (x)

is computed as

\begin{matrix} x_{0} & = C {KConv}_{3 \times 3}^{c_{1}, r c_{1}} (x), x_{1} = C {KConv}_{1 \times 1}^{c_{1}, r c_{1}} (x), \\ x_{h} & = C {KConv}_{3 \times 1}^{c_{1}, r c_{1}} (x), x_{v} = C {KConv}_{1 \times 3}^{c_{1}, r c_{1}} (x), \\ x^{'} & = Concat (x_{0}, x_{1}, x_{h}, x_{v}, C BN (x_{0}), C BN (x_{1}), C BN (x_{h}), C BN (x_{v})), \\ y & = C {KConv}_{1 \times 1}^{8 r c_{1}, c_{2}} (x^{'}) . \end{matrix}

(10)

Here,

C {KConv}_{k_{1} \times k_{2}}^{c_{i n}, c_{o u t}} (\cdot)

represents a complex KAN convolution layer with a kernel size of

k_{1} \times k_{2}

. The scale of channel repetition r is empirically set to 4, which provides a balance between enriching multi-scale polarimetric feature representations and maintaining computational efficiency.

c_{i n}

and

c_{o u t}

stand for the numbers of input and output channels, respectively.

Concat (\cdot)

operation concatenates data along channel dimension.

C BN (\cdot)

represents the complex batch normalization layer [18]. The parallel kernel design in the MBComplexKConv block is motivated by the anisotropic and direction-dependent scattering mechanisms commonly observed in PolSAR images. The

3 \times 3

complex KAN convolution captures local spatial context and neighborhood interactions, while the

1 \times 1

kernel focuses on channel-wise nonlinear polarimetric feature fusion. Moreover,

3 \times 1

and

1 \times 3

kernels are specifically introduced to model elongated and directional scattering patterns, such as buildings and crop rows, enabling the network to better exploit orientation-sensitive polarimetric information.

Following the layer configurations in [18,38], we construct CV-CPKAN with two MBComplexKConv blocks. The channel configurations are

C_{1} = 5

and

C_{2} = 10

, which denote output channel dimensions of the first and second MBComplexKConv blocks, respectively. To preserve the spatial resolution of features, all complex KAN convolution layers employ a dilation rate of 1 and a padding of 1. Following each MBComplexKConv block, a

2 \times 2

complex average pooling operation is applied. As a result, the network generates a one-dimensional feature vector of size

16 C_{2} = 160

.

2.7. CV-PolyLoss and Network Optimization

PolyLoss [55] is the polynomially-weighted extension of cross-entropy loss, specifically designed for classification tasks, which exhibits higher robustness and generalizability. Considering the complex-valued nature of PolSAR data, we propose a complex-valued PolyLoss, i.e., CV-PolyLoss, which is formulated as follows:

\begin{matrix} L & = \frac{1}{2 K} \sum_{k = 1}^{K} [(1 + ϵ) (1 - {\hat{z}}_{k}^{r}) + \sum_{i = 1}^{\infty} \frac{1}{i + 1} {(1 - {\hat{z}}_{k}^{r})}^{i + 1} \\ + (1 + ϵ) (1 - {\hat{z}}_{k}^{i}) + \sum_{i = 1}^{\infty} \frac{1}{i + 1} {(1 - {\hat{z}}_{k}^{i})}^{i + 1}] \\ = - \frac{1}{2 K} \sum_{k = 1}^{K} [z_{k}^{r} log ({\hat{z}}_{k}^{r}) + z_{k}^{i} log ({\hat{z}}_{k}^{i}) + ϵ ({\hat{z}}_{k}^{r} + {\hat{z}}_{k}^{i} - 2)] . \end{matrix}

(11)

Here, K is the number of categories.

{\hat{z}}_{k}^{r}

and

{\hat{z}}_{k}^{i}

are the real and imaginary parts of the k-th elements of prediction

\hat{z}

.

z_{k}^{r}

and

z_{k}^{i}

are the real and imaginary parts of the k-th elements of label encoding z. In vector z, the element corresponding to the annotation is

1 + 1 j

, while all other elements are 0. The polynomial coefficient

ϵ

is set as 1. We optimize our model via minimizing the CV-PolyLoss with the AdamW optimizer [56]. Finally, the predicted class is determined by the index with the maximum value of

({\hat{z}}_{k}^{r} + {\hat{z}}_{k}^{i}) / 2

in vector

\hat{z}

.

3. Results

3.1. Experimental Dataset and Configurations

The performance of CV-CPKAN is evaluated on three benchmark PolSAR datasets, i.e., the Flevoland, San Francisco and Oberpfaffenhofen datasets. The Flevoland dataset captures the agricultural region of Flevoland, Netherlands in 1989, containing

750 \times 1024

pixels categorized into 15 land cover classes. The San Francisco dataset depicts the urban-coastal area of San Francisco, USA in 1989, consisting of

900 \times 1024

pixels classified into five categories. The Oberpfaffenhofen dataset consists of a

1300 \times 1200

pixel image from the Oberpfaffenhofen area in Germany in 2002, which is categorized into three classes.

During the training, each input sample is represented as a

6 \times 16 \times 16

image patch with the target pixel for classification positioned at the local center

(8, 8)

. For all datasets, a random 10% of labeled pixels were employed for training, while the remaining 90% were reserved for testing. The model employs an initial learning rate of 1 × 10 ⁻³ across 50 training epochs. The learning rate is adjusted using the CosineAnnealingLR scheduler [57]. The classification results are measured in terms of overall accuracy (OA), average accuracy (AA) and the Kappa coefficient (

κ

).

3.2. Comparative Study

To demonstrate the superiority of the proposed model, CV-CPKAN is compared with nine existing approaches using the aforementioned PolSAR datasets. The competitors are listed as follows:

(a): SVM [15]: A support vector machine method for PolSAR image classification through optimized polarimetric indicators to outperform the standard Wishart approach.
(b): Haar-CNN [24]: A CNN-based PolSAR image classification method using features extracted through Haar wavelet transformation to improve classification accuracy and suppress speckle noise.
(c): CV-CNN-SE [38]: A PolSAR image classification work incorporating attention-based multi-scale CV-CNNs and squeeze–excitation (SE) blocks to enhance channel interactions.
(d): SDF2Net [25]: A PolSAR image classification approach built upon CV-CNN-SEs, designed as a three-branch feature fusion framework that employs 3D CV-CNNs with SE attention, whose multi-level features from all branches will be finally fused.
(e): ViT [26]: A ViT-based supervised approach for PolSAR image classification, improving classification performance by capturing global features through self-attention mechanisms.
(f): HybridCVNet [58]: A hybrid CV-CNN and complex-valued ViT-based PolSAR image classification method designed to improve accuracy through complementary feature fusion and global dependency modeling.
(g): CV-MsAtViT [59]: A complex-valued multi-scale attention ViT tailored for PolSAR image classification, capable of jointly modeling spatial structures and polarimetric characteristics to achieve superior accuracy.
(h): VMamba [32]: An advanced vision backbone for natural image analysis, introducing state space sequence modeling to enable efficient global dependency learning and enhanced visual representation.
(i): CFAT [60]: A PolSAR image classification approach built on a hybrid CNN-Transformer architecture with a Fieldy attention mechanism, designed to obtain local and global dependencies for enhanced generalization.

The comparison of OA, AA and

κ

across different models is presented in Table 1, Table 2 and Table 3, and Pauli RGB image, ground truth and prediction results are displayed in Figure 3, Figure 4 and Figure 5, respectively. Best results are shown in bold. Each dataset is tested in five independent runs, and the mean performance along with its corresponding standard deviation (STD) is reported.

Figure 3 displays the classification outcomes from all approaches on the Flevoland dataset, and Table 1 presents their quantitative comparisons. From Table 1, it is evident that CFAT achieves the highest OA, AA and

κ

among all other competitors, demonstrating the effectiveness of combining different architectures for model designing. Besides that, we also observe that VMamba outperforms Haar-CNN, CV-CNN-SE, SDF2Net, HybridCVNet and CV-MsAtViT in terms of OA, AA and

κ

, highlighting the potential of integrating advanced mathematical theories into deep learning frameworks. Comparatively, our proposed CV-CPKAN achieves cutting-edge performance, surpassing the second-best CFAT by 1.30% in OA, 3.52% in AA and 1.48% in

κ

, while exhibiting lower standard deviations. These findings highlight the effectiveness of our complex KAN convolution layers and MBComplexKConv blocks for PolSAR image classification. According to Figure 3, Figure 3l demonstrates superior contextual consistency with fewer misclassified blocks compared to other results. For instance, white-boxed areas in Figure 3c–k illustrate significant confusion regarding the rapeseed class. In contrast, Figure 3l is more similar to the ground truth illustrated in Figure 3b.

Table 2 and Figure 4 present numerical and visual classification results on the San Francisco dataset. Table 2 indicates that CV-CPKAN again achieves the highest accuracies, particularly on the vegetation class. In contrast, as advanced architectures, ViT and VMamba yield lower accuracies because they require a larger amount of training data compared to CNNs. Particularly, CV-CPKAN surpasses the second-ranked CFAT by 1.30% in OA, 3.52% in AA and 1.48% in

κ

, demonstrating the efficiency of our framework combining CNNs and KANs. It is worth noting that VMamba underperforms CV-CPKAN by 3.90% in OA, 8.50% in AA and 4.33% in

κ

. One possible explanation is that VMamba was originally developed for real-valued data, which limits its ability to mine the fundamental phase information within PolSAR data.

Analysis of Figure 4 further reveals the following key observations. It can be discovered from the blue-boxed areas of Figure 4c–k that some isolated pixels belonging to the mountain class are incorrectly classified into other categories like vegetation. Although CV-MsAtViT effectively alleviates most of these misclassification problems, the boundary between mountain and vegetation regions is still unclear, as depicted in Figure 4i. By contrast, Figure 4l exhibits better spatial consistency and clearer classification boundaries, which again demonstrates the excellent generalization capability of CV-CPKAN.

Figure 5 illustrates the classification outputs of all evaluated approaches on the Oberpfaffenhofen dataset, with the corresponding quantitative metrics provided in Table 3. From Figure 5, it can be observed that large regions of woodland are misclassified as built-up areas in the upper-left of subfigures (Figure 5c–g,j). Moreover, as highlighted by the white rectangle in Figure 5c–k, several built-up area pixels are incorrectly recognized as woodland by all competitors. Although CFAT already showcases competitive performance, it still exhibits noticeable confusion between built-up and woodland regions. In contrast, CV-CPKAN generates results that are more consistent with ground truth and exhibit superior spatial continuity compared to those of the other methods.

Upon analyzing Table 3, we further observe that on the Oberpfaffenhofen dataset, the accuracy for the built-up areas is lower compared to other categories, which can be attributed to the complex scattering patterns in the heterogeneous regions. Given this challenging situation, our method consistently delivers the best classification results, achieving OA, AA and

κ

values of 99.74%, 99.77% and 99.66% with competitive standard deviations, further highlighting the superiority and robustness of CV-CPKAN.

4. Discussion

In the following subsections, we first conduct an analysis of all compared approaches on model resources and efficiency in Section 4.1. Then, an ablation study is detailed in Section 4.2 to verify the effectiveness of each component in CV-CPKAN. Section 4.3 examines how varying the sampling rates of annotated training data affects classification performance. Finally, Section 4.4 and Section 4.5 discuss the selection of loss functions and the network hyper-parameter settings of CV-CPKAN.

4.1. Model Resource and Efficiency Analysis

As shown in Table 4, CV-CPKAN demonstrates superior computational efficiency over several state-of-the-art approaches. With just 0.134 M parameters, 0.080 G floating point operations (FLOPs) and 0.040 G multiply–accumulate operations (MACs), it achieves a substantially lower computational burden compared to most deep learning models. For instance, both ViT and HybridCVNet demand considerably more resources, requiring 0.258 M parameters and 0.515 G FLOPs for ViT, and 0.310 M parameters and 0.465G FLOPs for HybridCVNet. On the other hand, CV-CPKAN demonstrates superior classification performance compared to other lightweight models, such as CV-CNN-SE with 0.132 M parameters and 0.198G FLOPs, and SDF2Net with 0.094 M parameters and 0.142 G FLOPs. This favorable balance between the accuracy and computational demand highlights the efficiency and effectiveness of CV-CPKAN. Note that the parameters in traditional machine learning are not directly compared to those in deep learning. Therefore, their comparison is marked as N/A.

Although CV-CPKAN achieves excellent classification performance with low theoretical computational complexity, the current implementation of KAN convolution layers relies on sequential for-loop operations rather than fully parallelized matrix computations. This limited utilization of hardware parallelism may result in longer actual runtime compared to conventional convolutional networks, despite the reduced FLOPs and MACs. Nevertheless, this limitation is implementation-related rather than architectural, and future work will focus on developing parallelized and hardware-efficient KAN convolution operators to further improve practical deployment efficiency. In summary, CV-CPKAN maintains competitive classification performance while requiring relatively fewer trainable parameters and lower computational overhead. This favorable balance renders it a theoretically promising candidate for remote sensing applications.

4.2. Ablation Study

To assess the effectiveness of each component in CV-CPKAN, we carried out an ablation study. Since the results on the Flevoland dataset are approaching saturation, we performed an ablation study solely on the San Francisco dataset for clearer comparison with results in Table 5. For the real-valued baseline in Table 5, each complex element of

T

is decomposed into two real channels (yielding nine channels in total) while discarding phase components. This setup isolates the specific contribution of complex-valued processing.

Based on Table 5, we can conclude that, compared to a traditional CNN, KAN convolution layers offer improvements of 2.90% in OA, 8.36% in AA, and 10.45% in

κ

with lower standard deviations. Unlike traditional CNNs or attention-based models with fixed linear kernels, KAN convolution layers utilize learnable nonlinear functions to adaptively model the complex amplitude–phase interactions in PolSAR data. This design provides superior nonlinear fitting and robustness, effectively capturing intricate polarimetric relationships that static architectures often miss. Furthermore, the additional complex-valued design brings marginal improvements of 0.16% in OA, 0.45% in AA and 0.57% in

κ

, suggesting that additional complex-valued representations can capture phase features often neglected by real-valued networks. In conclusion, by incorporating our newly designed MBComplexKConv block, CV-CPKAN achieves the highest classification accuracies with OA, AA and

κ

values of 99.80%, 99.24% and 99.05%, respectively. The ablation results indicate that the proposed MBComplexKConv block aggregates multi-scale and direction-aware features, leading to the best overall performance and improved robustness.

4.3. Sampling Rate Analysis

In this section, CV-CPKAN and baseline CNN are evaluated under different sampling rates (SRs) to analyze their sensitivity on the San Francisco dataset. Table 6 and Figure 6 present the OA, AA and Kappa values and curves of CV-CPKAN and baseline CNN. From these results, several key findings can be made. Both the baseline CNN and CV-CPKAN show performance improvement as SR increases, particularly under small-sample conditions, with the most notable enhancement occurring when the SR ranges from 1% to 5%.

However, when SR exceeds 10%, the classification performance tends to saturate, making our improvement less observable. This plateau is primarily due to the nature of the benchmark datasets, where dominant scattering mechanisms and class distributions are already sufficiently represented at moderate annotation densities. Consequently, both CV-CPKAN and the baseline CNN enter a stable performance regime, suggesting that further gains are constrained by data sufficiency rather than model capacity. Therefore, following the SR configuration in [60], we set our default SR to 10%.

4.4. Loss Function Analysis

Originating from classical cross-entropy loss, PolyLoss is designed to mitigate network overfitting and enhance model generalization, particularly under class-imbalanced conditions [55]. To better verify the efficacy of CV-PolyLoss, we perform a comparative experiment on the inter-class imbalanced Flevoland dataset, evaluating CV-PolyLoss against the complex-valued mean squared error loss (CV-MSELoss) and complex-valued cross-entropy loss (CV-CELoss), which are widely utilized in classification tasks [61]. Results are presented in Table 7, which indicates the contributions of CV-PolyLoss in enhancing model robustness and classification performance.

4.5. Network Hyper-Parameter Analysis

We finally perform a parameter sensitivity analysis on the San Francisco dataset to explore the impact of key network hyper-parameters. Table 8 summarizes the classification performance under various network configurations. In Table 8, ‘-’ denotes that the corresponding parameter is set to the default configuration. We can draw the following conclusions from Table 8:

(a): The optimal performance is achieved when the input patch size, kernel size and MBComplexKConv block layers are set to 16, 3 and 2, respectively. These configurations are, therefore, empirically adopted as the default settings in our experiments.
(b): Increasing the grid size generally improves the classification performance of CV-CPKAN, while this comes with a significant increase in trainable parameters, which further leads to higher memory consumption and an increased risk of overfitting [30]. For instance, the parameter count rises from 0.095 M at a grid size of 3 to 0.168 M at a grid size of 7. Under our default setup, OA stabilizes at 99.80% with only slight AA and $κ$ gains observed as the grid size increases from 5 to 7. Therefore, we recommend a grid size of 5 to balance the model complexity and classification accuracy.

5. Conclusions

This paper introduces a complex-valued convolutional Kolmogorov–Arnold framework for PolSAR image classification (CV-CPKAN), which consists of two multi-branch blocks based on complex KAN convolution layers and an improved complex-valued PolyLoss function. CV-CPKAN extracts robust and complete polarimetric features from both the amplitude and phase components of PolSAR data, with its MBComplexKConv blocks enriching polarimetric representations from different scales. Experimental results on three benchmark datasets demonstrate that the designed CV-CPKAN contributes to improved classification accuracy with lower standard deviation and enhanced contextual consistency.

While CV-CPKAN constitutes an exploratory study of KAN-based convolutional layers, its comparative study was conducted with adequate labeled samples. Such supervised methods will encounter substantial challenges under conditions of dataset with extreme annotation scarcity and inter-class imbalance. Therefore, our future direction will concentrate on developing efficient, parallelized KAN convolution layers and applying them to self-supervised polarimetric representation learning.

Author Contributions

Conceptualization, Z.K.; Methodology, Z.K.; Software, Z.K.; Validation, Z.K.; Formal analysis, Z.K.; Investigation, Z.K.; Resources, H.B.; Data curation, Z.K., S.L. and H.B.; Writing—original draft, Z.K. and H.B.; Writing—review & editing, Z.K., S.L., H.B., L.H. and F.L.; Visualization, Z.K., S.L., H.B., L.H. and F.L.; Supervision, S.L., H.B., L.H. and F.L.; Project administration, S.L., H.B., L.H. and F.L.; Funding acquisition, S.L., H.B., L.H. and F.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Key Research and Development Program of China, grant number 2022YFA1003800; the National Natural Science Foundation of China, grant numbers 42201394 and 42571393; and the Key Research and Development Program of Shaanxi Province, grant number 2025CY-YBXM-040. The APC was funded by the National Key Research and Development Program of China, 2022YFA1003800.

Data Availability Statement

All raw data that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

References

Chen, S.W.; Wang, X.S.; Xiao, S.P.; Sato, M. Target Scattering Mechanism in Polarimetric Synthetic Aperture Radar; Springer: Berlin/Heidelberg, Germany, 2018. [Google Scholar]
Pipia, L.; Fabregas, X.; Aguasca, A.; López-Martínez, C. Polarimetric temporal analysis of urban environments with a ground-based SAR. IEEE Trans. Geosci. Remote Sens. 2012, 51, 2343–2360. [Google Scholar] [CrossRef]
Bi, H.; Xu, F.; Wei, Z.; Han, Y.; Cui, Y.; Xue, Y.; Xu, Z. Unsupervised PolSAR image factorization with deep convolutional networks. In Proceedings of the IGARSS 2019-2019 IEEE International Geoscience and Remote Sensing Symposium, Yokohama, Japan, 28 July–2 August 2019; pp. 1061–1064. [Google Scholar]
Cloude, S.R.; Pottier, E. A review of target decomposition theorems in radar polarimetry. IEEE Trans. Geosci. Remote Sens. 2002, 34, 498–518. [Google Scholar] [CrossRef]
Bi, H.; Sun, J.; Xu, Z. Unsupervised PolSAR image classification using discriminative clustering. IEEE Trans. Geosci. Remote Sens. 2017, 55, 3531–3544. [Google Scholar] [CrossRef]
Wang, N.; Jin, W.; Bi, H.; Xu, C.; Gao, J. A survey on deep learning for few-shot PolSAR image classification. Remote Sens. 2024, 16, 4632. [Google Scholar] [CrossRef]
Krogager, E. New decomposition of the radar target scattering matrix. Electron. Lett. 1990, 26, 1525–1527. [Google Scholar] [CrossRef]
Touzi, R. Target scattering decomposition in terms of roll-invariant target parameters. IEEE Trans. Geosci. Remote Sens. 2006, 45, 73–84. [Google Scholar] [CrossRef]
Freeman, A.; Durden, S.L. A three-component scattering model for polarimetric SAR data. IEEE Trans. Geosci. Remote Sens. 2002, 36, 963–973. [Google Scholar] [CrossRef]
Yamaguchi, Y.; Moriyama, T.; Ishido, M.; Yamada, H. Four-component scattering model for polarimetric SAR image decomposition. IEEE Trans. Geosci. Remote Sens. 2005, 43, 1699–1706. [Google Scholar] [CrossRef]
Cloude, S. An entropy based classification scheme for polarimetric SAR data. In Proceedings of the 1995 International Geoscience and Remote Sensing Symposium, IGARSS’95. Quantitative Remote Sensing for Science and Applications, Firenze, Italy, 10–14 July 1995; IEEE: Piscataway, NJ, USA, 1995; Volume 3, pp. 2000–2002. [Google Scholar]
Doulgeris, A.P.; Anfinsen, S.N.; Eltoft, T. Classification with a non-Gaussian model for PolSAR data. IEEE Trans. Geosci. Remote Sens. 2008, 46, 2999–3009. [Google Scholar] [CrossRef]
Uhlmann, S.; Kiranyaz, S. Integrating color features in polarimetric SAR image classification. IEEE Trans. Geosci. Remote Sens. 2013, 52, 2197–2216. [Google Scholar] [CrossRef]
Xue, X.; Di, L.; Guo, L.; Lin, L. An efficient classification method of fully polarimetric SAR image based on polarimetric features and spatial features. In Proceedings of the 2015 Fourth International Conference on Agro-Geoinformatics (Agro-Geoinformatics), Istanbul, Turkey, 20–24 July 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 327–331. [Google Scholar]
Lardeux, C.; Frison, P.L.; Tison, C.; Souyris, J.C.; Stoll, B.; Fruneau, B.; Rudant, J.P. Support vector machine for multifrequency SAR polarimetric data classification. IEEE Trans. Geosci. Remote Sens. 2009, 47, 4143–4152. [Google Scholar] [CrossRef]
Tao, M.; Zhou, F.; Liu, Y.; Zhang, Z. Tensorial independent component analysis-based feature extraction for polarimetric SAR data classification. IEEE Trans. Geosci. Remote Sens. 2014, 53, 2481–2495. [Google Scholar] [CrossRef]
Bi, H.; Yao, J.; Wei, Z.; Hong, D.; Chanussot, J. PolSAR image classification based on robust low-rank feature extraction and Markov random field. IEEE Geosci. Remote Sens. Lett. 2020, 19, 4005205. [Google Scholar] [CrossRef]
Kuang, Z.; Bi, H.; Li, F.; Xu, C.; Sun, J. Polarimetry-inspired Contrastive Learning for Class-imbalanced PolSAR Image Classification. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5212819. [Google Scholar] [CrossRef]
Zhuang, D.; Zhang, L.; Zou, B. Model-Based Polarimetric SAR Target Decomposition: A Scheme to Introduce Repeat-Pass PolInSAR Coherence. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5212516. [Google Scholar] [CrossRef]
Zhang, S.; Zhang, L.; Zou, B.; Zhuang, D.; Jiang, Y. SGMFNet: A Semantic-Guided Multi-Frequency PolSAR Data Fusion Framework Based on Scattering Mechanisms. IEEE Trans. Geosci. Remote Sens. 2025, 63, 5222416. [Google Scholar] [CrossRef]
Wang, P.; Chen, Y.; Huang, B.; Zhu, D.; Lu, T.; Dalla Mura, M.; Chanussot, J. MT_GAN: A SAR-to-optical image translation method for cloud removal. ISPRS J. Photogramm. Remote Sens. 2025, 225, 180–195. [Google Scholar] [CrossRef]
He, Z.; Wang, P.; Huang, B.; Zhu, D.; Lee, H.F.; Leung, H. Dadigan: A Dual Attention Blocks-Based Disentangled Iterative Generative Adversarial Network for Cloud and Shadow Removal on Sar and Optical Images. Inf. Fusion 2026, 125, 103487. [Google Scholar] [CrossRef]
Zhou, Y.; Wang, H.; Xu, F.; Jin, Y.Q. Polarimetric SAR image classification using deep convolutional neural networks. IEEE Geosci. Remote Sens. Lett. 2016, 13, 1935–1939. [Google Scholar] [CrossRef]
Jamali, A.; Mahdianpari, M.; Mohammadimanesh, F.; Bhattacharya, A.; Homayouni, S. PolSAR image classification based on deep convolutional neural networks using wavelet transformation. IEEE Geosci. Remote Sens. Lett. 2022, 19, 4510105. [Google Scholar] [CrossRef]
Alkhatib, M.Q.; Zitouni, M.S.; Al-Saad, M.; Aburaed, N.; Al-Ahmad, H. SDF2Net: Shallow to Deep Feature Fusion Network for PolSAR Image Classification. arXiv 2024, arXiv:2402.17672. [Google Scholar]
Dong, H.; Zhang, L.; Zou, B. Exploring vision transformers for polarimetric SAR image classification. IEEE Trans. Geosci. Remote Sens. 2021, 60, 5219715. [Google Scholar] [CrossRef]
Jamali, A.; Roy, S.K.; Bhattacharya, A.; Ghamisi, P. Local window attention transformer for polarimetric SAR image classification. IEEE Geosci. Remote Sens. Lett. 2023, 20, 4004205. [Google Scholar] [CrossRef]
Wang, W.; Wang, J.; Quan, D.; Yang, M.; Sun, J.; Lu, B. PolSAR Image Classification via a Multi-Granularity Hybrid CNN-ViT Model with External Tokens and Cross-Attention. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 8003–8019. [Google Scholar] [CrossRef]
d’Ascoli, S.; Touvron, H.; Leavitt, M.L.; Morcos, A.S.; Biroli, G.; Sagun, L. Convit: Improving vision transformers with soft convolutional inductive biases. In Proceedings of the International Conference on Machine Learning, PMLR, Virtual, 18–24 July 2021; pp. 2286–2296. [Google Scholar]
Drokin, I. Kolmogorov-arnold convolutions: Design principles and empirical studies. arXiv 2024, arXiv:2407.01092. [Google Scholar] [CrossRef]
Ho, J.; Jain, A.; Abbeel, P. Denoising diffusion probabilistic models. Adv. Neural Inf. Process. Syst. 2020, 33, 6840–6851. [Google Scholar]
Zhu, L.; Liao, B.; Zhang, Q.; Wang, X.; Liu, W.; Wang, X. Vision mamba: Efficient visual representation learning with bidirectional state space model. arXiv 2024, arXiv:2401.09417. [Google Scholar] [CrossRef]
Liu, Z.; Wang, Y.; Vaidya, S.; Ruehle, F.; Halverson, J.; Soljačić, M.; Hou, T.Y.; Tegmark, M. Kan: Kolmogorov-arnold networks. arXiv 2024, arXiv:2404.19756. [Google Scholar]
Bodner, A.D.; Tepsich, A.S.; Spolski, J.N.; Pourteau, S. Convolutional Kolmogorov-Arnold Networks. arXiv 2024, arXiv:2406.13155. [Google Scholar]
Somvanshi, S.; Javed, S.A.; Islam, M.M.; Pandit, D.; Das, S. A survey on kolmogorov-arnold network. ACM Comput. Surv. 2024, 58, 55. [Google Scholar] [CrossRef]
Bi, H.; Kuang, Z.; Li, F.; Gao, J.; Chen, X. Overview of deep learning algorithms for PolSAR image classification. Chin. Sci. Bull. 2024, 69, 5108–5128. [Google Scholar] [CrossRef]
Song, J.; Gao, S.; Zhu, Y.; Ma, C. A survey of remote sensing image classification based on CNNs. Big Earth Data 2019, 3, 232–254. [Google Scholar] [CrossRef]
Zhang, Z.; Wang, H.; Xu, F.; Jin, Y.Q. Complex-valued convolutional neural network and its application in polarimetric SAR image classification. IEEE Trans. Geosci. Remote Sens. 2017, 55, 7177–7188. [Google Scholar] [CrossRef]
Zhang, S.; Cui, L.; Dong, Z.; An, W. A Deep Learning Classification Scheme for PolSAR Image Based on Polarimetric Features. Remote Sens. 2024, 16, 1676. [Google Scholar] [CrossRef]
Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 6000–6010. [Google Scholar] [CrossRef]
Gu, A.; Dao, T. Mamba: Linear-time sequence modeling with selective state spaces. arXiv 2023, arXiv:2312.00752. [Google Scholar] [CrossRef]
Xu, R.; Yang, S.; Wang, Y.; Cai, Y.; Du, B.; Chen, H. Visual Mamba: A Survey and New Outlooks. arXiv 2024, arXiv:2404.18861v3. [Google Scholar] [CrossRef]
Peng, S.; Zhu, X.; Deng, H.; Deng, L.J.; Lei, Z. Fusionmamba: Efficient remote sensing image fusion with state space model. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5410216. [Google Scholar] [CrossRef]
Yao, J.; Hong, D.; Li, C.; Chanussot, J. Spectralmamba: Efficient mamba for hyperspectral image classification. arXiv 2024, arXiv:2404.08489. [Google Scholar] [CrossRef]
Kuang, Z.; Bi, H.; Li, F.; Xu, C. ECP-Mamba: An Efficient Multi-scale Self-supervised Contrastive Learning Method with State Space Model for PolSAR Image Classification. IEEE Trans. Geosci. Remote Sens. 2025, 63, 5218718. [Google Scholar] [CrossRef]
Li, Z. Kolmogorov-Arnold Networks are Radial Basis Function Networks. arXiv 2024, arXiv:2405.06721. [Google Scholar] [CrossRef]
Sidharth, S.S.; Keerthana, A.R.; Gokul, R.; Anas, K.P. Chebyshev polynomial-based kolmogorov-arnold networks: An efficient architecture for nonlinear function approximation. arXiv 2024, arXiv:2405.07200. [Google Scholar]
Wolff, M.; Eilers, F.; Jiang, X. CVKAN: Complex-Valued Kolmogorov-Arnold Networks. arXiv 2025, arXiv:2502.02417. [Google Scholar]
Jamali, A.; Roy, S.K.; Hong, D.; Lu, B.; Ghamisi, P. How to learn more? Exploring Kolmogorov–Arnold networks for hyperspectral image classification. Remote Sens. 2024, 16, 4015. [Google Scholar] [CrossRef]
Cheon, M. Kolmogorov-arnold network for satellite image classification in remote sensing. arXiv 2024, arXiv:2406.00600. [Google Scholar] [CrossRef]
Helber, P.; Bischke, B.; Dengel, A.; Borth, D. Eurosat: A novel dataset and deep learning benchmark for land use and land cover classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2019, 12, 2217–2226. [Google Scholar] [CrossRef]
Li, G.; Ye, M. Dynamic 3D KAN Convolution with Adaptive Grid Optimization for Hyperspectral Image Classification. Arab. J. Sci. Eng. 2025, 1–14. [Google Scholar] [CrossRef]
Zhao, L.; Zhu, T.; Zhou, C.; Luo, T.; Li, W.; Zhang, G. BAN: A Boundary-Aware Network Based on KAN for Robust Hyperspectral Image Classification Against Adversarial Attacks. IEEE Trans. Geosci. Remote Sens. 2025, 63, 5522413. [Google Scholar] [CrossRef]
Leng, Z.; Tan, M.; Liu, C.; Cubuk, E.D.; Shi, X.; Cheng, S.; Anguelov, D. Polyloss: A polynomial expansion perspective of classification loss functions. arXiv 2022, arXiv:2204.12511. [Google Scholar] [CrossRef]
Loshchilov, I. Decoupled weight decay regularization. arXiv 2017, arXiv:1711.05101. [Google Scholar]
Loshchilov, I.; Hutter, F. SGDR: Stochastic Gradient Descent with Restarts. arXiv 2016, arXiv:1608.03983. [Google Scholar] [CrossRef]
Alkhatib, M.Q. Polsar image classification using a hybrid complex-valued network (hybridcvnet). IEEE Geosci. Remote Sens. Lett. 2024, 21, 4017705. [Google Scholar] [CrossRef]
Alkhatib, M.Q. PolSAR image classification using complex-valued multiscale attention vision transformer (CV-MsAtViT). Int. J. Appl. Earth Obs. Geoinf. 2025, 137, 104412. [Google Scholar] [CrossRef]
Cui, X. CFAT: Convolutional Fieldy Attention Transformer for Polarimetric SAR Image Classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2025, 18, 26432–26445. [Google Scholar] [CrossRef]
Lee, C.; Hasegawa, H.; Gao, S. Complex-valued neural networks: A comprehensive survey. IEEE/CAA J. Autom. Sin. 2022, 9, 1406–1426. [Google Scholar] [CrossRef]

Figure 1. The workflow of

Φ (x)

, where the real-part transformation and the imaginary-part transformation are constructed by

ϕ (x)

.

Figure 1. The workflow of

Φ (x)

, where the real-part transformation and the imaginary-part transformation are constructed by

ϕ (x)

.

Figure 2. The architecture of the complex-valued convolutional Kolmogorov–Arnold network for PolSAR image classification (CV-CPKAN). It consists of two multi-branch complex KAN convolution (MBComplexKConv) blocks, each integrating parallel complex KAN convolution layers with diverse kernels and complex batch normalization layers to extract multi-scale features from amplitude and phase components. MBComplexKConv blocks are then followed by complex PReLU activation and average pooling operations, ending in a complex linear layer for pixel-wise classification. We design a novel complex-valued PolyLoss (CV-PolyLoss) to optimize our model.

Figure 3. Classification maps of different methods over the Flevoland dataset. (a) Pauli RGB image. (b) Ground truth. (c) SVM. (d) Haar-CNN. (e) CV-CNN. (f) SDF2Net. (g) ViT. (h) HybridCVNet. (i) CV-MsAtViT. (j) VMamba. (k) CFAT. (l) CV-CPKAN.

Figure 4. Classification maps of different methods over the San Francisco dataset. (a) Pauli RGB image. (b) Ground truth. (c) SVM. (d) Haar-CNN. (e) CV-CNN. (f) SDF2Net. (g) ViT. (h) HybridCVNet. (i) CV-MsAtViT. (j) VMamba. (k) CFAT. (l) CV-CPKAN.

Figure 5. Classification maps of different methods over the Oberpfaffenhofen dataset. (a) Pauli RGB image. (b) Ground truth. (c) SVM. (d) Haar-CNN. (e) CV-CNN. (f) SDF2Net. (g) ViT. (h) HybridCVNet. (i) CV-MsAtViT. (j) VMamba. (k) CFAT. (l) CV-CPKAN.

Figure 6. OA, AA and Kappa curves of CV-CPKAN and baseline CNN on the San Francisco dataset.

Table 1. Classification performance of different methods on the Flevoland dataset.

Class	SVM [15]	Haar-CNN [24]	CV-CNN-SE [38]	SDF2Net [25]	ViT [26]	HybridCVNet [58]	CV-MsAtViT [59]	VMamba [32]	CFAT [60]	Ours
1: Water	81.91	99.09	99.32	99.81	100	99.94	99.92	99.79	99.92	100
2: Forest	71.71	85.39	98.80	99.23	98.90	99.29	99.40	99.11	99.46	100
3: Lucerne	82.04	98.29	96.32	97.46	97.84	99.32	99.21	99.57	99.55	99.94
4: Grass	0.24	83.90	86.19	85.10	99.88	96.55	95.53	97.14	98.76	99.64
5: Rapeseed	68.99	88.25	93.87	94.05	99.32	96.78	99.15	98.29	99.37	99.99
6: Beet	68.10	74.78	77.65	91.59	98.76	97.49	96.88	97.80	97.99	99.61
7: Potatoes	79.40	95.93	95.39	91.30	98.95	96.54	97.24	96.46	97.96	99.55
8: Peas	68.33	99.19	97.65	95.80	97.52	98.12	98.78	98.89	99.42	99.93
9: Stem	73.01	91.45	95.90	99.00	99.95	97.75	98.34	99.08	99.46	99.86
10: Bare	0.00	95.06	94.08	94.49	98.01	99.56	98.33	99.36	99.93	99.96
11: Wheat 3	73.97	96.41	98.78	98.16	99.66	98.24	97.25	99.11	99.79	99.92
12: Wheat 2	0.05	72.03	81.86	97.34	100	94.33	98.85	98.76	99.36	100
13: Wheat	83.86	97.53	98.62	98.86	98.66	99.46	99.77	99.49	99.90	100
14: Barley	0.00	96.51	96.76	98.51	99.17	99.20	98.06	99.22	99.40	100
15: Buildings	1.04	84.60	86.33	86.85	99.40	98.44	99.57	95.41	98.45	98.46
OA (%)	63.22	91.73	94.78	96.01	98.89	98.13	98.51	98.69	99.28	99.86
AA (%)	50.18	90.56	93.17	95.17	99.07	98.07	98.42	98.50	99.25	99.72
$κ$ (%)	59.18	90.96	93.92	95.64	98.79	97.95	98.37	98.59	99.24	99.70
STD of OA (%)	8.66	4.15	1.42	0.40	0.31	0.27	0.21	0.23	0.18	0.02
STD of AA (%)	0.99	5.30	2.12	0.62	0.29	0.35	0.19	0.26	0.21	0.14
STD of $κ$ (%)	1.67	5.43	1.61	0.44	0.34	0.24	0.27	0.21	0.23	0.15

Table 2. Classification performance of different methods on the San Francisco dataset.

Class	SVM [15]	Haar-CNN [24]	CV-CNN-SE [38]	SDF2Net [25]	ViT [26]	HybridCVNet [58]	CV-MsAtViT [59]	VMamba [32]	CFAT [60]	Ours
1: Bare Soil	0.04	78.97	57.81	79.98	96.47	81.23	86.07	80.30	90.40	97.75
2: Mountain	40.61	94.62	94.82	94.49	99.52	98.02	95.14	91.39	98.02	99.80
3: Water	98.37	99.26	99.11	98.70	93.47	99.54	99.01	97.59	99.50	99.86
4: Urban	95.65	96.21	98.47	98.94	94.38	98.24	98.78	97.21	98.99	99.93
5: Vegetation	64.21	60.25	77.71	87.42	96.02	93.09	93.05	87.21	91.69	99.31
OA (%)	88.73	94.65	96.37	97.13	95.86	98.11	98.03	95.90	98.50	99.80
AA (%)	59.77	85.86	85.58	91.31	97.01	94.03	94.41	90.74	95.72	99.24
$κ$ (%)	81.75	91.58	94.28	95.50	95.94	97.03	96.90	94.72	97.57	99.05
STD of OA (%)	0.12	2.00	0.22	0.20	0.89	0.34	0.33	0.41	0.22	0.01
STD of AA (%)	0.86	4.28	1.78	1.54	0.56	0.26	0.29	0.36	0.42	0.08
STD of $κ$ (%)	0.21	3.08	0.35	0.31	0.35	0.29	0.31	0.30	0.33	0.09

Table 3. Classification performance of different methods on the Oberpfaffenhofen dataset.

Class	SVM [15]	Haar-CNN [24]	CV-CNN-SE [38]	SDF2Net [25]	ViT [26]	HybridCVNet [58]	CV-MsAtViT [59]	VMamba [32]	CFAT [60]	Ours
1: Build-Up Areas	56.33	93.09	90.49	91.01	92.25	94.62	95.74	93.87	95.77	99.76
2: Woodland	57.21	90.73	96.44	96.80	97.13	97.70	97.94	96.09	97.24	99.96
3: Open Areas	95.98	94.10	96.14	96.71	97.77	97.54	96.96	97.12	97.47	99.89
OA (%)	80.46	94.21	94.86	95.30	96.27	96.81	96.85	96.12	97.02	99.74
AA (%)	70.84	93.97	94.49	94.84	95.72	96.62	96.88	95.69	96.83	99.77
$κ$ (%)	64.20	90.26	91.25	91.99	93.38	94.62	94.60	94.16	94.90	99.66
STD of OA (%)	1.29	0.49	0.24	0.08	0.42	0.37	0.29	0.39	0.21	0.17
STD of AA (%)	2.10	1.47	0.31	0.08	0.41	0.34	0.28	0.38	0.17	0.09
STD of $κ$ (%)	2.71	1.61	0.30	0.13	0.67	0.41	0.21	0.40	0.69	0.13

Table 4. Comparison of model parameters, FLOPs and MACs.

Class	SVM [15]	Haar-CNN [24]	CV-CNN-SE [38]	SDF2Net [25]	ViT [26]	HybridCVNet [58]	CV-MsAtViT [59]	VMamba [32]	CFAT [60]	Ours
Params (M)	N/A	0.126	0.132	0.094	0.258	0.310	0.308	0.252	0.228	0.134
FLOPs (G)	0.136	0.189	0.198	0.142	0.515	0.465	0.461	0.378	0.232	0.080
MACs (G)	0.064	0.092	0.096	0.068	0.248	0.223	0.221	0.182	0.114	0.040

Table 5. Ablation study of CV-CPKAN on the San Francisco dataset.

Network Components				OA (%)	AA (%)	$κ$ (%)	STD of OA (%)	STD of AA (%)	STD of $κ$ (%)
CNN Baseline	KAN Convolution Layers	Complex Network	MBComplexKConv Blocks	OA (%)	AA (%)	$κ$ (%)	STD of OA (%)	STD of AA (%)	STD of $κ$ (%)
✔	✘	✘	✘	96.32	88.73	85.91	0.06	0.18	0.23
✘	✔	✘	✘	99.22	97.09	96.36	0.02	0.07	0.09
✘	✔	✔	✘	99.38	97.54	96.93	0.02	0.04	0.05
✘	✔	✔	✔	99.80	99.24	99.05	0.01	0.08	0.09

Table 6. Sampling rate (SR) analysis on the San Francisco dataset.

	1	5	10	15	20
Acc (%)	1	5	10	15	20
CV-CPKAN
OA	98.77	99.63	99.80	99.84	99.90
AA	96.58	98.82	99.24	99.52	99.78
$κ$	95.72	98.53	99.05	99.40	99.72
Baseline CNN
OA	95.48	95.97	96.29	96.34	96.36
AA	83.36	87.29	88.60	88.84	89.10
$κ$	79.19	84.11	85.75	85.92	86.26

Table 7. Loss function analysis of CV-CPKAN on the Flevoland dataset.

	CV-MSELoss	CV-CELoss	CV-PolyLoss
Acc & STD	CV-MSELoss	CV-CELoss	CV-PolyLoss
OA (%)	99.84	99.85	99.86
AA (%)	99.03	99.56	99.72
$κ$ (%)	98.95	99.43	99.70
STD of OA (%)	0.03	0.03	0.02
STD of AA (%)	0.40	0.15	0.14
STD of $κ$ (%)	0.43	0.14	0.15

Table 8. Network hyper-parameter analysis of CV-CPKAN on the San Francisco dataset.

Setting Index	Network Hyper-Parameters				OA (%)	AA (%)	$κ$ (%)
Setting Index	Input Patch Size	Kernel Size of KAN Convolution Layers	Grid Size of KAN Convolution Layers	Number of MBComplexKConv Blocks	OA (%)	AA (%)	$κ$ (%)
0	16	3	5	2	99.80	99.24	99.05
1	14	-	-	-	99.75	99.08	98.99
2	18	-	-	-	99.79	99.13	98.92
3	-	1	-	-	99.70	99.20	99.00
4	-	5	-	-	99.76	99.16	99.02
5	-	-	3	-	99.73	99.09	98.96
6	-	-	7	-	99.79	99.37	99.21
7	-	-	-	1	99.65	98.28	97.85
8	-	-	-	3	99.77	99.25	99.06

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Kuang, Z.; Liu, S.; Bi, H.; He, L.; Li, F. CV-CPKAN: Complex-Valued Convolutional Kolmogorov–Arnold Framework for PolSAR Image Classification. Remote Sens. 2026, 18, 330. https://doi.org/10.3390/rs18020330

AMA Style

Kuang Z, Liu S, Bi H, He L, Li F. CV-CPKAN: Complex-Valued Convolutional Kolmogorov–Arnold Framework for PolSAR Image Classification. Remote Sensing. 2026; 18(2):330. https://doi.org/10.3390/rs18020330

Chicago/Turabian Style

Kuang, Zuzheng, Shuxin Liu, Haixia Bi, Lijun He, and Fan Li. 2026. "CV-CPKAN: Complex-Valued Convolutional Kolmogorov–Arnold Framework for PolSAR Image Classification" Remote Sensing 18, no. 2: 330. https://doi.org/10.3390/rs18020330

APA Style

Kuang, Z., Liu, S., Bi, H., He, L., & Li, F. (2026). CV-CPKAN: Complex-Valued Convolutional Kolmogorov–Arnold Framework for PolSAR Image Classification. Remote Sensing, 18(2), 330. https://doi.org/10.3390/rs18020330

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

CV-CPKAN: Complex-Valued Convolutional Kolmogorov–Arnold Framework for PolSAR Image Classification

Highlights

Abstract

1. Introduction

2. Materials and Methods

2.1. Deep Learning in PolSAR Image Classification

2.2. KANs and Their Applications in Remote Sensing

2.3. KAN Layer to KAN Convolution Layer

2.4. Complex KAN Layer to Complex KAN Convolution Layer

2.5. Raw PolSAR Data Extraction and Preprocessing

2.6. CV-CPKAN Network Architecture

2.7. CV-PolyLoss and Network Optimization

3. Results

3.1. Experimental Dataset and Configurations

3.2. Comparative Study

4. Discussion

4.1. Model Resource and Efficiency Analysis

4.2. Ablation Study

4.3. Sampling Rate Analysis

4.4. Loss Function Analysis

4.5. Network Hyper-Parameter Analysis

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI