Underwater Image Classification Based on LBP-KPCA Combined with SSA-SVM Approach

Li, Han; Li, Songsong; Zhou, Qiaozhen; Ma, Zhongsong; Chen, Xiaoming

doi:10.3390/info17030229

Open AccessArticle

Underwater Image Classification Based on LBP-KPCA Combined with SSA-SVM Approach

by

Han Li

¹,

Songsong Li

^1,*,

Qiaozhen Zhou

¹,

Zhongsong Ma

¹ and

Xiaoming Chen

²

¹

School of Information Engineering, Dalian Ocean University, Dalian 116023, China

²

School of Optoelectronic Engineering and Instrumentation Science of DUT, Dalian University of Technology, Dalian 116024, China

^*

Author to whom correspondence should be addressed.

Information 2026, 17(3), 229; https://doi.org/10.3390/info17030229

Submission received: 20 January 2026 / Revised: 16 February 2026 / Accepted: 25 February 2026 / Published: 28 February 2026

(This article belongs to the Section Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

China possesses abundant marine fishery resources, which play a vital role in the national economy. Achieving rapid and high-precision classification of underwater targets in complex aquatic environments is of significant importance for enhancing aquaculture intelligence and operational efficiency. To address the challenges of insufficient feature extraction and inefficient classifier parameter optimization in underwater image classification, this study proposes a classification method integrating local binary patterns (LBP), kernel principal component analysis (KPCA), and an improved sparrow search algorithm (SSA). The method first extracts image texture features using LBP and then applies KPCA for nonlinear dimensionality reduction. Subsequently, three optimization strategies—dynamic weighting, boundary contraction, and adaptive mutation—are introduced to enhance SSA, which is then employed to optimize the core parameters of the Support Vector Machine (SVM). Experiments were conducted on an underwater image dataset containing four types of targets: sea urchins, fish, rocks, and scallops. The results demonstrate that, compared with the traditional KPCA-SVM method, the integration of LBP features and the improved SSA increases classification accuracy from 55% to 94.37%, validating the effectiveness of the proposed approach in extracting underwater image features and optimizing classifier parameters. This provides technical support for improving the feasibility of automatic underwater target recognition in aquaculture applications.

Keywords:

underwater images; machine learning; classification accuracy

1. Introduction

China’s coastline stretches a full 18,000 km, and its expansive maritime territory makes it a true maritime power. However, China’s seas possess unique characteristics: most waters are nearshore and relatively shallow, with areas within 60 nautical miles and depths not exceeding 100 m accounting for 98.5% of the country’s total maritime area. As China’s comprehensive national strength continues to grow, its exploration and understanding of the ocean have deepened significantly. The rapid expansion of the marine economy has come to play an increasingly important role in the nation’s overall economic growth. The vast ocean harbors valuable mineral resources, abundant biological resources, and marine chemical resources. By continuously advancing marine resource exploitation technologies, these latent potentials are being transformed into new drivers of economic growth, powerfully propelling rapid economic development. This process is vital for addressing major societal challenges and advancing the great rejuvenation of the Chinese nation. At the same time, the expansive ocean provides significant military-strategic advantages but also imposes higher demands on the security, defense, and resource protection of maritime territories. Economic activities such as seabed mineral exploration, oil drilling platform monitoring, and fisheries resource tracking rely heavily on precise underwater target classification technology. Therefore, whether to promote the development of the marine economy or to safeguard national sovereignty and territorial integrity, researchers are driven to continuously improve the accuracy and efficiency of underwater target classification.

As one of the core technologies for marine resource exploration, ecological monitoring, and autonomous navigation of underwater robots, the research value of underwater image classification becomes more and more prominent with the deepening of marine development in recent years. However, the complexity and specificity of underwater environments—such as light attenuation, scattering effects, suspended particle interference, and low contrast—make traditional image processing methods face serious challenges in feature extraction and classification accuracy.

Traditional underwater image classification research mainly uses manually extracted features to describe the image content, such as Scale Invariant Feature Transform (SIFT) [1], Histogram Orientation Gradient (HOG) [2], and then combines them with the Nearest Neighbor Classifier, Support Vector Machine (SVM) [3], Random Forest [4] and other classifiers for recognition classification. Although these methods are simple and effective, they are usually limited by manually extracted features and suffer from weak generalization ability and sensitivity to environmental changes.

In recent years, learning methods represented by deep learning, especially convolutional neural networks (CNNs), have significantly driven the performance of underwater image classification. CNNs play an important role as a powerful tool in image and video classification [5], and CNN methods, with their powerful nonlinear feature representation, enable the integration of feature extraction and classification decision-making processes in classification tasks [6,7]. Aiming at the problem of uneven illumination and color shift specific to underwater images, researchers have proposed enhanced CNN structures dedicated to underwater image classification. For example, Li et al. [8] proposed combining an image enhancement module with a CNN structure to significantly improve the classification accuracy. Fu et al. proposed a detection method for small underwater targets: candidate regions were first extracted based on Markov random field segmentation, followed by the computation of Hu moment features for potential target areas and their fusion with a convolutional neural network, thereby enhancing the characterization and recognition capabilities for small targets [9]. Wang et al. improved upon the CenterNet framework by integrating semantic and spatial information within the network, enhancing the multi-scale consistency of feature representation and subsequently improving the detection performance for multi-scale underwater targets [10]. Cai et al. employed the YOLOv3 object detection algorithm to achieve precise identification and counting of Takifugu rubripes, effectively improving the recognition accuracy and practical applicability for specific aquatic species [11]. Wang et al. proposed an improved algorithm named YOLOv7-PSS based on YOLOv7. Through structural optimization, the model’s parameter count was reduced while training and inference efficiency were enhanced. Additionally, the SIoU loss function was introduced to accelerate network convergence, thereby optimizing the overall training process and detection performance [12]. Yuan et al. developed a fish target detection method based on Faster R-CNN and implemented a secondary transfer learning strategy for network training, aiming to strengthen the model’s adaptability and generalization ability across various underwater scenarios [13]. Chen et al. introduced a fine-grained feature-aware detection method for underwater salps, enhancing the expression of key cues through cross-dimensional feature interaction and fusion, thereby reducing missed detections caused by underwater biological occlusion and improving detection reliability [14]. While deep learning models can provide superior recognition accuracy and performance, they have a huge demand for computational resources and often face operational challenges on resource-constrained devices. Comparatively speaking, machine learning technology is known for its fast training process, small model size, and convenient deployment characteristics, and can operate without relying on high-end hardware, so the classification models based on traditional machine learning methods still have strong application value and research significance in resource-constrained underwater image classification scenarios.

Machine learning algorithms, as useful decision-making tools, are widely used in society [15]. With the continuous development of machine learning methods, traditional machine learning models are widely used in the field of underwater image classification. Early research mainly relied on the combination of feature extraction and underlying classification models, where classification models include the K-Nearest Neighbors (KNN) algorithm [16], Decision Tree (DT) [17]. These methods usually extract image features by hand and then classify them using the classification models mentioned above. KNN is commonly used to solve classification and regression problems [18]. The KNN method is based on Euclidean distance for category discrimination, which is easy to implement but sensitive to noise and sample distribution; the DT model has strong interpretability and adaptability, but is prone to overfitting problems and has limited performance in the face of complex nonlinear boundaries. To address the shortcomings of the above methods, the Support Vector Machine (SVM) has gradually become a mainstream model in underwater image classification. The SVM uses the data points closest to the hyperplane, known as support vectors, to guide hyperplane placement [19]. SVM maps the input features to a high-dimensional feature space by introducing a kernel function to construct an optimal hyperplane with maximum spacing, which has good robustness and is especially suitable for high-dimensional, small-sample, and nonlinear classification problems. In the face of underwater images with uneven illumination, color distortion, and background noise interference, SVM shows strong discriminative ability and is considered one of the classifiers with outstanding performance in traditional machine learning. However, the direct use of high-dimensional image features often suffers from data redundancy, degradation of classification performance, and increased computational complexity due to the fact that underwater images are generally affected by light attenuation, color distortion, and background complexity [20]. To alleviate the above problems, researchers have widely introduced feature dimensionality reduction techniques to improve the performance of models by reducing the data dimensions. Among them, the commonly used dimensionality reduction methods include Linear Discriminant Analysis (LDA), flow learning methods such as Local Linear Embedding LLE, Isometric Mapping, and so on. Linear Discriminant Analysis (LDA) implements supervised feature dimensionality reduction by maximizing inter-category differences while minimizing intra-category differences. However, the LDA method relies on strong distributional assumptions in the modeling process, i.e., the samples of each category obey a Gaussian distribution and have a similar covariance structure, which is usually difficult to establish in complex and variable underwater environments, thus limiting its classification performance [21]. Stream shape learning methods perform nonlinear dimensionality reduction by maintaining the local geometric structure of the data. Although they can effectively capture the nonlinear structure, they are sensitive to noise and have high computational complexity, which can lead to poor image classification performance in underwater environments with severe noise interference [22].

Compared with the above methods, principal component analysis (PCA) has received extensive attention in underwater image classification in recent years due to its advantages of simple algorithms, efficient computation, and robustness to noise [23]. PCA was initially used to extract latent features and minimize redundancy [24]. PCA projects the original data into a low-dimensional space by orthogonal transformation to maximize the variance information of the original data and obtain the features containing the most important information about the data [25]. In other words, PCA reduces high-dimensional data to a lower number of dimensions while retaining the important information that explains the original data [26], which is able to effectively remove redundant information and reduce the feature dimensions, thus improving the performance of the classifier [27]. Kent et al. [28]. achieved a high classification accuracy of approximately 90% for occupants’ overall spatial satisfaction by combining PCA and SVM. However, due to the fact that underwater image data usually has obvious nonlinear characteristics, the performance of traditional linear PCA is limited in processing nonlinear streaming data. Therefore, nonlinear dimensionality reduction methods such as kernel principal component analysis (KPCA) are gradually introduced into the field of underwater image classification [29]. Although KPCA partially solves the problem of nonlinearity in the data, its ability to characterize local details is still limited by the way the original features are represented. In order to enhance the ability of describing the local structure of the image, local binary pattern (LBP) is introduced in this paper for texture feature extraction [30]. LBP is a texture description operator based on pixel gray-scale differences, which has the advantages of computational simplicity, robustness to illumination changes, and the ability to effectively characterize local microstructures. By combining LBP with KPCA, the ability to characterize image details can be enhanced while retaining the nonlinear mapping capability, further improving the performance of subsequent classifiers.

Meanwhile, as a widely used classifier for underwater image classification tasks, the performance of the Support Vector Machine (SVM) is highly dependent on the optimization of hyperparameters. Traditional parameter optimization methods, such as grid search and cross-validation, although simple, have high computational complexity and are prone to falling into local optimums, which are difficult to meet the demand for real-time and robustness of underwater image classification tasks. To address this problem, this paper introduces the sparrow search algorithm (SSA) to optimize the hyperparameters of SVM. SSA is a new type of population intelligence optimization algorithm proposed in recent years, which is constructed inspired by the sparrow’s foraging behavior, and possesses the advantages of strong global search capability, fast convergence speed, and simple implementation [31]. In order to further improve the convergence stability and local exploration ability of SSA in complex search spaces, this paper introduces three optimization strategies, namely, dynamic weighting, boundary contraction, and adaptive mutation, on the basis of the original SSA framework, which enhances the performance of SSA in high-dimensional nonlinear optimization problems. Combining the improved SSA with SVM can significantly improve the feature extraction ability and classification performance of the model for complex images, especially for underwater image scenes with complex feature distribution.

Based on the above background, this paper proposes an underwater image classification framework that integrates LBP extraction of texture features, KPCA nonlinear dimensionality reduction, and SSA optimization, in order to solve the dual bottlenecks in feature extraction and parameter optimization of existing methods. Firstly, the local texture histogram of the image is extracted by the LBP algorithm, and the high-dimensional features are nonlinearly downscaled by using KPCA to eliminate the redundant information; secondly, an improved SSA is designed, which optimizes the process of finding the optimal parameters of the SVM by incorporating the optimization strategies, such as dynamic weights, boundary contraction, and so on, and further improves the classification accuracy of the algorithm. The experimental results show that compared with other classification models, the method effectively improves the accuracy of underwater image classification, proves the feasibility and effectiveness of the method, and provides a feasible solution for underwater image classification.

2. Fundamental Principles

2.1. Kernel Principal Component Analysis

The core idea of KPCA is to use a kernel function to nonlinearly map input data into a high-dimensional feature space and perform linear PCA in that space in order to capture the nonlinear structural features of the data.

Specifically, assume that the input data is as follows:

X = [x_{1}, x_{2}, \dots, x_{n}], x_{i} \in R^{d}

(1)

First, the nonlinear mapping function

ϕ (\cdot)

is defined and the kernel matrix

K

is computed:

K_{i j} = (ϕ (x_{i}), ϕ (x_{j})) = k (x_{i} {, x}_{j})

(2)

Subsequent centering of the nuclear matrix:

K^{'} = K - 1_{n} K - K 1_{n} + 1_{n} K 1_{n}

(3)

Then perform eigenvalue decomposition on the centralized kernel matrix:

K^{'} α^{(k)} = λ_{k} n α^{(k)}

(4)

For any input data

x

, its projection on the

k

th principal component is as follows:

y_{k} (x) = \sum_{i = 1}^{n} {α_{i}}^{(k)} k (x_{i}, x)

(5)

This results in a low-dimensional feature representation:

y (x) = {[y_{1} (x), y_{2} (x) \dots, y_{m} (x)]}^{T}

(6)

2.2. Support Vector Machine for Classification

Support Vector Machines (SVMs) are a class of margin-based learning methods that can be applied to both classification and regression. In this study, we focus on SVM for classification, specifically a kernel SVM classifier (C-SVC), to discriminate underwater targets into multiple categories. The core idea of SVM is to construct the optimal separating hyperplane in the feature space to improve the generalization ability of the classifier by maximizing the classification interval [32]. This process can be formalized as the following quadratic programming optimization problem with constraints:

\begin{array}{r} \underset{α}{m i n} & \frac{1}{2} \sum_{i = 1}^{n} \sum_{j = 1}^{n} α_{i} α_{j} y_{i} y_{j} K (x_{i}, x_{j}) - \sum_{i = 1}^{n} α_{i} \\ s . t . & \sum_{i = 1}^{n} α_{i} y_{i} = 0 \\ 0 \leq α_{i} \leq C, i = 1, 2, \dots, n \end{array}

(7)

where

K (x_{i}, x_{j})

is the kernel function,

C

is the penalty factor, and

α_{i}

is the Lagrange multiplier. The classification decision function can be obtained by solving this optimization problem:

f (x) = s i g n (\sum_{i = 1}^{n} α_{i} y_{i} K (x_{i}, x) + b)

(8)

Considering the nonlinear characteristics of underwater image feature space, the radial basis kernel function (RBF) is chosen as the basis of feature mapping in this paper, and its expression is as follows:

K (x_{i}, x_{j}) = e x p (- γ {∥ x_{i} - x_{j} ∥}^{2})

(9)

2.3. Sparrow Search Algorithm

SSA is a swarm intelligence optimization algorithm that simulates the foraging behavior of sparrows, which was proposed by Xue et al. in 2020 [31], inspired by sparrow foraging and vigilance behavior. SSA divides individuals in the population into producers, scroungers, and watchers, and individuals update their positions according to their roles at different stages to search for optimal solutions.

Suppose the population size is N, the problem dimension is D, and the location of each sparrow is as follows:

X = [x_{i, j}]_{N \times D}

(10)

Forager position updates (producers): Foragers are responsible for global searching. The update formula is as follows:

x_{i, j}^{t + 1} = {\begin{array}{l} x_{i, j}^{t} \cdot e x p (- \frac{i}{α \cdot T}), & R_{2} < S T \\ x_{i, j}^{t} + Q \cdot L, & R_{2} \geq S T \end{array}

(11)

Follower position updates (scroungers): Followers perform a localized search based on the position of the forager:

x_{i, j}^{t + 1} = {\begin{matrix} Q \cdot e x p (\frac{x_{w o r s t, j}^{t} - x_{i, j}^{t}}{i^{2}}), & i > N / 2 \\ x_{i, j}^{t} + | x_{i, j}^{t} - x_{b e s t, j}^{t} | \cdot A, & i \leq N / 2 \end{matrix}

(12)

Watcher position updates (watchers): A small number of sparrows randomly take on the role of watchers, monitoring the environment and guiding the population to jump out of the local optimum:

x_{i, j}^{t + 1} = x_{b e s t, j}^{t} + β \cdot | x_{i, j}^{t} - x_{b e s t, j}^{t} |

(13)

After each iteration, all individual positions are evaluated based on the fitness function f (x), and the current optimal individual position is updated to enter the next round of position update until the termination condition is satisfied.

3. Materials and Methods

The designed algorithm flow of underwater image classification based on LBP-KPCA combined with the SSA-SVM method is shown in Figure 1, where image preprocessing is performed on the underwater dataset, after which the LBP features of the image are extracted and incorporated into the KPCA to reduce the dimensionality, and finally, the classification is performed by using the SSA-optimized SVM.

3.1. Feature Extraction Method by Fusing LBP-KPCA

In the underwater image classification task, due to the complexity of the underwater environment, resulting in serious distortion and degradation of the image, traditional feature extraction methods often find it difficult to accurately capture the essential features of the image. Local binary pattern (LBP), as a widely used texture feature description operator in computer vision, the core idea is to encode local texture information by comparing the magnitude relationship between the center pixel and its neighboring pixel values, which not only has the advantages of rotational invariance and grayscale invariance, but also performs well in terms of computational efficiency. Therefore, this paper designs a feature extraction framework based on LBP-KPCA, which realizes the efficient extraction of underwater image features by combining the texture description capability of LBP with the nonlinear dimension reduction advantage of KPCA. For the center pixel point located at (

x_{c}, y_{c}

) in the image, the mathematical expression of the LBP feature value is as follows:

L B P_{P, R} (x_{c}, y_{c}) = \sum_{p = 0}^{P - 1} s (g_{p} - g_{c}) 2^{p}

(14)

where P denotes the number of neighborhood sampling points, R denotes the sampling radius,

g_{c}

is the grayscale value of the center pixel,

g_{p}

is the grayscale value of the neighboring pixel points, and s(x) is the threshold function:

s (x) = {\begin{array}{l} 1, & x \geq 0 \\ 0, & x < 0 \end{array}

(15)

In practical applications, considering that the texture features presented in underwater images in different scales and directions have significant local similarity and directional consistency, this paper adopts the uniform mode LBP operator, which filters the main structural modes by counting the number of “0/1” and “1/0” jumps in the LBP coding. When the number of jumps does not exceed 2, the pattern is considered UNIFORM [33], which not only significantly reduces the feature dimensions but also preserves more than 90% of the texture information in the image. The texture feature descriptors with strong discriminative ability are finally obtained by histogram statistics and normalization of LBP feature maps. The U-value is calculated by the following formula:

U (L B P_{P, R}) = | s (g_{P - 1} - g_{c}) - s (g_{0} - g_{c}) | + \sum_{p = 1}^{P - 1} | s (g_{p} - g_{c}) - s (g_{p - 1} - g_{c}) |

(16)

After extracting the texture features by LBP, considering that there may be complex nonlinear relationships between underwater image features, this paper introduces the kernel principal component analysis KPCA, which maps the data in the input space to the high-dimensional feature space through the kernel function to capture the nonlinear structural features of the data. Given

N

d

-dimensional sample vectors

{x_{i}}_{i = 1}^{N}

, the samples are mapped to the high-dimensional feature space by the kernel function

K (\cdot, \cdot)

, in which the radial basis kernel function (RBF) chosen in this paper has good nonlinear mapping ability and mathematical properties:

K (x_{i}, x_{j}) = \exp (- γ {∥ x_{i} - x_{j} ∥}^{2})

(17)

where

γ

is the bandwidth parameter of the RBF kernel function used to control the radial range of action of the Gaussian kernel. By solving the eigenvalue equation,

K α_{i} = λ_{i} α_{i}

(18)

The eigenvector

α_{i}

and the corresponding eigenvalue

α_{i}

can be obtained. In this paper, KPCA is not only used to reduce the feature dimension, but more importantly, it enhances the discriminative ability of LBP features through nonlinear mapping, so that a set of orthogonal feature vectors are obtained, which constitute a set of bases in the feature space, and the original features can be projected into the k-dimensional space by selecting the feature vectors corresponding to the largest k feature values:

y = Φ^{T} v = \sum_{i = 1}^{N} α_{i} K (x, x_{i})

(19)

In KPCA, we employ the RBF kernel

k (x, y) = e x p (- γ_{kpca} ∥ x - y ∥_{2}^{2})

and set

γ_{kpca} = 0.1

as a fixed parameter. This parameter is not included in the SSA optimization, because our hyperparameter search is primarily devoted to the downstream RBF-SVM (

C

and

γ_{svm}

), which directly governs the classification margin and typically exhibits higher sensitivity to performance. Jointly optimizing

γ_{kpca}

would significantly enlarge the search space and increase training cost. Moreover, the KPCA input in our pipeline is a normalized LBP histogram feature. For L1-normalized histograms, the squared Euclidean distance is bounded by

∥ x - y ∥_{2}^{2} \leq 2

, and with

γ_{kpca} = 0.1

, the RBF kernel value remains in a non-saturated range

k \in [e x p (- 0.2), 1] \approx [0.819,1]

, avoiding degenerate similarity structures (nearly constant or overly localized kernels) and enabling KPCA to extract meaningful nonlinear components before SVM classification.

The specific underwater image feature extraction process is shown in Figure 2. The underwater image feature extraction process designed in this paper adopts a multi-layer processing architecture, which constructs a complete feature extraction and optimization system from raw data input to feature dimension compression. Firstly, in the data input layer, the system performs format validation and quality screening on the raw image data from four categories (sea urchin, fish, rock, and scallop) to ensure the normality and usability of the input data in the subsequent processing process; secondly, the preprocessing layer implements the conversion from BGR format to grayscale map and the size standardization (256 × 256), and at the same time introduces data enhancement techniques such as noise suppression and contrast enhancement. Pixel values are mapped to the [0, 1] interval via min–max normalization. Subsequently, the feature extraction layer carries out texture feature extraction by integrating the uniform mode LBP operator, and the generated feature histogram is statistically generated by using n_points+3 intervals, and the stability of the features is ensured by the normalization process. Finally, in the KPCA feature dimensionality reduction layer, the nonlinear mapping is performed by using the RBF kernel function (γ = 0.1), and the feature is reduced to 8 dimensions by principal component analysis. The final output is a feature matrix of shape (N,8), and the whole process ensures the reliability and validity of feature extraction through a quality feedback loop. The whole process ensures the reliability and effectiveness of feature extraction through a quality feedback loop.

3.2. SVM Parameter Optimization Method Based on Improved SSA

As a classical statistical learning method, the classification performance of SVM relies heavily on the selection of kernel function parameters and penalty factors, while traditional parameter tuning methods often suffer from computational inefficiency, local optimization, etc. Therefore, this section discusses in depth the optimization mechanism of SVM parameters based on the sparrow search algorithm SSA, and realizes the classifier by constructing an adaptable and convergent optimization framework, with a performance that is significantly improved, while the convergence and stability of the algorithm are deeply analyzed at both theoretical and practical levels.

In order to fully leverage the classification performance of SVM, this paper designs a parameter optimization framework based on SSA. First, within the discoverer update, we design a staged dynamic position update mechanism, and employ the nonlinear decay and periodic modulation of

α (t)

,

ω (t)

, and

Q (t)

to realize an adaptive regulation strategy characterized by strong exploration in the early phase and strong convergence in the later phase. This mechanism enhances the coverage of global search while improving the stability of late-stage convergence. Second, we incorporate boundary contraction and gradient-guided local refinement, enabling more directional exploitation within the neighborhood of promising optima and thereby improving the granularity of the resulting parameter solutions. Third, Lévy flight mutation is introduced to provide long-range jumps that strengthen the ability to escape local optima, preventing the search from being trapped in spurious basins under noisy perturbations. Finally, a population-density-based adaptive control is adopted to dynamically perceive and adjust diversity, such that the balance between exploration and exploitation no longer relies solely on the iteration index, but can be adaptively tuned according to the current search state.

In the optimization process of SSA, by simulating the foraging behavior and vigilance mechanism of the sparrow population in nature, this paper constructs an efficient parameter finding strategy framework. In this framework, the finder’s update strategy introduces a staged dynamic position update mechanism, which is formulated as follows:

X_{i}^{t + 1} = {\begin{array}{l} X_{i}^{t} \cdot e x p (- \frac{i}{α (t) T}) + ω (t) \cdot V_{i}^{t} + c_{1} \cdot r_{1} \cdot (X_{p b e s t} - X_{i}^{t}), & R < S T \\ X_{i}^{t} + Q (t) \cdot L \cdot (X_{g b e s t} - X_{i}^{t}) + c_{2} \cdot r_{2} \cdot (X_{a v g} - X_{i}^{t}), & o t h e r w i s e \end{array}

(20)

This updating strategy reflects a hierarchical pattern of behavior in sparrow populations: in the early stages, it relies mainly on the exponential decay factor with the velocity term for large-scale search, i.e., global exploration, while in the later stages, it tends to converge towards the individual optimum and the group average position for local refinement. This position update mechanism not only ensures the convergence of the algorithm but also effectively avoids falling into local optimality.

In order to further enhance the dynamic regulation of the algorithm, a time-dependent adaptive parameter adjustment mechanism is introduced in the following form:

\begin{array}{l} α (t) & = α_{m a x} - (α_{m a x} - α_{m i n}) \cdot {(t / T)}^{2} \\ ω (t) & = ω_{m a x} - (ω_{m a x} - ω_{m i n}) \cdot {(t / T)}^{0.5} \\ Q (t) & = Q_{0} \cdot (1 + β \cdot s i n (π t / T)) \end{array}

(21)

These adjustment functions embody the optimization idea of pre-exploration and post-convergence, in which the parameters decay nonlinearly with the number of iterations, or fluctuate periodically, which helps to adjust the search intensity and scope at different stages of the algorithm, thus realizing a balanced adjustment of the search process.

In terms of strengthening the local search capability, this paper designs a fine search strategy combined with gradient guidance, which is formulated as follows:

X_{l o c a l}^{t + 1} = X_{b e s t}^{t} + λ (t) \cdot \nabla F (X_{b e s t}^{t}) + η \cdot N (0, σ^{2} (t))

(22)

where

λ (t)

is the adaptive step factor,

\nabla F (X)

is the gradient estimate of the fitness function, and

σ^{2} (t)

is the dynamically adjusted Gaussian perturbation variance. Through this multi-level optimization mechanism, the algorithm is able to achieve efficient and refined exploration within the neighborhood of the optimal solution while maintaining a strong global search capability.

In order to be able to provide a wider range of parameter space exploration capabilities while maintaining population diversity, the algorithm further introduces a Lévy flight-based mechanism, which enhances the ability of the algorithm to jump out of the local optimum, as formulated below:

X_{m u t a n t}^{t + 1} = X_{b e s t}^{t} + α \cdot L é v y (β) \otimes (X_{b e s t}^{t} - X_{w o r s t}^{t})

(23)

where

L é v y (β)

denotes the Lévy flight step and

\otimes

denotes the element-by-element product. This mechanism generates long jumps through a non-Gaussian distribution, which helps to break the local convergence state.

In order to dynamically regulate the global exploration and local utilization balance of the algorithm, this paper also introduces an adaptive control mechanism based on population density with the following formula:

D (t) = \frac{1}{N} \sum_{i = 1}^{N} ∥ X_{i}^{t} - X_{m e a n}^{t} ∥

(24)

τ (t) = τ_{0} \cdot \exp (- k \cdot D (t))

(25)

where

D (t)

denotes the population density and

τ (t)

is an adaptive control parameter. Through the dynamic perception of population diversity, the search intensity and search direction are adjusted so as to more finely regulate the balance between exploration and utilization of the algorithm.

The SSA optimization process proposed in this paper mainly consists of four core aspects: parameter space definition, discoverer update, follower update, and sentinel update, where the parameter space covers three key optimization variables, namely, the penalty factor C, the kernel function parameter γ, and the type of the kernel function, and the search ranges of these parameters are set to be C∈[2⁻⁵,2¹⁰] and γ∈[2⁻¹⁵,2⁻⁵].

As shown in Figure 3, in the discoverer update phase, the algorithm explores the optimal solution through a combination of global and local search strategies, where the global search ensures a broad coverage of the search space, while the local search focuses on the fine-grained exploration of the discovered regions of high-quality solutions. The follower update mechanism is mainly responsible for tracking the optimal position and avoiding the worst position, through which the overall distribution of the population is continuously optimized by this dual mechanism. Sentinel updating, on the other hand, ensures stability and effectiveness of the search process through safety checks and boundary control.

3.3. Experimental Dataset

The dataset used in this study is a new dataset constructed based on the laboratory underwater scallop dataset and the datasets provided by Dalian University of Technology for the UCCS and UIQS competitions. It includes scallops, sea urchins, fish, and stones, with 521 images of sea urchins, 532 images of fish, 208 images of stones, and 600 images of scallops, totaling 1861 images. As shown in Figure 4, an overview of part of the dataset is provided. As shown in Figure 5, the underwater scallop data in the dataset were collected in a pool measuring 2 m wide, 1.5 m long, and 1 m deep. The equipment used for capturing images was a GoPro Hero 5 Session action camera, which is waterproof and supports wireless control, commonly used for various underwater photography tasks. To replicate the camera positioning in actual marine operations, the camera was set at a height of 27 cm above the bottom of the pool, tilted at an angle of 30 degrees, and the horizontal distance between the scallops and the camera was maintained at 1 m during data collection.

The experimental dataset contains a total of 1861 color underwater target images, in which the training and test sets are randomly divided according to the ratio of 8:2, and a stratified sampling strategy is used to ensure that the divided dataset maintains the consistency of the category distribution.

In the image preprocessing stage, a series of standardization and enhancement processing steps are used, taking into account the special characteristics of underwater images. Firstly, all the images were uniformly resized to a standard size of 256 × 256 pixels, followed by histogram equalization to improve the image contrast, and finally median filtering was applied to suppress the scattering noise introduced by the underwater environment. To enhance the robustness of the model, data enhancement techniques were also introduced, including random rotation (±15°), translation (±10%), and mirror flipping.

3.4. Experimental Setup

In terms of algorithmic parameter configuration, a circular neighborhood template with a radius of 2 is used for LBP feature extraction, and the number of sampling points is set to 16 in order to achieve the best balance between texture feature expressiveness and computational efficiency. In the KPCA dimensionality reduction process, the number of principal components is determined by the cumulative contribution rate, and the feature dimension that can retain 95% of the information content is finally selected. In the sparrow search algorithm, the population size is set to 50, the maximum number of iterations is 1000, and the initial value of the adaptive weights is set to 0.9. The termination condition is reached if the optimal solution improvement is less than

10^{- 6}

for 50 consecutive generations or the maximum number of iterations is reached.

To ensure the statistical significance of the experimental results, this paper adopts a ten-fold cross-validation method, in which 30 independent repetitions of each set of experiments were conducted to assess the performance stability of the method by calculating the mean and standard deviation. Meanwhile, the differences in the performance of the different methods were statistically analyzed using the paired

t

-test (significance level

α

= 0.05) to verify whether the advantages of the proposed method were statistically significant.

3.5. Evaluation Indicators

In the construction of the evaluation index system, this paper measures the performance of the classification method through multi-dimensional indices such as accuracy, precision, recall, and F1-score.

(1): Accuracy: Accuracy is defined as the proportion of all correctly classified samples to the total number of samples:

A c c u r a c y = \frac{T P + F N}{T P + F P + T N + F N}

(26)

(2): Precision: Precision measures the proportion of samples predicted to be in the positive category that are actually positive:

P r e c i s i o n = \frac{T P}{T P + F P}

(27)

(3): Recall: Recall measures the proportion of all positive class samples that are correctly recognized:

R e c a l l = \frac{T P}{T P + F N}

(28)

(4): F1-score: The F1-Score is a reconciled average of precision and recall:

F_{1} = 2 * \frac{P * R}{P + R}

(29)

4. Results

4.1. Experimental Results and Analysis

4.1.1. LBP Parameter Experiment

In order to explore in depth the mechanism of action of each core component of the model and the extent of its influence on the final performance, the following experiments will illustrate the contribution of each module to the model’s effectiveness. A systematic study was first conducted on the key parameters of the LBP feature extraction module. As shown in Table 1, for a fixed number of samples P = 8, the model reaches optimal performance when the domain radius R takes the value of 2. Too large or too small a domain radius results in performance degradation, a phenomenon that can be explained by the fact that a smaller domain radius may not be able to capture enough texture information, whereas too large a domain radius introduces too much noise interference.

Similarly, as can be seen in Table 2, the model performs best with the number of sampling points P = 16 in the configuration of the optimal domain radius R = 2, and continuing to increase the number of sampling points leads to feature redundancy and affects computational efficiency.

A systematic analysis of the experiments reveals that the performance of the LBP feature extraction module is closely related to the selection of its key parameters, and, in particular, the model exhibits optimal feature extraction with the configuration of domain radius R = 2 and the number of sampling points P = 16. At the same time, experimental data show that either too large or too small a feature extraction range can lead to performance degradation, a phenomenon that can be attributed to the trade-off between the amount of information expressed by the features and the noise interference.

4.1.2. LBP-KPCA Feature Downscaling Experiment

After completing the preliminary LBP parameter experiments and determining the optimal parameter configurations, this paper conducts KPCA-based nonlinear dimensionality reduction experiments using the extracted LBP texture features as inputs, which are used to evaluate the impact of dimensionality reduction on the model performance. Since the LBP operator used is in uniform mode with parameters set to P = 16 and R = 2, the LBP histogram dimension of each image generated is P + 2 = 18. However, the overall feature dimensions can be extended to 128 dimensions or higher after image region segmentation and histogram splicing, so the KPCA dimensionality reduction experiments are conducted with 128 dimensions as the initial dimensionality starting point. The experimental results in terms of KPCA dimensionality reduction are shown in Table 3, where the increase in the dimensionality of the retained features increases, the performance of the model shows a trend of rapid increase and then leveling off. When the dimension is gradually increased from 2 to 8, all the indexes are significantly improved, and the optimal effect is reached at 8 dimensions, and the classification accuracy has broken through the threshold of 90%. This indicates that the 8-dimensional space basically contains the core information of the original features and possesses high differentiation ability. And when the KPCA dimensions continue to rise to 16, 32, and 64, the indicators show different degrees of decline. This indicates that more redundant or noisy information is included in the high-dimensional features, which leads to a decrease in classifier performance.

Under the adopted cumulative-variance criterion, the final KPCA dimensionality can be relatively low (e.g., 8 dimensions). This does not necessarily indicate insufficient dataset diversity; rather, it suggests that the upstream LBP histogram representation is highly structured and contains substantial redundancy, such that a few dominant nonlinear components can capture most of the informative characteristics. In our implementation, the number of KPCA components is selected to retain 95% of the cumulative explained information; hence, the resulting dimensionality is dependent on the data distribution and the chosen representation. For underwater imagery, factors such as illumination fluctuations, scattering-induced noise, and local contrast variations can introduce variance that is weakly discriminative. Removing these components via KPCA can therefore act as an implicit denoising and compactness constraint, which may improve robustness and reduce overfitting. The stable classification performance—including high accuracy/F1 and competitive results compared with deep learning baselines—indicates that, under the current evaluation setting, the low-dimensional embedding still preserves sufficient discriminative information for the considered task.

In the exploration of dimensionality reduction in KPCA, it is experimentally found that when the feature dimension reaches 8, the classification performance tends to stabilize, and the gain brought by further increasing the dimensionality shows a trend of marginal diminution, which provides an important theoretical basis for the selection of dimensionality in the design of the model.

From the perspective of feature representation, this paper provides an in-depth analysis of the impact of LBP feature extraction on sample separability through KPCA dimensionality reduction visualization. As shown in Figure 6, it is found that when only KPCA is used, the samples of different categories show significant overlap in the feature space, which reflects that the raw pixel features struggle to effectively portray the essential characteristics of underwater targets. After the introduction of LBP feature extraction, the samples show a more ideal clustering effect in the reduced dimensional space, which is manifested by a significant increase in the interclass distance and a more compact distribution of the samples within the class. This enhancement of the feature expression ability essentially explains the intrinsic mechanism of the improvement of the classification performance, and also verifies the superiority of the LBP features in the characterization of underwater images.

4.1.3. SSA Optimization Experiment Based on LBP-KPCA Features

After completing the LBP parameter experiments and determining the optimal configuration, this paper takes the extracted LBP texture features as the basis and performs nonlinear dimensionality reduction by KPCA to compress the redundant information and enhance the feature expression ability. Based on this, the research on the optimization strategy of the parameters of the Support Vector Machine is carried out. In order to enhance the performance of the classifier, this paper introduces the sparrow search algorithm (SSA) and conducts multi-strategy improvement and optimization experiments on it in order to enhance its global search capability and convergence effect.

In the in-depth study of optimization strategies, as shown in Table 4, under the framework of the basic sparrow search algorithm, the model achieves an improvement in optimization efficiency and final performance by incorporating a dynamic weight adjustment mechanism, a boundary contraction strategy, and an adaptive mutation operation. In particular, the dynamic weighting strategy improves the accuracy to 91% with a simultaneous increase in precision and recall. The dynamic weighting strategy improves classification accuracy for positive classes and reduces the missed detection rate by expanding the search span early and refining the pace later. This approach enables the algorithm to find more effective parameters. The boundary shrinkage strategy also improves performance with an accuracy of 91%, which is slightly lower than the dynamic weighting strategy, but shows that the boundary shrinkage mechanism also obtains better model parameters by enhancing the local optimization. However, the reason the boundary shrinkage strategy is slightly less effective than dynamic weighting is that premature shrinkage of the search space slightly limits the algorithm’s ability to find better solutions in distant domains.

The adaptive mutation strategy showed the best improvement with an accuracy of 92%, demonstrating an overall trend of outperforming the aforementioned single strategy. The mutation mechanism endows the algorithm with the ability to jump out of the local optimum, enabling it to explore more appropriate parameter combinations. The integrated optimization strategy finally adopted achieves a classification accuracy of 94% through the synergy of various improvement means, reflecting the complementary effect of the advantages of the synergy of various improvement strategies. Dynamic weighting ensures that the algorithm explores no dead ends, boundary contraction ensures convergence accuracy, adaptive mutation provides a guarantee to jump out of local traps, and the combination of the three enables the algorithm to find the parameter that optimizes the four classes of image classification equilibrium in a more efficient way.

To determine whether the improvements introduced into the SSA yield statistically significant advantages over the standard SSA, a rigorous evaluation should be conducted under paired and repeated-run experimental settings. Specifically, both the standard SSA and the improved SSA should be executed using the same data splits and the same number of independent runs. The results from each run can then be compared using paired significance tests, such as the paired t-test.

In this study, we report the means and standard deviations from repeated runs. The results show that the proposed model consistently achieves higher performance with limited variation across runs. The observed magnitude of improvement exceeds the reported variability, indicating that the performance gain is stable and not merely an occasional outcome due to random initialization. Therefore, under the current experimental protocol, the results provide sufficient empirical evidence that the improved SSA enhances both optimization stability and classification performance.

4.2. Ablation Experiments of Different SVM Models

As shown in Table 5, the benchmark model only uses the traditional combination of KPCA and SVM, and although it has some feature extraction and classification ability, its 55% classification accuracy rate indicates that its performance is more limited in complex underwater environments; the KPCA+SSA+SVM model after the introduction of the improved sparrow search algorithm significantly improves the classification performance through adaptive parameter optimization, with an accuracy rate of 67.3%. Further, when the LBP texture feature extraction mechanism is introduced, the classification accuracy of the model shows a leapfrog improvement to 92.1%. The final complete model is optimized by combining LBP feature extraction and sparrow search algorithm, and achieves optimal performance in all evaluation indices, such as accuracy, precision, recall, and F1-score, in which the classification accuracy is as high as 94.37%, which is nearly 40 percentage points higher compared to the benchmark model.

Through the in-depth analysis of different model experiments, the underwater image classification method based on LBP-KPCA with the sparrow search algorithm proposed in this paper shows significant performance advantages in multiple dimensions. The experimental results verify the synergistic effect of feature fusion and parameter optimization in improving the classification performance, in which the introduction of LBP feature extraction significantly improves the model accuracy from 55% to 92.13% in the baseline, while the further optimization of the sparrow search algorithm increases the accuracy to 94.37%.

To ensure the reproducibility and fairness of runtime evaluation, this study employs a repeated statistical approach for timing experiments. The specific settings are as follows: the number of running rounds is set to 10 to obtain stable average training and inference time statistics, and one warm-up round is included to avoid slower performance in the initial round due to initialization. Under this configuration, the final model, LBP+KPCA+SSA+SVM, achieves an average training time of 806.45 s and a total inference time on the test set of 0.0836 s, with an average inference time per image of 0.2241 ms/image. In comparison, the baseline KPCA+SVM model exhibits training and inference times of 4.098 s and 3.269 s (8.765 ms/image), respectively. After introducing LBP, the training time reduces to 0.647 s, and the inference time significantly decreases to 0.039 s (0.106 ms/image). This indicates that the texture histogram representation provided by LBP results in more compact and discriminative input features, thereby reducing the computational burden in the subsequent KPCA mapping and SVM classification stages. When SSA is further introduced for hyperparameter optimization, the inference time remains in the sub-millisecond range at 0.0836 s (0.2241 ms/image), demonstrating that SSA operates solely during the offline training phase and does not introduce additional inference complexity. However, the average training time increases significantly to 806.45 s. This is because SSA requires repeated training and evaluation of SVM across multiple generations and population assessments, leading to a training cost that roughly accumulates with population size multiplied by the number of iterations. Therefore, this timing experiment clearly illustrates that feature representation optimization contributes to inference acceleration, while swarm intelligence-based search increases training costs without affecting the time structure of deployment inference.

4.3. Comparative Experiment

To evaluate the performance of the proposed LBP-KPCA-SSA-SVM model, this experiment compares it with different deep learning models. For a fair comparison, the same dataset used in the proposed method is also adopted in the comparative experiments. The models used for comparison in the experiment include the deep residual network ResNet50, commonly used as a standard CNN baseline; the efficient convolutional neural network EfficientNet, often employed as a lightweight and efficient deep CNN baseline; and the Vision Transformer network. The experimental results are shown in Table 6.

From the comparative results in Table 6, it can be observed that under the same dataset and experimental settings, the proposed LBP-KPCA-SSA-SVM model achieves the best performance across all four evaluation metrics. EfficientNet attains an accuracy of 79.5%. Its overall lower performance primarily reflects that, while its compound scaling strategy is efficient, its relatively compact representational capacity may be insufficient to stably separate class boundaries in the presence of common underwater imaging challenges such as low contrast and scattering blur. Consequently, it shows limitations in both recall and precision. ResNet50 achieves an improved accuracy of 90.1%. Its residual structure facilitates learning deeper local texture and edge features while mitigating gradient degradation, thereby enhancing the model’s precision. However, its recall remains lower compared to the Transformer model, indicating that relying solely on the local receptive fields of convolutions may still be inadequate for covering some difficult samples when faced with cross-scale morphological variations or background interference. The Transformer model further reaches an accuracy of 91.2%, with recall and F1-score surpassing those of ResNet50. This suggests that its self-attention mechanism can more effectively model global dependencies, thereby improving recognition completeness for easily confusable classes and weakly salient targets. Nonetheless, its accuracy is still lower than that of the proposed method, implying that the benefits of global modeling in this task contribute more to reducing missed detections rather than false detections.

In contrast, the proposed LBP+KPCA+SSA+SVM model achieves optimal results across all four metrics, with scores of 0.95, 0.92, 0.93, and 0.943, respectively. It outperforms the Transformer model by 3% in precision and 3.1% in accuracy while maintaining a high recall, demonstrating superior comprehensive performance. This can be explained by the synergistic mechanism of representation, compression, classification, and optimization. Firstly, LBP encodes the local texture structure of underwater targets into statistical histogram features, which are robust to illumination variations and local contrast fluctuations. This helps suppress misjudgments caused by background scattering noise. Secondly, KPCA compresses key discriminative components in a nonlinear manner, preserving essential information while reducing redundant and noisy dimensions, thereby improving inter-class separability and enhancing generalization. Thirdly, the maximum-margin discrimination of SVM provides greater stability for small-sample and high-noise scenarios. Furthermore, the improved SSA’s global search for parameters C and γ effectively avoids underfitting issues arising from empirical parameter settings, enabling the decision boundary to achieve a balance between high precision and high recall.

Thus, although the proposed method belongs to the machine learning paradigm, it combines interpretable texture features with nonlinear feature learning and incorporates intelligent optimization for adaptive parameter tuning, ultimately achieving classification performance comparable to or better than mainstream deep learning models. This validates the effectiveness of the proposed model for underwater complex imaging conditions and scenarios with limited sample sizes.

4.4. Limitations and Future Work

(1): A limitation of this study is that the experimental validation is conducted on the current class set, and the scalability of the proposed framework under a larger or more fine-grained category taxonomy has not yet been systematically investigated. As the number of classes increases and inter-class similarity becomes stronger, the classification problem typically requires more discriminative representations and more stable hyperparameter configurations. In future work, we will expand the dataset by incorporating additional target categories and finer-grained labels, and perform comprehensive scalability analyses, including performance evolution with increasing class cardinality, robustness under fine-grained ambiguities, and computational cost scaling.
(2): Another limitation of this study is that the robustness of the proposed framework in unseen underwater environments has not yet been systematically evaluated. Underwater images may undergo substantial domain shifts due to variations in illumination, turbidity, color attenuation, and background clutter, which can adversely affect the model’s generalization in practical applications. In future work, we will conduct comprehensive cross-environment evaluations under domain-shift experimental protocols by testing on one or more standard public benchmark datasets as well as on independently collected datasets acquired under distinct underwater conditions. We will also explore robustness enhancement techniques, such as environment-invariant feature modeling and domain adaptation strategies, to improve the reliability of the proposed model in real-world underwater scenarios and to strengthen the reproducibility and comparability of our results.
(3): Another limitation of this study is that, although we report consistent performance improvements on the considered dataset and provide comparisons with several deep learning baselines, the achieved performance has not yet been systematically compared with recent state-of-the-art (SOTA) results for underwater image classification under fully matched benchmark protocols. In future work, we will prioritize evaluating our framework on one or more standard public underwater image benchmarks and conducting protocol-consistent comparisons with the latest SOTA methods, including modern deep CNN and hybrid architectures. This will enable a more rigorous assessment of the competitiveness, generalization capability, and statistical relevance of the observed accuracy improvements beyond the current experimental setting.

5. Conclusions

In this paper, an underwater image classification method based on the optimization of LBP-KPCA feature fusion and sparrow search algorithm is proposed, and through systematic theoretical analysis and experimental validation, it achieves significant performance improvement in the underwater target classification task. It is shown that the LBP-KPCA feature fusion framework plays a key role in enhancing the texture features of underwater images, in which the local binary pattern operator significantly improves the discriminative ability of the features by effectively capturing the local texture structure of the target, while the kernel principal component analysis further strengthens the robustness of the feature expression through the nonlinear mapping. Such a multilevel feature extraction strategy allows the model to maintain stable classification performance even in complex underwater environments.

In terms of parameter optimization, the sparrow search algorithm shows excellent global search capability and fast convergence characteristics, and realizes the adaptive optimization of support vector machine parameters by simulating the foraging and vigilance behaviors of sparrow groups, in which the introduced dynamic weight adjustment and boundary contraction mechanism further enhance the search efficiency of the algorithm, so that the model can obtain the optimal parameter configurations within a smaller number of iterations. The experimental results show that compared with the traditional optimization method, the improved sparrow search algorithm not only converges faster, but also can effectively avoid falling into the local optimum, which provides an efficient and reliable solution for solving the parameter optimization problem in underwater image classification.

The main contributions of this study to the field of underwater image classification are reflected in the following aspects. Firstly, a novel feature fusion framework is proposed, which effectively solves the key problems in underwater image feature extraction through the organic combination of LBP and KPCA. Secondly, an adaptive parameter optimization strategy based on the sparrow search algorithm is designed, which significantly improves the performance of the model. Lastly, a complete experimental evaluation system is constructed, which provides reliable experimental benchmarks and optimization directions for the subsequent research. Experimental validation shows that the proposed method improves the classification accuracy from 55% in the benchmark to 94.37%, which is a significant improvement in all evaluation metrics.

From the practical application point of view, the method proposed in this paper has a broad application prospect, which can effectively support important tasks such as autonomous navigation of underwater robots, marine resource exploration, ecological environment monitoring, etc. Its excellent classification performance and low computational complexity make it particularly suitable for deployment in practical underwater operation systems.

Author Contributions

H.L.: conceptualization, data curation, formal analysis, methodology, writing—original draft preparation; S.L.: conceptualization, methodology, resources, writing—review and editing, supervision, funding acquisition; X.C.: data curation, formal analysis, software; Q.Z.: methodology, validation, software; Z.M.: data curation, validation. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Major Research Plan of the National Natural Science Foundation of China (51778104) and the Department of Education of Liaoning Province (DL202005).

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author due to privacy concerns.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Lowe, D.G. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 2004, 60, 91–110. [Google Scholar] [CrossRef]
Dalal, N.; Triggs, B. Histograms of oriented gradients for human detection. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, 20–25 June 2005; IEEE: New York, NY, USA, 2005; Volume 1, pp. 886–893. [Google Scholar] [CrossRef]
Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Akhtarshenas, A.; Toosi, R. An open-set framework for underwater image classification using autoencoders. SN Appl. Sci. 2022, 4, 229. [Google Scholar] [CrossRef]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25, 1097–1105. [Google Scholar] [CrossRef]
Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; IEEE: New York, NY, USA, 2017; pp. 2261–2269. [Google Scholar] [CrossRef]
Li, J.; Skinner, K.A.; Eustice, R.M.; Johnson-Roberson, M. WaterGAN: Unsupervised generative network to enable real-time color correction of monocular underwater images. IEEE Robot. Autom. Lett. 2018, 3, 387–394. [Google Scholar] [CrossRef]
Fu, S.; Xu, F.; Liu, J. Detection of Small Underwater Targets by Combining Region Extraction with Improved Convolutional Neural Networks. J. Appl. Acoust. 2023, 42, 1280–1288. [Google Scholar]
Wang, R.; Jiang, Z. Underwater Target Detection Algorithm Based on Improved CenterNet. Laser Optoelectron. Prog. 2023, 60, 0215001. [Google Scholar]
Cai, K.; Miao, X.; Wang, W.; Pang, H.; Liu, Y.; Song, J. A Modified YOLOv3 Model for Fish Detection Based on MobileNetv1 as Backbone. Aquac. Eng. 2020, 91, 102117. [Google Scholar] [CrossRef]
Wang, X.; Sun, M.; Liu, X. A Survey on Track-Before-Detect Algorithms for Underwater Target Detection. J. Ordnance Equip. Eng. 2024, 45, 99–108. [Google Scholar]
Yuan, H.; Zhang, S. Underwater Fish Target Detection Method Based on Faster R-CNN and Image Enhancement. J. Dalian Ocean. Univ. 2020, 35, 612–619. [Google Scholar]
Chen, X.; Yang, Q.; Yao, H. Underwater Target Detection Algorithm Based on Fine-Grained Feature Perception. J. Shaanxi Univ. Sci. Technol. 2024, 42, 177–183. [Google Scholar]
Zhang, T.; Zhu, T.; Han, M.; Chen, F.; Li, J.; Zhou, W.; Yu, P.S. Fairness in graph-based semi-supervised learning. Knowl. Inf. Syst. 2023, 65, 543–570. [Google Scholar] [CrossRef]
Raj, M.V.; Murugan, S.S. Underwater image classification using machine learning technique. In Proceedings of the 2019 International Symposium on Ocean Technology (SYMPOL), Ernakulam, India, 11–13 December 2019; IEEE: New York, NY, USA, 2019; pp. 166–173. [Google Scholar] [CrossRef]
Muthusamy, M.; Sujatha, M.; Kongara, G.; Penumudi, H.G.; Palle, M.R. Under water image detection using hybrid K-tree algorithm. AIP Conf. Proc. 2022, 2452, 090007. [Google Scholar] [CrossRef]
Demir, K.; Yaman, O. Projector deep feature extraction-based garbage image classification model using underwater images. Multimed. Tools Appl. 2024, 83, 79437–79451. [Google Scholar] [CrossRef]
Game, C.A.; Thompson, M.B.; Finlayson, G.D. Machine learning for non-experts: A more accessible and simpler approach to automatic benthic habitat classification. Ecol. Inform. 2024, 81, 102619. [Google Scholar] [CrossRef]
Mittal, S.; Srivastava, S.; Jayanth, J.P. A survey of deep learning techniques for underwater image classification. IEEE Trans. Neural Netw. Learn. Syst. 2023, 34, 6968–6982. [Google Scholar] [CrossRef] [PubMed]
Mika, S.; Ratsch, G.; Weston, J.; Scholkopf, B.; Mullers, K.R. Fisher discriminant analysis with kernels. In Proceedings of the Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop, Madison, WI, USA, 25 August 1999; IEEE: New York, NY, USA, 1999; pp. 41–48. [Google Scholar] [CrossRef]
Roweis, S.; Saul, L. Nonlinear dimensionality reduction by locally linear embedding. Science 2000, 290, 2323–2326. [Google Scholar] [CrossRef] [PubMed]
Lever, J.; Krzywinski, M.; Altman, N. Principal component analysis. Nat. Methods 2017, 14, 641–642. [Google Scholar] [CrossRef]
Başar, R.; Ocak, Ö.; Erturk, A.; de la Roche, M. Leveraging Machine Learning Techniques to Predict Cardiovascular Heart Disease. Information 2025, 16, 639. [Google Scholar] [CrossRef]
Bi, P.; Xu, J.; Du, X.; Li, J.; Chen, G. l2,p-norm sequential bilateral 2DPCA: A novel robust technology for underwater image classification and representation. Neural Comput. Appl. 2020, 32, 17027–17041. [Google Scholar] [CrossRef]
Usman, A.M.; Versfeld, D.J.J. Principal components-based hidden Markov model for automatic detection of whale vocalisations. J. Mar. Syst. 2024, 242, 103941. [Google Scholar] [CrossRef]
Jiang, J.; Ma, J.; Chen, C.; Wang, Z.; Cai, Z.; Wang, L. SuperPCA: A superpixelwise PCA approach for unsupervised feature extraction of hyperspectral imagery. IEEE Trans. Geosci. Remote Sens. 2018, 56, 4581–4593. [Google Scholar] [CrossRef]
Kent, M.; Parkinson, T.; Kim, J.; Schiavon, S. A data-driven analysis of occupant workspace dissatisfaction. Build. Environ. 2021, 205, 108270. [Google Scholar] [CrossRef]
Padmavathi, G.; Muthukumar, M.; Thakur, S.K. Kernel principal component analysis feature detection for underwater images. Adv. Mater. Res. 2010, 129–131, 953–958. [Google Scholar]
Ojala, T.; Pietikäinen, M.; Harwood, D. A comparative study of texture measures with classification based on featured distributions. Pattern Recognit. 1996, 29, 51–59. [Google Scholar] [CrossRef]
Xue, J.; Shen, B. A novel swarm intelligence optimization approach: Sparrow search algorithm. Syst. Sci. Control. Eng. 2020, 8, 22–34. [Google Scholar] [CrossRef]
Vapnik, V.N. Statistical Learning Theory; Wiley: New York, NY, USA, 1998. [Google Scholar]
Ojala, T.; Pietikäinen, M.; Mäenpää, T. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 24, 971–987. [Google Scholar] [CrossRef]

Figure 1. Flowchart of the underwater image classification algorithm.

Figure 2. LBP-KPCA feature extraction pipeline for underwater image classification.

Figure 3. Sparrow search algorithm optimization process.

Figure 4. Example underwater images used for testing the optimization process.

Figure 5. Scallop dataset collection diagram.

Figure 6. Comparison of KPCA downscaling visualization.

Table 1. Effect of domain radius R on model performance.

R	Precision	Recall	F1-Score	Accuracy
1	0.85	0.81	0.83	0.84
2	0.86	0.84	0.85	0.86
3	0.85	0.82	0.84	0.85

Table 2. Effect of the number of sampling points P on model performance.

P	Precision	Recall	F1-Score	Accuracy
8	0.86	0.84	0.85	0.86
16	0.89	0.87	0.88	0.88
24	0.88	0.86	0.87	0.87

Table 3. Effect of KPCA dimensions on model performance.

Dimension	Precision	Recall	F1-Score	Accuracy
128	0.89	0.86	0.87	0.88
64	0.90	0.87	0.87	0.89
32	0.90	0.87	0.88	0.89
16	0.90	0.88	0.89	0.90
8	0.92	0.90	0.90	0.92
4	0.86	0.82	0.84	0.85
2	0.76	0.73	0.74	0.76

Table 4. Contribution of SSA optimization strategies based on LBP-KPCA features.

Strategy	Precision	Recall	F1-Score	Accuracy
Basic SSA	0.90	0.87	0.88	0.90
Dynamic weighting	0.92	0.88	0.89	0.91
Boundary contraction	0.91	0.88	0.89	0.91
Adaptive variation	0.93	0.90	0.91	0.92
Integrated optimization	0.95	0.92	0.93	0.94

Table 5. Performance comparison of different model configurations.

Model	Precision	Recall	F1-Score	Accuracy
KPCA+SVM	0.55	0.52	0.53	0.557
KPCA+SSA+SVM	0.68	0.64	0.66	0.673
LBP+KPCA+SVM	0.92	0.90	0.90	0.921
LBP+KPCA+SSA+SVM	0.95	0.92	0.93	0.943

Table 6. Comparative results of classification models.

Model	Precision	Recall	F1-Score	Accuracy
EfficientNet	0.76	0.73	0.74	0.795
RESNet50	0.92	0.84	0.86	0.901
Transformer	0.92	0.90	0.91	0.912
LBP+KPCA+SSA+SVM	0.95	0.92	0.93	0.943

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, H.; Li, S.; Zhou, Q.; Ma, Z.; Chen, X. Underwater Image Classification Based on LBP-KPCA Combined with SSA-SVM Approach. Information 2026, 17, 229. https://doi.org/10.3390/info17030229

AMA Style

Li H, Li S, Zhou Q, Ma Z, Chen X. Underwater Image Classification Based on LBP-KPCA Combined with SSA-SVM Approach. Information. 2026; 17(3):229. https://doi.org/10.3390/info17030229

Chicago/Turabian Style

Li, Han, Songsong Li, Qiaozhen Zhou, Zhongsong Ma, and Xiaoming Chen. 2026. "Underwater Image Classification Based on LBP-KPCA Combined with SSA-SVM Approach" Information 17, no. 3: 229. https://doi.org/10.3390/info17030229

APA Style

Li, H., Li, S., Zhou, Q., Ma, Z., & Chen, X. (2026). Underwater Image Classification Based on LBP-KPCA Combined with SSA-SVM Approach. Information, 17(3), 229. https://doi.org/10.3390/info17030229

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Underwater Image Classification Based on LBP-KPCA Combined with SSA-SVM Approach

Abstract

1. Introduction

2. Fundamental Principles

2.1. Kernel Principal Component Analysis

2.2. Support Vector Machine for Classification

2.3. Sparrow Search Algorithm

3. Materials and Methods

3.1. Feature Extraction Method by Fusing LBP-KPCA

3.2. SVM Parameter Optimization Method Based on Improved SSA

3.3. Experimental Dataset

3.4. Experimental Setup

3.5. Evaluation Indicators

4. Results

4.1. Experimental Results and Analysis

4.1.1. LBP Parameter Experiment

4.1.2. LBP-KPCA Feature Downscaling Experiment

4.1.3. SSA Optimization Experiment Based on LBP-KPCA Features

4.2. Ablation Experiments of Different SVM Models

4.3. Comparative Experiment

4.4. Limitations and Future Work

5. Conclusions

Author Contributions

Funding

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI