An Improved ORB-KNN-Ratio Test Algorithm for Robust Underwater Image Stitching on Low-Cost Robotic Platforms

Yi, Guanhua; Zhang, Tianxiang; Chen, Yunfei; Yu, Dapeng

doi:10.3390/jmse14020218

Open AccessArticle

An Improved ORB-KNN-Ratio Test Algorithm for Robust Underwater Image Stitching on Low-Cost Robotic Platforms

by

Guanhua Yi

,

Tianxiang Zhang

,

Yunfei Chen

and

Dapeng Yu

^*

School of Information and Communication Engineering, Dalian University of Technology, Dalian 116024, China

^*

Author to whom correspondence should be addressed.

J. Mar. Sci. Eng. 2026, 14(2), 218; https://doi.org/10.3390/jmse14020218

Submission received: 16 December 2025 / Revised: 9 January 2026 / Accepted: 15 January 2026 / Published: 21 January 2026

(This article belongs to the Section Ocean Engineering)

Download

Browse Figures

Versions Notes

Abstract

Underwater optical images often exhibit severe color distortion, weak texture, and uneven illumination due to light absorption and scattering in water. These issues result in unstable feature detection and inaccurate image registration. To address these challenges, this paper proposes an underwater image stitching method that integrates ORB (Oriented FAST and Rotated BRIEF) feature extraction with a fixed-ratio constraint matching strategy. First, lightweight color and contrast enhancement techniques are employed to restore color balance and improve local texture visibility. Then, ORB descriptors are extracted and matched via a KNN (K-Nearest Neighbors) nearest-neighbor search, and Lowe’s ratio test is applied to eliminate false matches caused by weak texture similarity. Finally, the geometric transformation between image frames is estimated by incorporating robust optimization, ensuring stable homography computation. Experimental results on real underwater datasets show that the proposed method significantly improves stitching continuity and structural consistency, achieving 40–120% improvements in SSIM (Structural Similarity Index) and PSNR (peak signal-to-noise ratio) over conventional Harris–ORB + KNN, SIFT (scale-invariant feature transform) + BF (brute force), SIFT + KNN, and AKAZE (accelerated KAZE) + BF methods while maintaining processing times within one second. These results indicate that the proposed method is well-suited for real-time underwater environment perception and panoramic mapping on low-cost, micro-sized underwater robotic platforms.

Keywords:

underwater image stitching; image enhancement; underwater robotics; feature-based matching

1. Introduction

Underwater imaging is a fundamental technology in marine exploration and resource development [1]. Its primary value lies in acquiring high-resolution and visually interpretable underwater information, which provides an essential foundation for tasks such as underwater target recognition [2], environmental perception, ecological monitoring, and resource investigation. However, the quality of underwater optical images is inherently constrained by the physical characteristics of light propagation in water. Wavelength-dependent absorption causes rapid attenuation of red light, while forward and backward scattering blur image details and reduce contrast, as illustrated in Figure 1 [3]. With the advancement of marine sensing and imaging technologies, underwater image acquisition techniques have undergone substantial improvements [4]. In particular, underwater robotic platforms, especially low-cost micro-sized underwater vehicles [4], have become crucial carriers for image acquisition and environmental interaction. These platforms help mitigate challenges traditionally associated with underwater imaging, including operational complexity, high risk, and elevated deployment costs [5]. They have been successfully deployed in diverse applications such as offshore oil and gas exploration, marine scientific research, underwater archaeology, and aquaculture monitoring [1,6].

Integrating underwater robotic platforms with image stitching algorithms enables the real-time fusion of narrow field-of-view image sequences into large-scale panoramic representations, without the need for additional high-cost optical equipment [7,8]. This approach substantially increases the amount of visual and spatial information available compared to individual image frames [9]. Consequently, this technology is becoming a key enabler for underwater surveying, deep-sea resource exploration, and marine ecological conservation, while also serving as an essential perceptual basis for autonomous decision-making and path planning in underwater robotic systems [8,10].

Currently, feature point-based image stitching is the most widely adopted approach, which primarily consists of feature extraction and feature matching. As the front-end stage of image stitching, feature extraction critically influences subsequent matching accuracy, robustness, and computational efficiency by transforming raw pixel data into geometrically meaningful and repeatable structures [11]. Typically, corner detection is used to identify stable feature points, and feature descriptors are then employed to encode their local appearance into compact vector representations [12]. Feature matching is subsequently performed to establish correspondences between features originating from the same physical 3D point across different images, generally using descriptor distance metrics followed by geometric consistency verification to eliminate unreliable matches [13]. The integration of feature extraction and matching constitutes the core computational basis of visual perception systems. It represents the fundamental and critical first stage of image stitching, and serves as a key technical foundation for downstream tasks such as SLAM (Simultaneous Localization and Mapping), 3D reconstruction, medical image fusion, and industrial inspection [14].

Despite the extensive development of image stitching algorithms based on handcrafted features, these methods continue to face substantial challenges in practical underwater applications. For example, Liao et al. [15] proposed an image stitching method based on image segmentation and multi-anisotropic transformation, which achieves visually satisfactory results but suffers from high computational complexity and substantial resource consumption, limiting its practical applicability. Altuntas et al. [16] proposed a feature point-based dense image matching algorithm for terrestrial applications, demonstrating strong performance in ground-based environments. However, its performance degrades significantly under harsh visual conditions, where the number of reliable feature correspondences is drastically reduced, making the algorithm unsuitable for direct application in underwater settings. Similarly, Zhang et al. [17] introduced a high-precision panoramic stitching framework using robust feature matching to generate wide field-of-view hyperspectral mosaics. Although this method performs effectively on hyperspectral datasets, its adaptability to low-light, low-texture underwater environments remains limited.

To address these challenges, this study first conducts a document analysis to evaluate the research significance and development trends of underwater image stitching. Furthermore, to expand the perceptual field of view of underwater robots and enhance their environmental awareness, the study focuses on the practical scenario of acquiring and stitching optical images using low-cost micro-sized underwater robots. Inspired by the classical underwater image stitching framework proposed by Gracias et al. [18], an adapted stitching pipeline suitable for underwater robotic platforms is implemented. Building upon this foundation, an ORB-KNN–Ratio Test-based image stitching method is introduced. The proposed method serves as a lightweight solution tailored to the characteristics of low-texture underwater environments. It enhances the detectability of local texture features through Gray-World white balance and CLAHE (contrast-limited adaptive histogram equalization)-based contrast enhancement. When combined with ORB feature descriptors and a fixed-ratio correspondence filtering strategy, the method effectively suppresses homography estimation errors caused by mismatches in weakly textured regions, thereby improving the stability and accuracy of the stitching process. Although KNN-based matching strategies combined with ratio constraints have been explored in research areas such as target recognition [19], their effectiveness in homography estimation for underwater image stitching under weak-texture and color-degraded conditions has not been sufficiently validated. Moreover, correspondences that may be tolerable in recognition tasks can lead to catastrophic distortions in homography estimation during image stitching, where strict geometric consistency is required. Existing ORB-based studies have largely overlooked this issue. In this study, the proposed matching strategy is specifically designed to suppress geometrically inconsistent correspondences, thereby enabling robust homography estimation in complex underwater environments and preventing severe distortions of global alignment that may otherwise lead to stitching failure. Moreover, under constrained computational resources, the method achieves real-time performance while maintaining high stitching accuracy and reliability.

The main contributions of this paper are as follows:

(1): This paper presents a complete, lightweight, and robust underwater image stitching framework, specifically designed for underwater robotic platforms with stringent real-time constraints and limited computational resources.
(2): An enhanced ORB-based feature matching strategy is proposed. A lightweight color and contrast enhancement scheme is first applied to improve feature detectability, followed by KNN-based matching with a ratio-test constraint to suppress false correspondences. Compared with the conventional ORB approach, the proposed strategy significantly increases the number of reliable feature points while improving robustness and matching accuracy.
(3): A practical and reproducible underwater evaluation protocol is established and validated using real-world data collected from an underwater robotic platform. PSNR and SSIM are computed exclusively within overlapping regions, and a detailed runtime analysis is provided, demonstrating the effectiveness, real-time performance, and applicability of the proposed method in real underwater environments.

2. Related Works

Underwater images serve as an essential medium for underwater information transmission and constitute a fundamental basis for environmental perception by underwater robotic systems [20]. Due to the unique properties of the aquatic medium, light undergoes significant absorption and scattering during underwater propagation, resulting in color distortion, image blurring, geometric ghosting, and sparse or even missing texture details. These degradations lead to a reduced number of extractable features, weakened feature discriminability, and an increased mismatch rate, ultimately causing poor geometric consistency and insufficient robustness in stitched images [21].

At present, feature point-based stitching methods remain the dominant paradigm. This pipeline typically consists of feature extraction, feature matching, and homography estimation. Commonly used feature extraction methods include SIFT, SURF, ORB, and AKAZE. Among them, ORB offers a favorable balance between fast detection and matching speed and sufficient feature extraction capability under challenging conditions, while maintaining robustness. These characteristics make ORB particularly suitable for resource-constrained embedded platforms [22].

KNN-based matching combined with ratio constraints has been widely applied in target recognition and image retrieval tasks. For example, Xie et al. [19] integrated ORB feature extraction with ratio-constrained KNN matching for target recognition under degraded illumination conditions. However, these studies differ fundamentally from the present work in terms of objectives and evaluation criteria. Although the method of Xie et al. [19] effectively suppresses ambiguous descriptor-level matches and enhances descriptor discriminability, its effectiveness for homography estimation in underwater image stitching under weak-texture and color-degraded conditions has not been sufficiently validated. Moreover, target recognition and image retrieval tasks can tolerate a certain level of mismatches and generally do not require strict geometric or global consistency. In contrast, image stitching is highly sensitive to geometric inconsistency, where even a small number of incorrect correspondences may severely distort homography estimation and lead to stitching failure. Therefore, the evaluation of image stitching focuses on geometric consistency rather than recognition accuracy, a requirement that has not been addressed in existing ORB-based recognition studies.

To address these challenges, this paper proposes a lightweight and robust algorithm specifically designed for underwater image stitching. The proposed method integrates ORB features with fixed-ratio KNN matching and tightly couples them with RANSAC-based homography estimation, aiming to reduce mismatches, suppress geometrically inconsistent correspondences, and achieve stable, real-time, and lightweight image stitching in complex underwater environments.

Problem Statement

The objective of this study is to design a lightweight and robust image stitching algorithm capable of achieving visually coherent results in underwater environments. In addition, a practical and reproducible evaluation protocol is established to quantitatively assess the stitching performance.

To achieve this objective while ensuring geometric consistency and stability throughout the stitching process, several challenges must be addressed. Due to the unique characteristics of underwater environments, input images often suffer from low contrast, uneven illumination, weak texture details, and severe color distortion. Furthermore, the limited computational resources of underwater robotic platforms, together with stringent real-time requirements, impose additional constraints on image stitching algorithms. Under such conditions, conventional approaches are prone to insufficient feature extraction, inaccurate feature matching, high mismatching rates, unreliable homography estimation, and unstable feature point detection.

Therefore, the central problem addressed in this work is how to achieve real-time, stable, and robust image stitching under degraded underwater imaging conditions and limited computational resources, while ensuring that the stitching results exhibit good geometric consistency and satisfactory visual quality for human perception.

3. Algorithm Introduction

3.1. Incremental Image Splicing Framework

Due to the inherent properties of the underwater environment, optical imagery captured underwater is frequently affected by light attenuation, turbidity, and insufficient texture information. Moreover, underwater robotic platforms often experience unreliable and imprecise localization, making it difficult to infer spatial positioning and environmental structure from single-frame images alone. This study draws inspiration from the underwater mosaic construction framework introduced by Gracias and Santos-Victor [18]. Their work is regarded as a foundational benchmark in this field, as it introduced a mosaic construction workflow suitable for underwater imaging conditions characterized by low texture, uneven illumination, and severe light scattering.

3.2. ORB Algorithm Principle

For feature extraction and matching, the ORB (Oriented FAST and Rotated BRIEF) algorithm, proposed by Rublee et al. [23] in 2011, is employed. ORB integrates FAST corner detection with BRIEF descriptors while incorporating orientation correction and multi-scale features, enhancing its adaptability to rotation and scale variations.

Its key advantages—fast computation and high robustness—make ORB a fundamental tool for underwater robotic platforms. The algorithm comprises two main components: o-FAST corner detection and r-BRIEF feature description, which will be detailed in the following sections.

3.2.1. o-FAST Corner Detection

During feature point detection, ORB employs the FAST algorithm for corner extraction [20]. The core principle is to construct a circular neighborhood centered on the target pixel p. If n consecutive pixels within this neighborhood satisfy the brightness condition in Equation (1):

I (x) - I (p) > t or I (x) - I (p) < - t

(1)

where I(x) grayscale value of the pixel in the neighborhood, I(p) is the grayscale value of the center pixel, t is the brightness difference threshold, x is the pixel within the neighborhood, and p is the pixel to be detected.

By iteratively applying this calculation across the image, all corner points within the region can be identified [24].

To improve the stability and selection accuracy of FAST corners, the Harris corner response function is employed to rank and filter the detected points. The Harris response R is defined as follows (Equation (2)):

R = d e t (M) - k \cdot t r a c e {(M)}^{2}

(2)

where ‘R’ is the Harris response score, ‘det(M)’ is the determinant of matrix ‘M’, ‘trace(M)’ is the trace of matrix ‘M’, ‘k’ is an empirical constant, and ‘M’ is the second-order (2 × 2) autocorrelation matrix defined by Equation (3):

M = [\begin{matrix} I_{x}^{2} & I_{x} I_{y} \\ I_{x} I_{y} & I_{y}^{2} \end{matrix}]

(3)

where I_x is the gradient component of the image in the x direction, and I_y is the gradient component of the image in the y direction.

The Harris response values are used to rank the feature points extracted by the FAST algorithm. The top N corners with the highest responses are selected as the final feature points. This procedure mitigates the influence of edge points and enhances the accuracy of subsequent feature description and matching [24].

To achieve rotation invariance, the principal orientation of each feature point is estimated using the gray-level centroid method. The core idea is to compute the centroid of the intensity values within the feature point’s neighborhood and define the direction of the line connecting this centroid to the feature point as its principal orientation. The orientation angle θ is calculated as follows (Equation (4)):

θ = a r c t a n (\frac{\sum (x, y) \in S^{y \cdot I (x, y)}}{\sum (x, y) \in S^{x \cdot I (x, y)}})

(4)

where S denotes the feature point’s neighborhood, θ represents the principal direction angle of the feature point, and I(x, y) is the grayscale value of pixel (x, y).

This method ensures that the feature points extracted by the ORB algorithm possess rotation-invariant orientations, thereby improving the robustness of feature matching.

3.2.2. r-BRIEF Feature Description

During descriptor construction, the ORB algorithm employs the r-BRIEF (Rotation-Invariant BRIEF) method to encode features. While ORB computes descriptors based on image blocks, r-BRIEF adapts to the orientation of keypoints, enhancing robustness against rotation and scale variations. Each keypoint is represented by a 256-bit descriptor, enabling compact and efficient encoding of local image features for subsequent fast matching.

The core principle of r-BRIEF involves comparing the grayscale values of pixel pairs within the neighborhood of a feature point, generating a binary string according to Equation (5):

τ (p; x, y) = \{\begin{cases} 1, I (p + x) < I (p + x) \\ 0, o t h e r w i s e \end{cases}

(5)

where p represents the center coordinates of the feature point, x and y denote the offsets relative top, I(p + x) denotes the pixel brightness obtained by translating the center p by x, I(p + y) denotes the pixel brightness obtained by translating the center p by y, and τ(p; x, y) represents the binary comparison result.

Feature similarity between keypoints is then evaluated by computing the Hamming distance between their binary descriptors, enabling fast and efficient feature matching.

r-BRIEF incorporates rotation matrices into the standard BRIEF algorithm. For an iii-bit test point set (a, b), a 2^i matrix is defined. Using the previously computed orientation angle θ, the corresponding rotation matrix Rθ is obtained, which transforms the test point set into an orientation-specific set Sθ = RθS. This transformation ensures that the r-BRIEF descriptors are rotation-invariant.

In summary, the ORB algorithm comprises two key components: the combination of FAST corner detection with the Harris corner response function, and the integration of the BRIEF algorithm with a rotation matrix. The former is responsible for extracting corner features, while the latter performs binary encoding of the extracted features and enhances their robustness against rotation.

3.3. KNN Algorithm Principle

The K-Nearest Neighbors (KNN) algorithm is a fundamental and widely used method for recognition tasks [25]. Its core principle is that the value or label of a sample is determined by the values of its K most similar neighbors. When combined with ORB, KNN performs secondary filtering on the initial matches generated by ORB, eliminating mismatches and thereby enhancing overall matching accuracy and robustness [23].

The basic operation of the KNN algorithm is as follows (Figure 2): For a new input sample—specifically, a feature descriptor extracted by the ORB algorithm—a suitable distance metric, such as the Hamming distance, is used to measure similarity against all reference samples. The algorithm then identifies the K samples with the smallest distances to the input sample, and these K distances are stored for subsequent analysis or filtering.

The blue circle represents a new sample awaiting classification. Assuming K = 2, the KNN algorithm computes the distances between this sample and all other samples in the feature space. These distances are then sorted, and the two nearest neighbors—indicated by the samples within the green solid-line circle in the figure—are identified. The class of the new sample is subsequently determined based on the labels of these two nearest neighbors. As illustrated, the choice of K and the selection of an appropriate distance metric are critical for the algorithm’s performance [23,25].

3.4. Ratio Test Principle

Lowe’s ratio test is a classical method for evaluating the quality of feature matches, first proposed by David G. Lowe [26] in his SIFT (scale-invariant feature transform) algorithm to eliminate ambiguous or incorrect correspondences. The core principle of the test is to identify the two nearest neighbors of a given feature descriptor—the nearest neighbor and the second-nearest neighbor—based on their matching distances. The Lowe ratio is defined as the ratio of the distance to the nearest neighbor over the distance to the second-nearest neighbor. The formula is given as follows:

r a d i o = d_{1} / d_{2}

(6)

where d₁ is the distance between the feature point and its nearest neighbor, and d₂ is the distance between the feature point and its second-nearest neighbor.

The calculated Lowe ratio is typically compared against a predefined threshold. If the ratio is below the threshold, it indicates a significant difference between the nearest and second-nearest neighbors relative to the sample descriptor. This implies a high similarity between the nearest neighbor and the sample, a low similarity with the second-nearest neighbor, and thus a clear and reliable correspondence. Conversely, if the ratio exceeds the threshold, the nearest and second-nearest neighbors are similarly close to the sample, creating ambiguity and preventing reliable identification of the correct match. In such cases, the sample descriptor is discarded. This criterion effectively reduces false matches, thereby enhancing the robustness and accuracy of feature matching.

3.5. Algorithm Flow

3.5.1. Incremental Image Registration Algorithm Based on Feature Point Extraction

This algorithm builds upon and refines the framework proposed by Gracias [18], which comprises three key stages: correlation block registration, robust motion estimation, and incremental stitching construction. At the time of the original publication, feature description techniques were still in their infancy, and the registration process relied on correlation block matching, which was limited in handling texture variations and local occlusions. With the development of local invariant features and robust matching models, feature-point-based registration methods now offer superior speed and stability. Accordingly, while retaining the core framework, this study replaces the original image registration module with a combination of feature point extraction and matching, followed by RANSAC (random sample consensus)-based robust homography estimation to obtain more reliable correspondences. Furthermore, homographic transformations are employed for image reprojection and fusion, enabling the construction of continuous underwater mosaics. The specific algorithmic workflow is described below.

For image registration, this algorithm employs a combined Harris + ORB approach for feature point extraction, replacing the traditional correlation block matching and pyramid strategies. The method identifies matching pairs that maintain geometric consistency between consecutive frames.

First, the Harris corner detection operator is applied to extract corners from the images. The Harris algorithm computes a response function based on the degree of grayscale variation within local image regions. Compared to ordinary pixels, these corners exhibit higher repeatability and distinctiveness, maintaining consistent positioning across multiple frames. After feature point extraction, the ORB descriptor is employed to encode local texture information within each corner neighborhood [23]. The ORB descriptor represents a sparse local feature with scale and rotation invariance. Its BRIEF-based binary vector encoding strategy significantly reduces dimensionality and computational cost, while directional correction mechanisms enhance stability under viewpoint changes and small-scale rotations. Unlike traditional cross-correlation block matching, which relies on pixel intensity consistency, ORB descriptors are robust to variations in brightness, local contrast, and uneven underwater illumination, making them well-suited for real-world underwater imaging conditions.

Subsequently, to perform feature point matching across different images, the algorithm employs a brute-force matcher using the Hamming distance as a similarity metric. Since ORB descriptors are binary vectors, the Hamming distance allows for rapid evaluation of differences between feature descriptors, enabling efficient identification of candidate matching points. Moreover, by sorting the matches according to distance and discarding correspondences with low similarity, the accuracy of the matching pairs is significantly improved, thereby reducing the impact of mismatches on subsequent transformation estimation [23,24].

For robust motion estimation, the RANSAC algorithm is combined with iterative least squares and M-estimation to determine the homography matrix, effectively eliminating erroneous matches. RANSAC is first employed to estimate the initial model, with the corresponding formulations provided in Equations (7)–(11):

θ = f (s)

(7)

where θ represents the model parameters to be estimated, S denotes the extracted minimal sample set, and f is the minimization objective function.

r_{i} = d (x_{i}, θ)

(8)

Among these, x_i represents the ith data point, d(x_i,θ) denotes the error metric value, and r_i indicates the residual of x_i under the current model θ.

I (θ) = {x_{i} \in D ∣ r_{i} < τ}

(9)

where D denotes the entire dataset, τ represents the interior point threshold, and I(θ) is the subset classified as interior points under the current model θ.

θ^{*} = a r g m a x | I (θ) |

(10)

where θ* denotes the model parameters that maximize the number of inliers, and |I(θ)| represents the number of points in the inlier set.

θ^{*} = f (I (θ^{*}))

(11)

Recalculate the model using all obtained interior points to achieve greater precision.

The model is further refined using iterative least squares, optimizing the homography transformation matrix by minimizing the reprojection error. Subsequently, robust optimization is performed via the M-estimator, which incorporates a weighting function to suppress the influence of residual outliers [27]. Building upon the initial RANSAC estimate, the M-estimation method achieves robust quadratic optimization of the homography matrix by progressively reducing the impact of high residual errors. This strategy effectively enhances geometric consistency in image registration for weakly textured underwater scenes, improving both the stability and visual naturalness of the resulting mosaics. Consequently, it elevates the alignment accuracy and overall consistency of image stitching.

For image stitching, this algorithm adopts the incremental stitching strategy proposed by Gracias [18], but simplifies it by reducing multi-frame stitching to a two-frame approach. Only a single relative transformation using the homography matrix is required, eliminating the need to estimate and accumulate transformations into a global coordinate system. Consequently, error accumulation control becomes unnecessary, as the algorithm focuses solely on solving a single alignment problem. This simplification results in a more straightforward implementation, reduced computational requirements, and faster execution.

In summary, this algorithm extends the framework proposed by Gracias by incorporating feature extraction techniques to improve the image registration stage. Specifically, it replaces correlation block matching with Harris + ORB feature point matching and employs the RANSAC algorithm to estimate the homography matrix, thereby ensuring geometric reliability of correspondences. The overall algorithmic workflow is illustrated in Figure 3.

Overall, the proposed algorithm retains the theoretical foundation of classical underwater image stitching methods while effectively enhancing feature robustness and computational efficiency. It offers a scalable technical solution for applications such as environmental perception, seafloor scene mapping, and marine scientific research.

The actual matching results (Figure 4) indicate that the feature-point-based incremental image stitching algorithm exhibits numerous misalignments, including typical ghosting and registration errors. Image edges remain unaligned, with discontinuities present on both sides of the stitching seam. The overall visual quality is blurred, exhibiting severe ghosting and failing to meet visual consistency requirements. These issues primarily arise from misaligned points being incorrectly classified as inliers during homography estimation, resulting in erroneous local transformation matrices. Furthermore, insufficient feature point extraction exacerbates the misalignment, collectively contributing to the observed artifacts [28]. It is worth noting that the original image sequences used in this experiment were captured by a ROV (remotely operated vehicle) platform in real underwater environments, which further increases the complexity of image degradation and scene variability.

3.5.2. ORB-KNN-Ratio Test Image Splicing Algorithm

Underwater images often suffer from severe color casts, weak local textures, and uneven illumination. These conditions cause traditional ORB + BF matching to produce numerous mismatches, compromising the accuracy of homography estimation. To address these limitations, the original approach has been optimized and improved in this study. Specifically, color compensation and contrast enhancement are applied prior to feature extraction to increase the number of detectable features. During the matching phase, a fixed-ratio filtering mechanism is employed to enhance the distinctiveness and consistency of matching pairs, thereby improving the overall stability of image stitching.

To address the insufficient feature points extracted by the ORB algorithm, which can lead to image matching failures and erroneous correspondences, the input underwater images undergo preprocessing, specifically color compensation and enhancement. Due to the non-uniform absorption of light in water—stronger absorption of red light and weaker absorption of blue-green light—underwater images typically exhibit a blue-green cast, weak textures, and low contrast. Under such low-contrast, low-texture conditions, feature point extraction becomes significantly more challenging, necessitating image preprocessing. Prior to feature extraction, color and brightness correction are applied to restore the intrinsic texture structure of the image. In this algorithm, underwater images are first processed using Gray-World white balance and CLAHE (contrast-limited adaptive histogram equalization) for local contrast enhancement. Gray-World white balance assumes that the average of the three color channels (R, G, B) in a natural image should approximate neutral gray [29]. Channel gains for the three channels are calculated using Equations (12) and (13), followed by color correction of the white-balanced image via Equation (14).

k = \frac{\bar{R} + \bar{G} + \bar{B}}{3}

(12)

α_{R} = \frac{k}{\bar{R}}, α_{G} = \frac{k}{\bar{G}}, α_{B} = \frac{k}{\bar{B}}

(13)

Here,

\bar{R}

,

\bar{G}

, and

\bar{B}

represent the average brightness of the red, green, and blue channels, respectively.

α_{R}

,

α_{G}

, and

α_{B}

denote the gain coefficients for the red, green, and blue channels, respectively.

k

is the grayscale target for the image.

R′, G′, and B′ represent the white-balanced values for the red, green, and blue channels, respectively.

R' = α_{R} \cdot R, G' = α_{G} \cdot G, B' = α_{B} \cdot B

(14)

The blue-green oversaturation is mitigated, bringing the overall color of underwater images closer to natural lighting conditions. CLAHE-based local contrast enhancement is subsequently applied to the white-balanced image using Equation (15), which preserves edge structures while enhancing weak texture contrast [30]. This process improves local texture visibility and contrast, thereby increasing the detectability of feature points.

I_{C L A H E (x, y)} = \frac{(C D F (I (x, y)) - I_{m i n})}{(I_{m a x} - I_{m i n})} \cdot (L - 1)

(15)

where I(x, y) is the grayscale value at pixel position (x, y); ‘CDF’ is the cumulative distribution function; I_min is the minimum grayscale value within the cumulative window; I_max is the maximum grayscale value within the cumulative window; ‘L’ is the grayscale level; ‘I_CLAHE (x, y)’ is the enhanced output pixel value.

The combination of these two preprocessing methods substantially improves both the robustness and the number of feature points detected in subsequent ORB feature extraction. This enhancement demonstrates stable performance, particularly in murky, low-texture underwater environments with severe light attenuation. The specific effects are illustrated in Figure 5; the original image is from a public dataset—TankImage-I.

To eliminate false matches and select reliable matching points, a combined KNN and Lowe’s ratio test strategy is employed, replacing the original method to enhance matching reliability and reduce false match rates. Traditional approaches typically rely on direct nearest-neighbor matching, identifying the closest descriptor pair for each feature. However, in underwater environments characterized by weak textures, uneven illumination, local blurring, and strong noise interference, descriptor similarity differences are often diminished. Direct matching under such conditions is prone to false correspondences, leading to unstable single-response model solutions and ultimately causing misalignment or distortion in the reconstructed structure.

To improve matching reliability, the proposed method first applies the KNN algorithm to the ORB-extracted feature descriptors, identifying the K nearest neighbors for each sample and storing the corresponding distances. Subsequently, the Lowe ratio test is applied: a match is accepted only if the ratio of the nearest distance to the second-nearest distance falls below a predefined threshold; otherwise, the candidate feature is discarded. This filtering process removes ambiguous matches, producing a “purer” set of correspondences. The fixed-ratio filtering strategy effectively suppresses false-similarity matches common in low-texture underwater scenes, ensuring stable matching quality. Moreover, it provides higher-quality matching points for subsequent algorithms, facilitating more robust model estimation and faster convergence.

In summary, the proposed ORB-KNN-Ratio Test algorithm consists of five stages: image preprocessing, feature extraction, feature matching, geometric transformation estimation, and image fusion. First, to address common underwater image issues such as color bias and low contrast, Gray-World white balance and CLAHE-based local contrast enhancement are applied during preprocessing, restoring color fidelity and enhancing texture details. Next, the ORB algorithm extracts rotation- and scale-invariant feature points along with their binary descriptors. To improve matching reliability, candidate matching pairs are identified using KNN search and subsequently refined via the Lowe ratio test, which eliminates spurious matches with insufficient similarity and establishes stable feature correspondences. Geometric transformation is then estimated using the RANSAC algorithm to remove remaining outliers and ensure accurate homography. Finally, the reference image is projected onto the target image using the homography transformation and seamlessly blended within the overlapping region, producing a spatially continuous underwater panoramic mosaic [31]. The overall workflow of the algorithm, integrating ORB-based feature extraction and KNN-guided matching, is illustrated in Figure 6. The process focuses on ensuring geometric consistency and stable feature distribution, rather than merely distinguishing descriptors.

4. Experimental Results Analysis

The experiments in this study were conducted using color image sequences captured by an underwater robotic platform (ROV) in real underwater environments. From these sequences, adjacent image pairs with sufficient overlap and representative visual features were selected as experimental data and stitched to generate the final composite images. Due to the high cost of real underwater data acquisition, as well as constraints imposed by operational environments and mission conditions, large-scale real underwater datasets are difficult to obtain.

Although large-scale real underwater datasets are difficult to obtain, the nine selected image groups cover a wide range of imaging quality and degradation levels, including varying degrees of illumination attenuation, color distortion, and sparse texture details. These datasets are representative of practical underwater operational conditions and constitute challenging real underwater scenarios rather than large-scale synthetic datasets. In addition, the selected images were not manually curated as optimal cases, but typical samples chosen to avoid result bias and to ensure a fair and controlled evaluation process. Each stitching unit in this experiment consists of two consecutive overlapping frames. Using multi-frame mosaicking would introduce accumulated geometric drift and would not allow an isolated assessment of the core components examined in this study—namely, feature extraction and homography estimation. Therefore, employing two-frame image pairs as the basic evaluation unit constitutes a standard and appropriate approach for assessing underwater stitching performance. Representative stitching results are shown in Figure 7.

The algorithm’s performance was evaluated using three objective metrics: peak signal-to-noise ratio (PSNR), Structural Similarity Index (SSIM), and computation time [32]. It should be emphasized that this study is not intended to establish statistically averaged performance metrics, but rather to conduct underwater engineering validation under representative and realistic conditions. In addition, large-scale statistical evaluation remains challenging for real-world underwater experiments.

In order to ensure that the results are fair and equitable, they are persuasive, the same set of images is used as input to the Harris–ORB + KNN, SIFT + BF, SIFT + KNN, AKAZE + BF, and ORB-KNN-Ratio Test algorithms to compare their performance.

The detailed computation procedure of the above evaluation metrics is described as follows: Due to the image stitching operation, the stitched result differs from the original frames in both content and spatial dimensions, making direct pixel-wise comparison over the full image infeasible. To ensure the validity, reliability, and fairness of the evaluation, all metrics are computed exclusively within the effective overlapping region, which is also the prevailing practice in the image stitching literature.

Specifically, the first image is projected onto the coordinate system of the second image using the estimated homography transformation, so that both images are aligned in a common spatial reference frame. A minimal bounding rectangle that covers all projected pixels is then constructed and used as the canvas, onto which both images are mapped. For each mapped image, pixels outside its valid projected region are uniformly set to zero, and a binary mask is employed to explicitly identify the effective overlapping area.

Finally, the SSIM and PSNR are computed only within the overlapping region to avoid interference from non-overlapping areas. For the SSIM, following the standard definition in Equation (16), local mean intensity, variance, and covariance are computed within the overlap to obtain the Structural Similarity Index. For the PSNR, the mean squared error (MSE) is first calculated on grayscale images within the overlapping region according to Equation (17), and the corresponding PSNR value is then derived using Equation (18). In this evaluation protocol, since ground-truth panoramic images are unavailable in real underwater scenarios, the second input image (i.e., the target image without geometric transformation) is treated as the reference image, and the stitched result is compared against it within the same coordinate system and overlapping region, thereby ensuring reproducibility, validity, and robustness of the quantitative evaluation.

SSIM quantifies the similarity between two images across three components, namely luminance, contrast, and structure, as defined in Equation (16). It reflects the degree to which structural information perceived by the human visual system is preserved. Values closer to 1 indicate higher structural fidelity, greater similarity, and superior visual quality. PSNR [33], derived from the mean squared error (MSE) and defined in Equations (17) and (18), measures the pixel-level fidelity between images before and after enhancement. Higher PSNR values correspond to better color reproduction, reduced distortion, and improved overall image quality. Computation time represents the duration required to execute the algorithm on a computer, serving as a critical metric for evaluating efficiency and real-time performance. Shorter computation times indicate higher algorithmic efficiency and better suitability for real-time applications [34,35].

S S I M (x, y) = \frac{(2 μ_{x} μ_{y} + C_{1}) (2 σ_{x y} + C_{2})}{(μ_{x}^{2} + μ_{y}^{2} + C_{1}) (σ_{x}^{2} + σ_{y}^{2} + C_{2})}

(16)

where μ_x and μ_y represent the mean luminance of images x and y, respectively. σ_x and σ_y denote the local standard deviation of the images. σ_xy is the local covariance of the images. C₁ and C₂ are the stabilization constants.

M S E = \frac{1}{M N} \sum_{i = 1}^{M} \sum_{j = 1}^{N} {[I (i, j) - K (i, j)]}^{2}

(17)

where I(i, j) is the grayscale value of the original image at point (i, j), K(i, j) is the grayscale value of the stitched image at point (i, j), M and N denote the number of columns and rows in the image, respectively, and MSE represents the mean squared error, reflecting the error energy between the reconstruction result and the original image.

P S N R = 10 \cdot l o g_{10} (\frac{L^{2}}{M S E})

(18)

where L is the maximum possible gray value of the image, log₁₀ is the logarithm base 10, and PSNR is the peak signal-to-noise ratio, reflecting image fidelity.

As shown in Table 1 and Figure 8, Figure 9 and Figure 10, the improved stitching method establishes stable feature correspondences even in weakly textured regions. This is achieved by enhancing texture visibility through color augmentation and suppressing similarity mismatches via fixed-ratio filtering, thereby significantly improving the inlier rate and structural consistency within overlapping areas.

In terms of overall stitching performance, the ORB-KNN algorithm consistently outperforms the Harris–ORB + KNN, SIFT + BF, SIFT + KNN, and AKAZE + BF methods in two key quality metrics: Structural Similarity Index (SSIM) and peak signal-to-noise ratio (PSNR). Specifically, ORB-KNN achieved an average SSIM of 0.879 and a stable PSNR of approximately 20.6 dB, with virtually no visible stitching artifacts. In contrast, Harris–ORB + KNN, SIFT + BF, SIFT + KNN, and AKAZE + BF consistently yielded low reconstruction quality, with SSIM values around 0.61 and PSNR values of approximately 9.5 dB, indicating severe visual distortion in the stitched results.

Regarding computational efficiency, Harris–ORB + BF initially averaged 0.41 s for the six image groups, which is 2.5 times faster than ORB-KNN (1.02 s), indicating potential suitability for real-time applications under relatively mild conditions. However, processing the third image group required 22.9 s, increasing the average computation time to 7.9 s, which is slower than ORB-KNN. This substantial increase is primarily attributed to the feature matching stage. Specifically, the Harris detector is highly sensitive to uneven illumination and background noise, leading to an excessive number of detected corner points in severely degraded underwater scenes. When combined with a brute-force matching strategy, which exhaustively compares all feature pairs, the matching complexity increases dramatically, resulting in a sharp rise in computation time. This behavior indicates potential stability limitations when handling complex textures or high-resolution imagery.

For AKAZE + BF, the average processing time for the first and third image groups was approximately 0.64 s, which is 1.6 times faster than ORB-KNN (1.02 s), also demonstrating certain real-time application potential. However, stitching failed for the second image group due to an insufficient number of detected feature points. This limitation arises because, under low-contrast and strong scattering conditions, the nonlinear scale space of AKAZE tends to extract fewer stable keypoints, leading to inadequate feature correspondences for reliable homography estimation.

In contrast, both SIFT + BF and SIFT + KNN exhibited computation times comparable to ORB-KNN, with an average runtime of approximately 0.82 s, reflecting their relatively stable performance.

From the perspective of the final stitched images, the ORB-KNN-Ratio Test algorithm demonstrates excellent performance in underwater image stitching and environmental perception. It effectively and accurately establishes feature correspondences while mitigating common challenges in underwater imaging, such as illumination variations, scattering, and reflections. Comparison of quantitative metrics—including SSIM, PSNR, and algorithm runtime—between the proposed method and traditional stitching approaches indicates that the ORB-KNN-Ratio Test algorithm achieves superior performance. Specifically, it provides stable and relatively short computation times, better preserves image structures, reduces distortion, and delivers higher overall image quality [36,37,38].

5. Summary and Limitations

This paper addresses the limited field-of-view problem encountered by low-cost micro-submersible robots operating in low-light, low-texture underwater environments by proposing an image stitching method based on ORB features and fixed-ratio filtering. The primary innovation of this approach is the integration of image enhancement with fixed-ratio matching, which improves the stability of feature correspondences and enhances stitching quality, specifically tailored to the challenges posed by low-texture underwater imagery.

Gray-World white balance combined with CLAHE enhancement restores color fidelity and local texture details, substantially improving the detectability of feature points. During feature matching, KNN search integrated with a fixed-threshold ratio test effectively eliminates false matches arising from regions with weak or indistinct textures. This process is further coupled with RANSAC-based robust estimation to compute homography transformations, ultimately enabling reliable stitching of wide-angle underwater scenes.

The proposed method achieves stable matching performance in weakly textured underwater scenes and is particularly suitable for underwater robotic platforms with high real-time requirements and limited computational resources. Although underwater and aerial environments differ in imaging conditions, many core challenges in image stitching are similar across these scenarios. As a result, the lightweight and robust stitching framework proposed in this study has the potential to be extended to aerial platforms under comparable conditions, such as UAV-based construction site monitoring, building facade analysis, and post-construction inspection and maintenance [39,40].

Although the algorithm developed in this study has demonstrated its feasibility and stability in real underwater environments, its effectiveness remains dependent on environmental conditions. Due to the inherent properties of water as an optical medium, light undergoes scattering and refraction within the water column, with a portion also absorbed by the water itself [41]. Consequently, illumination intensity has a significant impact on underwater image quality. Under conditions of extremely low light, high concentrations of suspended particles, or pronounced motion blur, image textures may reduce feature stability and limit stitching robustness under extreme conditions or misalignments that compromise matching stability [42]. In addition, the current stitching workflow relies on relative frame-to-frame registration and does not address cumulative drift issues in multi-frame sequences.

Author Contributions

Conceptualization, D.Y. and G.Y.; Resource support, D.Y.; Literature review, D.Y. and Y.C.; Participation in data preprocessing, G.Y.; Experimental design and implementation, G.Y.; Project and experimental supervision, D.Y.; Figure and visualization preparation, Y.C.; Drafting of the original manuscript, G.Y.; Data organization, T.Z.; Manuscript review and revision, T.Z. and D.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data used in this study were acquired using a real underwater robotic platform and are subject to operational constraints. Therefore, the complete dataset and source code are not publicly available. However, the methodological details, parameter settings, and evaluation protocols are fully described in the manuscript to support the reproducibility of the reported results. The authors are willing to share additional information or representative samples upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Li, C.; Guo, J.; Cong, R.; Pang, Y.; Wang, J. Underwater image enhancement by dehazing with minimum information loss and histogram distribution prior. IEEE Trans. Image Process. 2016, 25, 5664–5677. [Google Scholar] [CrossRef]
Yuan, X.; Guo, L.; Luo, C.; Zhou, X.; Yu, C. A survey of target detection and recognition methods in underwater turbid areas. Appl. Sci. 2022, 12, 4898. [Google Scholar] [CrossRef]
McGlamery, B.L. A computer model for underwater camera systems. Proc. SPIE 1980, 208, 221–231. [Google Scholar]
Mercado, J.; Sekimori, Y.; Toriyama, A.; Ohashi, M.; Chun, S.; Maki, T. Photogrammetry-based photic seafloor surveying and analysis with low-cost autonomous underwater and surface vehicles. J. Robot. Mechatron. 2024, 36, 245–258. [Google Scholar] [CrossRef]
Xu, G.; Zhou, D.; Yuan, L.; Guo, W.; Huang, Z.; Zhang, Y. Vision-based underwater target real-time detection for autonomous underwater vehicle subsea exploration. Front. Mar. Sci. 2023, 10, 1087345. [Google Scholar] [CrossRef]
Matheus, M.; Guilherme, C.; Paulo, J.; Paulo, L.; Silvia, S. Underwater robots localization using multi-domain images: A survey. J. Intell. Robot. Syst. 2025, 111, 52. [Google Scholar]
Zhang, H.; Zheng, R.; Zhang, W.; Shao, J.; Miao, J. An improved SIFT underwater image stitching method. Appl. Sci. 2023, 13, 12251. [Google Scholar] [CrossRef]
Zhang, Z.; Wu, R.; Li, D.; Lin, M.; Xiao, S.; Lin, R. Image stitching and target perception for autonomous underwater vehicle-collected side-scan sonar images. Front. Mar. Sci. 2024, 11, 1418113. [Google Scholar] [CrossRef]
Liu, Y.; Wang, X.; Sun, L.; Chen, J.; He, J.; Zhou, Y. Shallow marine high-resolution optical mosaics based on underwater scooter-borne camera. Sensors 2023, 23, 8028. [Google Scholar] [CrossRef] [PubMed]
Gu, Z.; Liu, X.; Hu, Z.; Wang, G.; Zheng, B.; Watson, J.; Zheng, H. Underwater computational imaging: A survey. Intell. Mar. Technol. 2023, 1, 2. [Google Scholar] [CrossRef]
Sharma, S.K.; Jain, K.; Shukla, A.K. A comparative analysis of feature detectors and descriptors for image stitching. Appl. Sci. 2023, 13, 6015. [Google Scholar] [CrossRef]
Zheng, J. A comparison of feature extraction methods in image stitching. Appl. Comput. Eng. 2023, 15, 160–166. [Google Scholar] [CrossRef]
Fan, Y.; Mao, S.; Li, M.; Kang, J. LMFD: Lightweight multi-feature descriptors for image stitching. J. Imaging 2023, 9, 78. [Google Scholar] [CrossRef]
Zhao, N. Image stitching based on feature detection and extraction: An analysis. In Proceedings of the 2024 2nd International Conference on Image, Algorithms and Artificial Intelligence (ICIAAI 2024), Singapore, 9–11 August 2024. [Google Scholar]
Liao, T.; Wang, C.; Li, L.; Liu, G.; Li, N. Parallax-tolerant image stitching via segmentation-guided multi-homography warping. Signal Process. 2025, 230, 109245. [Google Scholar] [CrossRef]
Altuntas, C. Feature point-based dense image matching algorithm for 3-D capture in terrestrial applications. J. Appl. Sci. 2022, 22, 329–341. [Google Scholar] [CrossRef]
Zhang, Y.; Mei, X.; Ma, Y.; Jiang, X.; Peng, Z.; Huang, J. Hyperspectral panoramic image stitching using robust matching and adaptive bundle adjustment. Remote Sens. 2022, 14, 4038. [Google Scholar] [CrossRef]
Gracias, N.; Santos-Victor, J.-V. Underwater mosaicing and trajectory reconstruction using global alignment. In Proceedings of the MTS/IEEE OCEANS 2001: An Ocean Odyssey, Honolulu, HI, USA, 5–8 November 2001; pp. 2557–2563. [Google Scholar]
Xie, Y.; Wang, Q.; Chang, Y.; Zhang, X. Fast Target Recognition Based on Improved ORB Feature. Appl. Sci. 2022, 12, 786. [Google Scholar] [CrossRef]
Zhao, H. Research on multi-sensor data fusion technology for underwater robots for deep-sea exploration. Appl. Math. Nonlinear Sci. 2024, 9. [Google Scholar] [CrossRef]
Chen, T.; Yang, X.; Li, N.; Wang, T.; Ji, G. Underwater image quality assessment method based on color space multi-feature fusion. Sci. Rep. 2023, 13, 16838. [Google Scholar] [CrossRef]
Qiu, X. Comparison and Application of Implementing Image Feature Matching Using Ratio Test and RANSAC; Atlantis Press: Dordrecht, The Netherlands, 2025. [Google Scholar]
Rublee, E.; Rabaud, V.; Konolige, K.; Bradski, G. ORB: An efficient alternative to SIFT or SURF. In Proceedings of the 2011 IEEE International Conference Computer Vision (ICCV), Barcelona, Spain, 6–13 November 2011; pp. 2564–2571. [Google Scholar]
Harris, C.; Stephens, M. A combined corner and edge detector. In Proceedings of the 4th Alvey Vision Conference, Manchester, UK, 31 August–2 September 1988; pp. 147–151. [Google Scholar]
Cover, T.; Hart, P. Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 1967, 13, 21–27. [Google Scholar] [CrossRef]
Lowe, D.G. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 2004, 60, 91–110. [Google Scholar] [CrossRef]
Fischler, M.A.; Bolles, R.C. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 1981, 24, 381–395. [Google Scholar] [CrossRef]
Wang, X.Y.; Hu, X.; Xiang, D.; Xiao, J.; Cheng, H. Point set registration-based image stitching in unmanned aerial vehicle transmission line inspection. EURASIP J. Adv. Signal Process 2025, 2025, 32. [Google Scholar] [CrossRef]
Zhao, Z.; Zhou, Z.; Lai, Y.; Wang, T.; Zou, S.; Cai, H.; Xie, H. Single underwater image enhancement based on adaptive correction of channel differential and fusion. Front. Mar. Sci. 2023, 9, 1058019. [Google Scholar] [CrossRef]
Alhajlah, M. Underwater image enhancement using customized CLAHE and adaptive color correction method. J. Imaging 2022, 8, 311. [Google Scholar]
Murat, I. Comprehensive evaluation of feature extractors in challenging environments: The rationale behind the ratio test. PeerJ Comput. Sci. 2024, 10, e2415. [Google Scholar]
Liu, B.; Yang, Y.; Zhao, M.; Hu, M. A novel lightweight model for underwater image enhancement—Rep-UWnet. Sensors 2024, 24, 3070. [Google Scholar] [CrossRef]
Lai, Y.; Zhou, Z.; Su, B.; Zhe, X.; Tang, J.; Yan, J.; Liang, W.; Chen, J. Single underwater image enhancement based on differential attenuation compensation. Front. Mar. Sci. 2022, 9, 1047053. [Google Scholar] [CrossRef]
Sara, U.; Akter, M.; Uddin, M. Image quality assessment through FSIM, SSIM, MSE and PSNR—A comparative study. J. Comput. Commun. 2019, 7, 8–18. [Google Scholar] [CrossRef]
Zheng, S.; Wang, R.; Chen, G.; Huang, Z.; Teng, Y.; Wang, L.; Liu, Z. Underwater image enhancement using Divide-and-Conquer network (DC-Net). PLoS ONE 2024, 19, e0294609. [Google Scholar]
Zhao, Y.; Gao, F.; Yu, J.; Yu, X.; Yang, Z. Underwater Image Mosaic Algorithm Based on Improved Image Registration. Appl. Sci. 2021, 11, 5986. [Google Scholar] [CrossRef]
Wang, Z.; Zhang, X.; Li, J. Fast calibration and stitching algorithm for underwater camera systems. Multimed. Tools Appl. 2023, 82, 18629–18644. [Google Scholar] [CrossRef]
Zhong, J.; Li, M.; Gruen, A.; Gong, J.; Li, D.; Li, M.; Qin, J. Application of Photogrammetric Computer Vision and Deep Learning in High-Resolution Underwater Mapping: A Case Study of Shallow-Water Coral Reefs. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2024, 10, 247–254. [Google Scholar] [CrossRef]
Wang, S. Effectiveness of Traditional Augmentation Methods for Rebar Counting Using UAV Imagery with Faster R-CNN and YOLOv10-Based Transformer Architectures. Sci. Rep. 2025, 15, 33702. [Google Scholar] [CrossRef]
Wang, S. Development of an Approach to an Automated Acquisition of Static Street View Images Using Transformer Architecture for Analysis of Building Characteristics. Sci. Rep. 2025, 15, 29062. [Google Scholar] [CrossRef] [PubMed]
Megha, V.; Rajkumar, K.K. Seamless panoramic image stitching based on invariant feature detector and image blending. Int. J. Image Graph. Signal Process 2024, 16, 30–41. [Google Scholar]
Spadaro, A.; Chiabrando, F.; Lingua, A.; Maschio, P. Photogrammetry and Traditional Bathymetry for High-Resolution Underwater Mapping in Shallow Waters. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2025, 48, 279–286. [Google Scholar] [CrossRef]

Figure 1. Light propagation and degradation mechanisms in underwater imaging.

Figure 2. Schematic diagram of the KNN algorithm.

Figure 3. Flowchart of incremental image stitching algorithm.

Figure 4. Actual stitching results from the incremental image stitching algorithm.

Figure 5. Comparison of image preprocessing effects.

Figure 6. ORB-KNN-Ratio Test algorithm flowchart.

Figure 7. Actual stitching results of the ORB-KNN-Ratio Test algorithm.

Figure 8. Algorithm SSIM metric comparison results.

Figure 9. Algorithm PSNR metric comparison results.

Figure 10. Comparison of Time Metrics Among Algorithms.

Table 1. Algorithm evaluation metrics.

Group Number	Algorithm	SSIM	PSNR	Time
Group 1	Harris + ORB + BF	0.6257	7.9399	0.35
	ORB-KNN-Ratio test	0.8670	21.1819	0.89
	SIFT + BF	0.6270	7.9534	0.96
	SIFT + KNN	0.6293	7.9820	0.81
	AKAZE + BF	0.6321	8.0164	0.61
Group 2	Harris + ORB + BF	0.6450	8.6618	0.45
	ORB-KNN-Ratio test	0.8522	20.3570	0.94
	SIFT + BF	0.5672	8.1574	0.87
	SIFT + KNN	0.5652	8.1360	0.92
	AKAZE + BF	Insufficient feature points for matching
Group 3	Harris + ORB + BF	0.5416	9.6894	22.863
	ORB-KNN-Ratio test	0.8887	20.3179	0.93
	SIFT + BF	0.5422	9.6935	0.84
	SIFT + KNN	0.5420	9.6923	0.80
	AKAZE + BF	0.5464	9.7374	0.63
Group 4	Harris + ORB + BF	0.5784	10.0027	4.06
	ORB-KNN-Ratio test	0.9243	21.4677	1.10
	SIFT + BF	0.5790	10.0085	0.82
	SIFT + KNN	0.5793	10.0090	0.82
	AKAZE + BF	0.5788	10.0043	0.70
Group 5	Harris + ORB + BF	0.5759	8.8036	0.46
	ORB-KNN-Ratio test	0.8935	20.3282	1.11
	SIFT + BF	0.5759	8.8023	0.79
	SIFT + KNN	0.5767	8.8106	0.77
	AKAZE + BF	0.5764	8.8058	0.63
Group 6	Harris + ORB + BF	0.5959	7.0840	0.18
	ORB-KNN-Ratio test	0.8561	19.0325	1.09
	SIFT + BF	0.5949	7.0728	0.74
	SIFT + KNN	0.5981	7.1030	0.75
	AKAZE + BF	0.5979	7.1041	0.61
Group 7	Harris + ORB + BF	0.5856	10.8036	0.50
	ORB-KNN-Ratio test	0.9297	20.8550	1.10
	SIFT + BF	0.5854	10.8012	0.82
	SIFT + KNN	0.5860	10.8076	0.85
	AKAZE + BF	0.5895	10.8409	0.64
Group 8	Harris + ORB + BF	0.7381	11.6438	3.58
	ORB-KNN-Ratio test	0.8782	20.0549	1.17
	SIFT + BF	0.7387	11.6502	0.89
	SIFT + KNN	0.7388	11.6496	0.81
	AKAZE + BF	0.7390	11.6512	0.68
Group 9	Harris + ORB + BF	0.6090	10.1543	0.54
	ORB-KNN-Ratio test	0.9009	21.4420	0.98
	SIFT + BF	0.6038	10.9076	0.89
	SIFT + KNN	0.6046	10.1059	0.86
	AKAZE + BF	0.6016	10.0739	0.66

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yi, G.; Zhang, T.; Chen, Y.; Yu, D. An Improved ORB-KNN-Ratio Test Algorithm for Robust Underwater Image Stitching on Low-Cost Robotic Platforms. J. Mar. Sci. Eng. 2026, 14, 218. https://doi.org/10.3390/jmse14020218

AMA Style

Yi G, Zhang T, Chen Y, Yu D. An Improved ORB-KNN-Ratio Test Algorithm for Robust Underwater Image Stitching on Low-Cost Robotic Platforms. Journal of Marine Science and Engineering. 2026; 14(2):218. https://doi.org/10.3390/jmse14020218

Chicago/Turabian Style

Yi, Guanhua, Tianxiang Zhang, Yunfei Chen, and Dapeng Yu. 2026. "An Improved ORB-KNN-Ratio Test Algorithm for Robust Underwater Image Stitching on Low-Cost Robotic Platforms" Journal of Marine Science and Engineering 14, no. 2: 218. https://doi.org/10.3390/jmse14020218

APA Style

Yi, G., Zhang, T., Chen, Y., & Yu, D. (2026). An Improved ORB-KNN-Ratio Test Algorithm for Robust Underwater Image Stitching on Low-Cost Robotic Platforms. Journal of Marine Science and Engineering, 14(2), 218. https://doi.org/10.3390/jmse14020218

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Improved ORB-KNN-Ratio Test Algorithm for Robust Underwater Image Stitching on Low-Cost Robotic Platforms

Abstract

1. Introduction

2. Related Works

Problem Statement

3. Algorithm Introduction

3.1. Incremental Image Splicing Framework

3.2. ORB Algorithm Principle

3.2.1. o-FAST Corner Detection

3.2.2. r-BRIEF Feature Description

3.3. KNN Algorithm Principle

3.4. Ratio Test Principle

3.5. Algorithm Flow

3.5.1. Incremental Image Registration Algorithm Based on Feature Point Extraction

3.5.2. ORB-KNN-Ratio Test Image Splicing Algorithm

4. Experimental Results Analysis

5. Summary and Limitations

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI