Next Article in Journal
Challenges and Opportunities in Predicting Future Beach Evolution: A Review of Processes, Remote Sensing, and Modeling Approaches
Previous Article in Journal
2C-Net: A Novel Spatiotemporal Dual-Channel Network for Soil Organic Matter Prediction Using Multi-Temporal Remote Sensing and Environmental Covariates
Previous Article in Special Issue
Study of Antarctic Sea Ice Based on Shipborne Camera Images and Deep Learning Method
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Optimized Recognition Algorithm for Remotely Sensed Sea Ice in Polar Ship Path Planning

by
Li Zhou
1,*,
Runxin Xu
2,
Jiayi Bian
3,
Shifeng Ding
2,
Sen Han
3 and
Roger Skjetne
4
1
State Key Laboratory of Ocean Engineering, Shanghai Jiao Tong University, Shanghai 200030, China
2
School of Ocean and Civil Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
3
School of Naval Architecture and Ocean Engineering, Jiangsu University of Science and Technology, Zhenjiang 212100, China
4
Department of Marine Technology, Faculty of Engineering, Norwegian University of Science and Technology, N-7491 Trondheim, Norway
*
Author to whom correspondence should be addressed.
Remote Sens. 2025, 17(19), 3359; https://doi.org/10.3390/rs17193359 (registering DOI)
Submission received: 28 July 2025 / Revised: 29 September 2025 / Accepted: 30 September 2025 / Published: 4 October 2025

Abstract

Highlights

What are the main findings?
  • An optimized model for enhanced sea ice detection in satellite imagery, especially for small sea ice floes.
  • A real-time path planning algorithm for polar navigation that enables safe passage through small ice floes while avoiding large hazardous ones.
What are the implications of the main findings?
  • A novel methodology integrating ice detection and path planning for polar navigation, utilizing satellite imagery to enable autonomous route optimization.
  • An integrated real-time model combining ice detection and path planning enhances navigational safety in polar regions.

Abstract

Collisions between ships and sea ice pose a significant threat to maritime safety, making it essential to detect sea ice and perform safety-oriented path planning for polar navigation. This paper utilizes an optimized You Only Look Once version 5 (YOLOv5) model, designated as YOLOv5-ICE, for the detection of sea ice in satellite imagery, with the resultant detection data being employed to input obstacle coordinates into a ship path planning system. The enhancements include the Squeeze-and-Excitation (SE) attention mechanism, improved spatial pyramid pooling, and the Flexible ReLU (FReLU) activation function. The improved YOLOv5-ICE shows enhanced performance, with its mAP increasing by 3.5% compared to the baseline YOLOv5 and also by 1.3% compared to YOLOv8. YOLOv5-ICE demonstrates robust performance in detecting small sea ice targets within large-scale satellite images and excels in high ice concentration regions. For path planning, the Any-Angle Path Planning on Grids algorithm is applied to simulate routes based on detected sea ice floes. The objective function incorporates the path length, number of ship turns, and sea ice risk value, enabling path planning under varying ice concentrations. By integrating detection and path planning, this work proposes a novel method to enhance navigational safety in polar regions.

1. Introduction

Arctic sea ice has been shrinking since the 1980s, opening up new maritime opportunities such as the Northeast, Northwest, and Central Passages [1,2]. However, navigating these extreme environments is challenging due to dynamic sea ice, which poses a considerable threat to vessel safety [3]. Therefore, accurate sea ice observation and reliable path planning are essential for safe navigation and are becoming major focuses of polar research.
Acquiring sea ice information predominantly relies on three methodologies: field measurements, ship-based observational surveys, and satellite remote sensing. In this context, Lu et al. [4] developed a two-stream radiative transfer model for ponded sea ice, while Weissling et al. [5] used video to capture ice dynamics. For large-scale assessments, Worby et al. [6] analyzed over 20,000 samples from ship transects. Recently, computational approaches have significantly enhanced data processing, with researchers like Zhou et al. [7] applying convolutional neural networks for instance segmentation of sea ice, and Ressel et al. [8] using artificial neural networks for classification. Chen et al. [9] pioneered a 3D reconstruction methodology for sea ice fields by integrating YOLO-based object detection. However, some of the aforementioned methods lack optimization for small-target sea ice, which may lead to difficulties in achieving complete identification of small sea ice floes in practical applications, and missed detection of sea ice could adversely affect obstacle avoidance decision-making for safe navigation. Although other approaches have proposed improvement strategies specifically for small-target sea ice, their model training in practice relies on large-scale datasets of ship-based sea ice imagery. Such datasets must be obtained through field observations by polar research vessels, making their acquisition significantly more challenging than that of open-source satellite remote sensing images.
The accurate extraction of sea ice feature parameters, particularly for image recognition, has become a crucial research focus. Current methodologies primarily use threshold segmentation, target recognition, and instance segmentation techniques [10]. Among target recognition algorithms, YOLO, SSD, and Mask R-CNN are predominant [11,12]. For instance, Lu et al. [13] introduced a fusion-based segmentation method for rocks, Zhang et al. [14] refined YOLOv5 for occlusion challenges, and Wu et al. [15] proposed SPE-YOLO using SE attention for small target detection. Other studies include Cai et al. [16], who used CNNs for sea ice instance segmentation, and Dong et al. [17], who developed a two-stage ice channel identification approach.
While these advancements in sea ice identification have improved situational awareness, they often do not directly provide a safe navigation plan. This has led to parallel research in polar path planning, where factors like sea ice concentration and thickness are considered. For instance, Shu et al. [18] employed an optimal control-based method, integrated with macro-scale sea ice concentration and thickness grid data, to conduct path planning for ship fleets in the Northern Sea Route, while distinguishing between breakable and unbreakable ice to optimize navigation costs. Zhang et al. [19] utilized a three-dimensional ant colony algorithm (3D-ACA), representing the ice field with average concentration and thickness of discrete grids, and took ship speed as a control variable to carry out multi-objective path planning for Arctic ships that balances fuel consumption and navigation risk optimization. Liu et al. [20] constructed a polar path planning background using sea ice concentration data from NSIDC and sea ice thickness data from PIOMAS, and proposed the D*-NSGA-III dynamic multi-objective path planning algorithm to conduct research on ship path planning in the Arctic region. Lehtola et al. [21] adopted an improved A*-based algorithm, combined with sea ice concentration, thickness data provided by the HELMI ice model and a ship-ice interaction model, to conduct safe and efficient route planning in ice-covered waters. Xu et al. [22] used an improved D* Lite algorithm, modified local update and path extraction rules based on gridded sea ice concentration and thickness data, to conduct dynamic path planning for ships in Arctic waters.
However, a significant limitation of these existing approaches is their reliance on simplified or generalized ice models, often based on satellite data of sea ice concentration or thickness. These methods lack attention to the actual distribution positions of sea ice, which are crucial for fine-grained real-ship path planning. Our research addresses this gap by proposing a novel framework that, for the first time, utilizes a vision-based approach to construct a high-fidelity, realistic ice field model, enabling more precise and safer navigation planning. This study presents a comprehensive solution by integrating advanced sea ice detection with intelligent path planning to address the critical safety challenge posed by ship-ice collisions in polar navigation. Our approach leverages YOLOv5-ICE, an incremental YOLOv5 improvement that incorporates three key modules: Squeeze-and-Excitation attention mechanisms, improved spatial pyramid pooling, and Flexible ReLU activation. This model achieves superior performance in sea ice target identification, especially in terms of detecting small floes in high-concentration ice regions. The detected ice parameters then inform our Any-Angle Path Planning algorithm that optimizes routes based on multiple safety factors including path length, maneuver complexity, which is measured by turns, and ice collision risk. This integrated detection-planning framework represents a strong engineering contribution in polar navigation safety, providing ships with reliable, ice-adaptive routing solutions.

2. Sea Ice Image Detection Algorithm

2.1. Model of YOLOv5

YOLO is an end-to-end target detection algorithm based on deep learning, which has a faster detection speed and can meet real-time requirements when ships are sailing in polar regions [23]. The model structure is simple and efficient, with strong scalability. The detection accuracy of small targets is relatively high, which is suitable for the identification of small sea ice targets in remote sensing images in complex polar environments. The network structure of YOLOv5 mainly contains Input, Backbone, Neck, and Head. The network structure is shown in Figure 1. In which, the Input layer takes images with a pixel size of 640 × 640 as input; the Backbone adopts the CSPDarknet53 structure, and gradually extracts sea ice features from images through multiple sets of convolution, residual connection and pooling operations; the Neck combines the Feature Pyramid Network (FPN) and Path Aggregation Network (PAN) to realize the fusion of features at different levels, effectively solving the feature matching problem of sea ice targets caused by scale differences; the Head outputs the category probability, confidence and bounding box coordinates of sea ice targets through the detection head, realizing the localization and recognition tasks of sea ice targets in remote sensing images.
The input side is responsible for pre-processing the input image, including operations such as resizing and normalization, to ensure the consistency of the input data. The input images are scaled uniformly to the standard input size to meet the input requirements of the network. In order to avoid distortion or information loss introduced during the scaling process, this preprocessing step adopts the method of maintaining the aspect ratio of the image by first calculating the aspect ratio of the image and then scaling it to the standard size uniformly according to this ratio, and the blank area is filled with grayscale bars.
The backbone network contains CBS structure, C3 structure, and SPPF (Spatial Pyramid Pooling with Features) structure. These modules together form an efficient and less computationally intensive feature extraction network. The CBS consists of three components, namely Convolution, Batch Normalization, and SiLU (Sigmoid-Weighted Linear Unit) activation function. The C3 structure consists of three standard Convolution and Bottleneck layer modules. Layer and Bottleneck Layer modules, with the C3 structure, features can be extracted and fused efficiently while maintaining a low computational complexity. SPPF uses different sizes of maximal pooling to increase the sensory field. The feature map is manipulated using pooling layers of different scales and used for the construction of the feature pyramid to obtain a multi-scale feature representation.
The Neck network of YOLOv5 is located between the backbone network and the output, which is responsible for fusing feature maps of different scales to enhance the feature expression capability of the network. The Neck module is composed of a Feature Pyramid Network (FPN) as well as a Pyramid Attention Network (PAN). The FPN structure transmits high-level semantic features from top to bottom, and the PAN structure realizes comprehensive information coverage by transmitting low-level spatial features downward so that the feature maps of each size contain both semantic and spatial information of the target.
The output side is the last part of YOLOv5, which is responsible for outputting the target category and correcting the position of the candidate box according to the position offset to get more accurate detection results. On the output side, the architecture employs a convolutional layer that transforms the feature map into final prediction results. This transformation is achieved by performing spatial-wise convolutions to directly map each grid cell to predictions containing target bounding box positions, dimensional attributes, and corresponding class probability distributions. To enhance prediction accuracy, the system incorporates adaptively sized anchor frames that serve as dimensional priors, where these anchor dimensions are statistically derived from the distribution of ground truth boxes in the training dataset through a clustering optimization process. The output covers the loss function and Non-Maximum Suppression (NMS), which work together to improve the stability and reliability of the network.
NMS is a commonly used technique in the field of computer vision to address the issue of overlapping bounding boxes in object detection algorithms. The goal of NMS is to retain the best candidate box by suppressing non-maximum values. First, the candidate boxes are scored, typically using the Intersection over Union (IoU) as the scoring metric. IoU measures the ratio of the intersection area to the union area between a candidate box and the target, serving as an indicator of the overlap between two regions. Next, all candidate boxes B are sorted in descending order based on their scores. The box with the highest score is selected and retained. For the remaining boxes, their IoU values are calculated against the retained box. If the IoU exceeds a predefined threshold M, the box is discarded; otherwise, it is kept. Finally, the candidate boxes processed through NMS are obtained, ensuring no overlap between them and retaining only the highest-scoring boxes.
For the sliding window approach, adjacent image segments have a 15% overlap, as shown in Figure 2. This overlap ensures that every region of the image is fully covered for detection. Although this approach introduces some redundant detections, they can be filtered out using NMS. Here, another NMS threshold is set to 0.5, which filters out boxes with IoU values below this threshold. After processing the entire image, the detection results from each cropped segment are merged to produce the final detection outcome. The NMS algorithm is utilized to filter the best result from multiple prediction frames and eliminate redundant detection [24].

2.2. The Optimization of the YOLOv5

The main difference between remote sensing satellite image recognition and general image recognition is that its image size is huge, while the target size in it is small and usually clustered together, resulting in recognition difficulties, as shown in Figure 3 [25]. The sea ice targets to be recognized are very small and difficult to recognize relative to the large size of the remote sensing image, and in areas of dense sea ice, multiple pieces of sea ice may also be close to each other or partially obscured.
To address the aforementioned challenges in remote sensing sea ice image recognition, we implemented three targeted modifications to the YOLOv5 algorithm to enhance its performance in detecting sea ice targets of varying sizes.
First, we integrated a Squeeze-and-Excitation Network (SE) attention mechanism into the backbone network. In complex remote sensing scenarios, different feature channels exhibit varying degrees of importance for sea ice detection. The SE mechanism explicitly models inter-channel dependencies, dynamically weighting each feature channel to selectively enhance ice-relevant characteristics (e.g., texture and boundary features) while suppressing background interference. This architectural modification enables the network to significantly improve its representational capacity and generalization performance.
Second, we enhanced the original Spatial Pyramid Pooling Fast (SPPF) module by developing the SPPCSPC-F structure. The substantial size variation in sea ice targets, ranging from small floes to extensive ice fields, presents significant multi-scale detection challenges. Our improved SPPCSPC-F architecture facilitates more effective fusion of multi-scale features, thereby strengthening the model’s capability to represent sea ice characteristics while maintaining detection accuracy across different target sizes.
Finally, we substituted the original SiLU activation function with a Flexible Rectified Linear Unit (FReLU). Given the importance of subtle pixel-level information in sea ice imagery, FReLU’s ability to preserve negative value information proves particularly advantageous compared to SiLU. This modification reduces information loss during feature extraction, enabling the model to learn more discriminative representations, which is a critical improvement for accurate ice detection in low-contrast or obscured regions.

2.2.1. Squeeze-And-Excitation Networks (SE) Attention Mechanism

Attention mechanisms mainly contain three types: spatial, channel, and hybrid domains. The SE model is a typical representative of the channel domain attention mechanism, focusing on the adaptive weight assignment of feature channels. In the working mechanism of this model, the feature map first undergoes compression in the spatial dimension, and then different weight values are applied to each feature channel, which characterizes the relative importance of the information carried by each channel [26]. The initial feature map is recalibrated according to the obtained weights to realize the enhancement of critical feature channels and the suppression of non-critical channels.
Due to the large size of the remote sensing image and the small size of the target sea ice, it is easy to lose some key information (such as the textures and edge contours of sea ice, etc.) when performing the identification, and there is a leakage of small target sea ice. For this reason, the SE attention mechanism is added in layer 9 of the YOLOv5 backbone network Backbone. The core part of this module is the two steps of Squeeze and Excitation, which adaptively adjusts the importance of each channel by learning the weights so that the neural network can better capture the features of the sea ice image, and improve the performance of the network without adding too much computational burden. The structure of the SE Attention Mechanism is shown in Figure 4.
In step 1, the images were feature-extracted by convolution. Step 2 is the compression phase (Fsq). A global average pooling of the H and W dimensions of the feature maps. Step 3 is the incentive phase (Fex). A series of fully connected layers act on the output of the compression phase and generate a channel attention vector. The final step is rescaling the operation (Fsc). Through two steps of Fsq and Fex, the SE attention mechanism can adaptively learn the importance of each channel. When the weight increases, the value of the feature graph will increase correspondingly, and the influence on the output will increase, otherwise, when the weight decreases, the value associated with it will decrease. This computational process improves the network’s ability to express and distinguish the features of the target regions and enhances the performance of the algorithm to detect small targets.

2.2.2. SPPCSPC-F Spatial Pyramid Pooling

By introducing multi-scale feature pooling methods, YOLOv5 has significantly improved its ability to detect objects of different sizes. The SPPF is an improvement in the structure of SPP. The SPPF further optimizes computational efficiency by utilizing a single shared pooling layer for multi-scale aggregation, reducing computational overhead while maintaining feature richness. Compared to traditional methods, these improvements enhance the ability to detect both small and large objects with lower computational costs.
The convolutional kernel CBL in the internal structure of SPP consists of Convolution, Standard Normalization, and Leaky ReLU, respectively. The structure of SPPF is superior to that of SPP, in that the maximal pooling layers will be in chunks by connecting them in series, thus speeding up the computation. The structure of SPPF is shown in Figure 5. shown, the convolution kernel CBS consists of convolution, standard normalization, and SiLU activation function, and its structure contains three consecutive maximal pooling layers of size 5 × 5, which can acquire three different scales of sensory fields.
Although the maximum pooling operation can expand the sensory field to obtain rich contextual information, this nonlinear downsampling reduces the spatial resolution of the feature maps and may lose some of the discriminative information in the original feature maps, making the detection of small targets ineffective. This pooling structure easily leads to overfitting, which needs to be avoided by using more training data or regularization.
In summary, the dual-branch architecture of this module effectively reconciles the competing demands of multiscale feature extraction and spatial detail preservation. The parallel processing pathways enable concurrent capture of both macroscopic ice field distribution patterns and microscopic floe characteristics. Through optimized pooling operations and feature fusion strategies, the module maintains computational efficiency while achieving these objectives [27].
In order to solve the above problems of SPPF, the advantages of the SPPCSPC structure in the YOLOv7 network model are integrated to improve the structure of SPPF, and the SPPCSPC-F module is obtained, and the SPP structure in the SPPCSPC is changed into the SPPF structure, which is because the SPPF structure has better accuracy and speed, and it is placed in layer 10 of the YOLOv5 backbone network. Layer 10 of the YOLOv5 backbone network and its structure are shown in Figure 6.
The SPPCSPC-F module first splits the input feature map into two branches by channel. One part of the branch undergoes maximum pooling at three different scales to obtain multi-scale sea ice information. The other part of the branch goes through 1 × 1 convolution directly to maintain the original resolution. Then the two parts of the features are connected by a channel, so that the multi-scale features are obtained and the original detail information is retained, and finally the two convolutions are further feature fused. The SPPCSPC-F module modifies the order of the maximum pooling, which retains the details while keeping the sensory field unchanged, and enhances the feature expression capability with stronger feature fusion ability. The introduction of the SPPCSPC-F module enhances the network’s capability to integrate multi-scale sea ice features while preserving crucial edge detail information. This module optimizes the feature extraction process, enabling more precise identification of diverse sea ice targets in satellite imagery, ranging from fragmented ice floes to continuous ice fields.

2.2.3. FReLU Activation Function

The original YOLOv5 algorithm uses the SiLU activation function. This traditional activation function has some limitations, such as when the input value is far away from 0, the derivative of the SiLU activation function will tend to 0, which will lead to the problem of vanishing gradient. The vanishing gradient will make it difficult for the model to perform effective backpropagation, which will make it difficult for the network to converge or cause instability in training, reducing the efficiency of training and even leading to loss of information.
In this paper, we use the FReLU activation function, which is more suitable for the target recognition task, to replace the SiLU activation function. FReLU is a kind of funnel function, which is obtained by the improvement of the ReLU activation function, and extends the ReLU function by adding a spatial condition to expand the space to two dimensions, which is a relatively simple process to realize, and only adds a small computational overhead [28]. The structure of the two activation functions is shown in Figure 7.
This replacement of FReLU brings several specific benefits, including significantly faster training speed, enhanced ability to handle small objects or complex data, and reduced resource consumption, making it more suitable for deployment in resource-constrained environments. The FReLU activation function incorporates learnable parameters that allow the network to adaptively adjust the shape of the function through learning. This flexibility enhances the learning ability of the model and better adapts to the characteristics of sea ice images, and the advantages of the FReLU activation function in nonlinear transformation and feature enhancement can improve the performance of the target recognition model.
In this paper, Combining the SE attention mechanism, the SPPCSPC-F spatial pyramid pooling module, and the FReLU activation function improves the performance of the model on specific tasks. It can learn more important feature representations, focus on key object regions, mitigate the overfitting problem, and improve the generalization ability of the model when recognizing sea ice with smaller sizes in remote sensing images. The synergistic effect of the two enables the model to obtain better recognition performance under limited data conditions.

2.2.4. Optimizing the Overall YOLOv5 Framework

The SE attention mechanism is added to the 9th layer of the backbone network, and the spatial pyramid pooling structure is improved in the 10th layer, while the SiLU activation function is replaced with the FReLU activation function, and correspondingly, the CBS layer of the network is changed to the CBF layer to optimize the structure. The YOLOv5 model is shown in Figure 8.
The optimization of YOLOv5 increases the depth and size of the network, which adds some computational overhead, but as the depth of the network increases, the model can learn more complex and abstract representations of sea ice features, improving the accuracy of the algorithm. Table 1. shows the parameters of the network before and after YOLOv5 optimization.

2.3. Experimental Results

2.3.1. Construction of the Data Set

This study utilizes multi-source satellite imagery to ensure robust sea ice detection under varying real-world conditions. Three primary datasets were selected to incorporate variations in spatial resolution, weather conditions, lighting environments, and geographical locations. The first dataset comprises 256 × 256 pixel RGB images from the NWPU dataset published by Northwestern Polytechnical University, with specific technical parameters detailed in reference [29]. The second dataset includes Arctic sea ice imagery acquired from Google Earth (http://earthengine.google.com/) with a spatial resolution of 0.5 m in the RGB spectral bands, formatted as 600 × 600 pixel images. The third dataset consists of multi-sensor remote sensing images provided by the Norwegian University of Science and Technology (NTNU), featuring data from three different satellite sensors across four polarization modes. Representative samples from these sea ice remote sensing datasets are illustrated in Figure 9.
Due to the datasets’ origin from diverse geographical regions and spectral sources, they contain inherent differences in resolution and lighting conditions. To address these challenges and simultaneously increase the number of training samples, we employed several data augmentation methods. These included the Mosaic operation, perspective conversion, left-right flipping, and rotation, as illustrated in Figure 10.
The Mosaic operation was a critical part of this process. Its main idea is to enrich the dataset by randomly cropping and scaling multiple images before stitching them together into a single, new image. This approach forces the model to learn and adapt to a wide variety of scales and contexts, improving its robustness and generalization capabilities in real-world scenarios.
The remote sensing sea ice dataset is constructed after the steps of de-weighting, manual labeling, and auditing. The label name is “ice”, the number of images is 600, and the number of labels is 15,948, which contained all ice within the dataset. It makes more diverse samples to participate in the model training. And annotations are created using the LabelImg v.1.8.6 tool, which enhances 15,948 annotation precision.
The number of labels is large relative to the number of images, which is due to the large number of sea ice targets in a large-size remote sensing image and is very dense. Therefore, a sliding window was used to cut a specified-size (such as 416 × 416 px) image as the input. YOLOv5 ensures no data leakage during dataset splitting by setting random seeds, strictly dividing training, validation, and test data paths, checking for duplicate file paths, and using hash verification mechanisms. Additionally, it allows manual specification of data paths to strictly control data distribution, ensuring that the training, validation, and test sets do not overlap. In this way, The dataset is split into training, validation, and test sets in a 7:2:1 ratio, providing a broad feature set for the detection model and contributing to higher accuracy.

2.3.2. Parameterization of the YOLOv5

The training parameters are set as follows, the number of iterations is 300, the initial learning rate is 0.001, the momentum parameter is 0.9, the weight decay parameter is 0.0005, and the threshold of the non-great suppression ratio is 0.5. Evaluation is carried out every 30 rounds of training. The F1 value is used as a comprehensive index to reconcile the accuracy P and recall R values, which can comprehensively evaluate the quality of the optimization model, and the larger the F1 value indicates that the quality of the model is higher. The detailed hardware and software parameters are shown in Table 2. Specific training parameters are given in Table 3.
The value Average Precision (AP) and the mean Average Precision (mAP) are generally used in the field of target recognition to evaluate the quality of algorithms. Precision (P) and Recall (R) are used to plot the Precision Recall (PR) Curve and calculate mAP by integrating them; P quantifies the effectiveness of sample classification and R quantifies the ability to detect positive samples [30].
The precision rate P is the probability of identifying correctly in all positive samples, also known as the check rate in the definition of model prediction. Recall R is the probability of identifying correctly in all positive samples. The average precision AP can be a more comprehensive measure of the model, the recall rate R indicator as the horizontal coordinate, the accuracy rate P indicator as the vertical coordinate, one time to plot the PR curve, the PR curve and the area enclosed by the horizontal coordinate is the average precision AP. The calculation of the mAP is generally divided into two steps: the first step is to calculate the average of each category in the AP of each category in the dataset, and the second step is to take the average value after summing the average accuracies of each category. Overall, a good target recognition model should have both high accuracy P and recall R, and further, a high mAP value. The relevant equations are shown in Equations (1)–(5) [31].
Precision = TP TP + FP
Recall = TP TP + FN
F 1 = 2 P R P + R
A P = 0 1 P R d R
m A P = i = 1 k A P i k
where the number of samples correctly categorized as positive samples is known as Ture Positives (TP), the number of samples incorrectly categorized as positive samples is known as False Positives (FP), the number of samples The number of correctly categorized negative samples is called True Negatives (TN) and the number of incorrectly categorized negative samples is called False Negatives (FN).

2.3.3. Ablation Experiments

To validate the model, ablation experiments were set up. The results of the ablation experiments are shown in Table 4 and Figure 10.
Based on the results of the ablation experiments, it can be seen that the original YOLOv5 has a mAP of 0.719, which is the lowest of all the evaluated models. Adding the SE attention mechanism improves the mAP by 1.9% to 0.738. In addition, to show the attention mechanism improvements and the optimization of the multi-scale Spatial Pyramid Pooling, the Simplified SPPF (SimSPPF) only improves the mAP by 0.6% to 0.725. Although SimSPPF has a small improvement compared with YOLOv5, it is not as good as the SPPCSPC-F selected in this paper. Adding SPPCSPC-F spatial pyramid pooling improves the mAP by 2.4% to 0.743. However, the R-value is relatively low at 0.688. replacing the FReLU activation function improves the mAP by 2.8% to 0.747. When the three optimizations were used simultaneously, the mAP improved by 3.5% to 0.754, the best-performing set of models in the ablation experiments. Similarly, the P, R, and F1 values of the original YOLOv5 were 0.719, 0.684, and 0.701, respectively. After optimization, the P, R, and F1 values were 0.753, 0.703, and 0.727, which were improved by 3.4%, 1.9%, and 1.8%, respectively.
It can be seen from Figure 11, the yellow curve represents the map with only the added SE module, which has the lowest peak compared to the other three curves. The red curve represents the fully optimized map, with the best effect. The results show that the improved YOLOv5 has higher accuracy in recognizing remote sensing sea ice images.

2.3.4. Comparison Experiments

To further validate the effectiveness of optimizing YOLOv5, this paper sets up a comparison experiment. The optimized algorithm is compared with the original YOLOv5 and other target detection models such as Faster-RCNN, YOLOv3, YOLOv4, and the current newer YOLOv8 model, and the Loss value and mAP value of each model are calculated, and the comparison results are shown in Figure 12.
From the trend of Loss value, it can be seen that in the first 40 epochs, the Loss value of each model decreases rapidly, which indicates that the network is learning the features of the sea ice rapidly and the training has not yet reached the stable stage. After 200 epochs, the training is gradually stabilized, in which the optimization model has a lower Loss value than the other algorithms, which indicates that the optimization of YOLOv5 has a fast convergence speed. All algorithms converge to stability at 250 epochs. In the stabilization phase, the optimized model has a lower Loss value and higher mAP value, which indicates better generalization ability and detection performance of YOLOv5-ICE compared to other models.
The individual trained models are evaluated, and the results are compared as shown in Table 5. Compared to Faster-RCNN, YOLOv3, YOLOv4, YOLOv5, and YOLOv8, the mAP values of the YOLOv5-ICE model are improved by 15%, 10.6%, 9.9%, 3.5%, and 1.3%, respectively. Among them, YOLOv3 has the lowest mAP of 0.604, and the optimized algorithm has the highest mAP of 0.754. YOLOv3, YOLOv4, and YOLOv8 have higher accuracy value P, but lower recall R. The F1 values of 0.552, 0.636, and 0.617 indicate that there is ice under detection for the identification of targets by the three models, and the YOLOv5-ICE has the highest F1 value of 0.727. From the results of the comparison experiments, it can be seen that the YOLOv5-ICE can better detect sea ice targets in remote sensing images.
The large-size sea ice remote sensing images are recognized using the YOLOv5-ICE. Since the sea ice targets are too dense, the confidence level is hidden in the resulting graph in order to better show the recognition effect. The recognition results of the algorithm before and after the optimization and the local zoomed-in area are shown in Figure 13. In the same area, the number of sea ice recognized by the original YOLOv5 is 14 and 55, and the number of sea ice recognized by the YOLOv5-ICE is 53 and 88. In comparison, the number of recognized sea ice has increased by 39 and 33, respectively, and most of them are small targets that are difficult to detect.
The results of remote sensing sea ice image recognition can provide environmental information for ship path planning, and the detailed sea ice information provides reliable input data for the ship path planning algorithm, which helps to generate the optimal path plan that is more in line with the actual ice conditions.

3. Polar Ship Route Planning Based on YOLOv5-ICE

3.1. Grid Mapping for Polar Ship Path Planning Using YOLOv5-ICE

The output of target recognition is used as the input for map construction. The sea ice distribution results are extracted, and raster maps are constructed. After optimizing the YOLOv5 algorithm, the sea ice target in the remote sensing image is recognized, and based on the recognition result, the prediction frame of the sea ice is extracted, which contains the distribution information of the sea ice, including the positional coordinate information of the sea ice, and at the same time, the prediction frame also characterizes the size of the sea ice.
Due to the focus on cost and timeliness in path planning research, bounding boxes are the preferred choice. Bounding box annotations are generally faster than pixel-level annotations, and bounding box methods consume fewer resources than semantic segmentation methods, with faster convergence. This approach strikes a balance between annotation speed, model processing time, and resource consumption. After obtaining the sea ice distribution results, the sea ice information is used as an input for the construction of the path planning map, and the process is shown in Figure 14.
It is shown in Figure 14 that the original remote sensing image consists of sea ice and seawater. A satellite image of sea ice is converted into a 7 × 7 grid. The sea ice identified by the rectangular box is represented by black cells, while white cells represent seawater.
The output of optimizing YOLOv5 recognition of remote sensing sea ice is used as the input for the construction of path planning maps, and raster maps corresponding to the actual environment are built within the ice area in which it navigates. The prediction bounding box for target detection corresponds to the minimum enclosing rectangle of sea ice formations. This is achieved through boundary inflation processing around sea ice obstacles, which effectively fills discontinuous grid regions. The sea ice boundary inflation methodology is illustrated in Figure 15.
It can ensure that sufficient safety distance is left between the ship and the sea ice to ensure that the ship carries out collision-free movement and reduce the potential collision risk caused by the edge of the obstacles [32]. The black grid in the figure indicates the location and size of the sea ice, and the white grid indicates the open water area through which the ship can pass.

3.2. Path Planning Algorithms and Objective Functions

Theta* is an improved algorithm based on the A* algorithm, both algorithms are path-planning algorithms under the raster map. Although the A* algorithm can plan the shortest path, this path is only the shortest relative to the raster map and does not fully match the shortest path in the actual physical environment [33]. The A* algorithm is constrained by the raster map, the resulting path has only 4 angles of movement and will follow the edge of the grid when planning the path, so the deviation of the path from the actual shortest path is larger. The deviation between such a path and the actual shortest path is larger. In ship path planning, to meet the requirements of high efficiency and maneuverability, it is necessary to ensure that the ship makes smooth turns as much as possible and that the Theta* algorithm is not constrained by grid boundaries, which can better solve this problem [34].
The main difference between the Theta* algorithm and the A* algorithm is that the Line Of Sight (LOS) algorithm detection is performed before determining the next waypoint when expanding the node so that the direction of the path is not constrained by the raster when generating waypoints, and the LOS detection algorithm is shown in Figure 16.
The red path in the figure is the path planned by the A* algorithm, and the blue path is the path planned by the Theta* algorithm. It can be seen that the path generated by the Theta* algorithm is smoother, with fewer turns, which is more suitable for the navigation scenarios of the ship in the polar region. The Theta* algorithm ensures optimality, while the speed of the path searching and the environment adaptability are both improved based on the A* algorithm.
Polar ship path planning should take into account the path distance, operation complexity, and sea ice avoidance while ensuring navigation safety. Therefore, considering the ship navigation conditions and ship performance, the objective function is established from three aspects: path distance, the number of ship turns, and the sea ice risk index. The path length is defined as shown in Equation (6) and the sea ice risk value is defined as shown in Equations (7) and (8).
D ( L ) = n = 1 N 1 d i s t f n , f n + 1
where D ( L ) is the path length, and f n is the sea ice avoidance waypoint at which the n navigation path is located, dist ( f n , f n + 1 ) is the distance from the point f n to point f n + 1 .
k ( i ) = i = 1 n f n k
j = 1 n r i s k ( f n , f n + 1 ) = k ( i ) D ( L ) × 100 %
where k ( i ) is the number of ice floes avoided, k is a variable from 0 to 1, when f n belongs to the sea ice avoidance waypoint, k = 1, otherwise k = 0; j = 1 n r i s k f n , f n + 1 is the sea ice risk value, which represents the percentage of the sea ice avoidance path length to the total path length.

4. Path Planning Results and Discussion

4.1. Path Planning Under Low Sea Ice Concentration

Remote sensing sea ice images with a fixed spatial resolution of 10 km × 10 km are selected and input into the YOLOv5-ICE algorithm for recognition, and the recognition results are extracted. The corresponding navigation scenarios are divided into 100 × 100 environmental grids, with a total of 10,000 grids, and the spatial resolution corresponding to one grid is 0.1 km. To more conveniently display the planned path, 1 large grid in the figure contains 10 small grids.
K-means clustering is used to obtain sea conditions with different sea ice densities, and different navigation scenarios of ships in the polar regions are simulated. The process of K-means begins by inputting a remote sensing image. Firstly, the number of clusters, denoted as k, is determined. Then, k cluster centers are initialized. The next step involves calculating the difference between the RGB values of each image pixel and the cluster centers. This difference is compared against a predefined threshold. If the difference exceeds the threshold, the process returns to the previous step to recalculate. If the difference is within the threshold, the process continues by outputting the proportion of each color in the k clusters. Subsequently, the sea ice concentration is calculated. Finally, a new remote sensing image can be inputted to repeat the entire process.
After calculating the k value through the sea ice dataset, six different navigation scenarios with varying sea ice concentrations were obtained. The sea ice concentrations of the scenarios are calculated at 5.0%, 8.3%, 15.7%, 18.4%, 25.1%, and 40.9%, respectively [30]. Based on the recognition results of remote sensing sea ice images, the corresponding raster maps are constructed, and the Theta* algorithm is used for path planning. Setting the first grid of the map as the starting point and the last grid as the endpoint, the path planning results are shown in Figure 17.
The results of the planned path for sailing scenarios under different sea ice concentrations are shown in Table 6, which contains the path length, the number of ship turns, the number of sea ice avoidance, and the sea ice risk value. The parameter comparison is visualized in Figure 18.
From the results, it can be seen that the sea ice concentration parameter significantly affects the ship’s path selection and the safety of navigation. Specifically, when the ship navigates under different sea conditions, with the increase in sea ice concentration, its path shows a more complex direction, the number of turns and the number of sea ice obstacle avoidance show an increasing trend, and the value of the sea ice risk increases. The numerical analysis results show that when the sea ice concentration increases from 5.0% to 40.9%, the length of the ship’s path increases by 3.81 km, the number of turning adjustments to avoid sea ice obstacles increases by 23 times, and the overall sea ice risk value increases by 11.6%. Changes in sea ice concentration can have a significant impact on ship navigation safety and path optimization.
Target recognition provides information about the distribution and morphology of sea ice, and path planning avoids dense sea ice areas according to this information. This paper utilizes the algorithms of target recognition and path planning, combines perception and decision-making, and finds a shorter, smoother, and safer path in the complex sea conditions in the polar region, which provides technical support for the navigation of polar ships [35,36,37].

4.2. Path Planning Under High Sea Ice Concentration

Section 4.1 analyzes path planning under the highest observed sea ice concentration of 40.9%. However, boundary inflation processing presents difficulties when applied to high-concentration sea ice detection results. While this method proves effective for individual ice floes when using uniform buffers, it becomes unsuitable for high-density scenarios where the ice map consists of densely packed rectangular grids (1000 × 1000). Applying spatial expansion to these grids would classify nearly the entire map as obstructed space, represented by solid black areas, thus preventing feasible path planning.
To address this challenge, we employ a binarization process to convert the high-density ice map into a grid-based representation. Traditional grid-based navigation methods categorize each cell as either blocked or unblocked, with black cells indicating sea ice and white cells representing open water. The high-density sea-ice maps before (10 km × 10 km) and after conversion are shown in Figure 19.
In Figure 19, the image is converted to the HSV color space to detect the red borders first. By identifying the contours of the red borders, the positions of the detection boxes are obtained. Subsequently, a 1000 × 1000 gridded environment map is created. All detection boxes are iterated through, and the grids within each detection box are marked as sea ice. In the code setup, only two colors are defined: 0 for black, representing water, and 1 for white, representing sea ice. The pixel positions of the detection boxes are obtained and converted into grid indices. The row and column indices of the detection boxes in the grid map are calculated. Finally, based on the markings, the gridded sea ice distribution map is displayed, which corresponds to the binarized grid map composed of black and white colors on the right side of Figure 19.
Because of that basic Theta* may fail in areas of high ice density, as it models ice as impassable regions. So, we extend the basic Theta* algorithm to non-uniform grid maps (Non-uniform Theta*). Representing map traversability with non-uniform cell costs, where a layer of risk cells with diminishing grayscale values is generated. Representing map traversability with non-uniform cell costs, where a layer of risk cells with diminishing grayscale values is generated.
The risk function R(c) is defined as follows to account for the spatial relationship to a threshold distance of 20 grids. For distances greater than or equal to 20 grids, the risk value remains constant at 1, indicating a stable and minimal level of risk. For distances within 20 grids, the risk increases nonlinearly as r decreases, which ensures a steeper growth of risk closer to the origin. The addition of 1 ensures a minimum risk baseline across all distances. This formulation captures the concept of escalating risk in proximity to the critical threshold while stabilizing risk at greater distances.
The traversal cost function edge represents the traversal cost associated with moving from the parent node to the child node. This formulation is particularly useful for path planning or graph traversal problems where both distance and risk must be considered.
The risk function and traversal cost function used are shown in Formulas (9)–(11) [38].
R ( c ) = 1 r   >   20   grids 20 20 r 1.2 + 1 r     20   grids
where r represents the distance (in grids) to a reference location.
e d g e ( c p , c c h i l d ) = m ¯ d i s t ( c p , c c h i l d )
m ¯ = 1 N c i L R ( c i )
where cp represents a parent node, cchild represents a child code, dist(cp, cchild) denotes the distance between the parent node and the child node m ¯ represents the average risk factor, which scales the distance to account for associated risks in the environment.
By using the function above, Figure 17 can be converted into a raster plot with a risk index in Figure 20a. The results of the path planning are shown in Figure 20b,c.
It is shown in Figure 20 how the image with a risk index is converted. The risk function is defined by comparing the distance from the center of each sea ice block to its edge with the radius of the sea ice. In Formula (9), a dynamic risk classification is applied to each sea ice block. This figure is represented by a gradient of colors—the closer the color is to the center of the sea ice, the darker it appears, while the farther it is from the center, the lighter it becomes. Due to the dynamic risk index, the risk distribution varies for sea ice blocks of different sizes.
Compared with the previous images, it can be seen that if the rectangular frame grid conversion method is used for the high density of sea ice images, the converted image will be all white. Therefore, using a raster transformation method with a risk function, one can subjectively judge that a path can arise between sea ice. Based on Figure 20a, a polar path-planning algorithm is employed. Figure 20b,c illustrate that the route planned by the non-uniform Theta* algorithm can pass through some small ice floes but avoid larger ones. The red solid line in the image represents the collision avoidance path obtained using the algorithm. The green in the image represents the number of turns of the ship. As can be seen from the figure, these two paths at different starting points, take 39 and 40 turns, respectively.

5. Conclusions

In this paper, remote sensing sea ice identification is performed using computer vision methods. Based on the identification results, polar ship route planning is conducted to assist in decision-making for navigation. The following conclusions can be drawn.
(1)
Targeted optimization of YOLOv5 is carried out according to the characteristics of remote sensing images. This optimization includes three improvements: adding the SE attention mechanism, improving the spatial pyramid pooling structure, and replacing the activation function with FReLU, which is more suitable for the target identification task.
(2)
Ablation experiments are conducted to compare the effects of different improvement methods. When the optimization methods are applied simultaneously, the mAP improves by 3.5%. Comparisons are conducted to verify the effectiveness of the optimization algorithm by trying Faster-RCNN, YOLOv3, YOLOv4, YOLOv5, and YOLOv8. The YOLOv5-ICE achieves an mAP of 75.4%, which is 1.3% higher than that of YOLOv8, making it the best-performing model. The number of detected sea ice instances increased by 39 and 33.
(3)
Path planning for polar ships under low concentrations is based on the setting of the risk assessment grid plot. Images with high sea ice density are successfully converted by setting the risk function—no longer a plain raster plot of pure white rectangular box conversion. Using the Theta * algorithm, the path can be successfully simulated. It can avoid the otherwise impassable sea ice, taking 39 and 40 turns, respectively.
Although the study has successfully detected sea ice in remote sensing images and performed path planning based on it, there are still some limitations. This study focuses on the innovative integration of an improved object detection algorithm with path planning. For ship movement, an idealized approach was adopted: in high ice concentration areas, the vessel is simplified into a point-mass model, and its own factors are considered idealistically in the path planning process. In future work, we will employ image segmentation techniques to accurately extract ice floe geometries, while incorporating real-world vessel maneuvering capabilities into the framework.

Author Contributions

Conceptualization, L.Z., R.X. and J.B.; methodology, R.X., J.B. and S.H.; software, J.B. and S.H.; validation, R.X., S.H. and S.D.; formal analysis, L.Z. and S.H.; investigation, S.D.; resources, R.S.; data curation, J.B.; writing—original draft preparation, L.Z., J.B.; writing—review and editing, R.X., S.H.; visualization, J.B., S.H.; supervision, L.Z., S.D.; project administration, L.Z., S.D.; funding acquisition, L.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Key Research and Development Program grant number 2024YFC2816301 and 2022YFE0107000, General Projects of National Natural Science Foundation of China grant number 52171259, High-tech ship research project of Ministry of Industry and Information Technology grant number [2021]342, Young Scientists Fund of National Natural Science Foundation of China grant number 52301331.

Data Availability Statement

Restrictions apply to the availability of these data. Data were obtained from Norwegian University of Science and Technology and are available from the authors with the permission of Norwegian University of Science and Technology.

Acknowledgments

The authors would like to express their gratitude to the anonymous reviewers and editorial team members for their valuable feedback and recommendations.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Chuah, L.F.; Mokhtar, K.; Ruslan, S.M.M.; Abu Bakar, A.; Abdullah, M.A.; Osman, N.H.; Bokhari, A.; Mubashir, M.; Show, P.L. Implementation of the energy efficiency existing ship index and carbon intensity indicator on domestic ship for marine environmental protection. Environ. Res. 2023, 222, 115348. [Google Scholar] [CrossRef]
  2. Zuo, Q.; Qian, L.; Xu, X.; Yan, J.; Cheng, L.; Zhang, Z. Navigation strategy and economic research of the northeast passage in the Arctic. Chin. J. Polar Res. 2015, 27, 203. [Google Scholar]
  3. Lin, B.; Zheng, M.; Chu, X.; Mao, W.; Zhang, D.; Zhang, M. An overview of scholarly literature on navigation hazards in Arctic shipping routes. Environ. Sci. Pollut. Res. Int. 2024, 31, 40419–40435. [Google Scholar] [CrossRef] [PubMed]
  4. Lu, P.; Leppäranta, M.; Cheng, B.; Li, Z.; Istomina, L.; Heygster, G. The color of melt ponds on Arctic sea ice. Cryosphere 2018, 12, 1331–1345. [Google Scholar] [CrossRef]
  5. Weissling, B.; Ackley, S.; Wagner, P.; Xie, H. EISCAM—Digital image acquisition and processing for sea ice parameters from ships. Cold Reg. Sci. Technol. 2009, 57, 49–60. [Google Scholar] [CrossRef]
  6. Worby, A.; Comiso, J. Studies of the Antarctic sea ice edge and ice extent from satellite and ship observations. Remote Sens. Environ. 2004, 92, 98–111. [Google Scholar] [CrossRef]
  7. Zhou, L.; Cai, J.; Ding, S. The Identification of Ice Floes and Calculation of Sea Ice Concentration Based on a Deep Learning Method. Remote Sens. 2023, 15, 2663. [Google Scholar] [CrossRef]
  8. Ressel, R.; Frost, A.; Lehner, S. A Neural Network-Based Classification for Sea Ice Types on X-Band SAR Images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2015, 8, 3672–3680. [Google Scholar] [CrossRef]
  9. Chen, J.; Lu, W.; Xu, R.; Ding, S.; Zhou, L. High-Resolution Simulation of Sea Ice Dynamics Using a Nested Modeling Approach. In Proceedings of the 28th International Conference on Port and Ocean Engineering Under Arctic Conditions, St. John’s, NL, Canada, 13–17 July 2025. [Google Scholar]
  10. Anderson, S. Remote Sensing of the Polar Ice Zones with HF Radar. Remote Sens. 2021, 13, 4398. [Google Scholar] [CrossRef]
  11. Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef]
  12. Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. SSD: Single shot multibox detector. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016. [Google Scholar] [CrossRef]
  13. Lu, B.; Zhou, J.; Wang, Q.; Zou, G.; Yang, J. Fusion-based color and depth image segmentation method for rocks on conveyor belt. Miner. Eng. 2023, 199, 108107. [Google Scholar] [CrossRef]
  14. Zhang, G.; Wang, C.; Xiao, D. A novel daily behavior recognition model for cage-reared ducks by improving SPPF and C3 of YOLOv5s. Comput. Electron. Agric. 2024, 227, 109580. [Google Scholar] [CrossRef]
  15. Wu, H.; Xu, Q.; He, X.; Xu, H.; Wang, Y.; Guo, L. SPE-YOLO: A deep learning model focusing on small pulmonary embolism detection. Comput. Biol. Med. 2025, 184, 109402. [Google Scholar] [CrossRef] [PubMed]
  16. Cai, J.; Ding, S.; Zhang, Q.; Liu, R.; Zeng, D.; Zhou, L. Broken ice circumferential crack estimation via image techniques. Ocean Eng. 2022, 259, 111735. [Google Scholar] [CrossRef]
  17. Dong, W.; Zhou, L.; Ding, S.; Ma, Q.; Li, F. Fast and Intelligent Ice Channel Recognition Based on Row Selection. J. Mar. Sci. Eng. 2023, 11, 1652. [Google Scholar] [CrossRef]
  18. Shu, Y.; Zhu, Y.; Xu, F.; Gan, L.; Lee, P.T.-W.; Yin, J.; Chen, J. Path planning for ships assisted by the icebreaker in ice-covered waters in the Northern Sea Route based on optimal control. Ocean Eng. 2023, 267, 113182. [Google Scholar] [CrossRef]
  19. Zhang, C.; Zhang, D.; Zhang, M.; Zhang, J.; Mao, W. A three-dimensional ant colony algorithm for multi-objective ice routing of a ship in the Arctic area. Ocean Eng. 2022, 266, 113241. [Google Scholar] [CrossRef]
  20. Liu, Q.; Wang, Y.; Zhang, R.; Yan, H.; Xu, J.; Guo, Y. Arctic weather routing: A review of ship performance models and ice routing algorithms. Front. Mar. Sci. 2023, 10, 1190164. [Google Scholar] [CrossRef]
  21. Lehtola, V.; Montewka, J.; Goerlandt, F.; Guinness, R.; Lensu, M. Finding safe and efficient shipping routes in ice-covered waters: A framework and a model. Cold Reg. Sci. Technol. 2019, 165, 102795. [Google Scholar] [CrossRef]
  22. Xu, T.; Yang, H.; Ma, J.; Xiong, K.; Hu, Q. An Improved D* Lite-Based Dynamic Route Planning Algorithm for Ships in Arctic Waters. J. Mar. Sci. Eng. 2024, 12, 2323. [Google Scholar] [CrossRef]
  23. Yang, Z.; Yin, Y.; Jing, Q.; Shao, Z. A High-Precision Detection Model of Small Objects in Maritime UAV Perspective Based on Improved YOLOv5. J. Mar. Sci. Eng. 2023, 11, 1680. [Google Scholar] [CrossRef]
  24. Qiao, W.; Guo, H.; Huang, E.; Su, X.; Li, W.; Chen, H. Real-Time Detection of Slug Flow in Subsea Pipelines by Embedding a Yolo Object Detection Algorithm into Jetson Nano. J. Mar. Sci. Eng. 2023, 11, 1658. [Google Scholar] [CrossRef]
  25. Ophoff, T.; Puttemans, S.; Kalogirou, V.; Robin, J.-P.; Goedemé, T. Vehicle and Vessel Detection on Satellite Imagery: A Comparative Study on Single-Shot Detectors. Remote Sens. 2020, 12, 1217. [Google Scholar] [CrossRef]
  26. Zhao, M.; Zhou, H.; Li, X. YOLOv7-SN: Underwater Target Detection Algorithm Based on Improved YOLOv7. Symmetry 2024, 16, 514. [Google Scholar] [CrossRef]
  27. Dong, L.; Ye, X.; Wang, X. Small Target Road Detection Method for Autonomous Driving Based on YOLOv5. In Proceedings of the 2025 10th International Conference on Electronic Technology and Information Science (ICETIS), Hangzhou, China, 26–28 June 2025; pp. 314–318. [Google Scholar] [CrossRef]
  28. Bao, Z.; Guo, Y.; Wang, J.; Zhu, L.; Huang, J.; Yan, S. Underwater Target Detection Based on Parallel High-Resolution Networks. Sensors 2023, 23, 7337. [Google Scholar] [CrossRef]
  29. Cheng, G.; Han, J.; Lu, X. Remote sensing image scene classification: Benchmark and state of the art. Proc. IEEE 2017, 105, 1865–1883. [Google Scholar] [CrossRef]
  30. Redmon, J.; Farhadi, A. YOLO9000: Better, Faster, Stronger. In Proceedings of the IEEE Conference on Computer Vision & Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 6517–6525. [Google Scholar]
  31. Mo, H.; Wu, J.; Xia, H.; Yu, X.; Zhao, A.E. A lightweight, efficient, adaptive design of YOLOv5 for enhanced SAR ship detection. Remote Sens. Lett. 2025, 16, 549–559. [Google Scholar] [CrossRef]
  32. Cao, S.; Fan, P.; Yan, T.; Xie, C.; Deng, J.; Xu, F.; Shu, Y. Inland Waterway Ship Path Planning Based on Improved RRT Algorithm. J. Mar. Sci. Eng. 2022, 10, 1460. [Google Scholar] [CrossRef]
  33. Hu, S.; Tian, S.; Zhao, J.; Shen, R. Path Planning of an Unmanned Surface Vessel Based on the Improved A-Star and Dynamic Window Method. J. Mar. Sci. Eng. 2023, 11, 1060. [Google Scholar] [CrossRef]
  34. Han, S.; Sun, J.; Ding, S.; Zhou, L. A Potential Field-Based Model Predictive Target Following Controller for Underactuated Unmanned Surface Vehicles. IEEE Trans. Veh. Technol. 2024, 73, 14510–14524. [Google Scholar] [CrossRef]
  35. Frackiewicz, M.; Mandrella, A.; Palus, H. Fast Color Quantization by K-Means Clustering Combined with Image Sampling. Symmetry 2019, 11, 963. [Google Scholar] [CrossRef]
  36. Zhou, L.; Chuang, Z.; Bai, X. Ice forces acting on towed ship in level ice with straight drift. Part II: Numerical simulation. Int. J. Nav. Archit. Ocean Eng. 2018, 10, 119–128. [Google Scholar] [CrossRef]
  37. Xie, C.; Zhou, L.; Ding, S.; Liu, R.; Zheng, S. Experimental and numerical investigation on self-propulsion performance of polar merchant ship in brash ice channel. Ocean Eng. 2023, 269, 113424. [Google Scholar] [CrossRef]
  38. Han, S.; Wang, L.; Wang, Y.; He, H. A dynamically hybrid path planning for unmanned surface vehicles based on non-uniform Theta* and improved dynamic windows approach. Ocean Eng. 2022, 257, 111655. [Google Scholar] [CrossRef]
Figure 1. Structure of the YOLOv5 model.
Figure 1. Structure of the YOLOv5 model.
Remotesensing 17 03359 g001
Figure 2. Remote sensing image slider window operation in the testing phase.
Figure 2. Remote sensing image slider window operation in the testing phase.
Remotesensing 17 03359 g002
Figure 3. Recognition task for Remote sensing scenarios.
Figure 3. Recognition task for Remote sensing scenarios.
Remotesensing 17 03359 g003
Figure 4. Structure of the SE attention mechanism.
Figure 4. Structure of the SE attention mechanism.
Remotesensing 17 03359 g004
Figure 5. Structure of the SPPF module.
Figure 5. Structure of the SPPF module.
Remotesensing 17 03359 g005
Figure 6. Structure of the SPPCSPC-F module.
Figure 6. Structure of the SPPCSPC-F module.
Remotesensing 17 03359 g006
Figure 7. Activation function of ReLU and FReLU. (a) ReLU activation function conditioned on 0; (b) FReLU activation functions with visualization conditions.
Figure 7. Activation function of ReLU and FReLU. (a) ReLU activation function conditioned on 0; (b) FReLU activation functions with visualization conditions.
Remotesensing 17 03359 g007
Figure 8. Structure of the improved YOLOv5 model.
Figure 8. Structure of the improved YOLOv5 model.
Remotesensing 17 03359 g008
Figure 9. Remote sensing sea ice dataset.
Figure 9. Remote sensing sea ice dataset.
Remotesensing 17 03359 g009
Figure 10. Data augmentation. (a) Mosaic; (b) Perspective, Flip left–right, and Rotation processing.
Figure 10. Data augmentation. (a) Mosaic; (b) Perspective, Flip left–right, and Rotation processing.
Remotesensing 17 03359 g010
Figure 11. AP values for each model in the ablation experiment.
Figure 11. AP values for each model in the ablation experiment.
Remotesensing 17 03359 g011
Figure 12. The loss and mAP value of each model in comparison experiment.
Figure 12. The loss and mAP value of each model in comparison experiment.
Remotesensing 17 03359 g012
Figure 13. Comparison of the detection results.
Figure 13. Comparison of the detection results.
Remotesensing 17 03359 g013
Figure 14. Raster map construction based on recognition results.
Figure 14. Raster map construction based on recognition results.
Remotesensing 17 03359 g014
Figure 15. Inflating boundaries of ice obstacles.
Figure 15. Inflating boundaries of ice obstacles.
Remotesensing 17 03359 g015
Figure 16. Schematic diagram of LOS algorithm.
Figure 16. Schematic diagram of LOS algorithm.
Remotesensing 17 03359 g016
Figure 17. Path planning results for scenarios under different sea ice concentrations.
Figure 17. Path planning results for scenarios under different sea ice concentrations.
Remotesensing 17 03359 g017aRemotesensing 17 03359 g017b
Figure 18. Results of each parameter of path planning.
Figure 18. Results of each parameter of path planning.
Remotesensing 17 03359 g018
Figure 19. High-density sea-ice maps before and after conversion.
Figure 19. High-density sea-ice maps before and after conversion.
Remotesensing 17 03359 g019
Figure 20. The results of the path planning.
Figure 20. The results of the path planning.
Remotesensing 17 03359 g020
Table 1. Original and YOLOv5-ICE network parameters.
Table 1. Original and YOLOv5-ICE network parameters.
LayerOriginal ModulefnOriginal ParamsOptimized Modulef_newn_newOptimized Params
0Conv−113520Conv−113872
1Conv−1118,560Conv−1119,264
2C3−1118,816C3−1120,928
3Conv−1173,984Conv−1175,392
4C3−12115,712C3−12121,344
5Conv−11295,424Conv−11298,240
6C3−13625,152C3−13639,232
7Conv−111,180,672Conv−111,186,304
8C3−111,182,720C3−111,199,616
9SPPF−11656,896SE−1132,768
10Conv−11131,584SPPCSPC-F−117,124,480
11Upsample−110Conv−11134,400
12Concat[−1,6]10Upsample−110
13C3_F−11361,984Concat[−1,6]10
14Conv−1133,024C3_F−11370,432
15Upsample−110Conv−1134,432
16Concat[−1,4]10Upsample−110
17C3_F−1190,880Concat[−1,4]10
18Conv−11147,712C3_F−1195,104
19Concat[−1,14]10Conv−11149,120
20C3_F−11296,448Concat[−1,15]10
21Conv−11590,336C3_F−11304,896
22Concat[−1,10]10Conv−11593,152
23C3_F−111,182,720Concat[−1,11]10
24- C3_F−111,199,616
where f and f_new denote the original and optimized input layer, −1 represents the input from the previous layer, n and n_new denote the original and optimized connection of the layer to the input layer, and original and optimized Params denote the number of parameters of the network.
Table 2. Hardware and software configurations and versions.
Table 2. Hardware and software configurations and versions.
Hardware and Software ConfigurationModels and Versions
operating systemWindow10
CPU, Central Processing UnitIntel Xeon W-2255
Graphics Card GPUNVIDIA Quadro P620
Deep Learning PlatformPytorch
Pytorch version1.10.2
CUDA version11.3
CUDNN version8.2.1
Python version3.9
Table 3. Training parameters.
Table 3. Training parameters.
ParametersValues
num_calsses4
learning_rate_base0.002
batch_size4
momentum0.937
num_workers4
epoch1000
weight_decay0.0005
Table 4. Ablation study results.
Table 4. Ablation study results.
NamePRF1mAP
YOLOv50.7190.6840.7010.719
YOLOv5 + SimSPPF0.7220.6860.7050.725
YOLOv5 + SE0.7310.7010.7160.738
YOLOv5 + SPPCSPC-F0.7370.6880.7120.743
YOLOv5 + FReLU0.7230.7060.7140.747
YOLOv5 + SE + SPPCSPC-F + FReLU0.7530.7030.7270.754
Table 5. Contrast study results.
Table 5. Contrast study results.
MouldPRF1mAP
Faster-RCNN0.6410.6320.6360.655
YOLOv30.8580.4070.5520.604
YOLOv40.7570.5480.6360.648
YOLOv50.7190.6840.7010.719
YOLOv80.8390.4880.6170.741
Ours0.7530.7030.7270.754
Table 6. Path planning results for different sea ice concentrations.
Table 6. Path planning results for different sea ice concentrations.
Ice ConcentrationPath LengthNumber of Ship TurnNumber of Ice AvoidanceSea Ice Risk Value
5.0%14.23 km185.6%
8.3%14.44 km364.2%
15.7%14.59 km5128.3%
18.4%14.67 km51610.9%
25.1%15.23 km72113.8%
40.9%18.04 km73117.2%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhou, L.; Xu, R.; Bian, J.; Ding, S.; Han, S.; Skjetne, R. Optimized Recognition Algorithm for Remotely Sensed Sea Ice in Polar Ship Path Planning. Remote Sens. 2025, 17, 3359. https://doi.org/10.3390/rs17193359

AMA Style

Zhou L, Xu R, Bian J, Ding S, Han S, Skjetne R. Optimized Recognition Algorithm for Remotely Sensed Sea Ice in Polar Ship Path Planning. Remote Sensing. 2025; 17(19):3359. https://doi.org/10.3390/rs17193359

Chicago/Turabian Style

Zhou, Li, Runxin Xu, Jiayi Bian, Shifeng Ding, Sen Han, and Roger Skjetne. 2025. "Optimized Recognition Algorithm for Remotely Sensed Sea Ice in Polar Ship Path Planning" Remote Sensing 17, no. 19: 3359. https://doi.org/10.3390/rs17193359

APA Style

Zhou, L., Xu, R., Bian, J., Ding, S., Han, S., & Skjetne, R. (2025). Optimized Recognition Algorithm for Remotely Sensed Sea Ice in Polar Ship Path Planning. Remote Sensing, 17(19), 3359. https://doi.org/10.3390/rs17193359

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop