1. Introduction
With the rapid development of science and technology and the deepening of human exploration of the ocean, maritime activities are becoming more and more frequent. The resolution technology of maritime target elements emerges as the times require and has become one of the core technologies to ensure navigation safety. In the field of maritime safety, accurate and timely interpretation of maritime target elements is an important prerequisite to avoid ship collisions and ensure navigation safety.
Traditional maritime target element extraction methods are usually based on the ideal assumptions of complete time-series data, uniform sampling, and low noise. However, in practical applications, limited by the hardware performance of observation equipment, the complex marine environment, and the variability of remote communication links, the acquired maritime target time-series data often have serious data quality problems, which bring serious challenges to the resolution and robustness of the resolution methods. From the device level, the hardware limitation of the sampling frequency of the sensor and the inherent deviation in the positioning accuracy lead to the phenomenon of non-uniform sampling of the data, and the sampling interval fluctuates between seconds and minutes, which seriously destroys the time-series continuity of the data [
1,
2]. Under the influence of environmental factors, electromagnetic interference caused by adverse weather at sea (such as typhoons and rainstorms), as well as the absorption and attenuation of signals by seawater, will cause a large amount of data to go missing and even data gaps in some monitoring periods, resulting in the loss of key feature information [
3]. This phenomenon often leads to the “data empty window” part of the monitoring period, which makes it difficult to accurately reconstruct the motion state and behavior characteristics of the target. At the same time, due to the interference of ionosphere, network congestion and other factors in the marine environment, noise signals such as impulse noise and white noise are easily mixed in the process of remote data transmission [
4,
5]. The interweaving and superposition of these problems make it difficult for traditional solution methods based on the assumption of uniform sampling to effectively extract the elements of target motion, which brings serious challenges to maritime target monitoring and situation analysis.
With the rapid development of artificial intelligence technology, deep learning technology has been gradually applied to the field of marine target element analysis. Its powerful feature-learning and pattern-recognition capabilities have brought new solutions for marine target element analysis. However, existing deep learning models still face some challenges when dealing with marine target element resolution tasks, such as insufficient generalization ability [
6] and poor adaptability to complex backgrounds [
7].
As an important breakthrough in the field of artificial intelligence, large language models have learned rich language knowledge and common sense through self-supervised learning on large-scale unlabeled text data and have powerful language understanding and generation capabilities. The application of large language models in the field of maritime object detection can provide new ideas for solving the limitations of traditional methods and existing deep learning models. By fine-tuning the large language model, it can better adapt to the needs of maritime object detection tasks and improve the accuracy and efficiency of detection.
Large language models have rapidly evolved in recent years. Starting from GPT-2 to GPT-4 [
8], major players like Google, Anthropic, and Meta have launched models such as Gemini [
9], Opus [
10], and Llama-3 [
11]. Built on Transformer architecture and pre-trained on vast corpora, these LLMs master language knowledge and semantic representation, excelling in traditional NLP tasks (text classification [
12], sentiment analysis [
13], translation [
14]) and complex applications (text generation [
14], code writing [
15]).
Recent breakthroughs have further expanded the capabilities of large language models (LLMs). OpenAI’s GPT-5, released in August 2025, features enhanced multimodal integration, enabling seamless processing of text, images, and sensor data (e.g., maritime radar) while delivering improved cross-modal reasoning [
16]. Google’s Gemini 2.5 family—including the lightweight Flash-Lite variant—offers a 1-million-token context window and achieves 40% faster inference [
17]. Meta’s Llama-3.1 series enhances domain adaptability through modular pre-training, allowing efficient fine-tuning on specialized datasets such as maritime safety records [
18]. These advancements mark LLMs’ transition from general-purpose tools to domain-specific solutions, laying a critical foundation for their application in maritime scenarios.
However, even pre-trained LLMs with strong generalizability often underperform in specialized fields. This gap arises because their pre-trained general knowledge fails to address the unique, task-specific needs of particular domains—including the maritime sector.
In maritime applications, pre-trained LLMs lack essential domain adaptation: they have limited understanding of nautical terminology (e.g., “dead reckoning,” “CPA/TCPA”) and struggle to model the laws of maritime target movement (e.g., ship maneuvering under wind and current). This deficiency directly leads to poor performance in maritime target element resolution tasks. Fine-tuning thus becomes a necessary step to align LLMs with maritime professional knowledge, bridging the gap between general model capabilities and domain-specific requirements.
To enable efficient LLM application in this task, three core challenges (domain adaptation gaps, high computational cost, data scarcity) must be addressed. This paper proposes a fine-tuned LLM adaptive optimization method, centered on an expert-prior-guided prompt learning framework. It uses a “small measured data + large simulation data” hybrid training set and API-based parameter fine-tuning.
In experiments, a multimodal fusion paradigm converts navigation mathematical models into interpretable symbols for prompt templates, enabling dynamic feature extraction adjustment (e.g., enhancing time-series feature weight in target maneuvering scenes). Unlike traditional prompt tuning, the method innovates a hierarchical prompt system: bottom layer (general navigation knowledge), middle layer (scenario-algorithm rules), top layer (task-specific parameters)—realizing end-to-end reasoning from semantic understanding to numerical resolution. The model requires two core capabilities: parsing navigation logs/target movement text to select optimal schemes and analyzing observation ship multi-dimensional data to determine optimal strategies.
The remainder of this paper is structured as follows:
Section 2 reviews related work, including LLM fine-tuning techniques, optimal selection methods based on LLMs, non-uniform sampling theory, and maritime target element resolution fundamentals;
Section 3 designs the LLM selection, fine-tuning strategy, and non-uniform sampling innovation for maritime scenarios;
Section 4 constructs the maritime target element resolution model with a hierarchical prompt system;
Section 5 verifies the method’s effectiveness through experiments; and
Section 6 summarizes conclusions and future work.
3. Non-Uniform Sampling Strategy Design Based on Large Language Model Fine-Tuning
Existing large language models face issues like insufficient domain adaptation, weak non-uniform sampling handling, and high fine-tuning costs in maritime target element resolution. To address this, this chapter integrates “large language model fine-tuning” with “non-uniform sampling strategy” and proposes solutions: select a maritime-adapted base model, design an efficient fine-tuning strategy, innovate the sampling approach, and provide technical support for subsequent model construction.
Post-training, the model is expected to parse text (e.g., navigation logs, target trends) and choose optimal resolution strategies, process multi-dimensional observational data, and achieve the implementation effects shown in
Figure 1 and
Figure 2.
3.1. Selection and Adaptation of Large Language Models
3.1.1. Basis for Model Selection
Maritime target element resolution tasks demand large language models to meet strict criteria: domain adaptation, deployment flexibility, and cost controllability. Given the task’s diverse scenarios, including marine monitoring and navigation management, models must handle professional knowledge effectively. They also need to operate stably across various hardware and network conditions, while keeping development, training, and maintenance costs reasonable.
This study selects ByteDance’s Doubao-Seed-1.6 for its superior adaptation. Its pre-training covers multi-domain knowledge and excels at long text context modeling, enabling fast adaptation to nautical scenes, accurate parsing of sailing logs, and key element extraction. In deployment, it supports lightweight localization, runs stably on ship edge devices, and offers flexible interfaces for rule injection and parameter fine-tuning. Regarding cost, its parameter scale fits medium computing power, eliminating the need for large GPU clusters, and reduces resource occupation, balancing accuracy and hardware load in ship embedded systems.
3.1.2. The Model Is Adapted to the Preprocessing Operation of the Maritime Target Element Resolution Task
In order to accurately adapt Doubao-Seed-1.6 to the task of maritime target element resolution, a standardized preprocessing process of data cleaning, labeling and format conversion was performed on the relevant data. The specific operations are as follows:
Filtered outliers from sensor failures (e.g., radar jump values, invalid text fields); filled time-series missing values (e.g., random sampling breakpoints) via interpolation/mean methods (combined with context and domain rules); corrected text errors (e.g., standardizing “dead reckoning error” to “dead reckoning deviation”) to improve text standardization.
Adopted “manual + semi-automatic” annotation: manually labeled core elements (target name, location, navigation status, sampling features; e.g., “Cargo ship A”, “30° N, 120° E”); used a navigation keyword-based (e.g., “collision avoidance”, “turning”) semi-automatic tool to identify hidden scene features (e.g., “15° heading mutation”→”avoidance scene”) for efficient, consistent annotation.
Converted sensor data (radar, sonar) into structured text (e.g., “azimuth 355°, sampling time 10:05:20”) by extracting core features; processed text data to map nautical terms (e.g., “turning point”, “dead reckoning”) into model-recognizable sequences (retaining contextual coding) for semantic parsing.
This preprocessing enhanced data integrity, standardization, and model compatibility, laying the foundation for Doubao-Seed-1.6’s efficient application in maritime target element resolution.
3.2. Fine-Tuning Strategy of Large Language Model for Maritime Target Element Resolution
To address high computational costs of traditional full tuning and the poor adaptability of existing PEFT methods in non-uniform sampling scenarios, this study proposed a “Prefix Tuning + LoRA” hybrid fine-tuning strategy (
Figure 3), with key details below:
3.2.1. Objective and Dataset
The target is to improve the model’s analysis ability for non-uniformly sampled data. The hybrid training set consists of “Small measured data (logs with random sampling features like sudden interval changes/missing data)” and “large-scale simulation data (simulating “dense→sparse→missing” scenarios)”. Key elements, such as target maneuver labels and sampling density, were manually labeled to enhance non-uniform pattern recognition.
3.2.2. Hybrid Fine-Tuning
Prefix Tuning involves adding a 64D trainable vector before input to encode navigation rules (e.g., “turn priority sampling”, “high weight for collision avoidance scenes”). With LoRA, 95% of base model parameters are frozen; only Transformer layer low-rank matrices are fine-tuned (trainable parameters < 0.5%). The training config uses an Adam optimizer (10−5), with a batch size of 16, and three epochs. A Cosine annealing scheduler balances convergence and overfitting, while a dynamic learning rate addresses random sampling “data sparsity”.
3.3. Innovation of Non-Uniform Sampling Strategy Combined with Large Language Model
To address “critical period information loss” and “multi-source data spatiotemporal misalignment” in random sampling, two LLM-integrated strategies are proposed:
Non-uniform sampling point selection based on semantic understanding: Leveraging LLMs’ semantic parsing for nautical scenes, sampling priorities are dynamically adjusted. By analyzing log texts (e.g., “target ship 15° angle mutation”) or sensor data patterns (e.g., speed surges), key events like “turning” and “accelerating” are identified. Sampling density increases during critical events (interval reduced to 2–5 s), while cruising phases adopt lower frequencies (60–180 s), minimizing redundant data and resolving sampling imbalance.
Multi-source data fusion of non-uniform sampling method: For radar-AIS spatiotemporal mismatches, LLMs enable cross-sensor semantic alignment. By translating radar echo intensity (50 dB, azimuth 350°) and AIS data (“heading 355°, speed 12 knots”) into unified text, spatiotemporal correlations are analyzed (e.g., “5° azimuth-heading error within measurement tolerance”). Sampling strategies adaptation: reduced density in consistent areas and increased sampling in conflict zones (heading deviation > 10°) to verify data integrity, enhancing multi-source data utility.
This “model selection-fine-tuning-sampling innovation” trinity forms the technological core for the subsequent “maritime target element calculation model”. The model adaptation enables domain-specific reasoning, fine-tuning optimizes performance under maritime power constraints, and sampling innovation supplies high-quality data, jointly enabling end-to-end target element resolution from non-uniformly sampled data.