Intelligent Task Allocation and Planning for Unmanned Surface Vehicle (USV) Using Self-Attention Mechanism and Locking Sweeping Method

: The development of intelligent task allocation and path planning algorithms for unmanned surface vehicles (USVs) is gaining significant interest, particularly in supporting complex ocean operations. This paper proposes an intelligent hybrid algorithm that combines task allocation and path planning to improve mission efficiency. The algorithm introduces a novel approach based on a self-attention mechanism (SAM) for intelligent task allocation. The key contribution lies in the integration of an adaptive distance field, created using the locking sweeping method (LSM), into the SAM. This integration enables the algorithm to determine the minimum practical sailing distance in obstacle-filled environments. The algorithm efficiently generates task execution sequences in cluttered maritime environments with numerous obstacles. By incorporating a safety parameter, the enhanced SAM algorithm adapts the dimensional influence of obstacles and generates paths that ensure the safety of the USV. The algorithms have been thoroughly evaluated and validated through extensive computer-based simulations, demonstrating their effectiveness in both simulated and practical maritime environments. The results of the simulations verify the algorithm’s capability to optimize task allocation and path planning, leading to improved performance in complex and obstacle-laden scenarios.


Introduction
The research and development of unmanned surface vehicles (USVs) have garnered increasing attention in recent years.These vehicles are being actively developed and deployed across various practical applications [1][2][3], predominantly in the military domain.By employing USVs for hazardous tasks such as maritime patrol and coast guarding in dangerous environments, human risk can be greatly minimized as operator involvement is limited.It is important to note that USVs also hold promise for civilian applications.One such application is environmental monitoring, where USVs can be utilized to efficiently collect water sampling data in heavily polluted lakes, eliminating the need for human exposure to harmful elements [4,5].Another potential application is search and rescue missions in post-disaster scenarios.Research has shown that the effective deployment of USVs can significantly enhance the success rate of rescue missions by reducing response times and improving overall effectiveness [6].
In the task allocation process, the problem can be mathematically represented as the traveling salesman problem (TSP).It involves a given set of cities with distances specified between each pair of cities.The goal is to determine the shortest route that visits each city exactly once and returns to the initial city.In the context of task allocation, the goal is to determine the optimal path planning for multiple task points, ensuring that each task point is covered once.By solving the TSP, efficiently allocating tasks, and planning the trajectory, the overall distance traveled is minimized while ensuring all task points are visited.Recently, the most commonly used task allocation algorithms include genetic algorithms [7], ant colony algorithms [8], and neural network algorithms [9].
A et al. [10] used GA to solve the defined optimization problem model and developed an unmanned boat task allocation system.Wang et al. [11] discussed a dual-chromosome encoding genetic algorithm based on alignment, introduced alignment-based learning and multiple mutation operators to improve the genetic algorithm, and obtained better allocation results.Jia et al. [12] proposed a metaheuristic algorithm based on an improved genetic algorithm to solve the task pre-allocation problem with multiple constraints.Zhai et al. [13] used a particle swarm optimization algorithm and the entropy weight method to propose a cooperative task allocation method for multi-heterogeneous aircraft under strong spatiotemporal constraints; Chen et al. [14] proposed a heuristic algorithm based on an ant colony system to minimize the task completion time.Oriented by the time consumption of the task, the approximate optimal solution is sought in the collaborative search task.Inspired by the immune endocrine short feedback system, Huang et al. [15] proposed an artificial immune algorithm that can produce an asymptotically optimal allocation strategy and achieve rapid convergence on a large number of variables.Zhu et al. used the Self-Organizing Map (SOM) algorithm in the task allocation problem for the first time [16].This algorithm well solved the collaboration problem between robots in the task allocation problem.Subsequently, Zhu et al. [17] proposed an improved task based on SOM.The distribution algorithm adds a path tracking controller to make the robot's planned path smoother.
Many existing methods for task allocation in USVs rely on the Euclidean distance as the cost metric between task points, which often overlooks the influence of obstacles present in the environment.Liu et al. [18,19] attempted to address this limitation by introducing a repulsive field within the iterative process of the SOM algorithm.However, the results obtained from this approach only achieved a relatively optimal outcome.As a result, these allocation algorithms need to be combined with path planning algorithms to optimize the generation of safe paths, but this can lead to poor real-time performance in complex environments.Another significant issue with the current state of USV task allocation is the lack of algorithm validation in simulation environments that accurately represent practical scenarios.For instance, numerous simulation experiments [20] have been conducted in simplistic, artificially constructed environments rather than real-world oceanic environments.This gap between simulation and reality hampers the assessment of algorithm performance and its applicability to practical situations.
To properly address the abovementioned issues, this paper has proposed an improved algorithm based upon self-attention mechanism.Environmental factors are incorporated into the input model of SAM through the locking sweeping method [21].Highlights of this paper can be summarized as follows: (1) The proposed algorithm fully considers environmental obstacle factors in the task allocation process, and the output allocation results can minimize the actual voyage of the USV.(2) The proposed algorithm can optimize some dangerous paths in the task allocation results to ensure the safe navigation of USV.No additional combination of path planning algorithms is required.(3) The proposed algorithm incorporates a shoreline extension approach to ensure the safety of USV by maintaining a user-configurable clearance distance from the shoreline.
The rest of the paper is organized as follows: Section 2 specifically introduces the adaptive distance field construction process.Section 3 describes the detailed algorithm structure for intelligent allocation and path planning is described, which includes the SAM algorithm and the improved SAM algorithm.The proposed algorithms are verified by simulations in Section 4. Section 5 concludes the paper and discusses future work.

Adaptive Distance Cost
In the majority of algorithms designed to address task assignment problems, the Euclidean distance is commonly employed as the basis for estimating the distance cost between two task points.This approach disregards the influence of obstacles within the environment, thereby overlooking crucial factors.The task area of a USV often encompasses diverse obstacles, and ensuring safety is of paramount importance in autonomous navigation.When obstacles are present between two task points, it is essential for the planned task route to consistently avoid entering the safety range of the obstacle.As a result, the actual distance cost between task points exceeds the Euclidean distance, as shown in Figure 1.To overcome this, a new distance cost construction approach based on the locking sweeping method (LSM) has been proposed in this paper.
SAM algorithm and the improved SAM algorithm.The proposed algorithms are veri by simulations in Section 4. Section 5 concludes the paper and discusses future work.

Adaptive Distance Cost
In the majority of algorithms designed to address task assignment problems, the clidean distance is commonly employed as the basis for estimating the distance cost tween two task points.This approach disregards the influence of obstacles within the vironment, thereby overlooking crucial factors.The task area of a USV often encompa diverse obstacles, and ensuring safety is of paramount importance in autonomous n gation.When obstacles are present between two task points, it is essential for the plan task route to consistently avoid entering the safety range of the obstacle.As a result, actual distance cost between task points exceeds the Euclidean distance, as shown in ure 1.To overcome this, a new distance cost construction approach based on the lock sweeping method (LSM) has been proposed in this paper.

Locking Sweeping Method (LSM)
The locking sweeping method (LSM) is an improved method of the fast sweep method (FSM), which is an iterative method used to calculate the time-of-arrival field solving the Eikonal equation.This method achieves the computation by iteratively swe ing (traversing) the entire grid in a specific order.Each sweep is responsible for solv the time-of-arrival field propagation in a particular direction.The Eikonal equation is where ( ) , t x y is the time of arrival at the node ( ) x y , ( ) , v x y is the propagation ve ity, and Ω is the model space.Equation (1) can be expressed as: where h is the spatial lag and assume h to be equal to 1.The solution of Equation can be obtained as:

Locking Sweeping Method (LSM)
The locking sweeping method (LSM) is an improved method of the fast sweeping method (FSM), which is an iterative method used to calculate the time-of-arrival field by solving the Eikonal equation.This method achieves the computation by iteratively sweeping (traversing) the entire grid in a specific order.Each sweep is responsible for solving the time-of-arrival field propagation in a particular direction.The Eikonal equation is: where t(x, y) is the time of arrival at the node (x, y), v(x, y) is the propagation velocity, and Ω is the model space.Equation (1) can be expressed as: where h is the spatial lag and assume h to be equal to 1.The solution of Equation ( 2) can be obtained as: The time of arrival of each node can be solved by its adjacent nodes.The pseudocode of the LSM is shown in Algorithm 1.A locking structure is maintained by each node in the model space.During the remaining sweep process, if a node's value remains unchanged, the node is locked.This enables faster skipping in subsequent processes.The LSM based calculated processes are illustrated in Figure 2a.To generate a potential field, interface sweep processes are simulated from the four corners of the entire model space.By recursively tracing back to the initial node, the potential value at each node can be determined, representing the local interface time of arrival.Assuming a constant propagation velocity, the time of arrival of the interface is directly proportional to the distance between the current node and the initial node.The arrival time increases as the distance grows.Nodes within obstacles have a value of 1, while other nodes have values ranging from 0 to 1. Figure 2b illustrates the node locking process, where nodes with converged values get locked during each sweep process.For further details on the LSM, refer to [21].

Algorithm 1 Locking Sweeping Method algorithm.
Input: grid map (M), initial node (P initial ), locking map (L), time-of-arrival field (T) Initialization: if neighbor nodes of p i have valid value and L(p i ) = 0 then 5.
using Equation ( 6) calculate the time-of-arrival T(p i ) of p i 6. else 7.
skip the calculate process of node p i 8.
if T(p i ) remains unchanged during two adjacent sweep processes then 10.
L(p i ) =

The Adaptive Distance Field Construction Process
The distance field example given in Figure 2a has a single sweep origin node.The potential value assigned to each node corresponds to its distance weight in relation to the position of the origin node.As depicted in Figure 3, the potential value of the blue dot represents the length of the collision-free path.Considering the complex and changeable nature of the USV operating area, along with the potential inaccuracies in the map data, it is not advisable to directly employ the raw obstacle boundary data.To ensure USV safety, it is recommended to extend each obstacle boundary by a configurable distance.The specific value of extend distance can be adjusted according to the user's requirements.

The Adaptive Distance Field Construction Process
The distance field example given in Figure 2a has a single sweep origin node.The potential value assigned to each node corresponds to its distance weight in relation to the position of the origin node.As depicted in Figure 3, the potential value of the blue dot represents the length of the collision-free path.Considering the complex and changeable nature of the USV operating area, along with the potential inaccuracies in the map data, it is not advisable to directly employ the raw obstacle boundary data.To ensure USV safety, it is recommended to extend each obstacle boundary by a configurable distance.The specific value of extend distance can be adjusted according to the user's requirements.By setting multiple initial points from the obstacles, the field generated by the LSM can provide distance information.This information can be utilized to create a new field that indicates how close a local point is to obstacles.Using this new field, a distance field related to the extended obstacles can be derived.
The process of creating the extended obstacle boundary consists of two steps: (1) using the LSM to generate a field implicitly reflecting the risk of obstacles and (2) configuring the extended weight α to adjust the influence range of obstacles.In the first step, we set

The Adaptive Distance Field Construction Process
The distance field example given in Figure 2a has a single sweep origin node.T potential value assigned to each node corresponds to its distance weight in relation to t position of the origin node.As depicted in Figure 3, the potential value of the blue d represents the length of the collision-free path.Considering the complex and changea nature of the USV operating area, along with the potential inaccuracies in the map da it is not advisable to directly employ the raw obstacle boundary data.To ensure U safety, it is recommended to extend each obstacle boundary by a configurable distan The specific value of extend distance can be adjusted according to the user's requiremen By setting multiple initial points from the obstacles, the field generated by the LS can provide distance information.This information can be utilized to create a new fie that indicates how close a local point is to obstacles.Using this new field, a distance fie related to the extended obstacles can be derived.
The process of creating the extended obstacle boundary consists of two steps: (1) ing the LSM to generate a field implicitly reflecting the risk of obstacles and (2) configuri the extended weight α to adjust the influence range of obstacles.In the first step, we By setting multiple initial points from the obstacles, the field generated by the LSM can provide distance information.This information can be utilized to create a new field that indicates how close a local point is to obstacles.Using this new field, a distance field related to the extended obstacles can be derived.
The process of creating the extended obstacle boundary consists of two steps: (1) using the LSM to generate a field implicitly reflecting the risk of obstacles and (2) configuring the extended weight α to adjust the influence range of obstacles.In the first step, we set the node value at the obstacle boundary equal to 0, the LSM is run to sweep the entire model space.The resulting output is denoted as D o_init , where the value of D o_init represents the distance between the corresponding position and the obstacle.The farther the distance, the larger the value.D o_init can be calculated using the expression: D o_init = so f tmax((LSM(P obs )) (7) where P obs is the obstacle's node location set. Figure 4 represents the process of generating the obstacle field.
A simulated environment representing a typical maritime environment with multiple small islands is displayed in Figure 4a the node value at the obstacle boundary equal to 0, the LSM is run to sweep the entire model space.The resulting output is denoted as , where the value of where obs P is the obstacle's node location set. Figure 4 represents the process of generat- ing the obstacle field.
A simulated environment representing a typical maritime environment with multiple small islands is displayed in Figure 4a  After generating the obstacle field, the next step is further visualizing and configure the actual influence range of obstacles, the new obstacle field o D can be obtained as: where α is a sweeping scale limitation parameter which can control the influence range of obstacles according to the user's requirements.Figure 4 illustrates the field obtained by restricting the expansion range of the obstacle boundary.By limiting how far the obstacles extend, the resulting field provides a clearer depiction of the navigable areas.The generated o D with 0.4 α = is represented in Figure 5a.In Figure 5b-d, by gradually increasing the value of α , the impact range of obstacles diminishes progressively.This phenom- enon highlights the controllability of the dimension of the o D through the adjustment of α .Note that the determination of α should be carried out adaptively, taking into ac- count various factors such as safety considerations and overall distance requirements.These adaptive calculations for α are beyond the scope of this paper and will not be dis- cussed in detail.After generating the obstacle field, the next step is further visualizing and configure the actual influence range of obstacles, the new obstacle field D o can be obtained as: where α is a sweeping scale limitation parameter which can control the influence range of obstacles according to the user's requirements.Figure 4 illustrates the field obtained by restricting the expansion range of the obstacle boundary.By limiting how far the obstacles extend, the resulting field provides a clearer depiction of the navigable areas.The generated D o with α = 0.4 is represented in Figure 5a.In Figure 5b-d, by gradually increasing the value of α, the impact range of obstacles diminishes progressively.This phenomenon highlights the controllability of the dimension of the D o through the adjustment of α.Note that the determination of α should be carried out adaptively, taking into account various factors such as safety considerations and overall distance requirements.These adaptive calculations for α are beyond the scope of this paper and will not be discussed in detail.
After generating D o , the model space distance field T i for task i can be expressed as follows: where P i is the position of task i.The obtained T i is shown in Figure 6, where distance weight of each node from the task i has been clearly shown.( ) where i P is the position of task i .The obtained i T is shown in Figure 6, where distance weight of each node from the task i has been clearly shown.
(a) (b) ( ) where i P is the position of task i .The obtained i T is shown in Figure 6, where distance weight of each node from the task i has been clearly shown.

Self-Attention Mechanism
The self-attention mechanism is a network model that addresses the challenge of handling input vectors with uncertain sizes and lengths [22].Its primary objective is to enable the machine to capture correlations among different components of the input vector.Information transfer is facilitated through weighted processing, wherein the relevance and importance of different elements are determined by learned weights.The self-attention mechanism modifies the representation of the target element's position through comparison and learning processes, resulting in an enhanced expression of the information encoded within the input vector.The core of the algorithm is query, key, and value.
The structure of the self-attention mechanism is shown in Figure 7.For each element in the sequence, the self-attention mechanism calculates the similarity between that element and the other elements, and subsequently normalizes these similarities to obtain attention weights.The output of the self-attention mechanism is then obtained by performing a weighted sum of each element and its respective attention weight.The calculation process is as follows:

Self-Attention Mechanism
The self-attention mechanism is a network model that addresses the challenge of handling input vectors with uncertain sizes and lengths [22].Its primary objective is to enable the machine to capture correlations among different components of the input vector.Information transfer is facilitated through weighted processing, wherein the relevance and importance of different elements are determined by learned weights.The self-attention mechanism modifies the representation of the target element's position through comparison and learning processes, resulting in an enhanced expression of the information encoded within the input vector.The core of the algorithm is query, key, and value.
The structure of the self-attention mechanism is shown in Figure 7.For each element in the sequence, the self-attention mechanism calculates the similarity between that element and the other elements, and subsequently normalizes these similarities to obtain attention weights.The output of the self-attention mechanism is then obtained by performing a weighted sum of each element and its respective attention weight.The calculation process is as follows:

Self-Attention Mechanism
The self-attention mechanism is a network model that addresses the challenge of handling input vectors with uncertain sizes and lengths [22].Its primary objective is to enable the machine to capture correlations among different components of the input vector.Information transfer is facilitated through weighted processing, wherein the relevance and importance of different elements are determined by learned weights.The self-attention mechanism modifies the representation of the target element's position through comparison and learning processes, resulting in an enhanced expression of the information encoded within the input vector.The core of the algorithm is query, key, and value.
The structure of the self-attention mechanism is shown in Figure 7.For each element in the sequence, the self-attention mechanism calculates the similarity between that element and the other elements, and subsequently normalizes these similarities to obtain attention weights.The output of the self-attention mechanism is then obtained by performing a weighted sum of each element and its respective attention weight.The calculation process is as follows:  Firstly, perform the Embedding operation on the input element, denoted as: where W represents the parameter matrix of Embedding.a i serves as the input data of the attention mechanism to solve query q i , key k i , and value v i : Then, the attention weight α i between task i and resource j is calculated by: where d represents the matrix dimensions of q and k.In the self-attention mechanism, their dimensions are the same.After the normalization operation, the attention weight is obtained and denoted by the ⌢ α i,j .The final output relevance weight of task i is then calculated by:

Improved Multi-Head Self-Attention Mechanism for TSP
When applying the SAM to the TSP problem, the input elements are the information of tasks in the task-allocation problem.Each task in the TSP problem includes its location, a distance matrix to other task points, and factors related to environmental obstacles.In order to capture various correlations and local features, we employ a multi-head attention mechanism.This mechanism allows for the exploration and integration of different relationships, enabling the SAM to generate a more flexible and comprehensive solution for the TSP problem.
In multi-head self-attention mechanism, each task will correspond to multiple q, k, v.Such as, for task 1: where n is the number of heads.The task i can be denoted as: where P i , T i is the position and distance field of task i respectively.Then, the distance matrix κ between each task can be expressed as: where m is the number of task points.The NO.i column of the matrix represents the distance from other tasks to task i.Then, the linear layer input can be expressed as: where W represents the parameter matrix, and τ is a bias vector.Then, bring Equations ( 19) and (20) into Equations ( 11)-( 16) to obtain the final output relevance weight of task i.
The pseudo-code of the improved multi-head self-attention mechanism for TSP is shown in Algorithm 2. Enter all task points into a multi-headed layer to obtain the corresponding attention vector, and then calculate the correlation weight of all task points through a single attention layer, and use the task point with the largest correlation weight as the output result of this iteration.The resulting model will be obtained after all iterations.
Algorithm 2 Improved multi-head self-attention mechanism algorithm for TSP.

Multi-Goal Path Planning for USV
An accurate determination of the optimized task execution sequence can be achieved using the improved SAM.Subsequently, the path planning algorithm is invoked to generate trajectories that visit each task point.In this study, the trajectory calculation prioritizes achieving the minimum distance cost.It is intuitive to consider the straight line connecting two task points as the optimal path when no obstacles exist between them.While the SAM provides initial solutions for path planning, additional enhancements are necessary to address a crucial concern, namely the possibility of encountering obstacles along the connection between two tasks.
While the evaluation function in the iterative process of the improved self-attention mechanism (SAM) algorithm considers obstacle factors, the resulting output may not provide direct insight into the presence of obstacles between two adjacent task sequences.However, due to the inclusion of distance information between all task points in the input distance matrix of the improved SAM (ISAM) algorithm, it becomes possible to determine the existence of obstacles between adjacent nodes as follows: where d(i, i + 1) is the Euclidean distance between nodes i and i + 1. B is a BOOL variable indicating the obstruction between adjacent nodes.A value of 0 means there is no obstruction between nodes.If B(i, i + 1) = 1, it is necessary to search for a collision-free path between nodes.Because of the characteristics of the node distance field, the collision-free path is calculated from: The simulation results are shown in Figure 8b-d, which show the calculation results of different algorithms.The total distance cost of allocation results based on three algorithms as shown in Table 1.Simulation results present all three algorithms generate a circular task sequence and path set with the position of the USV as the No. 1 node.When the influence of obstacles in the environment is not considered, the SAM algorithm performs best and its total distance is 1498.6.In an attempt to minimize distance costs, both SOM and SAM algorithms generated paths that intersected with the island in the upper left corner.Consequently, it becomes necessary to incorporate a global path search algorithm to optimize their results.Relatively, the improved SAM algorithm  Furthermore, leveraging the characteristics of the distance field, the improved SAM algorithm can use Equation (23) to optimize certain hazardous paths without the need for an additional path search algorithm.Notably, in an obstacle-filled environment, the improved SAM algorithm exhibits superior task allocation effectiveness and achieves the lowest overall distance cost, amounting to 1535.5.In conclusion, the improved SAM algorithm attains best performance and possesses practical value.

Simulations of Intelligent Task Allocation and Path Planning
In this section, simulations have been conducted to verify the effectiveness of the proposed algorithm in intelligent task allocation and path planning.The scenario involves an unmanned surface vehicle (USV) performing an environmental monitoring mission, where the vehicle is tasked with collecting water sampling data from multiple water monitoring stations.The selected area of interest is located near the Songhua River, China.Figure 9a displays the electronic map representing the designated area, which is then converted into a binary map of size 400 pixels × 600 pixels, as shown in Figure 9b.
algorithm can use Equation (23) to optimize certain hazardous paths without the need for an additional path search algorithm.Notably, in an obstacle-filled environment, the improved SAM algorithm exhibits superior task allocation effectiveness and achieves the lowest overall distance cost, amounting to 1535.5.In conclusion, the improved SAM algorithm attains best performance and possesses practical value.

Simulations of Intelligent Task Allocation and Path Planning
In this section, simulations have been conducted to verify the effectiveness of the proposed algorithm in intelligent task allocation and path planning.The scenario involves an unmanned surface vehicle (USV) performing an environmental monitoring mission, where the vehicle is tasked with collecting water sampling data from multiple water monitoring stations.The selected area of interest is located near the Songhua River, China.Figure 9a displays the electronic map representing the designated area, which is then converted into a binary map of size 400 pixels × 600 pixels, as shown in Figure 9b.In addition to fulfill the overall objectives of the environmental monitoring mission, which involves the USV visiting all water monitoring stations while minimizing distance costs, there is another crucial factor that must be considered: the impact of high and low tides.The varying tide levels have a significant effect on the dimensions of obstacles, rendering certain water sampling stations inaccessible to the USV during low tide periods.Therefore, the algorithm is required to identify the stations (or tasks) that can be executed based on the specific situation.Subsequently, it needs to allocate the tasks and plan the trajectory accordingly.This ensures the optimal execution of tasks within the constraints imposed by tide conditions.
The simulation results considering the tide effect are depicted in Figure 10.In Figure 10a, the area consists of 20 water sampling stations represented by orange stars.Among them, five stations (marked by pink dash circles) are situated in riparian areas.During low tide periods, the USV is unable to visit these five stations due to the low water level.This issue is addressed by adjusting the sweeping scale limit (α ) in this study to accommodate the tide height effect.During the rising tide period, the reduced dimensions of the obstacle area result in a larger free space.Thus, a higher value of α is selected to enable In addition to fulfill the overall objectives of the environmental monitoring mission, which involves the USV visiting all water monitoring stations while minimizing distance costs, there is another crucial factor that must be considered: the impact of high and low tides.The varying tide levels have a significant effect on the dimensions of obstacles, rendering certain water sampling stations inaccessible to the USV during low tide periods.Therefore, the algorithm is required to identify the stations (or tasks) that can be executed based on the specific situation.Subsequently, it needs to allocate the tasks and plan the trajectory accordingly.This ensures the optimal execution of tasks within the constraints imposed by tide conditions.
The simulation results considering the tide effect are depicted in Figure 10.In Figure 10a, the area consists of 20 water sampling stations represented by orange stars.Among them, five stations (marked by pink dash circles) are situated in riparian areas.During low tide periods, the USV is unable to visit these five stations due to the low water level.This issue is addressed by adjusting the sweeping scale limit (α) in this study to accommodate the tide height effect.During the rising tide period, the reduced dimensions of the obstacle area result in a larger free space.Thus, a higher value of α is selected to enable the algorithm to explore more tasks.The results for α = 0.9 are presented in Figure 10b, where a closed tour depicted by the light blue line successfully covers all 20 stations (tasks).Conversely, Figure 10c illustrates the outcomes with α = 0.8.An observation reveals that tasks situated in shallow water areas are excluded from consideration due to the extended obstacle boundary.Nevertheless, the remaining tasks can still be visited by following the closed tour depicted by the light blue line.
the algorithm to explore more tasks.The results for 0.9 α = are presented in Figure 10b, where a closed tour depicted by the light blue line successfully covers all 20 stations (tasks).Conversely, Figure 10c illustrates the outcomes with 0.8 α = .An observation reveals that tasks situated in shallow water areas are excluded from consideration due to the extended obstacle boundary.Nevertheless, the remaining tasks can still be visited by following the closed tour depicted by the light blue line.

Conclusions and Future Work
This paper presents a novel algorithm, based on the self-attention mechanism (SAM), to address the issue of intelligent task allocation for USV.The operating environment of a USV usually contains various obstacles, and safety is the most important thing to pay attention to.How to reasonably and effectively allocate multiple tasks based on the minimum distance cost while ensuring navigation safety is one of the main bottlenecks in deploying complex marine missions.To address this, the algorithm incorporates environmental obstacle factors into the input model of the SAM algorithm, utilizing the actual navigation distance as the distance cost between nodes.Additionally, the algorithm includes a path planning module, enabling efficient optimization of hazardous paths.By employing this algorithm, tasks can be assigned to minimize the actual sailing distance of the USV.It generates feasible paths for the USV to avoid collisions, meeting the most critical requirements in maritime navigation.

Conclusions and Future Work
This paper presents a novel algorithm, based on the self-attention mechanism (SAM), to address the issue of intelligent task allocation for USV.The operating environment of a USV usually contains various obstacles, and safety is the most important thing to pay attention to.How to reasonably and effectively allocate multiple tasks based on the minimum distance cost while ensuring navigation safety is one of the main bottlenecks in deploying complex marine missions.To address this, the algorithm incorporates environmental obstacle factors into the input model of the SAM algorithm, utilizing the actual navigation distance as the distance cost between nodes.Additionally, the algorithm includes a path planning module, enabling efficient optimization of hazardous paths.By employing this algorithm, tasks can be assigned to minimize the actual sailing distance of the USV.It generates feasible paths for the USV to avoid collisions, meeting the most critical requirements in maritime navigation.
In future research, the first enhancement is to further consider the kinematic characteristics of USVs [23], as they can potentially impact the task allocation outcome.Additionally, to enhance the practical performance of the proposed approaches, they will undergo validation using real boats in a realistic environment [24].The final enhancement involves extending the current algorithm to accommodate larger-scale unmanned vehicle systems.This expansion aims to enable multiple USV formations to undertake more complex missions that require extended mission durations [19].

Figure 2 .
Figure 2. The potential field constructed by LSM in a space with obstacles.(a) The sweep processes; (b) the locked state of the nodes in the configuration space.(The red dot represents sweep initial point and the gray area represents the locked nodes set).

Figure 3 .
Figure 3. Example of path planning.(The red line represents collision-free path from blue dot to red dot).

Figure 2 .
Figure 2. The potential field constructed by LSM in a space with obstacles.(a) The sweep processes; (b) the locked state of the nodes in the configuration space.(The red dot represents sweep initial point and the gray area represents the locked nodes set).

Figure 2 .
Figure 2. The potential field constructed by LSM in a space with obstacles.(a) The sweep process (b) the locked state of the nodes in the configuration space.(The red dot represents sweep ini point and the gray area represents the locked nodes set).

Figure 3 .
Figure 3. Example of path planning.(The red line represents collision-free path from blue dot to red dot).

Figure 3 .
Figure 3. Example of path planning.(The red line represents collision-free path from blue dot to red dot).
. The generated obstacle field is shown in Figure 4b.The brown areas in the visualization represent obstacle areas, indicating regions where the USV should avoid navigating.Conversely, lighter-colored nodes indicate safer areas.The USV should aim to sail in these lighter-colored regions to ensure safer navigation.
the distance between the corresponding position and the obstacle.The farther the distance, the larger the value._o init D can be calculated using the expression: . The generated obstacle field is shown in Figure 4b.The brown areas in the visualization represent obstacle areas, indicating regions where the USV should avoid navigating.Conversely, lighter-colored nodes indicate safer areas.The USV should aim to sail in these lighter-colored regions to ensure safer navigation.

Figure 4 .
Figure 4. Example of the process of generating the obstacle field.(a) The simulated environment including multiple small islands; (b) the generated obstacle field.

Figure 4 .
Figure 4. Example of the process of generating the obstacle field.(a) The simulated environment including multiple small islands; (b) the generated obstacle field.

Figure 5 .
Figure 5. Example of the process of generating constrained obstacle field.(a)

Figure 7 .
Figure 7.The algorithm structure of the self-attention mechanism.

JFigure 6 .
Figure 6.Example of the process of generating distance field.(a)

Figure 7 .
Figure 7.The algorithm structure of the self-attention mechanism.Figure 7. The algorithm structure of the self-attention mechanism.

Figure 7 .
Figure 7.The algorithm structure of the self-attention mechanism.Figure 7. The algorithm structure of the self-attention mechanism.

Figure 8 .
Figure 8. Simulation results of intelligent task allocation and path planning when

Figure 8 .
Figure 8. Simulation results of intelligent task allocation and path planning when α = 0.9.(a) The simulation environment; (b) SOM + path planning algorithm; (c) SAM + path planning algorithm; (d) improved SAM algorithm.(The green line indicates the straight line without considering obstacles, and the light blue dashed line represents collision-free paths).

Figure 9 .
Figure 9.A practical simulation environment.(a) The electronic map near Songhua River, China; (b) the converted binary map of the electronic map.

Figure 9 .
Figure 9.A practical simulation environment.(a) The electronic map near Songhua River, China; (b) the converted binary map of the electronic map.

Table 1 .
The total distance cost of allocation results based on three algorithms.

Table 1 .
The total distance cost of allocation results based on three algorithms.