Ship-Collision Avoidance Decision-Making Learning of Unmanned Surface Vehicles with Automatic Identiﬁcation System Data Based on Encoder—Decoder Automatic-Response Neural Networks

: Intelligent unmanned surface vehicle (USV) collision avoidance is a complex inference problem based on current navigation status. This requires simultaneous processing of the input sequences and generation of the response sequences. The automatic identiﬁcation system (AIS) encounter data mainly include the time-series data of two AIS sets, which exhibit a one-to-one mapping relation. Herein, an encoder–decoder automatic-response neural network is designed and implemented based on the sequence-to-sequence (Seq2Seq) structure to simultaneously process the two AIS encounter trajectory sequences. Furthermore, this model is combined with the bidirectional long short-term memory recurrent neural networks (Bi-LSTM RNN) to obtain a network framework for processing the time-series data to obtain ship-collision avoidance decisions based on big data. The encoder–decoder neural networks were trained based on the AIS data obtained in 2018 from Zhoushan Port to achieve ship collision avoidance decision-making learning. The results indicated that the encoder–decoder neural networks can be used to e ﬀ ectively formulate the sequence of the collision avoidance decision of the USV. Thus, this study signiﬁcantly contributes to the increased e ﬃ ciency and safety of maritime transportation. The proposed method can potentially be applied to the USV technology and intelligent collision-avoidance systems


Introduction
With the increasing popularity of the automatic identification system (AIS) equipment and the development of shore-based and spaceborne AIS equipment, AIS data have become a popular data source for big data analysis and machine learning in the marine industry. Therefore, several studies have investigated AIS data. The maturity of the algorithms, such as the machine learning and big data mining algorithms, has resulted in novel solutions to the previous problems. Currently, researchers can easily access and analyze the AIS data to improve the intelligence of maritime autonomous surface ships (MASSs). Obtaining seamen-collision avoidance experience from the massive AIS data is particularly important; therefore, the big-data-related technologies should urgently develop methodologies to satisfy their requirements. The development of intelligent/unmanned surface vehicles is becoming the trend in the shipbuilding industry. Therefore, intelligent collision avoidance technologies must repeated training to activate the deep network units. The process includes obtaining more data samples, separately obtaining the best part of each sample, continuously integrating and optimizing decision-making, and surpassing the superiority of the sample's decision-making. However, this process is always based on the anthropomorphic characteristics of the training samples, and sailing at sea would not conflict with the actual navigation rules. This study uses the AIS trajectory data obtained in 2018 from Zhoushan Port (Figure 1) as the source for identifying the encounter data and extracting the handling-behavior data of ships. The flexible multi-sequence mapping samples of the ship encounters are obtained as Seq2Seq training data for the encoder-decoder machine learning. The extracted successful ship collision-avoidance cases from the AIS big data can be divided into two steps: (a) ship AIS-encounter-trajectory-data identification and (b) ship key feature-point (KFP) extraction from the ship-encounter trajectories. at sea would not conflict with the actual navigation rules. This study uses the AIS trajectory data obtained in 2018 from Zhoushan Port (Figure 1) as the source for identifying the encounter data and extracting the handling-behavior data of ships. The flexible multi-sequence mapping samples of the ship encounters are obtained as Seq2Seq training data for the encoder-decoder machine learning. The extracted successful ship collision-avoidance cases from the AIS big data can be divided into two steps: (a) ship AIS-encounter-trajectory-data identification and (b) ship key feature-point (KFP) extraction from the ship-encounter trajectories.

Trajectory Data Identification of Ship Encounters
Currently, there are two quantification sources based on which the ship-encounter azimuthal map can be obtained.
(1) The radar-based collision avoidance steering diagram is mainly used to provide guidance to the ship's pilot when a target ship is detected solely by the radar and the pilot cannot see the target ship [20,21]. The portion-angle division is as follows: 030°-90°-150°-210°-292.5°-330° ( Figure 2).
(2) COLREGS define the azimuth of the target ship (005°-112.5°-247.5°-355°; Figure 3). Rule 13 states: A vessel shall be deemed to be overtaking when coming up with another vessel from a direction more than 22.5 degrees abaft her beam. Rule 14 states: When two power-driven vessels are meeting on reciprocal or nearly reciprocal courses so as to involve risk of collision each shall alter her course to starboard so that each shall pass on the port side of the other.

Trajectory Data Identification of Ship Encounters
Currently, there are two quantification sources based on which the ship-encounter azimuthal map can be obtained.
(1) The radar-based collision avoidance steering diagram is mainly used to provide guidance to the ship's pilot when a target ship is detected solely by the radar and the pilot cannot see the target ship [20,21]. The portion-angle division is as follows: 030 • -90 • -150 • -210 • -292.5 • -330 • (Figure 2 at sea would not conflict with the actual navigation rules. This study uses the AIS trajectory data obtained in 2018 from Zhoushan Port (Figure 1) as the source for identifying the encounter data and extracting the handling-behavior data of ships. The flexible multi-sequence mapping samples of the ship encounters are obtained as Seq2Seq training data for the encoder-decoder machine learning. The extracted successful ship collision-avoidance cases from the AIS big data can be divided into two steps: (a) ship AIS-encounter-trajectory-data identification and (b) ship key feature-point (KFP) extraction from the ship-encounter trajectories.

Trajectory Data Identification of Ship Encounters
Currently, there are two quantification sources based on which the ship-encounter azimuthal map can be obtained.
(1) The radar-based collision avoidance steering diagram is mainly used to provide guidance to the ship's pilot when a target ship is detected solely by the radar and the pilot cannot see the target ship [20,21]. The portion-angle division is as follows: 030°-90°-150°-210°-292.5°-330° ( Figure 2).
(2) COLREGS define the azimuth of the target ship (005°-112.5°-247.5°-355°; Figure 3). Rule 13 states: A vessel shall be deemed to be overtaking when coming up with another vessel from a direction more than 22.5 degrees abaft her beam. Rule 14 states: When two power-driven vessels are meeting on reciprocal or nearly reciprocal courses so as to involve risk of collision each shall alter her course to starboard so that each shall pass on the port side of the other.  (2) COLREGS define the azimuth of the target ship (005 • -112.5 • -247.5 • -355 • ; Figure 3). Rule 13 states: A vessel shall be deemed to be overtaking when coming up with another vessel from a direction more than 22.5 degrees abaft her beam. Rule 14 states: When two power-driven vessels are meeting on reciprocal or nearly reciprocal courses so as to involve risk of collision each shall alter her course to starboard so that each shall pass on the port side of the other. The difference with respect to the heading and speed for establishing the ship collisionavoidance behavioral modes is obtained based on the encounter azimuthal maps of two ships approaching collision and the distance to the closest point of approach (DCPA), time to the closest point of approach (TCPA), distance, difference of heading and speed (Table 1). Subsequently, the appropriate training data are extracted based on the ship-encounter pattern (12 types). The difference with respect to the heading and speed for establishing the ship collisionavoidance behavioral modes is obtained based on the encounter azimuthal maps of two ships approaching collision and the distance to the closest point of approach (DCPA), time to the closest point of approach (TCPA), distance, difference of heading and speed (Table 1). Subsequently, the appropriate training data are extracted based on the ship-encounter pattern (12 types). The difference with respect to the heading and speed for establishing the ship collision-avoidance behavioral modes is obtained based on the encounter azimuthal maps of two ships approaching collision and the distance to the closest point of approach (DCPA), time to the closest point of approach (TCPA), distance, difference of heading and speed (Table 1). Subsequently, the appropriate training data are extracted based on the ship-encounter pattern (12 types).

150°210°9
The relative motion of the ship with respect to DCPA and TCPA can be calculated as follows: Here, V xO and V yO are the components of the own ship's speed vector on the x-and y-axes, respectively, V xT and V yT are the components of the target ship's speed vector on the x-and y-axes, respectively, V R is the relative speed, ϕ R is the relative course, and α is the angle compensation coefficient.
The ship-encounter situation data were extracted based on the AIS trajectories of the ship, as shown in Figure 5. The ship-encounter situation data were extracted based on the AIS trajectories of the ship, as shown in Figure 5.

Head-on situation
Crossing situation Overtaking situation Figure 5. The AIS trajectory data in ship-encounter situations.

Key Feature-Point Extraction from the Ship-Encounter Trajectories
In a meeting situation, some ship handling behaviors are hidden in the trajectory, occupying 15% of the overall trajectory points [22]. Therefore, a large amount of redundant data will be present with respect to the overall ship trajectory. In this study, the KFPs are extracted based on the AIS encounter trajectory to recognize the ship handling behavior as training data for the encoder-decoder neural networks.
We improved the multiple-scale sliding window algorithm when considering the ship drift angle to perform KFP extraction. The sliding window algorithm can be used to solve the problem of filtering the sub-elements in the array sequence. This algorithm can convert the original nested loop problem into a one-time single loop problem, which can considerably reduce the time when processing big data. The receptive field of the sliding window in the AIS trajectory KFP extraction algorithm constructed in this study is a rectangle, which always contains only three trajectory points and only judges whether the second point in the receptive field is retained as a feature point in the trajectory at a certain time. In this study, we combined the principle of the compression algorithm, ship drift angle deviation, position deviation, and AIS spatiotemporal characteristics to improve the performance of the online sliding window KFP extraction algorithm.
The KFP extraction process is illustrated in Figures 6 and 7. For more details, please refer to our previous study [23].

Key Feature-Point Extraction from the Ship-Encounter Trajectories
In a meeting situation, some ship handling behaviors are hidden in the trajectory, occupying 15% of the overall trajectory points [22]. Therefore, a large amount of redundant data will be present with respect to the overall ship trajectory. In this study, the KFPs are extracted based on the AIS encounter trajectory to recognize the ship handling behavior as training data for the encoder-decoder neural networks.
We improved the multiple-scale sliding window algorithm when considering the ship drift angle to perform KFP extraction. The sliding window algorithm can be used to solve the problem of filtering the sub-elements in the array sequence. This algorithm can convert the original nested loop problem into a one-time single loop problem, which can considerably reduce the time when processing big data. The receptive field of the sliding window in the AIS trajectory KFP extraction algorithm constructed in this study is a rectangle, which always contains only three trajectory points and only judges whether the second point in the receptive field is retained as a feature point in the trajectory at a certain time. In this study, we combined the principle of the compression algorithm, ship drift angle deviation, position deviation, and AIS spatiotemporal characteristics to improve the performance of the online sliding window KFP extraction algorithm.
The KFP extraction process is illustrated in Figures 6 and 7. For more details, please refer to our previous study [23].

AIS data flow
Input three points {P 1 ,P 2 ,P 3 } Calculate the difference ΔC between the relative azimuth angle C 1 and P 1 course C 0 Calculate the Spatiotemporal distance S between the P 2 to P 2 '        The sliding window algorithm was utilized to extract KFPs from the ship-encounter situation data presented in Section 2.1. Figures 8 and 9 show the before and after extraction of the single ship trajectory. The ship-handling-behavior data were extracted, as shown in Figure 10.  The sliding window algorithm was utilized to extract KFPs from the ship-encounter situation data presented in Section 2.1. Figure 8 and Figure 9 show the before and after extraction of the single ship trajectory. The ship-handling-behavior data were extracted, as shown in Figure 10.  The sliding window algorithm was utilized to extract KFPs from the ship-encounter situation data presented in Section 2.1. Figure 8 and Figure 9 show the before and after extraction of the single ship trajectory. The ship-handling-behavior data were extracted, as shown in Figure 10.

Decoder-Encoder Automatic-Response Neural Networks
In traditional ship-collision avoidance research, a ship collision-avoidance behavior pattern library has to be established in advance; this library must contain most of the maritime encounter patterns at sea stored on a database. Subsequently, the computer matches the current navigation status with the historical data and generates the corresponding ship collision-avoidance decision based on the candidate set. This method has four shortcomings. (a) It requires a large database capacity, (b) the current encounter situation must exist in historical data, (c) it needs strong database search capabilities for support, and (d) it cannot distinguish the quality of the data source. The optimality of the collision-avoidance decision is considerably limited by the quality of the data source; furthermore, the quality of this decision cannot avoid that of the data source itself. The computer only uses the storage and index functions during the collision-avoidance decision-making process and does not fully utilize the advantages of artificial intelligence. Therefore, this can be referred to as automatic collision avoidance. An intelligent collision-avoidance decision-making system should be able to learn, analyze the current state, and perform an in-depth study of the ship collision-avoidance mechanism associated with the ship-encounter data when a situation not recorded in the database is encountered. In addition, it must make relatively reasonable decisions similar to those made by the officers in the ship.

Decoder-Encoder Automatic-Response Neural Networks
In traditional ship-collision avoidance research, a ship collision-avoidance behavior pattern library has to be established in advance; this library must contain most of the maritime encounter patterns at sea stored on a database. Subsequently, the computer matches the current navigation status with the historical data and generates the corresponding ship collision-avoidance decision based on the candidate set. This method has four shortcomings. (a) It requires a large database capacity, (b) the current encounter situation must exist in historical data, (c) it needs strong database search capabilities for support, and (d) it cannot distinguish the quality of the data source. The optimality of the collision-avoidance decision is considerably limited by the quality of the data source; furthermore, the quality of this decision cannot avoid that of the data source itself. The computer only uses the storage and index functions during the collision-avoidance decision-making process and does not fully utilize the advantages of artificial intelligence. Therefore, this can be referred to as automatic collision avoidance. An intelligent collision-avoidance decision-making system should be able to learn, analyze the current state, and perform an in-depth study of the ship collision-avoidance mechanism associated with the ship-encounter data when a situation not recorded in the database is encountered. In addition, it must make relatively reasonable decisions similar to those made by the officers in the ship.

Sequence-to-Sequence (Seq2Seq) Model
The Seq2Seq model is a similar translation response model that translates a sequence into another sequence using two sets of recurrent neural networks (RNNs). The structure of the Seq2Seq model is presented in Figure 11.

Sequence-to-Sequence (Seq2Seq) Model
The Seq2Seq model is a similar translation response model that translates a sequence into another sequence using two sets of recurrent neural networks (RNNs). The structure of the Seq2Seq model is presented in Figure 11. The Seq2Seq model implements step-by-step encoding and decoding of the input and output sequences through a semantic vector state C, which can completely retain all the information from the input sequence to the output sequence.
The Seq2Seq model implements step-by-step encoding and decoding of the input and output sequences through a semantic vector state C, which can completely retain all the information from the input sequence to the output sequence.
(1) Encoder stage The encoder converts a variable-length input sequence into a fixed-length vector, obtains the output for each hidden layer, summarizes the output, and generates a state vector C after a series of non-linear transformations.
where t is the number of KFPs in the input sequence. The output h t of the last hidden layer can also be used as the semantic vector C, and E 1 , . . . , E t denote the input data of the target ship trajectory.
(2) Decoder stage In the decoder, the output sequence Y 2 , . . . , Y t−1 is the own ship response sequence for the target ship and the fixed semantic vector C will be used to predict the next output word Y t .
With the continuous evolution of neural networks, their structure is becoming increasingly complex such that they can include complex data structures, simplify the preprocessing of the input data, and make the input information more complete and without any losses. In this study, the overall target-ship KFP sequence is considered the input and the overall own-ship KFP sequence is considered the output. The encoder-decoder network encodes the target ship sequence to obtain a space state vector and performs decoding to generate the corresponding collision avoidance decision sequence ( Figure 12). Automatic Response Figure 12. Conversion of the Seq2Seq model.

Bidirectional Long Short-Term Memory Recurrent Neural Networks (Bi-LSTM RNN) Structure
The AIS data include standard time-series data; the KFP sequence extracted from the trajectory is considered time-series data. The optimal effect of processing time-series data in neural networks can be obtained using the RNN. Therefore, each encoder and decoder in this study are independent Bi-LSTM RNN networks. Accordingly, the forget gate in the LSTM unit [24][25][26] (Figure 13) stores historical data, remembers the common behavioral patterns, and forgets unique behaviors, improving the universality of the model. Thus, it is more stable and effective when data are being inputted. This study, which is based on RNN, adds LSTM units to the time-series data to solve the gradient disappearance problem, and introduces a bidirectional structure to enhance the contextual correlation [27].

Output Gate
Output Value Figure 12. Conversion of the Seq2Seq model.
Among the two recurrent neural networks (RNN) networks, the first one mainly considers the input of the time-series data sequence to generate a state vector, whereas the other one generates a time-series data sequence through the generated state vector. The two networks cooperate with each other to learn ship collision-avoidance decisions.

Bidirectional Long Short-Term Memory Recurrent Neural Networks (Bi-LSTM RNN) Structure
The AIS data include standard time-series data; the KFP sequence extracted from the trajectory is considered time-series data. The optimal effect of processing time-series data in neural networks can be obtained using the RNN. Therefore, each encoder and decoder in this study are independent Bi-LSTM RNN networks. Accordingly, the forget gate in the LSTM unit [24][25][26] (Figure 13) stores historical data, remembers the common behavioral patterns, and forgets unique behaviors, improving the universality of the model. Thus, it is more stable and effective when data are being inputted. This study, which is based on RNN, adds LSTM units to the time-series data to solve the gradient disappearance problem, and introduces a bidirectional structure to enhance the contextual correlation [27].
Bi-LSTM RNN networks. Accordingly, the forget gate in the LSTM unit [24][25][26] (Figure 13) stores historical data, remembers the common behavioral patterns, and forgets unique behaviors, improving the universality of the model. Thus, it is more stable and effective when data are being inputted. This study, which is based on RNN, adds LSTM units to the time-series data to solve the gradient disappearance problem, and introduces a bidirectional structure to enhance the contextual correlation [27].  Figure 13. Long short-term memory (LSTM) unit structure.
Finally, the seq2seq model and Bi-LSTM RNN are combined to construct the encoder-decoder as shown in Figure 14. Finally, the seq2seq model and Bi-LSTM RNN are combined to construct the encoder-decoder as shown in Figure 14.

Results
The encoder-decoder was implemented in Python within the machine-learning framework of Google's TensorFlow. The overall network training structure, which was automatically generated via TensorBoard, is depicted in Figure 15.

Results
The encoder-decoder was implemented in Python within the machine-learning framework of Google's TensorFlow. The overall network training structure, which was automatically generated via TensorBoard, is depicted in Figure 15. As shown in Figure 16, the ship encounter data obtained by screening are disassembled into the trajectory of the target ship and the collision avoidance decision corresponding to this ship, which is used as the input value of the encoder-decoder for training.  Figure 16 shows the overall process of identifying the encounter data from the AIS data and extracting KFP. Finally, the data are converted into training data for the encoder-decoder. As shown in Figure 16, the ship encounter data obtained by screening are disassembled into the trajectory of the target ship and the collision avoidance decision corresponding to this ship, which is used as the input value of the encoder-decoder for training. As shown in Figure 16, the ship encounter data obtained by screening are disassembled into the trajectory of the target ship and the collision avoidance decision corresponding to this ship, which is used as the input value of the encoder-decoder for training.  Figure 16 shows the overall process of identifying the encounter data from the AIS data and extracting KFP. Finally, the data are converted into training data for the encoder-decoder.  Figure 16 shows the overall process of identifying the encounter data from the AIS data and extracting KFP. Finally, the data are converted into training data for the encoder-decoder.

Decoder input data
From 20 sets of training data (Figure 17), it was found that the automatic-response system based on the encoder-decoder has great potential in terms of the ability to reason about collision avoidance decisions. Figure 18 (1) shows the crossing encounter situation in which AIS trajectory data are extracted from the AIS data obtained from Zhoushan Port, and Figure 18 (2) shows the KFPs extracted using the sliding window algorithm. Figure 18 (3) shows the normalized training data, and Figure 18 (4) shows the training data after sequence decomposition and translation rotation transformation. Figure 18 shows the training and learning processes. From 20 sets of training data (Figure 17), it was found that the automatic-response system based on the encoder-decoder has great potential in terms of the ability to reason about collision avoidance decisions. Figure 18 (1) shows the crossing encounter situation in which AIS trajectory data are extracted from the AIS data obtained from Zhoushan Port, and Figure 18 (2) shows the KFPs extracted using the sliding window algorithm. Figure 18 (3) shows the normalized training data, and Figure 18 (4) shows the training data after sequence decomposition and translation rotation transformation. Figure 18 shows the training and learning processes. As shown in Figure 19, the reasoning task (task 1) in the single meeting situation shows a certain potential. It can be grounded on the basis of the reasoning of the position relationship, and the results of the generative automatic-response model proposed here are excellent. This shows that the overall structure based on the Seq2Seq model can be directly applied for structured learning, and the Bi-LSTM RNN can integrate the above knowledge to learn relevant information from the ship trajectory sequence, such as position and relationship information, when generating collision avoidance decisions, which can be triggered correctly. As shown in Figure 19, the reasoning task (task 1) in the single meeting situation shows a certain potential. It can be grounded on the basis of the reasoning of the position relationship, and the results of the generative automatic-response model proposed here are excellent. This shows that the overall structure based on the Seq2Seq model can be directly applied for structured learning, and the Bi-LSTM RNN can integrate the above knowledge to learn relevant information from the ship trajectory sequence, such as position and relationship information, when generating collision avoidance decisions, which can be triggered correctly. From Figure 19, it can be seen that the decision learning for ship collision avoidance converges very quickly, and the effect of collision avoidance is quickly achieved. Figure 20 presents the results of each type of decision-making to avoid target ships.

Conclusions
In this study, we propose the usage of encoder-decoder neural networks by utilizing the Seq2Seq model to learn how to generate appropriate collision-avoidance decisions based on the historic successful collision-avoidance cases. The Seq2Seq model can be conveniently used to develop training data and facilitate collision-avoidance actions. Thus, the proposed method could considerably improve the optimization efficiency, resulting in the increased realistic nature of the generated trajectories. Moreover, if the system encounters an unknown situation, conventional machine learning can be used to generate an average value without considering the authenticity of the ship navigation. The encoder-decoder is dependent on real data structures to generate an anthropomorphic decision, which may not be the optimal collision-avoidance solution; however, it can ensure the effectiveness and authenticity of the collision-avoidance strategy.
Thus, the proposed method employs big data training. Furthermore, it has various applications and exhibits high versatility, achieving timely strategy generation, timely response, and enhanced collision-avoidance security. Therefore, it can be used to generate intelligent ship-collision avoidance decision-making strategies and perform dynamic route planning and other relevant tasks.

Conclusions
In this study, we propose the usage of encoder-decoder neural networks by utilizing the Seq2Seq model to learn how to generate appropriate collision-avoidance decisions based on the historic successful collision-avoidance cases. The Seq2Seq model can be conveniently used to develop training data and facilitate collision-avoidance actions. Thus, the proposed method could considerably improve the optimization efficiency, resulting in the increased realistic nature of the generated trajectories. Moreover, if the system encounters an unknown situation, conventional machine learning can be used to generate an average value without considering the authenticity of the ship navigation. The encoder-decoder is dependent on real data structures to generate an anthropomorphic decision, which may not be the optimal collision-avoidance solution; however, it can ensure the effectiveness and authenticity of the collision-avoidance strategy.
Thus, the proposed method employs big data training. Furthermore, it has various applications and exhibits high versatility, achieving timely strategy generation, timely response, and enhanced collision-avoidance security. Therefore, it can be used to generate intelligent ship-collision avoidance decision-making strategies and perform dynamic route planning and other relevant tasks.