In this paper, we compare the performance of each model and four trajectories are randomly selected to measure the generalization ability of the model finally. Experimental results validate the effectiveness of the TTAG model.
Experimental Design and Analysis
- 1.
Comparison between the TTAG model and the baseline models
The processed AIS data were subjected to contrast experiments on the following seven models. The first 20-min sequence of historical tracks were entered to obtain the trajectory of future 1-min, 5-min, 10-min, as well as 15-min, respectively. We set the batch size to 2000, the maximum number of training rounds to 500, and the learning rate to 0.001. The experimental results are shown in
Table 3 and
Table 4. We calculated not only the MSE values of the trajectory points, but also the respective MSE values for latitude and longitude. The experimental conclusions are as follows.
(1) For LSTM model and GRU model, it can be seen from the table that when predicting the trajectory of the future 1 min, the prediction accuracy of GRU model and LSTM model is similar, but compared with other models, the prediction accuracy of these two models is lower. In particular, the performance of the LSTM model decreases rapidly with the increase of the number of predicted points. The possible reason for this is that the LSTM model is not good at processing the data under the scenario presented in this paper.
(2) Both BiGRU and BiLSTM are two-way networks. Unlike the one-way network, which can only access past information, the two-way network can simultaneously access past and future information, thus making the prediction results more accurate. According to the experimental results of this paper, when BiGRU model and BiLSTM model predict the trajectory of the future 1 min, compared with GRU and LSTM model, the loss value decreases somewhat. With the increase in number of iterations, the loss value of bidirectional network increases more gently.
(3) Compared with recurrent neural networks, the accuracy of the TCN model in predicting the trajectory of the future 1 min is greatly improved. This is because the TCN network has a strong time memory ability, but with the increase of the number of iterations, the performance of the TCN network declines rapidly. We have conducted many experiments, and the results obtained are basically the same. For the problem of rapid deterioration of TCN network performance, the possible reason for this is that the extraction of data features by the TCN model in this experiment is still insufficient, and the TCN network may be more suitable for the prediction in a very short time.
(4) The TTCN model, namely the improved TCN model proposed in this paper, can be seen from the data in the table that its performance exceeds that of the traditional TCN model. When predicting the future trajectory, the accuracy of traditional TCN model decreases rapidly with the increase of prediction time, while the accuracy of TTCN model also decreases rapidly, but its MSE value is still smaller than that of BiGRU and BiLSTM bidirectional network when predicting the future trajectory of the next 5 min. The TTCN model outperformed both the TCN model and the GRU model in predicting the trajectory of the next 15 min. The experimental results show that our improvement on TCN is successful.
(5) The TTAG model is a model built in this paper that integrates the TTCN network. Based on the TTCN model, an attention mechanism and GRU network are added to further extract key data features, thus improving the accuracy of prediction. According to the experimental results, it can be seen that the loss value of TTAG model is the lowest among all models, regardless of whether predicting the trajectory of the next 1 min or the trajectory of the next 15 min.
Figure 9 shows the relationship between the loss value of each model and the number of predicted points. Due to the poor performance of the LSTM model compared with other models in this section, we did not draw the loss curve of the LSTM model in order to see the comparison results of the other six models more clearly. It can be intuitively seen from the figure that the loss value of TTAG model is smaller than that of other models. With the increase in predicted points, although the loss value keeps rising, compared with the baseline model, the rise curve is more gentle. The experimental results in this section verify the validity of the TTAG model.
- 2.
Comparison experiment of TTCN-Attention-X models with different architectures
As can be seen from the results of
Table 3 and
Table 4, compared with the traditional recurrent neural network, the TTAG model exhibits obvious advantages in the experiment. Now we change the internal structure of the TTAG model and construct the TTCN-Attention-X model with different architectures for comparison, where X is other neural network, such as LSTM. We want to know whether models with TTA-X architecture all exhibit good performance in trajectory sequence prediction through comparative experiments. Due to the time-consuming training of the two-way network, only three models, i.e., TTCN-Attention-GRU, TTCN-Attention-LSTM (TTAL) and TTCN-Attention-RNN (TTAR) are compared in this experiment. Experimental results are shown in
Table 5 and
Table 6.
As can be seen in the following table, the loss value of TTAL model when predicting the trajectory of the future 1 min is not much different from that of the TTAG model. With the increase in prediction time, the performance of the TTAL model decreases significantly, while the accuracy of TTAR model is lower. The experimental results show that the performance of the TTA-X architecture depends on the performance of X network to some extent. The long-term memory capacity of RNN network is not good, so the performance of TTAR model is poor. The performance of TTAL model is weaker than that of TTAG model, possibly because GRU network is more suitable for the scenario in this paper.
- 3.
Effect of expansion coefficient and convolution kernel size on TTAG model
Compared with recurrent neural networks, receptive field is unique to convolutional neural networks. A larger receptive field means that convolutional neural networks can obtain more context information, but larger is not always better. Increasing the receptive field tends to increase the training time. In addition, the data fitting effect does not necessarily get better when the model becomes more complex. In this section, different receptive fields will be obtained by setting different expansion coefficients, and experimental comparison will be conducted for each group.
Table 7 is the experimental results. Due to the limited experimental environment, after fixing the two dilated causal convolution layers in the TTCN model with smaller expansion coefficient, we conduct experiments on the dilated causal convolution layer with the largest expansion coefficient, and only measure the MSE value in the next 1 min. As can be seen from the table, in the process of the expansion coefficient list from [1, 2, 4, 8] to [1, 2, 4, 8, 16, 32], the prediction accuracy of the model keeps rising, but after that, the model performance begins to decline. At the same time, it can be seen from the table that the average training time of the model increases significantly with the increase of receptive field. Under the premise of fixed expansion coefficient, we also carry out comparative experiments on the size of convolution kernel. It can be seen from
Table 8 that when the convolution kernel size is 3, the model achieves the best performance. With the increase of convolution kernel, the model becomes more complex, but the accuracy decreases.
- 4.
Ablation experiment
In the previous experiments, we have verified the validity of the TTAG model proposed in this paper. In order to observe the role of each module in the TTAG model, the ablation experiment is carried out in this section. For this purpose, different modules are removed, respectively, and the TTCN-Attention network, TTCN-GRU network and Attention-GRU network are constructed.
Table 9 is the experimental result, as above, here we only carried out the measurement experiment of trace MSE value in the next 1 min. As can be seen from the table, the accuracy of the model decreases when any module in the model is removed. When TTCN module is excluded, the performance of the model is only comparable to that of the baseline model, indicating that TTCN network can greatly improve the performance of the TTAG model. When the attention module and GRU module in this section are excluded, the performance of the model also decreases significantly. Meanwhile, according to the following table, the role of each module in improving the overall model performance is sorted as: TTCN module > attention module > GRU module. Ablation experiments show that any module in the TTAG model is indispensable, and each module does play a role in improving the accuracy of the overall model. It’s not necessarily better to build a complex model, only by giving full play to the role of each module and paying attention to the characteristic information that has a great influence on the predicted value can we achieve a better prediction effect.
- 5.
Model generalization
In order to verify the generalization ability of the TTAG model, four different trajectories are randomly selected for prediction at last. In addition to the first trajectory, the other three trajectories include curve segments. The track segments contain 100, 100, 274 and 588 track points, respectively.
Because the trajectory of the ship in the low-speed sailing stage is prone to clutter, the four trajectory segments selected are all trajectories of the medium- and high-speed sailing stages. We set an indicator and an error threshold . If the average distance between the predicted value and the real value of the track less than , then we consider the prediction to be successful and the value of is 1, otherwise, it is 0. We set = 250 m.
According to
Table 10, it can be seen that the
hit of all four trajectories is 1, indicating that the deviation fall within a reasonable error range, and the MSE value of trajectory 1 is the least.
Figure 10 shows the scatter plots of the predicted value and the real value of four trajectories, respectively. It can be intuitively seen from the figure that the predicted value and the real value have a high coincidence for both the first two shorter trajectories and the second two longer trajectories.