# Incorporation of Deep Kernel Convolution into Density Clustering for Shipping AIS Data Denoising and Reconstruction

## Abstract

**:**

## 1. Introduction

- Question 1: How to accurately handle noise, redundant, and abnormal data in big AIS data, relating to both large and small water areas?
- Question 2: How to reconstruct the trajectory after data denoising based on different ships?

- (1)
- Development of a systematical framework that enables rational AIS data denoising, trajectory extraction, and reconstruction.
- (2)
- Incorporation of deep kernel convolution and density clustering into the process of AIS data denoising.
- (3)
- Application of the piecewise cubic spline interpolation method in trajectory reconstruction, in which the position and speed of ships are taken into account in an interpolation process.
- (4)
- Implementation of the experiments to verify the effectiveness of the proposed methodology in both big and small waterways.

## 2. Literature Review

#### 2.1. Research on Denoising Based on AIS Data Features

#### 2.2. Research on Denoising Based on Clustering

#### 2.3. Research on Denoising Based on Deep Learning

## 3. Methodology

#### 3.1. The Proposed Framework

#### 3.2. A New DBSCANDKC Method

**Definition**

**1.**

**Definition**

**2.**

**Definition**

**3.**

#### 3.3. The Proposed Methodology

Algorithm 1: DBSCANDKC | |

Input: Raw AIS trajectory dataset $data$ and density threshold $MinPts$ | |

Output: The reconstructed trajectory dataset $Trdataset$ | |

step 1 | Get the ship AIS dataset $data1\leftarrow data(MMSI,Timestamp)$ |

step 2 | Delete obvious abnormal data points and obtain the dataset $data2\leftarrow data1\cap kinematicfeatures$ for ${P}_{i}$ in $data1$: if ${P}_{i}.lon\in [-{180}^{\circ},{180}^{\circ}]\cap {P}_{i}.lat\in [-{90}^{\circ},{90}^{\circ}]\cap {P}_{i}.speed\in [1,40]$ $\cap item.course\in [{0}^{\circ},{360}^{\circ}]\cap d\in [0.05,1]$ $data2.reserve({P}_{i})$ else $data2.noise({P}_{i})$ end if end for |

step 3 | Grid meshing and generate density matrix $DM{2}_{N\times M}$ for ${P}_{i}$ in $data2$: $DM1(i,j)\leftarrow {P}_{i}.lon\cap {P}_{i}.lat$ end for |

step 4 | Calculate the new density matrix $DM{2}_{N\times M}\leftarrow Deepconvolution(DM{1}_{N\times M})$ |

step 5 | $data3\leftarrow \{DM2(i,j),MinPts\}$ for $DM2(i,j)$ in $DM{2}_{N\times M}$: if $DM2(i,j)<MinPts$ $data3.noise(DM2(i,j))$ else $data3.reserve(DM2(i,j))$ end if end for |

step 6 | Ship trajectories $data4\leftarrow \{data3,MMSI,timestamp\}$ |

step 7 | Reconstruct the trajectory data $Trdataset\leftarrow data4(cubicspline)$ for ${P}_{i},{P}_{i+1}$ in $data4$: if $|{P}_{i+1}.time-{P}_{i}.time|10s$: $Trdatase{t}_{i}\leftarrow cubicspline({P}_{i},{P}_{i+1})$ end if end for |

step 8 | Return the reconstruct trajectories dataset $Trdataset$ |

#### 3.3.1. Trajectory Preprocessing

- Ship trajectory division;

- Abnormal Data Cleaning.

Algorithm 2: Trajectory preprocessing | |

Input: Raw AIS data $data$ | |

Output: Preprocessed ship data $data2$ | |

for ${P}_{i}\in data$ $data1\leftarrow data(MMSI,Timestamp)$ split raw ship AIS data | |

for ${P}_{j}\in data1$ | |

if ${P}_{j}.lon\notin [-{180}^{\circ},{180}^{\circ}]\left|\right|{P}_{j}.lat\notin [-{90}^{\circ},{90}^{\circ}]$ | |

or ${P}_{j}.speed\notin [1,40]$ | |

or ${P}_{j}.course\notin [{0}^{\circ},{360}^{\circ}]$ | |

or $d\notin [0.05,1]$ | |

continue | |

else | |

return $data2$ of the same MMSI on different days end if | |

end for | |

end for |

#### 3.3.2. Data Cleaning Based on Data Features and Deep Convolution

- Mesh Division;

- Convolution kernel operation;

- Potential data cleaning.

Algorithm 3: Potential Data Cleaning | |

Input: Density matrix $DM{1}_{N\times M}$, $DM{2}_{N\times M}$, and density threshold $MinPts$ | |

Output: Kore points $data3$ | |

for $DM{2}_{ij}$ in $DM{2}_{N\times M}$: | |

if $DM{2}_{ij}\le DM{1}_{ij}$ | |

$data3.noise(DM{2}_{ij})$ | |

else | |

$data3.reserve(DM{2}_{ij})$ end if | |

end for return $data3$ |

#### 3.3.3. Trajectory Reconstruction

- Ship trajectory division;

- Determine the interpolation interval;

- Trajectory interpolation.

Algorithm 4: Trajectory reconstruction | |

Input: Denoised AIS data $data3$ | |

Output: Reconstructed trajectory data $Trdataset$. Split $data3$ $data4\leftarrow \{data3,MMSI,timestamp\}$ | |

for ${P}_{j}$ in $data4$: | |

if Δt > 10 Reconstruct the trajectory data $Trdatase{t}_{j}\leftarrow data4(cubicspline)$ end if | |

end for | |

return $Trdataset$ |

## 4. Experimental Results and Analysis

#### 4.1. Data Set and Experimental Design

#### 4.2. Visualisation Results of Different Kernel Functions

#### 4.3. Visualisation and Analysis of Trajectory Denoising Results in Two Research Areas

#### 4.4. Trajectory Reconstruction and Comparative Analysis of Arctic Ocean

#### 4.5. Trajectory Reconstruction and Comparative Analysis of Strait of Dover Waters

#### 4.6. Discussion

## 5. Conclusions

**Figure 6.**Visualisation of different kernel functions in the Arctic Ocean. (

**a**) The original data; (

**b**) the results with Gaussian convolution kernel; (

**c**) the results with mean convolution kernel; (

**d**) the results with sharpening convolution kernel.

**Figure 7.**Visualisation of different kernel functions in the Strait of Dover water. (

**a**) The original data; (

**b**) the results after the Gaussian convolution kernel; (

**c**) the results after the mean convolution kernel; (

**d**) the results after sharpening convolution kernel.

**Figure 8.**Comparison of denoising and reconstruction results in the Arctic Ocean. (

**a**) The results after simple data preprocessing; (

**b**) the results after the deep convolution operation; (

**c**) the reconstructed trajectories results.

**Figure 9.**Comparison of denoising and reconstruction effects in Strait of Dover water. (

**a**) Data preprocessing; (

**b**) data cleaning; (

**c**) trajectory reconstruction.

**Figure 10.**Comparison of denoising Effects of MMSI 218832000. (

**a**) Data preprocessing; (

**b**) data cleaning; (

**c**) trajectory reconstruction.

**Figure 12.**Comparison of denoising effects of MMSI 316025029. (

**a**) The raw AIS trajectory; (

**b**) the results after simple data preprocessing; (

**c**) the point data before reconstruction; (

**d**) the results reconstructed trajectories (different colours represent the ship trajectories on 23 days).

**Figure 14.**Comparison of denoising effects of MMSI 220002000. (

**a**) Data preprocessing; (

**b**) data cleaning; (

**c**) trajectory reconstruction.

**Figure 16.**Comparison of denoising effects of MMSI 244554000. (

**a**) Data preprocessing; (

**b**) data cleaning; (

**c**) trajectory reconstruction.

Water Areas | Time Span | Number of Trajectories | Number of Points | Longitude | Latitude |
---|---|---|---|---|---|

Arctic Ocean | 1 September 2018–31 September 2018 | 108,588 | 53,267,239 | 170° W–180° E | 66.089° N–90° N |

Strait of Dover | 1 January 2018–31 January 2018 | 3043 | 50,610 | 1.057° E–3.042° E | 50.622° N–51.952° N |

Raw Data Set | Dataset after Preprocessing | Dataset after Convolution | Dataset after Reconstruction | |
---|---|---|---|---|

Trajectories | 108,588 | 3046 | 2982 | 2982 |

Points | 53,267,239 | 2,146,651 | 1,972,471 | 2,433,576 |

Raw Data Set | Dataset After Preprocessing | Dataset after Convolution | Dataset after Reconstruction | |
---|---|---|---|---|

Trajectories | 3043 | 1057 | 1052 | 1504 |

Points | 50,610 | 30,689 | 29,793 | 99,828 |

MMSI | Raw Data Set | Dataset after Preprocessing | Dataset after Convolution | Dataset after Reconstruction |
---|---|---|---|---|

218832000 | 69,815 | 3815 | 819 | 3983 |

316025029 | 5215 | 3579 | 2142 | 4980 |

220002000 | 38 | 32 | 29 | 31 |

244554000 | 107 | 94 | 87 | 116 |

