Real-Time Management of Vessel Carbon Dioxide Emissions Based on Automatic Identiﬁcation System Database Using Deep Learning

: In this study, we propose an effective method using deep learning to strengthen real-time vessel carbon dioxide emission management. We propose a method to predict real-time carbon dioxide emissions of the vessel in three steps: (1) convert the trajectory data of the ﬁxed time interval into a spatial–temporal sequence, (2) apply a long short-term memory (LSTM) model to predict the future trajectory and vessel status data of the vessel, and (3) predict the carbon dioxide emissions. Automatic identiﬁcation system (AIS) database of a liqueﬁed natural gas (LNG) vessel were selected as the sample and we reconstructed the trajectory data with a ﬁxed time interval using cubic spline interpolation. Applying the interpolated AIS data, the carbon dioxide emissions of the vessel were calculated based on the International Towing Tank Conference (ITTC) recommended procedures. The experimental results are twofold. First, it reveals that vessel emissions are currently underestimated. This study clearly indicates that the actual carbon dioxide emissions are higher than those reported. The ﬁnding offers insight into how to accurately measure the emissions of vessels, and hence, better execute a greenhouse gases (GHGs) reduction strategy. Second, the LSTM model has a better trajectory prediction performance than the recurrent neural network (RNN) model. The errors of the trajectory endpoint and carbon dioxide emissions were small, which shows that the LSTM model is suitable for spatial–temporal data prediction with excellent performance. Therefore, this study offers insights to strengthen the real-time management and control of vessel greenhouse gas emissions and handle those in a more efﬁcient way.


Greenhouse Gases (GHGs) of Vessels
With global warming progressing worldwide, each government's emission management for greenhouse gases (GHGs) is becoming more meticulous. GHG emissions from vessels are receiving increasing attention, especially in port cities, since vessels have become a major source of polluting emissions. Most governments have introduced emission tax rates to reduce the GHG emissions of vessels, especially emissions of carbon dioxide (CO 2 ), which is one of the most important greenhouse gases. Since automatic identification system (AIS) database store the detailed real-time trajectory data of vessels, we can use longitude, latitude, speed over ground (SOG) and course over ground (COG) to accurately, and in real-time, estimate the carbon dioxide emissions while the vessel is sailing. Many studies have been carried out to estimate vessel Existing studies suggest that the LSTM model is a suitable algorithm for vessel trajectory prediction applications. However, studies predicting emissions applying reconstructed trajectory data in deep-learning models have been scarce. This study aims to fill the research gap.

The Motivation of the Study
The motivation of this study is to strengthen the real-time management of vessel carbon dioxide emissions and to control vessel GHG emissions in a more efficient way. To achieve that, we propose a method that can accurately grasp the current emissions of the ship and predict the future emissions status. We consider that the incompleteness of the AIS trajectory data will affect the accuracy of the vessel carbon dioxide emissions estimation. As such, we propose using the cubic spline interpolation method to enrich the vessel trajectory. With this method, we can estimate the vessel carbon dioxide emissions every 1 s to improve the estimation accuracy. Thereafter, we use the LSTM model to learn the historical SOG characteristics and trajectory characteristics of vessels, to predict the spatial-temporal distribution of vessels' future carbon dioxide emissions.
Deep learning can learn the basic characteristics of the data from a small number of samples through a deep non-linear network structure, which is suitable for rapid analysis of vessel data in the marine environment, so we chose deep learning. To improve the validity of the prediction data, we emphasized a real-time and short-term nature. Figure 1 outlines the flow of this study.
J. Mar. Sci. Eng. 2021, 9, x FOR PEER REVIEW 3 of 16 using the convolutional neural network alone, but the prediction error was not overwhelming compared to the LSTM model. There are also studies that examine the accuracy of machine learning methodologies on real datasets in vessels [14,15]. Existing studies suggest that the LSTM model is a suitable algorithm for vessel trajectory prediction applications. However, studies predicting emissions applying reconstructed trajectory data in deep-learning models have been scarce. This study aims to fill the research gap.

The Motivation of the Study
The motivation of this study is to strengthen the real-time management of vessel carbon dioxide emissions and to control vessel GHG emissions in a more efficient way. To achieve that, we propose a method that can accurately grasp the current emissions of the ship and predict the future emissions status. We consider that the incompleteness of the AIS trajectory data will affect the accuracy of the vessel carbon dioxide emissions estimation. As such, we propose using the cubic spline interpolation method to enrich the vessel trajectory. With this method, we can estimate the vessel carbon dioxide emissions every 1 s to improve the estimation accuracy. Thereafter, we use the LSTM model to learn the historical SOG characteristics and trajectory characteristics of vessels, to predict the spatial-temporal distribution of vessels' future carbon dioxide emissions.
Deep learning can learn the basic characteristics of the data from a small number of samples through a deep non-linear network structure, which is suitable for rapid analysis of vessel data in the marine environment, so we chose deep learning. To improve the validity of the prediction data, we emphasized a real-time and short-term nature. Figure 1 outlines the flow of this study. The rest of the paper is structured as follows. Section 2 describes the formation of the cubic spline interpolation model for reconstructing trajectory data, the LSTM model for prediction and the CO2 emission estimation model. Section 3 discusses the experiments and results. Section 4 summarizes the work and the prospects for future studies. The rest of the paper is structured as follows. Section 2 describes the formation of the cubic spline interpolation model for reconstructing trajectory data, the LSTM model for prediction and the CO 2 emission estimation model. Section 3 discusses the experiments and results. Section 4 summarizes the work and the prospects for future studies.

Cubic Spline Interpolation Model
Due to incorrect operation of the AIS system by shore and vessel personnel, information transmission failure between AIS and the shore base, subjective and objective factors such as the random failure of the AIS system itself or the problem of artificial improper maintenance, we need to select a method of interpolation and repair the trajectory to obtain more accurate and complete vessel trajectory information. Compared with the linear interpolation model, the non-linear interpolation model is more in line with the actual sailing state of the ship; especially near the coast, the ship cannot always keep a straight line sailing, but will adjust the course according to different sea conditions. Therefore, we select the method of cubic spline interpolation to smooth the vessel trajectory.
We suppose that the interval of trajectory data is [a, b], divided [a, b] into n intervals, as [(x 0 , x 1 ), (x 1 , x 2 ), · · · , (x n−1 , x n )], x 0 = a, x n = b, where the function expression for each interval is S(x). Cubic spline means that the curve of each interval is a cubic equation S i (x) that meets the interpolation conditions S(x i ) = y i and meets the condition of a smooth curve where S i (x), S i (x), S i (x) are a continuous function. The solved equation (Bartels et al. [16]) is as follows: where S i (x) is the cubic spline model expression, and a i , b i , c i , d i are the parameters to be solved. Accordingly, S i (x) must meet interpolation conditions S(x i ) = y i and Equations (1)-(3). Then, we can obtain the equation as follows: According to continuous function conditions, We can obtain the equation as follows: Inputting Equations (4), (10) and (11) into Equation (6), we can obtain the equation as follows: By inputting Equations (4) and (10)-(12) into Equation (9), we can obtain the equation as follows: We build linear equations with m as the unknown (m 0 = 0, m n = 0): We can calculate m 0 , m 1 , · · · , m n from Equation (14) and use it to calculate a i , b i , c i , d i and know the function expression for each interval to enrich the vessel trajectory data.

Long Short-Term Memory Model
The LSTM model (Hochreiter et al. [17]) is a variant of the recurrent neural network (RNN). The RNN cannot learn longer histories' data, resulting in a gradient decline or even disappearance at further time steps. To solve this problem, the LSTM model introduces storage units and unit states to control information transfer based on the RNN.
There are four gates (forget gate, input gate, update gate and output gate) in the storage unit of the LSTM model. The input gate controls the addition of new information.
The forget gate can forget information that needs to be discarded and retain the useful information of the past. The update gate can update data. The output gate causes the storage unit to output only information related to the current time step. These four gate structures perform matrix multiplication and non-linear summing in the memory cells so that the memory does not decay in constant iterations.
As shown in Figure 2, the structure of LSTM neural network consists of three layers: input layer, hidden layer (LSTM_Layer_1 and Other Layer) and output layer. Figure 3a shows the structure of the RNN. Its cell contains only one activation function, and the cells are only linked in order. Figure 3b is the structure of LSTM. Its cell is more complex than the RNN. It needs four gate calculations to output to the next cell. Figure 3c is an enlarged view of Figure 3b.
We build linear equations with as the unknown ( = 0, = 0): We can calculate , , ⋯ , from Equation (14) and use it to calculate , , , and know the function expression for each interval to enrich the vessel trajectory data.

Long Short-Term Memory Model
The LSTM model (Hochreiter et al. [17]) is a variant of the recurrent neural networ (RNN). The RNN cannot learn longer histories' data, resulting in a gradient decline o even disappearance at further time steps. To solve this problem, the LSTM model intro duces storage units and unit states to control information transfer based on the RNN.
There are four gates (forget gate, input gate, update gate and output gate) in the stor age unit of the LSTM model. The input gate controls the addition of new information. Th forget gate can forget information that needs to be discarded and retain the useful infor mation of the past. The update gate can update data. The output gate causes the storag unit to output only information related to the current time step. These four gate structure perform matrix multiplication and non-linear summing in the memory cells so that th memory does not decay in constant iterations.
As shown in Figure 2, the structure of LSTM neural network consists of three layers input layer, hidden layer (LSTM_Layer_1 and Other Layer) and output layer. Figure 3 shows the structure of the RNN. Its cell contains only one activation function, and the cell are only linked in order. Figure 3b is the structure of LSTM. Its cell is more complex than the RNN. It needs four gate calculations to output to the next cell. Figure 3c is an enlarged view of Figure 3b.   Forget Gate: For time t, the state ℎ at the previous time and the current training data can get through the forget gate; the formula is as follows: Input Gate: decides the amount that can be added to the cell state in the tanh network layer, and outputs a number between 0 and 1 through the Sigmoid network layer to decide which status values to update; the formula is as follows: Update Gate: Updates old state to new state . This gate retains long-term and short-term memory in different proportions of the cell; the formula is as follows: Output Gate: The third sigmoid network layer determines parts of the output cell state, combined with Equation (20), to obtain the output value of the cell; the formula is as follows: where : forget gate, : input cell, : input gate, : update gate, : output gate, σ: sigmoid activation function, W: weights for different gates, : input value at time t, ℎ : output value at the previous moment, and b: bias term for different gates.

Emission Estimation Model
In this study, we adopted the ITTC recommended procedure [19,20] to estimate vessel resistance. It is necessary to derive the total resistance first. Total resistance can be denoted as: where : total resistance, : total resistance coefficient, : density of water, S: wetted surface of the hull, V: SOG. Forget Gate: For time t, the state h t−1 at the previous time and the current training data x t can get f t through the forget gate; the formula is as follows: Input Gate: C t decides the amount that can be added to the cell state in the tanh network layer, and i t outputs a number between 0 and 1 through the Sigmoid network layer to decide which status values to update; the formula is as follows: Update Gate: Updates old state C t−1 to new state C t . This gate retains long-term and short-term memory in different proportions of the cell; the formula is as follows: Output Gate: The third sigmoid network layer determines parts of the output cell state, combined with Equation (20), to obtain the output value of the cell; the formula is as follows: where f t : forget gate, C t : input cell, i t : input gate, C t : update gate, o t : output gate, σ: sigmoid activation function, W: weights for different gates, x t : input value at time t, h t−1 : output value at the previous moment, and b: bias term for different gates.

CO 2 Emission Estimation Model
In this study, we adopted the ITTC recommended procedure [19,20] to estimate vessel resistance. It is necessary to derive the total resistance first. Total resistance can be denoted as: where R T : total resistance, C T : total resistance coefficient, ρ: density of water, S: wetted surface of the hull, V: SOG.
C T , the total resistance coefficient, can denoted as: where C F : frictional resistance coefficient, C A : incremental resistance coefficient, C AA : air resistance coefficient, C R : residual resistance coefficient. Based on calculated total resistance of the vessel, an estimation of the required power for the vessel to sail at speed V in a calm sea condition can be calculated by considering the components of the propulsion efficiencies. Installed power is the power required to tow the vessel with speed V in a calm sea. Service power can be derived from: where P I : installed power, η T : transmission efficiency,η D : quasi-propulsive coefficient, m: sea margin. Fuel oil consumption is calculated by using the specific fuel oil consumption (SFOC) as shown in Table 1 from Smith et al. [21]. Table 1 shows the value of the SFOC for each diesel engine type with engine age. As the vessel engine becomes older, the efficiency of the engine decreases and the advent of technology makes a newer engine more efficient. Table 1. Specific fuel oil consumption (SFOC) and distribution of engine age of vessels included in automatic identification system (AIS) data.

Engine Age Slow-Speed Diesel (SSD) Engine Medium-Speed Diesel (MSD) Engine High-Speed Diesel (HSD) Engine
Before The marine liquefied natural gas (LNG) CO 2 emissions factor is 2.75. So, the CO 2 emission estimation model can be denoted as: where E i , the total CO 2 emissions, is calculated by summing the carbon dioxide emissions at each trajectory point; T is the time interval of each track point.

Automatic Identification System (AIS) Dataset Analysis
This study used AIS data of an LNG vessel provided by exactEarth, as shown in Figure 4. The time period was from 1 January 2016 to 30 June 2016, and number of messages was about 50 million. It included vessel names, callsigns, maritime mobile service identities (MMSIs), vessel types, vessel cargos, vessel classes, lengths, widths, flag countries, destinations, estimated times of arrival (ETAs), draughts, longitudes, latitudes, SOGs, COGs, rates of turn (ROTs), headings, navigation (nav) statuses, source, times, main vessel types and sub vessel types.
The AIS data used this research took a comma-separated values (CSV) form. Every datum was divided by day based on Greenwich Mean Time (GMT). A total of 182 days were included, so the complete data of the vessel were divided into 182 small datasets. This was inconvenient, so we used MySQL to build a trajectory database combining the 182 small datasets into a large dataset.
This trajectory dataset mainly included: MMSI, longitude, latitude, SOG, COG and time. In the original dataset, we found that the average time interval of each piece of AIS data in Table 2 was 520.52 s. Furthermore, 25% of the data had an interval time within 6 s, 50% were within 17 s, and 75% were within 42 s. According to Figure 5, we also found that more than 90% of the data had a time interval of more than 30 min. Only about 8% of the data had a time interval of 2 s. This may be because AIS data collected through satellites show longer data collecting intervals when the vessel was sailing in areas with high traffic compared to areas with less traffic.

Interpolation Calculation
After database analysis, taking the data integrity and continuity as the principle, we extracted the LNG vessel trajectory data of MMSI 310028000 from the trajectory database as a sample to illustrate the feasibility of the model. The vessel trajectory over six months is shown in Figure 6. The vessel travels mainly between Japan and Australia; the IMO number is "8913174", built in 1992. In the original dataset, we found that the average time interval of each piece of AIS data in Table 2 was 520.52 s. Furthermore, 25% of the data had an interval time within 6 s, 50% were within 17 s, and 75% were within 42 s. According to Figure 5, we also found that more than 90% of the data had a time interval of more than 30 min. Only about 8% of the data had a time interval of 2 s. This may be because AIS data collected through satellites show longer data collecting intervals when the vessel was sailing in areas with high traffic compared to areas with less traffic.  In the original dataset, we found that the average time interval of each piece of AIS data in Table 2 was 520.52 s. Furthermore, 25% of the data had an interval time within 6 s, 50% were within 17 s, and 75% were within 42 s. According to Figure 5, we also found that more than 90% of the data had a time interval of more than 30 min. Only about 8% of the data had a time interval of 2 s. This may be because AIS data collected through satellites show longer data collecting intervals when the vessel was sailing in areas with high traffic compared to areas with less traffic.

Interpolation Calculation
After database analysis, taking the data integrity and continuity as the principle, we extracted the LNG vessel trajectory data of MMSI 310028000 from the trajectory database as a sample to illustrate the feasibility of the model. The vessel trajectory over six months is shown in Figure 6. The vessel travels mainly between Japan and Australia; the IMO number is "8913174", built in 1992.

Interpolation Calculation
After database analysis, taking the data integrity and continuity as the principle, we extracted the LNG vessel trajectory data of MMSI 310028000 from the trajectory database as a sample to illustrate the feasibility of the model. The vessel trajectory over six months is shown in Figure 6. The vessel travels mainly between Japan and Australia; the IMO number is "8913174", built in 1992. The focus of this research was to grasp the carbon dioxide emissions of vessels in real-time. We chose to use vessel data from 5 January 2016 over 00:35:10-00:59:40 for analysis. The reason for the selection was that the AIS data has relatively larger data points during the period, which could minimize the workload of trajectory data reconstruction. In addition, the trajectory trend is a straight line instead of a curve, which is more appropriate for simplicity. The total time interval was 26 min, and there were 23 trajectory data points, as shown in Table 3. For example, we found that the data intervals were different: 1 s, 7 s, 11 s, 12 s, 7 s from 00:35:10 to 00:35:48. Therefore, this study proposes using the cubic spline interpolation method to resample the data at one second intervals. We used Equations (1)- (14) to calculate the trajectory interpolation. Taking longitude as an example, = . The partial calculation results are shown in Figure 7. We found that the line connecting the original data points was no longer a simple straight line (orange color), but had become a curve (blue color), which was more in line with the actual situation. In terms of the actual geographic location, we could see the restored trajectory after interpolation calculation from Figure 8, where the yellow point is the trajectory point of the vessel at 00:59:40, the red point is the trajectory point on 5 January, the orange point is on other days in January and the green points are interpolation points. The focus of this research was to grasp the carbon dioxide emissions of vessels in real-time. We chose to use vessel data from 5 January 2016 over 00:35:10-00:59:40 for analysis. The reason for the selection was that the AIS data has relatively larger data points during the period, which could minimize the workload of trajectory data reconstruction. In addition, the trajectory trend is a straight line instead of a curve, which is more appropriate for simplicity. The total time interval was 26 min, and there were 23 trajectory data points, as shown in Table 3. For example, we found that the data intervals were different: 1 s, 7 s, 11 s, 12 s, 7 s from 00:35:10 to 00:35:48. Therefore, this study proposes using the cubic spline interpolation method to resample the data at one second intervals. We used Equations (1)- (14) to calculate the trajectory interpolation. Taking longitude as an example, x i = longitude i . The partial calculation results are shown in Figure 7. We found that the line connecting the original data points was no longer a simple straight line (orange color), but had become a curve (blue color), which was more in line with the actual situation. In terms of the actual geographic location, we could see the restored trajectory after interpolation calculation from Figure 8

Vessel Trajectory Prediction
The recurrent neural network is a typical framework for deep learning; it can be used to deal with spatial-temporal sequence problems. The characteristics are that the output of the current moment depends on the calculation result of the previous moment and that the timing is strong. The LSTM model is an improvement on the recurrent neural network as it also analyzes historical calculation results that are much older, automatically removes invalid historical calculation results, and remembers useful historical calculation results by introducing gating functions. The timeliness is stronger than the recurrent neural network.
As described in Section 1.2, for a vessel, its trajectory characteristics at time t could be expressed as ( ) ={( , , ),( , , ),( , , ),…,( , , )}; we could use this as an input value of the LSTM model. The number of data after the interpolation calculation changed from 23 to 1472. The experimental environment for this study was the DELL OptiPlex 7050 desktop computer, the CPU was Intel(R) Core (TM) i7-7700 CPU @3.60GHz, the memory was 16.0 GB, the operating system was Windows10 Pro, the program development environment was PyCharm (JetBrains, s.r.o., Prague, Czech Republic) in python 3.7 and we used an LSTM model provided by Keras (Chollet, F., & others, Retrieved from https://github.com/fchollet/keras). After experiments and manual adjustment of parameters, the optimal parameters of the LSTM model in this study were finally determined. As shown in Table 4, the time required to run the model once was 195 s.

Vessel Trajectory Prediction
The recurrent neural network is a typical framework for deep learning; it can be used to deal with spatial-temporal sequence problems. The characteristics are that the output of the current moment depends on the calculation result of the previous moment and that the timing is strong. The LSTM model is an improvement on the recurrent neural network as it also analyzes historical calculation results that are much older, automatically removes invalid historical calculation results, and remembers useful historical calculation results by introducing gating functions. The timeliness is stronger than the recurrent neural network.
As described in Section 1.2, for a vessel, its trajectory characteristics at time t could be expressed as ( ) ={( , , ),( , , ),( , , ),…,( , , )}; we could use this as an input value of the LSTM model. The number of data after the interpolation calculation changed from 23 to 1472. The experimental environment for this study was the DELL OptiPlex 7050 desktop computer, the CPU was Intel(R) Core (TM) i7-7700 CPU @3.60GHz, the memory was 16.0 GB, the operating system was Windows10 Pro, the program development environment was PyCharm (JetBrains, s.r.o., Prague, Czech Republic) in python 3.7 and we used an LSTM model provided by Keras (Chollet, F., & others, Retrieved from https://github.com/fchollet/keras). After experiments and manual adjustment of parameters, the optimal parameters of the LSTM model in this study were finally determined. As shown in Table 4, the time required to run the model once was 195 s.

Vessel Trajectory Prediction
The recurrent neural network is a typical framework for deep learning; it can be used to deal with spatial-temporal sequence problems. The characteristics are that the output of the current moment depends on the calculation result of the previous moment and that the timing is strong. The LSTM model is an improvement on the recurrent neural network as it also analyzes historical calculation results that are much older, automatically removes invalid historical calculation results, and remembers useful historical calculation results by introducing gating functions. The timeliness is stronger than the recurrent neural network.
As described in Section 1.2, for a vessel, its trajectory characteristics at time t could be expressed as Y (t) = {(p 1 , a 1 , t 1 ),(p 2 , a 2 , t 2 ),(p 3 , a 3 , t 3 ), . . . ,(p n , a n , t n )}; we could use this as an input value of the LSTM model. The number of data after the interpolation calculation changed from 23 to 1472. The experimental environment for this study was the DELL OptiPlex 7050 desktop computer, the CPU was Intel(R) Core (TM) i7-7700 CPU @3.60GHz, the memory was 16.0 GB, the operating system was Windows10 Pro, the program development environment was PyCharm (JetBrains, s.r.o., Prague, Czech Republic) in python 3.7 and we used an LSTM model provided by Keras (Chollet, F., & others, Retrieved from https://github.com/fchollet/keras (accessed on 20 July 2021)). After experiments and manual adjustment of parameters, the optimal parameters of the LSTM model in this study were finally determined. As shown in Table 4, the time required to run the model once was 195 s. The loss function is shown in Figure 9. We used 938 pieces of data to train the LSTM model, and finally, 235 pieces of data were used to verify the prediction effect. At an equal time (00:59:40), the predicted trajectory point of the LSTM model differed from the actual trajectory point by 0.593 nm, which was in line with expectations.  The loss function is shown in Figure 9. We used 938 pieces of data to train the LSTM model, and finally, 235 pieces of data were used to verify the prediction effect. At an equal time (00:59:40), the predicted trajectory point of the LSTM model differed from the actual trajectory point by 0.593 nm, which was in line with expectations. The prediction results are shown in Figure 10, and we further plot these in 2D ( Figure  11) and 3D ( Figure 12) graphs for visualization.     As Figure 10 shows, the trajectories predicted by the two models roughly coincided with the actual trajectories. In Figures 11 and 12, the green, red and blue lines indicate the    As Figure 10 shows, the trajectories predicted by the two models roughly coincided with the actual trajectories. In Figures 11 and 12, the green, red and blue lines indicate the As Figure 10 shows, the trajectories predicted by the two models roughly coincided with the actual trajectories. In Figures 11 and 12 In Figure 11, we intercepted the prediction result from original trajectory to visualize the differences. Three trajectories had the same start point (the first orange point at the lower left in Figure 10) but the predicted endpoints were different. The LSTM model has a smaller error (0.593 nautical miles or 1.097 km) than that of the RNN model (0.928 nautical miles or 1.718 km). This can be observed in Figure 12 where the trajectory predicted by the RNN was more concentrated at the end point, causing the trajectories to overlap, while the LSTM model did not have this defect. The time step was the time difference from 00:54:49 (in seconds).

Carbon Dioxide Estimation
We first calculated the carbon dioxide emissions of the vessel without interpolation by the ITTC; using Equation (24), we could determine the emissions of the vessel at 00:35:10-00:59:40 to be 47,169 kg for the period of study. The carbon dioxide emissions after interpolation totaled 74,926 kg for the same period. This was 27,757 kg more carbon dioxide emissions than determined by traditional estimation methods. Our proposed model was able to interpolate more time intervals, and hence, improve the accuracy of the vessel carbon dioxide emissions. Observing the vessel carbon dioxide emissions from a spatial-temporal perspective, they go down first and then up (Figures 13 and 14).

Carbon Dioxide Estimation
We first calculated the carbon dioxide emissions of the vessel without interpolation by the ITTC; using Equation (24), we could determine the emissions of the vessel a 00:35:10-00:59:40 to be 47,169 kg for the period of study. The carbon dioxide emission after interpolation totaled 74,926 kg for the same period. This was 27,757 kg more carbon dioxide emissions than determined by traditional estimation methods. Our proposed model was able to interpolate more time intervals, and hence, improve the accuracy of the vessel carbon dioxide emissions. Observing the vessel carbon dioxide emissions from a spatial-temporal perspective, they go down first and then up (Figures 13 and 14).  Next, we used the trajectories predicted by the LSTM model to estimate the carbon dioxide emissions, to obtain the spatial-temporal distribution of future carbon dioxid emissions from vessels. With the vessel emissions per second obtained, we calculated tha the cumulative carbon dioxide emissions from vessels during 00:54:49-00:59:41 were

Carbon Dioxide Estimation
We first calculated the carbon dioxide emissions of the vessel without interpolati by the ITTC; using Equation (24), we could determine the emissions of the vessel 00:35:10-00:59:40 to be 47,169 kg for the period of study. The carbon dioxide emissio after interpolation totaled 74,926 kg for the same period. This was 27,757 kg more carb dioxide emissions than determined by traditional estimation methods. Our propos model was able to interpolate more time intervals, and hence, improve the accuracy of t vessel carbon dioxide emissions. Observing the vessel carbon dioxide emissions from spatial-temporal perspective, they go down first and then up (Figures 13 and 14).  Next, we used the trajectories predicted by the LSTM model to estimate the carb dioxide emissions, to obtain the spatial-temporal distribution of future carbon dioxi emissions from vessels. With the vessel emissions per second obtained, we calculated th the cumulative carbon dioxide emissions from vessels during 00:54:49-00:59:41 we Next, we used the trajectories predicted by the LSTM model to estimate the carbon dioxide emissions, to obtain the spatial-temporal distribution of future carbon dioxide emissions from vessels. With the vessel emissions per second obtained, we calculated that the cumulative carbon dioxide emissions from vessels during 00:54:49-00:59:41 were 13,315 kg. We predicted that the highest value would be reached in this area (118.7439, −16.9425).
The predicted carbon dioxide emissions (13,315 kg) and the true emissions (13,616 kg) gave an error of 301 kg. A heat map is shown of the spatial-temporal distribution of carbon dioxide emissions from vessels every second in specific section from 00:54:49 to 00:59:40 in Figure 15. Observing the vessel carbon dioxide emissions from a time perspective, the actual and predicted values increased during this period. We found that the growth rate of the predicted value was slower than the growth rate of the true value. At 00:59:41, the error was 2.36 kg (51.46-49.10 kg).
The predicted carbon dioxide emissions (13,315 kg) and the true emissions (13,616 kg) gave an error of 301 kg. A heat map is shown of the spatial-temporal distribution of carbon dioxide emissions from vessels every second in specific section from 00:54:49 to 00:59:40 in Figure 15. Observing the vessel carbon dioxide emissions from a time perspective, the actual and predicted values increased during this period. We found that the growth rate of the predicted value was slower than the growth rate of the true value. At 00:59:41, the error was 2.36 kg (51.46-49.10 kg). Moreover, it was found that the vessel emissions in the future would continue to increase. The deceleration trend indicated that the vessel would accelerate to the northeast in the future. On the other hand, because we had the data of the carbon dioxide emissions of the vessel in seconds, we could re-establish any time interval to reduce the number of vessel trajectory points if needed. That practice is useful for calculating a larger number of vessel carbon emission trajectories, for example, to calculate the carbon dioxide emissions of the vessel in 3 s, 6 s or 10 s intervals ( Figure 16). We can see from Figure 16 that the larger the number of time steps, the faster increase of carbon dioxide emissions, which also indicates that the SOG of the vessel was accelerating. At the same time, the number of trajectory points decreased as the time interval decreased, and the vessel carbon dioxide emissions represented by each track point gradually increased, but the overall trajectory trend remained the same.  Moreover, it was found that the vessel emissions in the future would continue to increase. The deceleration trend indicated that the vessel would accelerate to the northeast in the future. On the other hand, because we had the data of the carbon dioxide emissions of the vessel in seconds, we could re-establish any time interval to reduce the number of vessel trajectory points if needed. That practice is useful for calculating a larger number of vessel carbon emission trajectories, for example, to calculate the carbon dioxide emissions of the vessel in 3 s, 6 s or 10 s intervals ( Figure 16). We can see from Figure 16 that the larger the number of time steps, the faster increase of carbon dioxide emissions, which also indicates that the SOG of the vessel was accelerating. At the same time, the number of trajectory points decreased as the time interval decreased, and the vessel carbon dioxide emissions represented by each track point gradually increased, but the overall trajectory trend remained the same.

Conclusions
The predicted carbon dioxide emissions (13,315 kg) and the true emissions (13,616 kg) gave an error of 301 kg. A heat map is shown of the spatial-temporal distribution of carbon dioxide emissions from vessels every second in specific section from 00:54:49 to 00:59:40 in Figure 15. Observing the vessel carbon dioxide emissions from a time perspective, the actual and predicted values increased during this period. We found that the growth rate of the predicted value was slower than the growth rate of the true value. At 00:59:41, the error was 2.36 kg (51.46-49.10 kg). Moreover, it was found that the vessel emissions in the future would continue to increase. The deceleration trend indicated that the vessel would accelerate to the northeast in the future. On the other hand, because we had the data of the carbon dioxide emissions of the vessel in seconds, we could re-establish any time interval to reduce the number of vessel trajectory points if needed. That practice is useful for calculating a larger number of vessel carbon emission trajectories, for example, to calculate the carbon dioxide emissions of the vessel in 3 s, 6 s or 10 s intervals ( Figure 16). We can see from Figure 16 that the larger the number of time steps, the faster increase of carbon dioxide emissions, which also indicates that the SOG of the vessel was accelerating. At the same time, the number of trajectory points decreased as the time interval decreased, and the vessel carbon dioxide emissions represented by each track point gradually increased, but the overall trajectory trend remained the same.

Conclusions
In this study, we proposed a method to interpolate AIS data and estimate vessel emissions based on predicted vessel trajectories using deep learning models. The results suggest that the carbon dioxide emissions are underestimated. The method can be applied to monitor a vessel's carbon dioxide emissions and trajectory in real time, and to provide vessels with early warning services relating to carbon dioxide emissions monitoring.
The contributions of the study are fourfold. Firstly, it has novelty in interpolating AIS data points with a non-linear model. AIS data sometimes have time intervals that are too long due to equipment issues and/or human reasons, which limits the accuracy of the vessels' carbon dioxide emissions estimation based on AIS data. To overcome the problem, this study proposes a cubic spline interpolation model to resample missing AIS data points.
The repaired trajectories of the vessel (Figure 8) maintain the same trend as the original trajectory in smooth curves, which represent the actual situation well when compared to the traditional linear models in previous studies. Through the interpolation, we could successfully obtain AIS data with a time interval of one second.
Secondly, it reveals that vessel emissions are currently underestimated. In the study, by applying the repaired trajectory data of the vessel (1472 pieces of data), we estimated the carbon dioxide emissions of the vessel to be 74,926 kg. Compared to the emissions calculated from the original trajectory data (23 pieces of data), a 27,757 kg increase of carbon dioxide emissions was identified. This clearly indicates that the actual carbon dioxide emissions are higher than those reported. The finding offers insight into how to accurately measure the emissions of vessels, and hence, better execute a GHG reduction strategy.
Thirdly, it validated the performance of two deep-learning algorithms in predicting vessel trajectories. We conducted prediction with two deep-learning algorithms, LSTM and RNN. The results confirm that the LSTM model performs better. We used the predicted SOG to predict the future carbon dioxide emissions of the vessel. The error was as small as 301 kg, which shows that the LSTM model was suitable for spatial-temporal data prediction with excellent performance.
Fourthly, it offers an alternative solution to predict vessel trajectories when AIS data is not retrievable. The model can also be used to estimate vessel emissions when AIS data are not available on occasions such as the shutdown of AIS for artificial reasons (including active or passive) or when the collection of AIS data fails in certain time periods.
At present, the method has limitations since it has only been tested in cases where the vessel trajectory is not comprehensive. It is unclear about the interpolation effect and trajectory prediction effect on curved trajectories, which need further research with several cases considering the computational time in the future. Future studies may also verify different artificial neural network (ANN) models such as the feedforward neural network (FFNN) or 1D convolutional neural networks (1D-CNN), which may work well on time-series data [14].