Trajectory Data Compression Algorithm Based on Ship Navigation State and Acceleration Variation

: An active area of study under the dual carbon target, which is based on automatic identiﬁ-cation systems (AIS), is the emission inventory of pollutants from ships. Data compression is required because there is currently so much data that it has become difﬁcult to transmit, process, and store it. A trajectory simpliﬁcation method considering the ship sailing state and acceleration rate of change is developed in this paper to assure the validity of the compressed data used in the emission inventory analysis. By carefully examining the integral relationship between acceleration and pollution emissions, the algorithm constructs an acceleration rate of change function for data compression and categorizes AIS data by ship navigation status. By dynamically altering the amount of acceleration change, the developed function can stabilize the pollutant emission calculation error and adaptively calculate the threshold value. The experimental results show that the emission calculation error of the proposed algorithm is only 0.185% when the compression rate is 90.28%.


Introduction
Under the two key objectives of "carbon peaking" and "carbon neutrality", the emission inventory of pollutants from ships is currently a hot research area.The process of compiling the inventory is based on the automatic identification system (AIS).A new navigation aid called AIS is being used to improve marine safety and communication between ships and shore, as well as between ships themselves.It can automatically communicate crucial data, including the ship's position, speed, heading, and name.When carbon dioxide emissions reach their peak and then start to decline gradually, this is referred to as "carbon peaking"."Carbon neutrality" refers to the positive and negative offsetting of carbon dioxide or greenhouse gas emissions through energy conservation and emission reduction strategies.The production of emission inventories has been achieved at numerous ports [1][2][3][4].A great deal of AIS data is produced due to the skyrocketing volume of maritime activity, which presents significant challenges for data transmission and processing [3].The data collected by researchers is probably already compressed to aid in transmission.Additionally, the cost in terms of time and space needed to perform computations for pollutant emission increases with data volume.To increase the effectiveness of emission inventory investigations, large amounts of data must be compressed before analysis.Compressed data can free up storage space and make it easier to store and transmit trajectory information [5].More importantly, the ship's trajectory data may be thoroughly analyzed with the help of simplification, allowing it to can retain pertinent information and eliminate superfluous material.
The use of trajectory simplification and compression techniques has greatly improved due to the rapid development of many disciplines and the widespread application of these techniques in a variety of sectors.Early methods for simplifying trajectories generally took into account information such as position, velocity, and time [6][7][8][9][10].Douglas proposed the Douglas-Peucker (DP) algorithm in 1973, which is one of the most classical trajectory compression algorithms [6].Meratnia et al. proposed the velocity-based top-down algorithm and top-down time ratio (TD-TR) algorithm [7].Many researchers have improved the DP algorithm by considering the characteristics of AIS data [11][12][13][14][15][16].Li et al. proposed that a suitable threshold interval can be selected from the experimental comparison results of different DP thresholds, according to the quality of AIS trajectory visualization [11].Han et al. proposed the conversion of trajectories into spatial paths and time series to compress both spatial and temporal data [12].Liangbin Zhao and Guoyou Shi proposed a method based on an improved DP algorithm that considers the shape of the ship's trajectory derived from the heading information of the trajectory points [16].Wonhee Lee and Sung-Won Cho (2022) proposed a simplified algorithm for the AIS trajectory considering terrain information [17].The polygon map random (PMR) quadtree was used to consider topographic information on the coast, and the intersection between topographic information and simplified trajectories was efficiently computed using the PMR quadtree.These algorithms consider other characteristics of the ship, but do not apply to emission inventory studies because the production of emission inventories requires the consideration of the ship's engine information and the deep relationship between different characteristics of the ship, and it is not enough to consider only these shallow characteristics [18].The lack of targeted studies makes it impossible to guarantee the reliability of the data.The processing efficiency of massive data is equally important, so the selection of the threshold value is also the focus of current research, and an adaptive threshold can optimize the compression method to a great extent [19][20][21][22][23][24][25].Zhaokun Wei et al. designed a new algorithm considering trajectory space and motion features which can compress AIS trajectories based on ship behavior features and apply statistical theory to help determine the threshold of motion features in the sliding window algorithm [20].Chunhua Tang et al. proposed an adaptive threshold AIS trajectory data compression method based on the DP algorithm to improve the computational efficiency of the algorithm by taking advantage of matrix operations and reducing the number of points [21].Ran Yan et al. proposed two trajectory compression algorithms: a static mode with a preset compression threshold and a dynamic mode that considers the distance between the trajectory point and the coastline in real-time [22].To address the difficulties involved in selecting appropriate thresholds, adaptive thresholds are also included in this paper's design goal.
Despite the high computing performance of these techniques, they are not appropriate for the analysis of emission inventories.This is due to the bottom-up emission inventory production method's requirement that different parameter values be substituted based on the type and condition of the ship's sailing [26][27][28][29].One of the crucial metrics, main engine load, must be calculated using both real-time speed and rated speed.As a result, in addition to position and speed information, it is important to consider the complex relationship between the motion characteristics of the ship and the pollution emissions when compressing such data.When employed for emission estimates, the compressed data output from the current trajectory simplification method will result in significant error.Therefore, a trajectory simplification technique that can be used for ship-related pollution emissions is required.Based on the peculiarities of AIS data and emission inventories, an adaptive threshold simplification algorithm suitable for emission inventories is proposed in this study.This study offers three contributions.To retain the voyage state differentiation points as the important features and speed up the compression process, the data are first categorized and then simplified.Second, a function for the acceleration rate of change that may be adaptively decided as a threshold was built.This function combined the main engine load and the rated speed to thoroughly assess the overall relationship with pollutant emissions.The suggested algorithm is then contrasted with other algorithms in terms of running time, compression ratio, and pollutant emission calculation error.

Ship Trajectory Simplification Algorithm
In this research, a simplified algorithm is put forth that can guarantee a high compression rate while maintaining the accuracy of emission calculation and critical feature information, including latitude and longitude, real-time speed, and the acceleration of the ship.Figure 1 depicts the simplified algorithm flow, and Appendix A contains the pseudocode.The simplified algorithm is split into two halves.The data are categorized in the first part according to the sailing state, while retaining the characteristics of the sailing state.The main engine load and speed determine the sailing state, and the ship's trajectory exhibits noticeably varied features depending on the sailing state.For instance, the ship is virtually completely stationary when it is moored, whereas when it is cruising, the ship is primarily moving across the water.The crucial trajectory information is thus contained in the navigation state differentiation point.The data from the various navigation states are handled independently in the second part, and the trajectories are simplified by adaptive thresholding.The several navigation statuses are categorized and compressed independently in this section.The compression technique can be significantly improved with adaptive thresholding.

Ship Trajectory Simplification Algorithm
In this research, a simplified algorithm is put forth that can guarantee a high compression rate while maintaining the accuracy of emission calculation and critical feature information, including latitude and longitude, real-time speed, and the acceleration of the ship.Figure 1 depicts the simplified algorithm flow, and Appendix A contains the pseudocode.The simplified algorithm is split into two halves.The data are categorized in the first part according to the sailing state, while retaining the characteristics of the sailing state.The main engine load and speed determine the sailing state, and the ship's trajectory exhibits noticeably varied features depending on the sailing state.For instance, the ship is virtually completely stationary when it is moored, whereas when it is cruising, the ship is primarily moving across the water.The crucial trajectory information is thus contained in the navigation state differentiation point.The data from the various navigation states are handled independently in the second part, and the trajectories are simplified by adaptive thresholding.The several navigation statuses are categorized and compressed independently in this section.The compression technique can be significantly improved with adaptive thresholding.Framework diagram of the simplified algorithm.First, the AIS data are classified according to the navigation status, and only the first and last trajectory points of the trajectory are retained for the part of the data where the main engine is not running (see data compression branch 2).Then, the maximum value σ of the acceleration rate of change function in all intermediate trajectory points P is calculated for the part the data where the main engine running, and the result is compared with the set threshold value to determine whether to retain or delete it (see data compression branch 1).

Classification of Data According to Navigation Status
Data must first be categorized according to the sailing status before being compressed.To determine the ship's sailing status, the IMO's speed and host load factor recommendations from the fourth GHG study are combined.The methodology is illustrated in Table 1 [26].It is possible to determine the distinguishing speed of the relevant sailing condition by using the host load factor calculation formula.Additionally, because not all differentiating points of the navigation status may be recorded by the automatic Framework diagram of the simplified algorithm.First, the AIS data are classified according to the navigation status, and only the first and last trajectory points of the trajectory are retained for the part of the data where the main engine is not running (see data compression branch 2).Then, the maximum value σ max of the acceleration rate of change function in all intermediate trajectory points P i is calculated for the part the data where the main engine running, and the result is compared with the set threshold value to determine whether to retain or delete it (see data compression branch 1).

Classification of Data According to Navigation Status
Data must first be categorized according to the sailing status before being compressed.To determine the ship's sailing status, the IMO's speed and host load factor recommendations from the fourth GHG study are combined.The methodology is illustrated in Table 1 [26].It is possible to determine the distinguishing speed of the relevant sailing condition by using the host load factor calculation formula.Additionally, because not all differentiating points of the navigation status may be recorded by the automatic identification system, interpolation must be used to determine some of these time points.It is important to interpolate between these two trajectory points to discern between distinct sailing states when two neighboring trajectory points are in different sailing states.For trajectory points in different navigational states P i and P i+1 , in P i , the real-time speed is V i , and the time point is t i .In P i+1 , the real-time speed is V i+1 , and the time point is t i+1 .The distinguished speeds for different sailing states are V.When P i to P i+1 has a stable rate of speed change during this time, the speed ratio can be calculated from r, and the interpolation point is calculated from P of the time point t.
For trajectory points P i and P i+1 of indifferent navigational states, the velocity of P i is V i , and the time point is t i .The voyage speed of P i+1 is V i+1 , and the time point is t i+1 .The distinguished velocity for different navigational states is V.When the period from P i to P i+1 has a stable rate of change of velocity, the time point t of the interpolated point P can be calculated from the velocity ratio r.
Table 1 divides the ship's sailing state into five categories, where the ship is almost stationary, and the main engine is not running in the moored state [26,27].The trajectory of the ship in the other four states will change significantly, and the main engine will run; this part of the data is also the focus of trajectory simplification.Therefore, in this paper, the sailing states are grouped into two parts according to whether the main engine is running or not.When the main engine is not running, only the first trajectory point and the last trajectory point of this part of the data need to be retained.When the main engine is running, the trajectory data of this part is simplified by the adaptive threshold value designed in this paper.

Adaptive Thresholds
The applicability of the threshold value to the data source in compression algorithms determines whether the compressed data may be used for further analysis [23,24].Most modern data compression algorithms demand compression criteria that have been intentionally defined.A great deal of testing is required to achieve the correct threshold value because this is a blind, speculative operation.Effective compression can be increased by adaptive thresholding.Three aspects make up the adaptive thresholding concept presented in this study.The integral relational equation between pollutant emission and acceleration is first developed after the key variables portion of the host emission equation is extracted for in-depth analysis.Second, an acceleration rate of change function is built using the integral relationship equation.This function not only reflects the accuracy of the integral relationship's emission computation, but also allows for dynamic adjustment of the acceleration change at various speed intervals.Finally, the function is used to establish a threshold value for trajectory simplification, which is an adjustable parameter and a user-preset for the accuracy of emission calculation.
The trustworthiness of the compressed data in subsequent specific research cannot be guaranteed by many trajectory simplification techniques.They can only promise that the retained information has a high trajectory similarity.The adaptive threshold suggested in this paper can ensure that the quality of the compressed data is no longer unknown.Users can adjust the threshold value to achieve a balance between the compression rate and the data quality according to the required precision, without performing multiple experiments.This lowers the cost of compression and ensures the dependability of the compressed data.

Integral Relationship Equation
Even though a ship's track is continuous, AIS data is collected and stored discretely [30][31][32].It is necessary to estimate the continuous variation of each ship's characteristics to calculate pollution emissions.If the mean value approach is used to estimate the velocity variation between two trajectory points, the error will be higher the larger the observed velocity difference between the two trajectory points.The cost will rise once more if the trajectory is interpolated with high density during this period.If the velocity information for this period is calculated using integration, it is not only more accurate than the mean technique, but also more efficient than high-density trajectory interpolation.
When calculating emissions using discrete AIS data, it is critical to determine the continuous variation of each parameter of a ship.If the deep relationship between parameter variation and emission calculation can be found, the ship trajectory data can be simplified to the maximum extent while ensuring the accuracy of emission calculation.The main engine emission estimation model in emission inventory production is shown in Equation ( 1) [26].In the equation, E i stands for the emissions of the main engine for a class i specific pollutant, P stands for rated engine power, LF stands for main engine load factor, Act stands for operation time, EF stands for pollutant emission factor, FCF stands for fuel correction factor, LLA stands for low load adjustment factor, and s stands for the sailing state of the ship.Among these, the main engine load factor must be calculated separately, and it is an important factor affecting the accuracy of the emission calculation.The classical calculation formula of the main engine load factor is shown in Equation ( 2).V a is the real-time speed, and V m is the maximum design speed.
Let the velocity of the trajectory point P 1 be V 1 , the velocity of P 2 be V 2 , the time difference be Act, and the rate of change of velocity during this period be a.After determining the ship's main engine power and sailing state, extract the variable part of the main engine emission calculation equation LF × Act The integral transformation is then carried out.The formula for calculating the main engine emission is shown in Equation (1), and the integral calculation relationship is shown in Equations ( 5) and (6).
Equation ( 6) converts the formula based on real-time velocity and time difference into an integral relationship based on acceleration.If the acceleration of a segment of the trajectory is stable, the intermediate trajectory points can be discarded without losing critical information and again, without affecting the emission calculation.

Threshold Function of Acceleration Rate of Change
It is cumbersome to design the threshold function to compress the data directly using the emission calculation formula, and this study considers simplifying the process with the help of Equations ( 5) and (6).For the starting point P s , the endpoint P e , and the intermediate trajectory points P i , the emissions can be calculated for three different periods E s,i , the E i,e and E s,e , and the errors can be analyze. σ In the above equation, the σ ' E is the error in E s,e as the standard error.It is more complicated to calculate the error if the emission calculation formula is used directly, due to the need to substitute all the parameters in Equation (3).When the acceleration of the three time periods is determined, the C is equal to a constant value that is not affected by the magnitude of the real-time velocity.When the acceleration change is constant, E s,e and σ ' E in Equation ( 8) have the opposite trend, while E s,e and real-time velocity show the same trend.Therefore, the main influence on the emission error is the variation of the real-time velocity over a certain period, which can be expressed in terms of acceleration.Since using a constant acceleration change to set the threshold leads to different simplification effects for the data of high velocity and the data of low velocity, the adaptive adjustment of the acceleration change at different velocity intervals is also required when setting the threshold function.In this study, Riemann integral relations for three accelerations can be established with the help of Equations ( 5) and (6).
Equation ( 9) is the Riemann integral relation, and Equation ( 10) is the integral expansion, where Act denotes the period, and a s,i denotes the P s to P i acceleration, and a i,e denotes the P i to P e acceleration, and a s,e denotes the P s to P e acceleration, and V is the velocity.Set S and S ' as the expressions of Equations ( 11) and ( 12), and we can obtain the acceleration rate of the change function σ, which is shown in Equation (13).
σ reflects the fluctuation of the acceleration change σ; the smaller it is, the more stable the acceleration.In addition, σ has the ability to adaptively change the amount of acceleration change at different velocity intervals.The smaller value also reflects the smaller error of the emission calculation.When σ equals 0, it means that the acceleration is constant, and the emission calculation error between the compressed data and the original data is also 0. The value of σ approximates the emission calculation error.Therefore, this paper uses σ to set an adaptive threshold for trajectory simplification, which is equivalent to presetting the emission calculation error value to ensure the quality of the compressed data, avoiding the need to determine the appropriate threshold value through extensive experiments.

Trajectory Simplification
According to the data classification in Section 2.1, the trajectory simplification process of AIS data in the four navigation states during host operation is described below.The AIS trajectory is represented as the set of points D = {P 1 , P 2 , . . . ,P i }.Calculate the maximum value of σ max as a function of the rate of change of acceleration for each point P i on the trajectory from its starting point P s and its ending point P e .If σ max exceeds the threshold, the maximum point P max is retained.Subsequently, the trajectory is split at that position (P max ).The algorithm is applied recursively to both sub-trajectories.If σ max is below the threshold, only the points P s and P e of the subpart of the trajectory are retained.A schematic of the trajectory simplification process is shown in Figure 2. AIS trajectory is represented as the set of points D = {P , P , … , P }.Calculate the maximum value of σ as a function of the rate of change of acceleration for each point P on the trajectory from its starting point P and its ending point P .If σ exceeds the threshold, the maximum point P is retained.Subsequently, the trajectory is split at that position (P ).The algorithm is applied recursively to both sub-trajectories.If σ is below the threshold, only the points P and P of the subpart of the trajectory are retained.A schematic of the trajectory simplification process is shown in Figure 2. In the first step, we keep the first node P and the last node P , and find the maximum point P .In the second step, if σ exceeds the threshold,, keep P , and split the trajectory.In the third step, recursively judge the two trajectories, find the maximum point, and judge σ ; if it does not exceed the threshold, then discard all trajectory points except for the first and last nodes.Recursively repeat the judgment, and finally, obtain the simplified trajectory containing only four trajectory points.

Compression Evaluation
In this paper, considering the needs of practical applications, the proposed algorithm pays more attention to the computational error of emissions from compressed data.The In the first step, we keep the first node P 1 and the last node P 13 , and find the maximum point P 6 .In the second step, if σ max exceeds the threshold, keep P 6 , and split the trajectory.In the third step, recursively judge the two trajectories, find the maximum point, and judge σ max ; if it does not exceed the threshold, then discard all trajectory points except for the first and last nodes.Recursively repeat the judgment, and finally, obtain the simplified trajectory containing only four trajectory points.

Compression Evaluation
In this paper, considering the needs of practical applications, the proposed algorithm pays more attention to the computational error of emissions from compressed data.The compression performance is evaluated in three aspects, namely compression ratio, emission calculation error, and runtime complexity.The compression ratio is derived by dividing the number of discarded trajectory points by the number of original trajectory points.The emission calculation error represents the standard error between the calculated emissions from the compressed data and the calculated emissions from the uncompressed data.CR = N − N s N × 100% ( 14) In the above equation, CR is the compression rate, N is the number of trajectory points on the original trajectory, and N s is the number of trajectory points on the simplified trajectory.σ E denotes the error in emission calculation, E o is the emission calculated from the original uncompressed data, and E s is the emission calculated from the compressed data.

Data Sources
The proposed approach is implemented and contrasted with other algorithms using one month's worth of AIS data from the sea region of the Shandong emission control area and three months' worth of data from the AIS data from the Ningbo port to further assess the algorithm's effectiveness.The static database of the ship comes from Clarkson's database and Lloyd's database, which mainly include parameters such as ship length, shipbreadth, ship depth, main engine power, auxiliary engine power, boiler power, rated speed, and ship tonnage.The dynamic AIS data includes parameters such as ship MMSI code, longitude, latitude, bow direction, heading to earth, real-time speed, and ship position accuracy.The AIS data and static database are tested after deciding on the outlier identification criteria [26,27].The random forest model was used to fill in the missing values and outliers in the design parameters [33].Cubic spline interpolation was used to fill in missing and anomalous values in the AIS data [34].NOx is the pollutant type used in the calculation of emissions.The experiments make use of Python 3.9 as the programming language and PyCharm 11.0.10 as the compiler.The comparison experiments make use of an identical hardware setup and software environment.The compiler's recursion depth is set to 30,000 according to the real quantity of data for the method, necessitating recursive iteration.

Experiments and Analysis
The compression rate may generally be improved by raising the threshold setting, but more information is lost in the process.Different thresholds were chosen for the suggested algorithm in this study so that they could be compared, and as shown in Tables 2 and 3; the compression rate rises when a higher threshold factor is applied.For each of the seven distinct compression rate circumstances, the discrepancy between the pollutant emissions calculated using the compressed data and the original data is minimal.This demonstrates that the suggested algorithm's data compression method may be successfully used to study emission inventories.The threshold system that directly considers the emission calculation formula and the acceleration rate of change is also used in this paper to compare compression performance.As shown in Figure 3, the errors of emission calculation for both at the same compression rate are very close.This also proves the reasonableness of the threshold design of the proposed algorithm.

Table 2. Partial performance comparison of each compression algorithm at different thresholds (Shandong Province data)
. Seven different thresholds are set for each algorithm, corresponding to the computational error of emissions at seven compression ratios.To facilitate the cross-sectional comparison, the determined thresholds will ensure that the data compression ratios under different algorithms are as close as possible.The suggested compression technique will be contrasted with three existing compression algorithms to further assess its performance.These include the top-down time ratio (TD-TR) method [7], the Douglas-Peucker (DP) algorithm [6], and the compression algorithm considering the behavior of the ship(CSB) [20].These three algorithms mainly consider some basic characteristics of the ship (latitude and longitude, time stamp, real-time speed, bow direction, etc.).In Appendix B, these three algorithms are explained in detail.In this study, seven sets of thresholds were chosen for each compression technique, and two datasets from the Shandong Province and the Ningbo Port were used for the experiments.Tables 2 and 3 illustrate the computational errors of the host emissions, as well as the compression rates of the four compression algorithms at various thresholds.To ensure that the four algorithms may be fully compared horizontally, the specified thresholds were set after several experiments.As shown in Figure 4, each compression algorithm was chosen for the scenarios of 90%, 94%, and 98% compression rates, while the emission calculation errors were compared horizontally.Because other algorithms have a harder time determining the precise compression rate when choosing the correct threshold, the error of the compression rate was set at within 2%.
emission calculation errors of more than 20%.Due to the integrated considerations of position, speed, and direction information, the emission calculation error of the compression method, taking ship behavior into account, is 23.14% at the compression rate of 98%, which is much better than that of the DP and TD-TR algorithms.Between this algorithm and the algorithm suggested in this study, there is still a sizable gap.This is due to the algorithm's additional disregard for the precise velocity fluctuation between the compressed trajectory points.Tables 2 and 3 further illustrate how higher thresholds might result in higher compression rates and greater information loss.Each algorithm has a varied performance, as illustrated in Figure 4.The DP algorithm emission calculation error is particularly high when the compression rate is 98%, reaching 59.321% and 36.983% in the two datasets, respectively.This is so that the data can be compressed using the DP algorithm, which ignores substantial speed fluctuations in favor of position information.As a result, for various compression rates, the DP algorithm performs the poorest in this regard.The approach suggested in this research performs an order of magnitude better than do the existing algorithms, and it shows the minimum error in emission computation at various compression rates.The emission calculation error of the suggested approach still fluctuates very slightly and is only 2.221% and 1.890% in the two datasets, even when the compression rate is raised to 98%.Other algorithms with the same compression rate have emission calculation errors of more than 20%.Due to the integrated considerations of position, speed, and direction information, the emission calculation error of the compression method, taking ship behavior into account, is 23.14% at the compression rate of 98%, which is much better than that of the DP and TD-TR algorithms.Between this algorithm and the algorithm suggested in this study, there is still a sizable gap.This is due to the algorithm's additional disregard for the precise velocity fluctuation between the compressed trajectory points.
The computational error of the technique for emissions calculation is significantly reduced as the compression rate drops, which is also in line with the theory of the algorithm put forth in this study.Other compression techniques, on the other hand, follow the same pattern, but when the compression rate is too high, the emission computation error can be significant and challenging to use in emission inventory investigations.This is because other compression algorithms do not carefully consider how the motion characteristics of the ship and the pollution emissions relate to one another.Table 4 shows the running time complexity of the four compression algorithms [35].The algorithm proposed in this paper is divided into two parts.The first part needs to traverse all the data, with the purpose of marking and dividing the navigation state, and the time complexity of this part is O(n).The second part needs to process the different divided sailing state data, mainly processing the data of the sailing state of the ship's main engine operation; the time complexity of this part is O(mlog m), and m denotes the amount of data for this sailing state.The running time complexity of the DP and TD-TR algorithms depends on the different algorithm designs, which is O n 2 , if the dynamic sliding window approach is used, and O(nlog n), if the iterative approach is used.The compression algorithm considering the ship's behavior is divided into two parts in parallel; the first part of the position data is compressed using the DP algorithm, so the time complexity is O n 2 or O(nlog n).The compression of the second part of the speed and heading data uses a fixed sliding window approach, so the time complexity is O(n).Although the running time complexity of the second part of the proposed algorithm is not optimal among all algorithms, the division processing of the first part will reduce the amount of data processed each time.Therefore, it is possible to maintain a short running time while ensuring the superiority of the proposed algorithm.The results show that the algorithm proposed in this paper can guarantee computational accuracy under the condition of a high compression rate, and it is suitable for the study of emission inventory.

Conclusions
In this paper, we propose a trajectory data compression algorithm based on the ship's navigation state and acceleration variation, and the proposed algorithm exhibits three novelties.First, the data are classified using the navigational states, retaining the navigational state differentiation points as key features.Second, the simplified algorithm combines the main engine load and rated speed to investigate the deep relationship with pollutant emissions, and it is applicable to the study of emission inventories.Third, the simplified algorithm adaptively determines the threshold value using the acceleration rate of change function.To test the performance of the proposed algorithm, numerical experiments are employed.The results show that the proposed algorithm maintains very low emission calculation errors at high compression rates and can achieve almost the same results as the original data in the study of emission inventories.Other algorithms show high errors in emission calculations, and their compressed data are not applicable to the study of emission inventories.Compared with other algorithms, the proposed algorithm can guarantee the quality of compressed data by controlling the variation of acceleration with preset emission calculation error values, avoiding the need to determine the appropriate threshold value through extensive experiments.In addition, data classification reduces the depth of data processing iterations and improves the operational efficiency.Therefore, the proposed algorithm exhibits good comprehensive performance.Future studies may employ the distributed approach to reduce running time and may also consider the adaptive threshold in terms of compression rate [36].for each T in T(NS 0 ) do 3: Add T[0] into Simpli f ied_trajectory 4: Add T[n − 1] into Simpli f ied_trajectory 5: for each T in T(NS

Figure 1 .
Figure 1.Framework diagram of the simplified algorithm.First, the AIS data are classified according to the navigation status, and only the first and last trajectory points of the trajectory are retained for the part of the data where the main engine is not running (see data compression branch 2).Then, the maximum value σ of the acceleration rate of change function in all intermediate trajectory points P is calculated for the part the data where the main engine running, and the result is compared with the set threshold value to determine whether to retain or delete it (see data compression branch 1).

Figure 1 .
Figure 1.Framework diagram of the simplified algorithm.First, the AIS data are classified according to the navigation status, and only the first and last trajectory points of the trajectory are retained for the part of the data where the main engine is not running (see data compression branch 2).Then, the maximum value σ max of the acceleration rate of change function in all intermediate trajectory points P i is calculated for the part the data where the main engine running, and the result is compared with the set threshold value to determine whether to retain or delete it (see data compression branch 1).

Figure 2 .
Figure 2. Schematic diagram of the trajectory simplification process.There are 13 trajectory points.In the first step, we keep the first node P and the last node P , and find the maximum point P .In the second step, if σ exceeds the threshold,, keep P , and split the trajectory.In the third step, recursively judge the two trajectories, find the maximum point, and judge σ ; if it does not exceed the threshold, then discard all trajectory points except for the first and last nodes.Recursively repeat the judgment, and finally, obtain the simplified trajectory containing only four trajectory points.

Figure 2 .
Figure 2. Schematic diagram of the trajectory simplification process.There are 13 trajectory points.In the first step, we keep the first node P 1 and the last node P 13 , and find the maximum point P 6 .In the second step, if σ max exceeds the threshold, keep P 6 , and split the trajectory.In the third step, recursively judge the two trajectories, find the maximum point, and judge σ max ; if it does not exceed the threshold, then discard all trajectory points except for the first and last nodes.Recursively repeat the judgment, and finally, obtain the simplified trajectory containing only four trajectory points.

Figure 3 .
Figure 3. Emission calculation errors of two thresholds σ and σ ' E using methods at the same compression rate.Method 1 uses the threshold considering the acceleration rate of change, and Method 2 uses the threshold considering the standard error of emissions calculation.The emission calculation errors of the two threshold setting methods at the same compression rate are close, which also proves the reasonableness of the threshold design of the proposed algorithm.

Figure 4 .
Figure 4. Comparison of emission calculation errors of each compression algorithm at different compression rates.The compression rates are divided into three grades-90%, 94%, and 98%-and the four algorithms are compared horizontally.The left graph shows the comparative analysis of data from Shandong Province, and the right graph shows the comparative analysis of data from the Ningbo Port.At the compression rate of about 98%, the other algorithms have a great error in emission calculation because they lose a large amount of information concerning the ship and the emission calculation, while the proposed algorithm can still maintain a small error.

Table 1 .
Basis of the determination of vessel navigation status.

Table 3 .
Partial performance comparison of each compression algorithm under different thresholds (Ningbo Port data).

Table 4 .
Running time complexity of each compression algorithm.The time complexity is a function that evaluates the time consumed to execute the program and allows for the estimation of program processor use.The time complexity is often expressed in large O symbolic expressions, excluding the lower order terms and first coefficients of this function.The time complexity is evaluated when the amount of input data tends to infinity.The running time complexity of the proposed algorithm is not optimal among all algorithms, but the data classification reduces the amount of data processed per iteration.Therefore, the overall efficiency is still high.
= 1 to n − 1 do 13: if O[i] is dividing point of navigational states or i = n − 1 14: part = O[temp : i + 1] 15:if the navigational state of part is the navigational state of main engine stop operation 16:Add part into T(NS 0 )