An AIS Data-Driven Approach to Analyze the Pattern of Ship Trajectories in Ports Using the DBSCAN Algorithm

: As the maritime industry enters the era of maritime autonomous surface ships, research into artiﬁcial intelligence based on maritime data is being actively conducted, and the advantages of proﬁtability and the prevention of human error are being emphasized. However, although many studies have been conducted relating to oceanic operations by ships, few have addressed maneuvering in ports. Therefore, in an effort to resolve this issue, this study explores ship trajectories derived from automatic identiﬁcation systems’ data collected from ships arriving in and departing from the Busan New Port in South Korea. The collected data were analyzed by dividing them into port arrival and departure categories. To analyze ship trajectory patterns, the density-based spatial clustering of applications with noise (DBSCAN) algorithm, a machine learning clustering method, was employed. As a result, in the case of arrival, seven clusters, including the leg and turning section, were derived, and departure was classiﬁed into six clusters. The clusters were then divided into four phases and a pattern analysis was conducted for speed over ground, course over ground, and ship position. The results of this study could be used to develop new port maneuvering guidelines for ships and represent a signiﬁcant contribution to the maneuvering practices of autonomous ships in port.


Introduction
With the fourth Industrial Revolution facing ongoing technological changes throughout multiple industries, various technologies are being developed within the global maritime industry to realize the unmanned autonomous navigation of ocean-going vessels [1].Such vessels are defined by the International Maritime Organization (IMO) as maritime autonomous surface ships (MASSs) [2].At the core of the technology required to deploy a MASS is the use of artificial intelligence based on big data generated by ships, including the fields of advanced smart sensors, wireless networks, and cyber security [3][4][5].In addition to the economic advantages of reducing total operating costs by 10-20% while investing minimal manpower, the development of MASSs also offers great advantages in the prevention of human errors, which account for most maritime accidents [6].
In particular, data are being collected from the automatic identification systems (AISs) and electronic chart display and information systems utilized by ships, and the development of optimal routes as well as of future prediction and decision-making technologies related to vessel operation, such as navigational route pattern analysis and research, is being actively undertaken [7][8][9].In Lee et al.'s study, for a quantitative stability verification method relating to ship route design, a new sea route was created based on AIS data and compared with existing ones [10].This study was significant in that the route through Appl.Sci.2021, 11, 799 2 of 32 which the ship actually passed was extracted on the basis of big AIS data.In addition, Rong et al.'s study applied the density-based spatial clustering of applications with Noise (DBSCAN) algorithm, a machine learning clustering method based on AIS data, to classify the characteristics of a route and detect anomalies in vessels that exhibited unusual movement [11].In a study by Wen et al., similar to Rong et al.'s study, the DBSCAN algorithm was applied to extract the turning section of a ship's trajectory, and a study was conducted to automatically design the route [12].Pallotta et al. also analyzed the pattern of a ship trajectory based on AIS data as well as anomaly detection and route prediction [13].In addition, in the study of Ciaccio et al., AIS data have been elaborated together with safety criteria to ensure safe navigation, even in difficult visibility conditions [14].
All of the preceding studies were analyzed on the basis of data on the oceanic voyages of ships corresponding to pilot-stations-to-pilot-stations.However, relatively little research has been conducted on maneuvering technology in port settings compared to other MASS technologies relating to the patterns of routes.In the case of ship berthing technology, Lee et al. employed machine learning based on berthing velocity data in their analysis and derived predictions from the data [15].In addition, research to classify berthing velocity by types of pilots is also being conducted [16].However, few studies have addressed technological development in the context of the patterns of port entry and departure by ships and how they approach piers.In unmanned ship development projects such as the Advanced Autonomous Waterborne Applications Initiative (AAWA), led by Rolls-Royce and Maritime Unmanned Navigation through Intelligence in Networks (MUNIN), promoted by the EU from 2012 to 2015, ship maneuvering in ports and berthing-related components of ship operations were excluded, and studies were conducted on ship automation [17,18].In addition, a few cases were applied to actual large ships [1].
The stages of a ship's entry to a port and the berthing process largely comprise the following steps.First, the ship enters the port.Second, the ship approaches its designated docking pier at a safe approaching speed.Third, when the ship is suitably close to the pier, it berths in a parallel position [19].The ship's berthing is affected by the port's characteristics, natural environmental elements such as the wind and tide, and other characteristics according to the type and size of ship, as well as human factors [20].At the present time, the human factor corresponds to pilots who are experts at navigating the port environment, and major ports around the world safely dock ships by bringing pilots on board when ships arrive and depart.However, there are instances that sometimes lead to shipping accidents due to pilot negligence, i.e., human error [16].In addition, in no port in the Korea is there an official ship maneuvering port guideline, and ships are fully dependent on the experience and knowledge of the pilots.For instance, the 2014 oil spill accident in Yeosu of the WU YI SAN, a tanker with a gross tonnage 160,000 tons, resulted from a collision with a terminal because the pilot did not sufficiently control the ship's approach speed [21].In addition, an accident in which a container ship with a gross tonnage of about 150,000 tons collided with a gantry crane in 2020 at the Busan New Port was also caused by a pilot's failure to control the ship's berthing velocity [22].
In order to solve these problems, it is necessary to analyze the predominant patterns of ship trajectory.The analysis was based on AIS data for ship arrival and departure in a port.Through the results of this study, a basic study on the MASS technology of ship maneuvering in ports can be proposed, and human error can be prevented.In addition, in the short term, it is possible to propose ship maneuvering guidelines for ports.Accordingly, a flowchart of this investigation is shown in Figure 1.In this study, AIS data on vessels arriving at and departing from the Busan New Port in the Korea were collected and analyzed by dividing them into port arrival and departure data.The algorithm that was then used to analyze the patterns of ship trajectories was the same DBSCAN algorithm as that used in previous navigational route studies.

Target Port
The target port to be analyzed in this study was the Busan New Port in the Korea.Busan New Port is the world's sixth-largest port in terms of gross container volume handling, and the target port has large-scale container ship berths and sees the frequent entry and departure of large container ships [23].In addition, as an accident involving a gantry crane collision recently occurred, it was determined that this location would be suitable in terms of developing a new port maneuvering guideline.The geographical location of Busan New Port is shown in Figure 2. Figure 3 shows the topographic characteristics of Busan New Port at the time of the accident involving the gantry crane.Figure 3a displays a British Admiralty Chart of Busan New Port.According to the chart, when a ship arrives at Busan New Port, it first enters a waterway called Gadeog Sudo.After passing through the waterway, the ship passes through a small island called Todo to berth at the pier as shown in Figure 3b.

Automatic Identification System Data
AIS is an automatic identification system that exchanges information at 150 MHz very high frequency (VHF) with nearby ships, stations, and satellites in order to identify and locate ships or other maritime traffic [25].AIS information includes the date and time and the ship's name, call sign, position (latitude and longitude), speed over ground (SOG, in knots), course over ground (COG), gyro heading, and rate of turn, etc.
In this study, time and date, position, SOG, and COG details were used as variables to be analyzed for ship trajectory patterning.As of April 2020, when the gantry crane collision occurred, AIS data on ship berthing at the target pier for four months from January Appl.Sci.2021, 11, 799 5 of 32 2020 were collected.The ship type was a container ship and the target vessels featured a gross tonnage of 100,000 tons or more.The specifics of the collected AIS data are shown in Table 2.The peculiarity of the AIS data analyzed is that the data reception interval differed depending on the ship's speed and changing course [26].For instance, dynamic information on the Class-A AIS used by Safety of Life at Sea ships was received at a cycle of 10 s for ships sailing at less than 14 knots and at 3.3 s when the ship's courses changed.In the case of the Class-B AIS, it was received in 30 s in the same situation.Therefore, in this study, as the time of entry and departure of ships differed, the unity of units was essential for analyzing the time series.

Basic Statistics
Figure 4a shows the results of plotting the AIS raw data collected from January to April 2020 in the area corresponding to Busan New Port, according to the information listed in Table 2, into a geographic information system (GIS).The GIS software used for the analysis was ArcGIS Pro. Figure 4b shows the result of organizing only the ships docked at the target pier in the raw data.According to the AIS data, the number of target ships corresponding to the arrival and departure was 42 each, and as there were ships that berthed two or three times during the period, the total number of trajectories was 50 each.Table 3 shows the results of the basic frequency analysis.Both the arrivals and departures were divided into cases of passing Todo to the left or passing to the right.Table 4 shows the results of counting which way each pier-specific ship passed Todo.In the case of the arriving ships, the number of ships passing Todo on the left and right was similar at 23 and 27 times, respectively.However, in the case of departure, the number of times ships passed to the left was 31, and to the right, 19.

Data Pre-Processing
In terms of data mining, data pre-processing is an essential step for improving the performance of analytical results [15,27].In particular, the AIS data to be used in this study had to be pre-processed into data suitable for analysis because they contain standard and reception errors due to non-input information from ships [8,28].In addition, because of the characteristics of the AIS data interval mentioned above, it was necessary to utilize a pre-processing technique for unit scaling.Therefore, in this study, the data cleaning and scaling methods were applied among the pre-processing methods.
Data cleaning consists of processing missing values and noisy data to make them suitable for analysis [27].The AIS error data, which can be called noisy data, were preprocessed through the list-wise delete method [29].In addition, the entire dataset was divided into arrival and departure categories for analysis.The time when the pilot boarded and disembarked was the basis for the ship's arrival and departure, respectively.Especially outside the port, the ships sailed without regularity such as drifting and anchoring.Therefore, to analyze the pattern of maneuvering in the port, the dataset was pre-processed based on the location of the pilot station, latitude 34.93 N. The result is shown in Figure 5.
Data scaling is the process by which data units are standardized [20].When analyzing data, they should be standardized so that no error is transmitted to the analysis result values due to the differences in units [27].The data in this study all featured different arrival and departure times, and there was a difference in the intervals due to the ship's speed; therefore, the unit was unified for the time series analysis through scaling.In this study, among the data scaling methods, min-max normalization was used.When the time according to position i of each ship was S(t i ), the time corresponding to the start point of the section was defined as S(t min ), and the final point was defined as S(t max ).In this case, the equation used in this study for the data scaling was as follows: MinMax time normalization = S(t i ) − S(t min ) S(t max ) − S(t min ) (1)

Definition of the DBSCAN Algorithm
Density-based spatial clustering of applications with noise (DBSCAN) is a density model-based clustering algorithm [30].As one of the unsupervised learning techniques in machine learning, it uses data location information.K-means, which is often used for clustering, clusters on the basis of the distance between clusters, and the hierarchical and fuzzy clustering method is also based on the distance between points [30,31].The grid-based method turns the object space into a finite number of spaces composing a grid structure, and all clustering processes are performed in the grid structure.The model-based method is about hypothesizing about each cluster and finding what works best for each one.It is based on the assumption that it is produced by a mixture of probability distributions.
However, DBSCAN clusters high-density elements, even though the dots are densely clustered using a density-based clustering method [32].Furthermore, geometric shapes can also be clustered.This method does not determine the number of clusters in advance, and it is possible to analyze very large databases.In this study, the dataset distributed according to the vessel AIS reception interval was analyzed over a time series.AIS data will be received at a high density when the vessel decreases speed and changes course.Furthermore, if the ship is moving at a high speed, the data will be distributed at a lower density.Therefore, the DBSCAN algorithm, one of the density-based methods, is appropriate for the clustering method of the dataset used in this study.
In order to utilize DBSCAN, two parameters must be set: epsilon (ε) and Minimum Points (minPts) [33].ε is the minimum distance for determining whether each data point neighbors the others, and minPts is the minimum amount of data required in ε to be recognized as a cluster [34].As is shown in Figure 6, if there is more than minPts in ε, the DBSCAN creates a cluster and expands this by performing the same check around neighboring data.When defining a dataset as D and a point in it as p, the ε neighborhood (N) can be defined in terms of the following equation: When p satisfies the following equation, it constitutes the core point.The border points are those near the core point, but with fewer than minPts within the ε neighborhood, and noise is defined as the points other than the core and border points.

Decision of the Epsilon and Minimum Sample
There is no general way to determine ε and minPts, which are important parameters for applying the DBSCAN algorithm [11].However, ε can be calculated through the reachability distance derived by the Ordering Points To Identify the Clustering Structure (OPTICS) algorithm [35].The parameter ε calculated at the reachability distance is almost identical to that of the DBSCAN algorithm [36].In OPTICS, the core distance connecting the core point is defined as the following equation: according to the vessel AIS reception interval was analyzed over a time series.AIS data will be received at a high density when the vessel decreases speed and changes course.Furthermore, if the ship is moving at a high speed, the data will be distributed at a lower density.Therefore, the DBSCAN algorithm, one of the density-based methods, is appropriate for the clustering method of the dataset used in this study.
In order to utilize DBSCAN, two parameters must be set: epsilon (ε) and Minimum Points (minPts) [33].ε is the minimum distance for determining whether each data point neighbors the others, and minPts is the minimum amount of data required in ε to be recognized as a cluster [34].As is shown in Figure 6, if there is more than minPts in ε, the DBSCAN creates a cluster and expands this by performing the same check around neighboring data.When defining a dataset as  and a point in it as , the ε neighborhood () can be defined in terms of the following equation: When  satisfies the following equation, it constitutes the core point.
Core Point The border points are those near the core point, but with fewer than minPts within the ε neighborhood, and noise is defined as the points other than the core and border points.

Decision of the Epsilon and Minimum Sample
There is no general way to determine ε and minPts, which are important parameters for applying the DBSCAN algorithm [11].However, ε can be calculated through the reachability distance derived by the Ordering Points To Identify the Clustering Structure (OP-TICS) algorithm [35].The parameter ε calculated at the reachability distance is almost identical to that of the DBSCAN algorithm [36].In OPTICS, the core distance connecting the core point is defined as the following equation: where   is the distance to the nearest neighbor of the minPts.In the case of ε , the UNDEFINED value can be omitted and the reachability of point  from point  can be defined as the following equation: where minPts dist is the distance to the nearest neighbor of the minPts.In the case of ε max , the UNDEFINED value can be omitted and the reachability of point p from point o can be defined as the following equation: one.It is based on the assumption that it is produced by a mixture of probability distributions.
However, DBSCAN clusters high-density elements, even though the dots are densely clustered using a density-based clustering method [32].Furthermore, geometric shapes can also be clustered.This method does not determine the number of clusters in advance, and it is possible to analyze very large databases.In this study, the dataset distributed according to the vessel AIS reception interval was analyzed over a time series.AIS data will be received at a high density when the vessel decreases speed and changes course.Furthermore, if the ship is moving at a high speed, the data will be distributed at a lower density.Therefore, the DBSCAN algorithm, one of the density-based methods, is appropriate for the clustering method of the dataset used in this study.
In order to utilize DBSCAN, two parameters must be set: epsilon (ε) and Minimum Points (minPts) [33].ε is the minimum distance for determining whether each data point neighbors the others, and minPts is the minimum amount of data required in ε to be recognized as a cluster [34].As is shown in Figure 6, if there is more than minPts in ε, the DBSCAN creates a cluster and expands this by performing the same check around neighboring data.When defining a dataset as  and a point in it as , the ε neighborhood () can be defined in terms of the following equation: When  satisfies the following equation, it constitutes the core point.
Core Point The border points are those near the core point, but with fewer than minPts within the ε neighborhood, and noise is defined as the points other than the core and border points.

Decision of the Epsilon and Minimum Sample
There is no general way to determine ε and minPts, which are important parameters for applying the DBSCAN algorithm [11].However, ε can be calculated through the reachability distance derived by the Ordering Points To Identify the Clustering Structure (OP-TICS) algorithm [35].The parameter ε calculated at the reachability distance is almost identical to that of the DBSCAN algorithm [36].In OPTICS, the core distance connecting the core point is defined as the following equation: where   is the distance to the nearest neighbor of the minPts.In the case of ε , the UNDEFINED value can be omitted and the reachability of point  from point  can be defined as the following equation: In accordance with this equation, the reachability plot applied to the arrival and departure data used in this study is shown in Figure 7. Therefore, the value of ε that can adequately calculate the number of clusters is 0.2.
In order to set minPts, prior knowledge of the dataset to be used in the research area is required.In this study, as a result of inputting various minPts values to classify sub-clusters based on ε = 0.2, derived earlier, the most valid result was derived when minPts = 50.

Application of the DBSCAN Algorithm
The ship trajectory based on AIS data includes ship latitude, longitude, SOG, and COG according to a time series.Therefore, the information contained in the AIS data was entered as a variable and applied to the DBSCAN algorithm.The application of the DBSCAN algorithm employs Scikit-learn [37] in Python.The result is shown in Figure 8.
According to Figure 8a, by applying the algorithm based on the arriving ship's trajectory, the turning section where the ship changed course was derived into three clusters and the berthing step was analyzed in the last step.At this time, the part listed between each cluster was designated as the leg.According to Figure 8b, in applying the departure data, the unberthing stage and turning section corresponding to purple and blue, respectively, were the same as the analysis result pertaining to arrival.However, the area corresponding to the leg of the arrival was analyzed as an orange cluster.Therefore, the arriving ship's trajectory was classified into seven stages and divided into three turning sections, three legs, and one berthing stage.Additionally, in the case of departure, it was classified into six stages, divided into the one unberthing stage, two turning sections, and three legs.The classified stages were re-designated as four phases with the same properties, and the pattern was analyzed [38].Figure 9 shows the framework for the pattern analysis of a ship's trajectory, with detailed items having been analyzed for the SOG, COG, and the ship's position.

Pattern of Arriving Ship's Trajectory
In order to analyze the pattern of the ship's arrival trajectory, the phases were classified as shown in Figure 10.The stage from the point when the ship is boarded by the pilot and before entering Gadeog Sudo was designated as the "port entry phase"; the stage through the Gadeog Sudo was designated as the "waterway phase"; the stage in which the ship passes through the breakwater for berthing was designated as the "breakwater phase"; and the stage of passing through Todo island and berthing to the pier was designated as the "berthing phase".The time series analysis for ship maneuvering in each of these phases is expressed in 11 stages, from 0 to 10, by multiplying the result scaled in Equation (1) by 10.The result of plotting the arriving ship's trajectory data on the British Admiralty Chart is shown in Figure 11a-c, which show the case of berthing by passing Todo to the left and to the right.Therefore, the pattern of the berthing phase was analyzed by dividing it into the left and right sides of Todo.

Port Entry Phase
The port entry phase consisted of the process prior to entering the Gadeog Sudo waterway after piloting outside the port.According to Figure 11a, Leg 1 is the step in which ships approach to enter the waterway from various trajectories.Turning Section 1 is the stage just before entering, where the course of the vessel is aligned with the direction of the waterway.The average SOG in this phase is 8.23 knots, and the results of the frequency counting are shown in Figure 12a.As a result of the analysis, 8~9 knots was the most common, with 171 counts, followed by 7~8 knots at 147 counts.The average COG was 328.84, and the results of the frequency counting are shown in Figure 12b.Therefore, the course of 330~335 degrees corresponds to 187 counts, which was determined through analysis to be the largest distribution.
The time series analysis of this phase was performed by adding the boxplot of the turning section (TS1) to that of Leg 1, divided into 11 stages, from 0 to 10, with the results shown in Figure 13.According to Figure 13a, ships moving at an average of 9.23 knots at the starting point gradually decrease their SOG and finally decrease to an average of 7.32 knots at the turning section.The change in the time series of the COG is shown in Figure 13b, and the COG, which changed in various courses, is ultimately determined as having an average of 337.25, similar to the direction of the waterway in the turning section.The summarized descriptive statistics are shown in Table 5, below.

Waterway Phase
The waterway phase is the process in which ships enter the berth from the waterway, as shown in Figure 11a, for Leg 2. As the SOG changes gradually, the average is 9.47 knots, but it features various distributions as displayed in Figure 14a.Among these, 10~11 knots is the most frequent, with 144 counts.According to Figure 14b, the COG is the most common at 335 to 340 degrees, with 580 counts, similar to the direction of the waterway, with an overall average of 335.11 degrees.The time series analysis of the waterway phase was divided into 11 stages, from 0 to 10, and is represented as a boxplot, which is shown in Figure 15.According to Figure 15a, ships entering at the average of 7.78 knots at the starting point gradually increase the SOG to a maximum average of 10.88 knots, and then, at the final point, the average again decreases to 10.58 knots as the ship enters the next turning section.The COG is shown in Figure 15b and is kept constant between about 330 and 340 degrees overall, but at the last point, it is determined at an average of 331.49degrees.The summarized descriptive statistics are shown in Table 6.The breakwater phase corresponds to Turning Section 2 and is the process in which the waterway is exited and the breakwater passed through, as shown in Figure 11a.At this time, the average SOG is 9.44 knots and the distribution of the frequency count is as shown in Figure 16a.In general, it was analyzed with a high ratio of distribution of between 9 and 10.5 knots.The COG is shown in Figure 16b, and the ship that initially moved at about 336 to 338 degrees (32 counts) changed course from 358 to 000 degrees (57 counts).As is shown in Figure 17, the time series analysis result of this phase shows the change as a change in course that occurs between boxplots 5 and 6.The SOG is shown in Figure 17a, and the average starting point in the waterway phase was determined to be 10.25 knots.Beyond that, at the point of course change, the SOG decreased to an average of 8.86 knots, and at the last point of the phase, the average was analyzed to be 8.70 knots.In the case of COG, the change was evident.According to Figure 17b, the degrees start at an average of 338.93, gradually change, then change to an average of 354.74 at the point of the course change, and then gradually approach toward 000.The summarized descriptive statistics are shown in Table 7.The berthing phase refers to the process of passing the breakwater to passing Todo and berthing and is plotted in the map shown in Figure 11a as Turning Section 3, Leg 3, and Berthing.In passing Todo, the characteristics of the process of passing to the left or right side differ, and so the results are separately plotted, as shown in Figure 11b,c.
First, the trajectory pattern of ships berthing by passing Todo on the left side was analyzed.Figure 18 displays a frequency count plot of the SOG in Turning Section 3, Leg 3, and the berthing step through Todo on the left side.Figure 18a shows the frequency counts of Turning Section 3, with 7~7.5 knots being the most common with 28 counts and the average SOG analyzed being 7.50 knots.Figure 18b corresponds to Leg 3, and 64 counts are included at 4-5 knots, indicating the greatest distribution.The average was analyzed as 5.14 counts, which was decelerated to berthing.Finally, Figure 18c shows the SOG at the berthing stage, with an average of 1.09 knots.In particular, as the berthing step consists of slow berthing using tug boats, it occupied the largest percentage with 233 counts, corresponding to 0~0.5 knots.Figure 19 is the result of analyzing the COG for each step.According to Figure 19a, 000~005 degrees was the most common, with 61 counts, and the average COG was 006.53 degrees.This is because ships moves north to pass Todo on the left side.In the next leg, ships move east to approach the pier, which is why there are many counts of almost 000 degrees and 060 to 080 degrees, as shown in Figure 19b.In Figure 19c, counts from 060 to 080 degrees were the most common, totaling 251. Figure 20 shows the results of the time series analysis of the ships berthing by passing Todo on the left side.Based on the scaled time data, the turning section (TS 3) and berthing (Berth) stages were classified into three divisions, and the leg (Leg 3) was divided into four.In the case of the SOG, it was as shown in Figure 20a, and the average of TS 3_1, the starting point passing through the breakwater phase, was 8.02 knots, with the speed gradually decreasing.At Leg 3_1, the average speed decreased to 6.38 knots, and at Berth 1, the average speed rapidly decreased to 2.31 knots.After that, the berthing process was completed by decelerating to 0 knots.The COG time series boxplot was as shown in Figure 20b, and the ships that were moving north at an average of 002.49degrees in TS 3_1 slowly turned eastward while passing Todo.Then, the course slowly moved north from Berth 1, starting at 072.64 degrees on average, at which point berthing was completed.The summarized descriptive statistics are shown in Table 8.
Next, the pattern of the trajectory of berthing ships passing Todo on the right was analyzed.Figure 21 shows the SOG frequency count boxplot of such ships.According to Figure 21a, the average SOG in Turning Section 3 was determined to be 7.99 knots, and the speed was higher than that of ships passing through the left side of the same section, and 7~8 knots was derived as the greatest distribution.Figure 21b is a leg-68 counts were included in the 5.5~6 knots range, and the average was 5.79 knots, which slightly differed from left-side-passing cases.Finally, Figure 21c exhibits an average value of 0.91 knots.In particular, there was a difference in that the frequency count distribution of the berthing stage followed a lognormal distribution.According to Figure 22a, 000 to 005 degrees was the most common at 37 counts, but as ships gradually turned east to pass Todo on the right side, the COG was changed to 040 degrees.In the subsequent Leg 3, the ship passing Todo moved northeast to approach the pier.According to Figure 19b, the counts of 040 to 050 degrees were the most frequent at 90 counts.In Figure 19c, counts of 040 to 060 degrees were the greatest and, as berthing must be completed by heading north, the ratio of 000 degrees was also high.Figure 23 shows the results of the time series boxplot of ships berthing by passing Todo on the right side.In the case of the SOG, it is the same as in Figure 23a, and the trend is similar to that for the left side, but there are slight differences overall.The average SOG of the starting point TS 3_1 was 8.42 knots, and the speed gradually decreased.At the Leg 3_3 point, the average speed decreases to 4.68 knots, and at Berth 1, the average speed decreases sharply to 1.61 knots.Thereafter, the berthing process is completed by decelerating to 0 knots.The COG is as shown in Figure 23b, and ships that were heading north at an average of 004.99 degrees in TS 3_1 turned east to pass Todo.After that, for berthing, the course changed for the north again, starting from Leg 3_2, and berthing was thus completed.The detailed descriptive statistical analysis results are shown in Table 9.

Pattern of the Departing Ship Trajectory
In order to analyze the pattern of the departing ship trajectories, the phases were classified as shown in Figure 24.As in the case of arrivals, it was divided into four phases, with unberthing and passing Todo referred to as the "unberthing phase"; the stage in which the ship passes through the breakwater for berthing was designated as the "breakwater phase"; the stage through the Gadeog Sudo was designated the "waterway phase"; and finally, exiting of the port was designated as the departure phase.The result of plotting the departing ships' trajectory data on the British Admiralty Chart is shown in Figure 25a-c, showing the case of unberthing by passing Todo to the left or to the right.Therefore, as in the case of the arrival, the pattern of the unberthing phase was analyzed by dividing it between ships' passage on the left or right sides of Todo.

Unberthing Phase
The unberthing phase refers to the process of unberthing the vessel using tug boats, then passing Todo and the breakwater, which is plotted on the map as shown in Figure 25a, as Unberthing, Leg 1, and Turning Section 1.In passing Todo, as in the Berthing phase, the characteristics of the processes of passing on the left and right sides differed, and so they were separately marked, with the plotting results shown in Figure 25b,c.
First, the case of passing Todo on the left side of the unberthing phase was analyzed.According to Figure 26a, the SOG occupies a high percentage, of over 70 counts of 0-0.5 knots during the unberthing stage.As Figure 26b indicates, the ship's speed slowly increased thereafter and, during the leg 1 stage, between 1~2 knots was noted for 106 counts, followed by 6~7 knots for 79 counts.For turning Section 1, the SOG averages 9.97 knots, and the frequency counts are as shown in Figure 26c.The result of plotting the departing ships' trajectory data on the British Admiralty Chart is shown in Figure 25a-c, showing the case of unberthing by passing Todo to the left or to the right.Therefore, as in the case of the arrival, the pattern of the unberthing phase was analyzed by dividing it between ships' passage on the left or right sides of Todo.The result of the COG frequency counts is shown in Figure 27.First of all, arrival ships' trajectory was completed by heading north, whereas in the case of departure, ships headed south.Therefore, according to Figure 27a, which depicts the unberthing stage, the average was determined to be 192.21degrees, and the counts also accounted for a high percentage of values between 190 and 200 degrees.After unberthing, the ships moved west to pass Todo on the left side and, accordingly, the value of COG of 270~280 degrees was the most frequent at 173 counts, as is shown in Figure 27b.Furthermore, as is shown in Figure 27c, in Turning Section 1, the average was almost 188.82degrees in the breakwater phase.The range with the most counts was 190-192 degrees, reading 52 counts.The detailed descriptive statistical analysis results are shown in Table 10.The time series boxplot of the unberthing phase passing Todo on the left side is shown in Figure 28.In the case of SOG, as shown in Figure 28a, it gradually increased from 0 knots, in contrast to the arrival case.However, the difference was that it increased to an average of 8.92 knots during Turning Section 1_1 (TS1_1) in the figure and an average of 10.67 knots at the final point.In the case of COG, as is shown in Figure 28b, after slowly unberthing the ship to the south, it was moved westward to pass Todo on the left side.After that, it turned to the south again from Leg 1_3.At this time, in the part of Turning Section 1, it plotted a course of about 188.90 degrees.
The following analyzes the case of passing Todo on the right side, as is shown in Figure 29.Likewise, in Figure 29a, during the unberthing stage, 0~1 knots occupied the largest percentage of 80 or more to the unberthing of the vessel.After that point, according to Figure 29b, the SOG was more diversely distributed than in the left-side case.In particular, the ratio of 9-10 knots was analyzed as being the second most frequent after the ratio in the range of 1-2 knots, and it seems that the SOG increased rapidly during Leg 1.In Turning Section 1, an average of 10.02 knots was determined, and it was also noted that the ship continued passing through at the increased SOG.This corresponds to Figure 29c, with the most frequently measured speed comprising 10~11 knots.In the case of the COG analysis results, according to Figure 30a, as the ship is unberthing to the south, the value of between 150 and 200 degrees was the most frequent with 157 counts.In Leg 1, as the ship passes the right side of Todo to the southwest, as is shown in Figure 30b, the range 230-240 degrees exhibited the highest frequency at 118 counts.Moreover, in Turning Section 1 in Figure 30c, turning south again, the value of 195~200 degrees constituted the highest percentage.The time series analysis of the ships passing on the right side of Todo is shown in Figure 31.In the case of the SOG, as is shown in Figure 31a, the difference with the analysis results of left-side passing is that it increases rapidly in Leg 1. Furthermore, the SOG decreases slightly in changing course in the southerly direction during Turning Section 1, and then increases to 10 knots or more as it continues.In the case of the COG, as is shown in Figure 31b, given that it passes on the right side of Todo, the amount of change is not large compared to the left side.Correspondingly, a southerly course is gradually set based on Leg 1_3.The detailed descriptive statistical analysis results are shown in Table 11.The breakwater phase occurs when the outgoing ship passes the breakwater and changes course to enter the waterway.Figure 25a shows the ship's position during this phase.The average SOG was 10.45 knots and, as shown in Figure 32a, the value between 10.5 and 11 knots was the most frequent, at 32 counts.In addition, the COG had a value of between 180 and 190 degrees while passing through the breakwater, and as the course changed to 170~180 degrees prior to entering the waterway, the frequency was analyzed as shown in Figure 32b.The time series boxplot of this phase was analyzed in six steps.As shown in Figure 33a, the ship's SOG was maintained above 10 knots and then decelerated to an average of 9.77 knots at the final point.The COG change is shown in Figure 33b.A vessel that sails at an average of 183.06 degrees at the starting point changes course and exits this phase at an average of 170.47 degrees.The summarized descriptive statistics are shown in Table 12.The waterway phase is the process, shown in Figure 25a, of passing through the Gadeog Sudo.This phase is characterized by the pilot disembarking in the middle section of the waterway.Therefore, as is shown in Figure 34a, the frequency of 8-9 knots and range of 11-12 knots comprise the highest proportion.Additionally, in the case of COG, it is similar to the direction of the waterway and, as is shown in Figure 34b, the counts of 162 to 164 degrees were the most frequent at 297, and the average was 169.92 degrees.The time series analysis of the waterway phase at the time of departure of the ships is shown in Figure 35.In the case of the SOG, in order to disembark the pilot, as shown in Figure 35a, the speed is gradually reduced to about 8 knots, and then gradually increased to 12 knots.As is shown in Figure 35b, in the case of the COG, the course that averaged 166.41 degrees when entering the waterway continues to be maintained at about 163 degrees after entering the waterway and passing through it.The detailed descriptive statistical analysis results are shown in Table 13.The departure phase is the process by which the ship passes through the waterway and fully exits the port, as is shown in Figure 25a.The SOG of this phase is 203 counts, with the most at 11-12 knots, as is shown in Figure 36a, and the average SOG of the ship being 12.31 knots, which is considered a fast speed in this setting.The distribution of the COG is shown in Figure 36b, and the various courses are set according to the position of the next port of each ship.The time series boxplot of the departure phase has no significant change compared to the other phases.Figure 37 shows the time series of the SOG and COG, with the SOG usually being about 12 knots, which increases slightly.The ship's COG is maintained in the same direction as the waterway and then changes in accordance with the destination of each ship.The summarized descriptive statistics are shown in Table 14.11. stages, from 0 to 10, by multiplying the result scaled in Equation (1) by 10.In addition, the boxplot was used to visually express the descriptive statistics for 11 stages.

Discussion
This study collected AIS data from container ships arriving at and departing from Busan New Port in the Korea and analyzed the pattern of their trajectories using the DBSCAN algorithm.The methodology is novel in its analysis of the specifics of ship maneuvering in a port, which is lacking in the literature, in contrast to navigational route pattern analyses between pilot stations, which has been extensively studied previously.Additionally, it is meaningful that the ship trajectory was quantitatively analyzed using a machine learning algorithm based on the AIS data of ships that were in fact safely berthed.
In particular, if this study is combined with existing berthing-and unberthing-related studies, it would form part of a basic evaluation of MASSs' in-port maneuvering and autoberthing.For instance, these results can be combined with existing studies relating to range predictions of berthing velocity and the analysis of the patterns of berthing maneuvering by pilots [15,16,20].All of these studies were conducted on the basis of measured data, and machine learning and artificial intelligence technologies were used.Therefore, this will constitute a core study relating to navigational ship operation between berths by MASSs.In addition, a major advantage of implementing a MASS is that it can prevent human error.Including the crane collision accident of a container ship at Busan New Port in April of 2020, which was the inspiration for this study, accidents in ports cause damage not only to ship hulls but also to human life and the environment, as well as significant economic damage, and so disaster prevention is vital.This is achievable as most of these kinds of incidents are a result of human error [6].
The short-term practical aspects of the analytical results of this paper are as follows.The result of analyzing patterns in detail by dividing them into arrival and departure phases can inform ship maneuvering guidelines for major ports in the Korea.In the case of Busan New Port, which is the target port in this study, the time series-based guideline is shown in a scatter graph in Figures 38 and 39 and is divided into arrival and departure segments, respectively.
The x-axis is a function of multiplying the scaled time data for each ship by 100.As is shown in Figure 38, in the case of the arrival data, this was based on latitude 34.39 N, which is the starting point of the ship berthing process.In the case of the departure data, this was analyzed as shown in Figure 39, from unberthing until the passage of latitude 34.39 N. In order to express the SOG and COG according to the amount of change in position, this study analyzed the latitude because it tends to move north upon arrival and south upon departure.The R-squared values for the fitting line of the scatter graph are shown in Table 15.In the case of the COG, this was analyzed as a low value of less than 0.5, which is thought to be due to the ships moving along various courses with the tug boats during the berthing and unberthing processes.The main limitation of this study is that it did not compare its data with those from the actual crane collision accident.Compared with the accident case, it is possible to analyze how different the maneuvering pattern is between the accident ship and the ship that is safely berthed.Furthermore, unlike when data were collected for analysis, Todo has since been removed due to the crane collision accident.Therefore, as a future study, it is necessary to analyze the stability by comparing the ship trajectory pattern according to the presence or absence of Todo in the Busan New Port.Moreover, a study to comparatively analyze the ship that was involved in the incident is also necessary.

Conclusions
In this study, in an effort to propose new ship maneuvering guidelines for the port, ship trajectories were divided into phases using the DBSCAN algorithm, and these patterns were analyzed.The results of this analysis can be used to conduct basic research on the future development of MASS ship maneuvering guidelines based on AI technology.Therefore, the findings of the study are summarized as follows: • First, in order to propose a ship port maneuvering guideline, the target port was set as Busan New Port and AIS data on the arrival and departure of ships were collected.The collection period was from January to April, 2020, and the targets were container ships with a gross tonnage of 100,000 tons or more.The collected data were a case in which 42 ships were targets for both arrival and departure, and a total of 50 berthings were completed.

•
Secondly, the collected data were pre-processed to render it suitable for analysis.In addition to organizing the data within the scope of the analysis by means of data cleaning, the unit of "date and time" was scaled to suit the time series analysis via data scaling.Then, the total dataset was separated into a range corresponding to arrivals and departures.

•
Thirdly, the dataset separated by arrivals and departures was applied to the DBSCAN algorithm.The most important parameters of the DBSCAN algorithm, epsilon and minimum samples, were set to 0.2 and 50, respectively.As a result, arrival was classified into seven stages, including leg, turning section, and berth clustering, and the departure was divided into six, including unberthing, leg, and turning section clusters.These were in turn divided into four phases again, and the pattern was analyzed.

•
Finally, the pattern of the ship trajectories was divided into four phases for both arrivals and departures and analyzed in detail.Analysis items included speed over ground, course over ground, and ship position.In order to analyze the patterns of the ships' trajectories, a frequency count analyis for each item of each phase was performed and a boxplot according to the time series was proposed.In particular, during the berthing and unberthing phases, the cases of passing Todo in Busan New Port on either the left or the right side were separately analyzed.

•
By synthesizing the results, we derived the degree of change in terms of speed over ground, course over ground, and ship position according to the stated time series.This can be utilized as a ship maneuvering guideline for the port.In addition, if trajectory data on ships safely berthed at each port are collected and analyzed, it will be possible to contribute to the development of MASS ship maneuvering technology in ports.

Figure 1 .
Figure 1.Flowchart of the study.

Figure 2 .
Figure 2. Geographical location of Busan New Port in the Korea.
Figure3shows the topographic characteristics of Busan New Port at the time of the accident involving the gantry crane.Figure3adisplays a British Admiralty Chart of Busan New Port.According to the chart, when a ship arrives at Busan New Port, it first enters a waterway called Gadeog Sudo.After passing through the waterway, the ship passes through a small island called Todo to berth at the pier as shown in Figure3b.Figure3c Figure3shows the topographic characteristics of Busan New Port at the time of the accident involving the gantry crane.Figure3adisplays a British Admiralty Chart of Busan New Port.According to the chart, when a ship arrives at Busan New Port, it first enters a waterway called Gadeog Sudo.After passing through the waterway, the ship passes through a small island called Todo to berth at the pier as shown in Figure3b.Figure3c pier to be analyzed.The pier where a container ship collided with a gantry crane was New Port 2 Pier No.5.Therefore, ships berthed at New Port 2 Pier No.4 and New Port 3 Pier No.1, located on both sides, including New Port 2 Pier No.5, were designated as targets for analysis.The characteristics of the New Port 2 and 3 Piers are shown in Table 1[24].

Figure 3 .
Figure 3. Topographical characteristics of Busan New Port: (a) British Admiralty Chart; (b) ship passing around Todo island; (c) location of the target pier.

Figure 4 .
Figure 4. Visualization of ship trajectory using AIS data: (a) raw data; (b) data to use for analysis.

Figure 6 .
Figure 6.Example of the density-based spatial clustering of applications with noise (DBSCAN) algorithm.

Figure 6 .
Figure 6.Example of the density-based spatial clustering of applications with noise (DBSCAN) algorithm.

Figure 6 .
Figure 6.Example of the density-based spatial clustering of applications with noise (DBSCAN) algorithm.

Figure 9 .
Figure 9. Framework for analyzing the pattern of a ship's trajectory.

Figure 10 .
Figure 10.Phase designation of the arrival cluster.

Figure 11 .
Figure 11.Plotting the arriving ship's trajectory data on the British Admiralty Chart: (a) all data; (b) passing the left side of Todo; and (c) passing the right side of Todo.

Figure 12 .
Figure 12.Frequency analysis of the port entry phase: (a) speed over ground (SOG); (b) course over ground (COG).

Figure 13 .
Figure 13.Boxplot time series analysis of the port entry phase: (a) SOG; (b) COG; * the time series analysis for ship trajectories in Leg 1 is expressed in 11 stages, from 0 to 10, by multiplying the result scaled in Equation (1) by 10.This phase is expressed as a 12 stages time series graph by adding a Turning Section 1 in addition to Leg 1 divided into 11 stages.The boxplot was used to visually express the descriptive statistics for 12 stages.

Figure 15 .
Figure 15.Boxplot time analysis of the arrival waterway phase: (a) SOG; (b) COG; * the time series analysis for ship trajectories in this phase is expressed in 11 stages, from 0 to 10, by multiplying the result scaled in Equation (1) by 10.In addition, the boxplot was used to visually express the descriptive statistics for 11 stages.

Figure 17 .
Figure 17.Boxplot time series analysis of the arrival breakwater phase: (a) SOG; (b) COG; * the time series analysis for ship trajectories in this phase is expressed in 11 stages, from 0 to 10, by multiplying the result scaled in Equation (1) by 10.In addition, the boxplot was used to visually express the descriptive statistics for 11 stages.

Figure 18 .
Figure 18.Frequency analysis of the SOG on the left-side berthing phase: (a) Turning Section 3; (b) Leg 3; and (c) Berthing.

Figure 19 .
Figure 19.Frequency analysis for the COG of the left-side berthing phase: (a) Turning Section 3; (b) Leg 3; and (c) Berthing.

Figure 20 .
Figure 20.Boxplot time series of the berthing phase passing Todo on the left: (a) SOG; (b) COG; * the time series analysis for ship trajectories in the case of passing Todo on the left is expressed in 11 stages by multiplying the result scaled in Equation (1) by 10.It was divided into 3 stages in Turning Section 3 (TS 3), 4 stages in Leg 3, and 3 stages in Berth.In addition, the boxplot was used to visually express the descriptive statistics for 11 stages.

Figure 21 .
Figure 21.Frequency analysis of the SOG of the right-side berthing phase: (a) Turning Section 3; (b) Leg 3; and (c) Berthing.

Figure 22 .
Figure 22.Frequency analysis for the COG of the right-side berthing phase: (a) Turning Section 3; (b) Leg 3; and (c) Berthing.

Figure 23 .
Figure 23.Boxplot time series of the berthing phase passing through Todo on the right side: (a) SOG; (b) COG; * the time series analysis for ship trajectories in the case of passing Todo on the right is expressed in 11 stages by multiplying the result scaled in Equation (1) by 10.It was divided into 3 stages in Turning Section 3 (TS 3), 4 stages in Leg 3, and 3 stages in Berth.In addition, the boxplot was used to visually express the descriptive statistics for 11 stages.

Figure 24 .
Figure 24.Phase designation of the departure cluster.

Figure 24 .
Figure 24.Phase designation of the departure cluster.

Figure 25 .
Figure 25.Plotting departing ships' trajectory data on the British Admiralty Chart: (a) all data; (b) passing on left side of Todo; and (c) passing on right side of Todo.

Figure 28 .
Figure 28.Boxplot time series of the unberthing phase passing through Todo to the left: (a) SOG; (b) COG; * the time series analysis for ship trajectories in the case of passing Todo on the left is expressed in 11 stages by multiplying the result scaled in Equation (1) by 10.It was divided into 3 stages in Unberth, 4 stages in Leg 1, and 3 stages in Turning Section 1 (TS 1).In addition, the boxplot was used to visually express the descriptive statistics for 11 stages.

Figure 31 .
Figure 31.Boxplot time series of the unberthing phase, passing Todo on the right: (a) SOG; (b) COG; * the time series analysis for ship trajectories in the case of passing Todo on the right is expressed in 11 stages by multiplying the result scaled in Equation (1) by 10.It was divided into 3 stages in Unberth, 4 stages in Leg 1, and 3 stages in Turning Section 1 (TS 1).In addition, the boxplot was used to visually express the descriptive statistics for 11 stages.

Figure 33 .
Figure 33.Boxplot time series analysis of the departure breakwater phase: (a) SOG; (b) COG; * the time series analysis Figure 6.stages, from 0 to 5, by multiplying the result scaled in Equation (1) by 10.In addition, the boxplot was used to visually express the descriptive statistics for 6 stages.

Figure 35 .
Figure 35.Boxplot time series analysis of the departure waterway phase: (a) SOG; (b) COG; * the time series analysis for Scheme 11. stages, from 0 to 10, by multiplying the result scaled in Equation (1) by 10.In addition, the boxplot was used to visually express the descriptive statistics for 11 stages.

Figure 37 .
Figure 37. Boxplot time series analysis of the departure phase: (a) SOG; (b) COG; * the time series analysis for ship trajec Table11.stages, from 0 to 10, by multiplying the result scaled in Equation (1) by 10.In addition, the boxplot was used to visually express the descriptive statistics for 11 stages.

Figure 38 .
Figure 38.Scatter graph of arriving ship trajectories for maneuvering guidelines.

Figure 39 .
Figure 39.Scatter graph of departing ship trajectories for maneuvering guidelines.

Table 1 .
Characteristics of the target piers.

Table 2 .
Automatic identification system (AIS) data characteristics for analysis.

Table 3 .
Number of ships and berthings.

Table 4 .
Number of times and on which side ships passed Todo.

Table 5 .
Descriptive statistics for the port entry phase of the scaled time series.

Table 6 .
Descriptive statistics for the arrival waterway phase of the scaled time series.

Table 7 .
Descriptive statistics for the arrival breakwater phase of the scaled time series.

Table 8 .
Descriptive statistics for the left-side berthing phase of the scaled time series.

Table 9 .
Descriptive statistics for the right-side berthing phase of the scaled time series.

Table 10 .
Descriptive statistics for left-side unberthing phase of the scaled time series.

Table 11 .
Descriptive statistics for the right-side unberthing phase of the scaled time series.

Table 12 .
Descriptive statistics for the departure breakwater phase of the scaled time series.

Table 13 .
Descriptive statistics for the departure waterway phase of the scaled time series.

Table 14 .
Descriptive statistics for the departure phase of the scaled time series.

Table 15 .
R-square analysis result of the fitting line.