A Machine-Learning Model for Zonal Ship Flow Prediction Using AIS Data: A Case Study in the South Atlantic States Region

Predicting traffic flow is critical in efficient maritime transportation management, coordination, and planning. Scientists have proposed many prediction methods, most of which are designed for specific locations or for short-term prediction. For the purpose of management, methods that enable long-term prediction for large areas are highly desirable. Therefore, we propose developing a spatiotemporal approach that can describe and predict traffic flows within a region. We designed the model based on a multiple hexagon-based convolutional neural network (mh-CNN) model that takes both the flow dynamics and environmental conditions into account. This model is highly flexible in that it predicts zonal traffic flow within variable time windows. We applied the method to measure and predict the daily and hourly traffic flows in the South Atlantic States region by taking the impacts of extreme weather events into consideration. Results show that our method outperformed other methods in daily prediction during normal days and hourly prediction during hurricane events. Based on the results, we also provide some recommendations regarding the future usage and customization of the model.


Introduction
Predicting traffic volume is critical in supporting efficient traffic management, design, and planning. Earlier attempts on traffic flow prediction are largely based on aggregated data [1,2]. With the increasing availability and accessibility of sensor data (e.g., tracking data from automatic identification systems (AIS)), scientists and practitioners can make predictions on vessel traffic at fine spatial and temporal levels. Although researchers have formulated numerous prediction methods, these methods tend to focus on predicting traffic volume for specific waterways or ports [3,4], or predicting flow within a short time (e.g., minutes) period. The effectiveness of these methods in providing a relatively longer term (e.g., a lead time of several days) prediction for large areas are unknown. Such long-term predictions are of great value to everyday flow management. Furthermore, among these models, the spatiotemporal characteristics of ship movements, weather, or environmental conditions are missing. Therefore, they often fail to deliver accurate results in variable conditions such as extreme weather conditions. This paper presents an approach that can predict the traffic volume for regions of high traffic volume such as ports and waterways [2,5]. These regions are termed as regions of interests (ROIs) throughout the paper. We focused on busy regions because these regions are where frequent navigation activities occur and are thus associated with a high risk of accidents. In addition, those regions exhibit relatively higher traffic volumes and may have higher irregularities during extreme weather events [6]. We proposed a new deep neural network-model to predict both short-term and long-term traffic in those ROIs. Our method was built on a deep learning approach that leverages the convolutional neural network (CNN) model, which can perform short term forecasting with different temporal intervals ranging from hourly output to daily prediction. This model also considers the spatiotemporal factors such as the influence of neighborhood flows as well as the vessel's movement behaviors like direction and different temporal granularities. We applied the method to investigate the model performance on predicting daily and hourly traffic flows in the South Atlantic States region, which is frequently affected by extreme weather events.
This research makes the following contributions: (1) It proposes a novel method that can accurately capture the changes of traffic volume based on spatiotemporal variations to make short-term and long-term predictions of future ship flow during both normal conditions and extreme weather events. (2) We customized the method to predict daily and hourly ship flow during normal and extreme weather conditions and provide recommendations regarding the use of such method.
The rest of this paper is organized as follows. Section 2 reviews the existing work in the field of machine learning algorithms to make predictions with a focus on ship flow forecasting. Section 3 describes the development of the multiple hexagon-based convolutional neural network model for making multivariate and multiple time-step ship flow prediction. In Section 4, we explain the experiments conducted to evaluate the approach. Finally, we summarize the results and draw conclusions in Section 5.

Related Work
Traffic prediction is an important topic in transportation research. Accurate traffic prediction is critical in enhancing situational awareness, supporting just-in-time (JIT) operations, improving monitoring, and navigation practices. Traffic prediction can predict the trajectory of a specific ship, the traffic volume of a port, and the crossing behaviors of ships near bridges. Prediction can be based on annual aggregated flow data, observations from remote sensing products, or the tracking information from the AIS system [7][8][9].
Currently, many methods are available for prediction. Two major categories are parametric and non-parametric. The literature shows that both methods deliver different performances under different conditions. A large number of parametric methods are time series models. One notable example is the autoregressive integrated moving average (ARIMA), which examines the periodical patterns of traffic flows and builds regression models to predict traffic flows [10,11]. Since parametric models heavily rely on a priori distribution, they perform well with normal operations while they fail to deliver a high-level performance in situations where the traffic flow is highly dynamic. On the other hand, non-parametric model such as the k-nearest neighbors (k-NN), which does not assume the regularity of traffic flow, can model complex non-linear patterns in data [12]. One of the popular sub-categories in non-parametric models are the deep learning-based methods. Deep learning methods predict traffic volume with knowledge derived from massive historical data. The deep learning model can take advantage of large volumes of data to explore patterns [13]. Deep learning architectures like deep belief networks [14] and convolutional neural networks have been successfully applied in many fields (e.g., speech recognition, computer visioning, and natural language processing) [13]. For example, the CNN model has been used to make supply and demand prediction of ride-sourcing services [15]. Considering the traffic flow patterns and the availability tracking data, we chose machine-learning methods, specifically, the CNN model as the main prediction method.
While scientists have proposed a few machine-learning methods to support marine traffic flow prediction [16], these methods are usually applicable to port regions or straits with rigid routing reporting mechanisms and redefined routes. Providing prediction on coastal waters is still rare, as methods usually do not take the following three factors into account. First, traditional methods stemming from network flow prediction treat the transportation network as a static system and usually do not incorporate the motion and movement patterns of ships. Predicting approaching vessels often becomes an important factor in determining the traffic volume in a region [16]. Second, the majority of methods are only applicable to controlled regions or predict traffic flow for a specific time window. In particular, the configuration that enables the prediction of the variability from short term to long term is missing. Third, environmental factors are usually not included in these methods. Research shows that periodic patterns in ferries may change during hurricane seasons [17]. Lacking environmental factors can lead to a low degree of prediction accuracy under extreme weather conditions. Therefore, the purpose of the proposed method is to make short-term and long-term predictions for hurricane-prone regions like the South Atlantic States to help people better prepare for extreme weather events. This method is applicable in a variety of locations (e.g., not limited to a specific harbor or waterway). By taking multivariate input data and using multiple time steps as inputs, the models can provide ship flow forecast at different spatiotemporal scales in an accurate manner.

Method
In this study, we analyzed the South Atlantic States region due to its unique geographical settings. With more than one thousand miles of coastline, the South Atlantic States region mainly consists of Florida, Georgia, North Carolina, and South Carolina. Nevertheless, in the past decades, as the sea surface temperature has risen, the power of Atlantic tropical cyclones has risen dramatically [18,19] and made this region particularly vulnerable to extreme weather events. During the extremely active 2017 Atlantic Basin hurricane season, 17 named tropical cyclones (of which six became major hurricanes) and two weaker systems developed from April to early November [20]. It was one of the most destructive hurricane seasons in history, costing more than $250 billion in damage alone in the U.S. [21]. To demonstrate the performance of our model, we used the AIS data of the Universal Transverse Mercator (UTM) Zone 17 dataset of 2017. This dataset is provided by marine cadaster [22], which are mainly located in the South Atlantic States region. These regions are featured with their high density of ship flows (more detail on the selection procedure is provided in Section 3.1.1). As the original datasets do not explicitly define the boundaries of zones, we used hexagons to partition and group ship flow data for the following reasons. First, no widely-accepted boundary partition mechanism is available for this region. Second, hexagons can better support algorithmic efficiency, data representation, and semantic expressiveness [23,24]. To examine the impacts of hurricanes, we reconstructed the paths and impacted regions of hurricanes using the data from the National Oceanic and Atmospheric Administration Hurricane Center [25].
To generate spatially continuous hexagons, we used Uber's H3 hexagonal hierarchical spatial indexing (H3) grid system [26]. The H3 grid system developed by Uber is originally used for ride optimization, spatial data visualization, and data exploration. The H3 system can be used to group geolocation data points into hexagonal areas or cells. Furthermore, this H3 system supports 16 different resolutions to group data at different spatial scales. There is also a hierarchical relationship for cells at different resolutions. The smaller hexagonal cell (child cell) with a finer resolution is approximately one seventh of the area of its hexagonal cell (parent cell) with coarser resolution. Furthermore, since each of the H3 cells has a unique identifier, it is easy for a child cell to locate its parent cell at coarser resolution and identify the unambiguous neighboring hexagons with the same resolution based on a specific search radius. This efficient indexing system allows us to group and search hexagonally grouped data in an efficient manner to analyze the pattern of ship flow. We only used level 3 and 4 hexagons because these are reasonable sizes for traffic flow management in the coastal waters. Hexagons with coarser resolution cover much greater regions (e.g., states or nations) where prediction results can be less meaningful, whereas hexagons with finer resolution only cover smaller regions where there are only a few ships. For instance, Figure 1 shows the 11 hexagons considered as ROIs at level 3 for the case studies, which included four regions for the hourly prediction test and one region for the hurricane impact test. The details on selecting these regions are described in Section 3.1.1. We prepared two sets of ship flow data using different hexagon sizes (Table 1) based on the processing procedures as explained in the following sections.

Traffic Flow Analysis
Before formulating a prediction method, we examined the typical spatiotemporal factors in determining the traffic flow in ROI regions. The classic ship traffic simulation contains both micro (focusing on the performance of the individual ship's navigation) and macro models (treat vessel traffic flow as a whole) based on the 'Network Simulation' of nodes and lines [27]. Our method was adapted from the macro method by considering the relative motions and distributions of ships. According to Xie and Liu (2018), traffic flow usually exhibits temporal regularity except for special conditions such as flooding events, hurricane seasons, and regulations. Furthermore, as ships move across different regions, the spatial association among regions can also alter the level of traffic flow. Therefore, we performed the pattern analysis outlined below to verify the impacts of typical spatiotemporal factors in contributing to the flow level. The pattern analysis serves as the basis of understanding the changes in traffic flow and the model formulation. In the context of this work, for a given hexagon cell, we defined the two types of ship traffic volume as follows: (1) ship flow in same direction where the total number of ships that move from each of the adjacent surrounding hexagons toward the center hexagon within the given interval of time (e.g., one hour or one day); and (2) the total ship flow where the total number ships with a unique identification is located within each of the hexagons within the given interval of tie.
Below, we explain the analysis methods and use our case study region as an example to assess the impacts of different factors. We recognize that the impacts of different factors may change over regions. When performing predictions, our data-driven machine learning model automatically adjusted the weight of factors based on the data in the South Atlantic States region.

ROI Identification
In order to extract high density regions, we used the ship tracking points to reconstruct the ship trajectories to calculate (1) the track point-based density and (2) the track line-based density by assigning points and line segments to grids ( Figure 2). Then, we performed reclassification based on the density threshold to identify high density cells as ROIs in the South Atlantic States region. After we identified the ROIs, we used the honeycomb model for spatiotemporal partitions to generate dynamic models [23]. Finally, we selected level 3 and level 4 hexagons that overlapped with high density regions and identified them as the final ROIs for ship flow analysis ( Figure 1 and Table 1

Spatial Distribution of Traffic Flow
We developed two ways to identify ship flow in the surrounding regions. For the first method, we simply calculated the total number of ships in each of the surrounding hexagons ( Figure 3a). For the second method, we calculated the total number of ships moving toward the central hexagon ( Figure 3b). In order to select the ships moving toward the central zone, we (1) calculated the compass bearing between the center of the of the surrounding H3 zone and the central H3 zone; (2) calculated the compass bearing between the start and end point of the ship; and (3) calculated the absolute difference between the bearings to select ships with bearing differences smaller than 30 degrees. The second method is more ideal for hourly prediction since ships may have moved across several hexagonal zones, covered longer distances, and show more complicated moving patterns at a daily level.

Spatial Distribution of Traffic Flow
We developed two ways to identify ship flow in the surrounding regions. For the first method, we simply calculated the total number of ships in each of the surrounding hexagons ( Figure 3a). For the second method, we calculated the total number of ships moving toward the central hexagon ( Figure 3b). In order to select the ships moving toward the central zone, we (1) calculated the compass bearing between the center of the of the surrounding H3 zone and the central H3 zone; (2) calculated the compass bearing between the start and end point of the ship; and (3) calculated the absolute difference between the bearings to select ships with bearing differences smaller than 30 degrees. The second method is more ideal for hourly prediction since ships may have moved across several hexagonal zones, covered longer distances, and show more complicated moving patterns at a daily level.
center of the of the surrounding H3 zone and the central H3 zone; (2) calculated the compass bearing between the start and end point of the ship; and (3) calculated the absolute difference between the bearings to select ships with bearing differences smaller than 30 degrees. The second method is more ideal for hourly prediction since ships may have moved across several hexagonal zones, covered longer distances, and show more complicated moving patterns at a daily level. To measure the spatial association, we used local indicators of spatial association (LISA) to explore the degree of spatial autocorrelation [28] at each unique location based on the local Moran's I. This can help us find clusters based on each zone's ship flow values. The local Moran's I is calculated as follows: To measure the spatial association, we used local indicators of spatial association (LISA) to explore the degree of spatial autocorrelation [28] at each unique location based on the local Moran's I. This can help us find clusters based on each zone's ship flow values. The local Moran's I is calculated as follows: where n is the total number of features; x i is the attribute of feature I; the average of the corresponding attribute is X; and the spatial weight between i and j.
For each of the hexagons identified based on ROIs, we performed the following analysis to explore the clustering pattern as the distance of the adjacent hexagons from the central hexagon increased. We first identified hexagons at level 3 that intersected with the ROIs and selected the neighboring hexagons with the same resolution based on the H3's traverse function with a distance of five. Figure 4 shows an example of all the neighboring hexagons selected based on the central hexagon (zone l3_Z2). Then, we calculated the total amount of ship flow in each of the hexagons. Based on the ship flow information, we analyzed the hexagons' clustering pattern using LISA. The result of the local Moran's I analysis based on a sample daily data was about 0.2, indicating that there is a clustering of cells with similar values. In addition, there was also a significant clustering pattern in the central region ( Figure 4), demonstrating that the central hexagon cell is more likely to be affected by its adjacent cells. Therefore, we used the H3's traverse function at the distance of one to select adjacent hexagons based on the central one's index to construct the ship flow. forecasting model.
hexagons' clustering pattern using LISA. The result of the local Moran's I analysis based on a sample daily data was about 0.2, indicating that there is a clustering of cells with similar values. In addition, there was also a significant clustering pattern in the central region (Figure 4), demonstrating that the central hexagon cell is more likely to be affected by its adjacent cells. Therefore, we used the H3's traverse function at the distance of one to select adjacent hexagons based on the central one's index to construct the ship flow. forecasting model.

Time Series Analysis
To perform time series analysis, we examined the temporal changes in traffic flow of a region and its neighbors. For example, Figure 5a

Time Series Analysis
To perform time series analysis, we examined the temporal changes in traffic flow of a region and its neighbors. For example, Figure 5a,b show the daily ship flow fluctuations in two zones near Port Miami. Since we leveraged the hexagon decomposition systems, we created seven hexagonal areas including one central area for both regions and six others for the adjacent regions, respectively (e.g., zone l3_Z6 is the central zone and Z1-6 are the neighboring hexagons). First, we extracted the daily total ship flow in each of the seven hexagons. Then, we plotted the daily ship flow of all seven hexagons to explore their patterns. In Figure 5, we selected Port Miami (i.e., zone l3_Z6) and Key West (i.e., zone l3_Z2) with their surrounding coastal waters as an example to demonstrate how factors in the surrounding zones can make an impact on the traffic flow in this region. Both regions have high daily ship flow. For instance, Port Miami is one of the largest passenger and cargo ports in the United States [29]. We found that the total ship flow in the central hexagonal region was highly correlated to the traffic flow in the surrounding regions that they demonstrated similar fluctuation patterns. Nevertheless, these two plots showed different fluctuation patterns. First, the peak seasons for Port Miami and Key West are different, and there is a much higher number of ship counts around Port Miami in general. Moreover, different regions show different latencies for ship flow changes. For instance, when the central region is experiencing an increasing number of ship flow counts, some of the surrounding regions remain relatively stable due to the fact that they may not contain major shipping routes and exhibit low traffic density [30]. This shows that (1) simple statistical/machine learning models may not perform well if applied directly to any high-density region without sufficient training and taking of other variations into consideration, and (2) surrounding regions can contribute to the central region differently, so it is important to configure sub-models to capture their variations.
In addition, we conducted regression to explore the relationship of ship flow near the region of Port Miami (zone l3_Z6) as an example, and it is calculated as follows: where n is the total number of observation; C is the total number of ship flow in the central region; and S is the total number of ships in the surrounding regions. We used simple linear analysis to explore the relationship between the ship flow in the central zones and ship flow in the surrounding zones for the selected sample zone. The results showed that the r-squared value for the sample zone was (1) 0.763 using the total number of ships in the surrounding zones and (2) 0.571 using the total number of ships moving toward the central zone, which indicates that around half of the observations can be explained by the ship flow in the surrounding regions.
demonstrated similar fluctuation patterns. Nevertheless, these two plots showed different fluctuation patterns. First, the peak seasons for Port Miami and Key West are different, and there is a much higher number of ship counts around Port Miami in general. Moreover, different regions show different latencies for ship flow changes. For instance, when the central region is experiencing an increasing number of ship flow counts, some of the surrounding regions remain relatively stable due to the fact that they may not contain major shipping routes and exhibit low traffic density [30]. This shows that (1) simple statistical/machine learning models may not perform well if applied directly to any high-density region without sufficient training and taking of other variations into consideration, and (2) surrounding regions can contribute to the central region differently, so it is important to configure sub-models to capture their variations.

Impacts of Extreme Weather Conditions
Extreme weather events are unusual, severe, or unseasonal weather conditions [31]. Although extreme weather events like tropical storms tend to last for a shorter period of time, they can still significantly affect the ship flow in the short term and can have a great impact on global commodity supply chains like marine transportation [6]. In our study, we mainly considered the impact of hurricanes due to the fact that they are very frequent in the South Atlantic States region, making many regions particularly vulnerable and can significantly change the traffic flow in the region. We used H3's hexagons to establish a search radius (distance = 5 in this case) to find the presence of hurricanes and record the information as binary data ( Figure 6). For example, Figure 7 shows that the change in the total number of ships when the region (i.e., zone l3_Z2) is affected by hurricanes with a drastic decrease in ship flow. Nevertheless, the number of ship counts return to a normal level quickly after the extreme weather event and remain stable. In our experiment, we generated a sub-model for the presence of hurricanes using the same procedure as above.
significantly change the traffic flow in the region. We used H3's hexagons to establish a search radius (distance = 5 in this case) to find the presence of hurricanes and record the information as binary data ( Figure 6). For example, Figure 7 shows that the change in the total number of ships when the region (i.e., zone l3_Z2) is affected by hurricanes with a drastic decrease in ship flow. Nevertheless, the number of ship counts return to a normal level quickly after the extreme weather event and remain stable. In our experiment, we generated a sub-model for the presence of hurricanes using the same procedure as above.

Our Proposed Model
In order to predict total ship flow within H3 cells by capturing the variations of zonal ship flow, we proposed a H3-based multivariate CNN model to extract and predict ship flow patterns using multiple previous time steps. Based on the analysis in previous sections, we constructed a deep neural network to make ship flow prediction by (1) using ship traffic around the high-density regions selected based on ROIs, and (2) taking the impact of hurricanes into consideration as an additional factor to predict ship flow during extreme weather events. In this framework, we integrated deep learning algorithms of CNN and H3 grid search to support ship flow prediction. We (1) incorporated the Uber's hexagonal hierarchical spatial index method (H3) to partition and organize trajectory points into identifiable grid cells; (2) unitized the deep learning architecture of CNN, which is a type of neural network (NN) to predict ship flows in H3 zones at different scales; and (3) used multiple time-steps and multivariate inputs to train the model. The CNN model has been widely used for forecasting analysis. For instance, Kim and Lee (2018) developed the STENet model based on CNN to predict ship traffic in crowded harbor water areas [32]; Wu and Tan (2016) combined CNN with long short-term memory (LSTM) for traffic prediction [33]; and Ma et al. (2017) used CNN for large-scale transportation network speed prediction [34]. Furthermore, Yu et al. (2019) used a social media dataset to train a CNN model for typhoon disaster assessment [35].
This model extends the basic CNN model so that it contains separate sub-CNN models to process each input variable. We used the actual ship flow and extreme weather data by splitting them into training and testing datasets to train and validate our model. We generated a sub-model for each of the variables (e.g., ship flow in the neighboring cells or the presence/absence of extreme weather events) that took a onedimensional sequence of a pre-defined time window. The total ship flows across the time in each of surrounding regions and the central region can be fed into their own sub-models respectively. Each of the sub-models contains a separate kernel to read the ship flow input sequence onto a separate set of filter

Our Proposed Model
In order to predict total ship flow within H3 cells by capturing the variations of zonal ship flow, we proposed a H3-based multivariate CNN model to extract and predict ship flow patterns using multiple previous time steps. Based on the analysis in previous sections, we constructed a deep neural network to make ship flow prediction by (1) using ship traffic around the high-density regions selected based on ROIs, and (2) taking the impact of hurricanes into consideration as an additional factor to predict ship flow during extreme weather events. In this framework, we integrated deep learning algorithms of CNN and H3 grid search to support ship flow prediction. We (1) incorporated the Uber's hexagonal hierarchical spatial index method (H3) to partition and organize trajectory points into identifiable grid cells; (2) unitized the deep learning architecture of CNN, which is a type of neural network (NN) to predict ship flows in H3 zones at different scales; and (3) used multiple time-steps and multivariate inputs to train the model. The CNN model has been widely used for forecasting analysis. For instance, Kim and Lee (2018) developed the STENet model based on CNN to predict ship traffic in crowded harbor water areas [32]; Wu and Tan (2016) combined CNN with long short-term memory (LSTM) for traffic prediction [33]; and Ma et al. (2017) used CNN for large-scale transportation network speed prediction [34]. Furthermore, Yu et al. (2019) used a social media dataset to train a CNN model for typhoon disaster assessment [35]. This model extends the basic CNN model so that it contains separate sub-CNN models to process each input variable. We used the actual ship flow and extreme weather data by splitting them into training and testing datasets to train and validate our model. We generated a sub-model for each of the variables (e.g., ship flow in the neighboring cells or the presence/absence of extreme weather events) that took a one-dimensional sequence of a pre-defined time window. The total ship flows across the time in each of surrounding regions and the central region can be fed into their own sub-models respectively. Each of the sub-models contains a separate kernel to read the ship flow input sequence onto a separate set of filter maps to learn features from the input time series data. We normalized the ship counts in the surrounding regions due to their high variation. Each sub-model contained two convolutional layers followed by a max-pooling layer as a down-sampling strategy. The first layer has a kernel size of three, whereas the second layer had a kernel size of one [36]. We used ADAM for the optimization algorithm [37]. A regularization technique of early stopping was used to fine-tune the mode. Then, the sub-model summarized the learned features from the sequence and produced a flat vector. All these flat vectors were merged through concatenation and interpreted by a fully connected layer to make a prediction (Figure 8). features from the sequence and produced a flat vector. All these flat vectors were merged through concatenation and interpreted by a fully connected layer to make a prediction. By utilizing the H3 system, we pre-processed the AIS trajectory points based on three steps so the model could run with the dataset: (1) we assigned unique H3 identifiers at different resolutions (level 3 and level 4) to the trajectory points using the H3 system based on their location; (2) calculated the total number of ships in each H3 cell; and (3) calculated the total number of ships moving toward the center cell. Based on the analysis on patterns, we formulated the model to include the following factors (Table 2): Table 2. Summary of the model. To demonstrate the performance of the model, we also selected other models for comparison: the auto regressive integrated moving average (ARIMA) model, lasso model, stochastic gradient descent (SGD) mode, long short-term memory (LSTM) model, convolutional long short-term memory (ConvLSTM) model, and multilayer perceptron (MLP) model [11,16,[38][39][40]. Of the six comparison models, Lasso, ARIMA, and SGD are statistical models. ConvLSTM and LSTM were developed based on a recurrent neural network (RNN) model where the connection between nodes can help them to process a sequence of inputs. The MLP model is a class of feedforward artificial neural network (ANN) model that contains an input, a hidden, and an output layer. We divided each dataset into training data and testing data. Due to the wide range of geolocations and complexity of ship flow patterns, we adopted a variable multiple time-step approach to train the models and find the optimal results. To train and evaluate the model accuracy for predicting a ship flow of certain number of days/hours in the future, we used the past 3-day ship flow data to predict the next 1-day and 3-day ship flow, and used the past 8-hours of ship flow data to predict the ship flow for next 4-hours and 8-hours.
As mentioned earlier, we compared our model with six other machine learning methods. In order to compare the prediction performance, we configured the models for the two groups of tests. We tested the seven predictive models to forecast the total number of ships in the selected central zones for each day/hour By utilizing the H3 system, we pre-processed the AIS trajectory points based on three steps so the model could run with the dataset: (1) we assigned unique H3 identifiers at different resolutions (level 3 and level 4) to the trajectory points using the H3 system based on their location; (2) calculated the total number of ships in each H3 cell; and (3) calculated the total number of ships moving toward the center cell.
Based on the analysis on patterns, we formulated the model to include the following factors ( Table 2): Table 2. Summary of the model. To demonstrate the performance of the model, we also selected other models for comparison: the auto regressive integrated moving average (ARIMA) model, lasso model, stochastic gradient descent (SGD) mode, long short-term memory (LSTM) model, convolutional long short-term memory (ConvLSTM) model, and multilayer perceptron (MLP) model [11,16,[38][39][40]. Of the six comparison models, Lasso, ARIMA, and SGD are statistical models. ConvLSTM and LSTM were developed based on a recurrent neural network (RNN) model where the connection between nodes can help them to process a sequence of inputs. The MLP model is a class of feedforward artificial neural network (ANN) model that contains an input, a hidden, and an output layer. We divided each dataset into training data and testing data. Due to the wide range of geolocations and complexity of ship flow patterns, we adopted a variable multiple time-step approach to train the models and find the optimal results. To train and evaluate the model accuracy for predicting a ship flow of certain number of days/hours in the future, we used the past 3-day ship flow data to predict the next 1-day and 3-day ship flow, and used the past 8-hours of ship flow data to predict the ship flow for next 4-hours and 8-hours.
As mentioned earlier, we compared our model with six other machine learning methods. In order to compare the prediction performance, we configured the models for the two groups of tests. We tested the seven predictive models to forecast the total number of ships in the selected central zones for each day/hour over the next few days/hours by using multiple time-steps as inputs. For the statistical model using Lasso and SGD, we used a recursive forecast strategy by making a prediction and feeding it into the model for subsequent prediction. The MLP model has a 2-layer structure with 50 nodes in the hidden layer. ConvLSTM is a class of LSTM, but LSTM takes multiple variables as inputs for our comparison. We used the root mean square error (RMSE) and mean absolute error (MAE) as evaluation metrics to gauge the prediction accuracy. The RMSE and MAE evaluation matrices are defined as follows: where N is the number of test samples; Predicted is the predicted ship flow value; and Actual denotes the real ship flow value.

Model Evaluation
We conducted two different groups (denoted as "Group 1" and "Group 2") of tests for ship flow prediction analysis. We pre-processed all data by assigning H3 identifiers (at resolutions of level 3 and 4). Group 1 contained the dataset of ship flow at the daily level, whereas Group 2 contained datasets at an hourly level. We selected vessel track points from hurricane events in September 2017 to test the impacts of hurricanes on ship flow prediction. Tests in Group 1 compared the performance of ship flow prediction using H3 hexagons at level 3 and 4 resolution using annual data. Tests in Group 2 compared the performance of ship flow prediction at an hourly level using monthly data. A separate test in Group 2 was conducted to compare the performance of ship flow prediction using H3 hexagons at level 3 resolution by taking the impact of hurricanes into consideration. We tried 10, 20, and 30 for the convolutional layer's filter size, and found that the prediction performance of convolution layer with the filter size 20 was superior to that with 10 and 30. Moreover, as for the learning rate for the Adam optimization algorithm, we tried 0.0001, 0.0002, and 0.001. We found that the model performed better when the learning rate was set to 0.001 when the model used level 3 hexagons, and 0.0001 when using level 4 hexagons. For the model input data, we found that the model performed better using the number of ships in the surrounding hexagons for daily-level prediction and the number of ships moving toward the center in the surrounding hexagon for hourly-level prediction. This is probably because ships can move a longer distance and demonstrate more complicated moving patterns at daily levels, thus using a simple ship direction calculation may be insufficient to capture the changes, thus affecting the accuracy of prediction. We adopted the optimal configurations described above for the two groups of tests.

Group 1: Daily Ship Flow Prediction
For the daily ship flow prediction, we separated the annual ship flow data and used the first 300 days for training and the last 65 days for testing. Eleven sets of daily traffic flow data at the H3 level 3 resolution and 21 sets of daily traffic flow data at the H3 level 4 resolution were selected. In this test, we used seven different models to predict ship flows for the next day and three days in the central zone based on ship flows in the surrounding hexagonal zones. We used an input of 3-day of ship flow from the historical data for model training. The test results showed that using the mh-CNN method could significantly improve the ship flow prediction accuracy (see complete results in Appendix A). The average MAE was about 11.65 for 1-day prediction and 13.75 for 3-day prediction, whereas the average RMSE was about 15.01 for 1-day prediction and 15.20 for 3-day prediction (Appendix A). Although the mh-CNN method outperformed the other methods in general, there were some cases where this method did not perform on par to the others. This is probably because the ship flow can be affected by many other factors and using the surrounding zone's ship flow is insufficient to reflect and capture the impacts on the ship flow in the central zone. Furthermore, for about 48% of cases, mh-CNN outperformed other models, which was the highest record.
As we increased the hexagon resolution to level 4 (Appendix B), using the mh-CNN method could significantly improve the ship flow prediction accuracy (see Appendix B for the complete results). The average MAE was about 7.38 for 1-day prediction and 8.77 for 3-day prediction, whereas the average RMSE was about 9.54 for 1-day prediction and 11.38 for 3-day prediction. In about 30% of the cases mh-CNN outperformed other models. Although the performance decreased slightly when compared to the level 3 cases, mh-CNN was still better than the other models.

Group 2: Hourly Ship Flow Prediction
In this group of tests, we compared the model performance for these two cases: (1) Four sets of hourly traffic flow data in September at level 3 resolution, and (2) one set of hurricane traffic flow data at the H3 level 3 resolution were selected (Figure 1). Level 3 hexagons were used to test the model performance at variable spatial scales. Ships usually cannot move over such a large area within a short period of time. In this group of tests, we used seven different models to predict ship flows for the next four hours and eight hours in the central zone based on ship flows moving toward the central zone in the surrounding hexagons. We used the first 600 hours of ship flow for model training and the remaining 120 hours of ship flow for testing. In addition, we also prepared a dataset to test the model performance during the hurricane period. We used the h3 hexagons to establish a search radius of five. A separate column of binary data (one for present and 0 for absent) was recorded to see if extreme weather events were present in the search area and created an additional sub-CNN model.
Results showed that using the mh-CNN method did not outperform other prediction models for hourly prediction using monthly data (Appendices C and F). This is probably because there is less variation of ship flow at the hourly level. Therefore, it is more difficult for mh-CNN to detect pattern changes in an accurate way when compared to the other models. Nevertheless, the mh-CNN still performed reasonably well as the RMSE and MAE results were very close to the other deep-learning models. This shows that it can still more accurately predict than the LSTM-based models and traditional statistical methods using Lasso or SGD. For the hurricane impact test, using the mh-CNN method performed better than other models for the next 4-hour and 8-hour forecasting. This is probably due to the greater change caused by the impact of hurricanes where it is more difficult for other models to capture the variation of ship flow patterns over time.

Discussion and Recommendations
In summary, our approach introduced performance gains in most of the experiences as summarized in Table 3. Our results confirm that in combination with the daily ship flow from surrounding H3 hexagonal cells, our mh-CNN model can accurately capture the ship flow variation at multi-spatiotemporal scales. Our mh-CNN model significantly outperformed other modes for long-term prediction (e.g., one day and three days or four hours and eight hours of prediction in our case). However, when the hourly data were applied to our model for hourly prediction based on general monthly data, the model showed an average performance when compared to the other classic models. Nevertheless, the model performance increased significantly when we took the presence of hurricanes into consideration for ship flow prediction during extreme weather events. The model showed much better performance for predicting the ship flow in the next four and eight hours when the region is affected by extreme weather events by creating a sub-model based on the search for the presence of hurricanes. Table 3. Model performance and recommendation (n/a: tests are not available).

3-day 2-day 1-day 4 hours 8 hours
Normal Model + Normal Days mh-CNN mh-CNN mh-CNN mh-CNN mh-CNN Normal Model + Hurricane Days n/a n/a n/a ARIMA MLP Hurricane Model + Hurricane Days n/a n/a n/a mh-CNN mh-CNN

Conclusions and Future Work
In this paper, we present a multiple hexagon-base CNN approach to predict ship flow by taking different variables into consideration so that it can be applied to regions involving major shipping activities. The forecast model implementation involves three major steps of data processing including (1) data partition using H3 hexagons at various resolution, (2) preparing sub-CNN models to process the inputs from each hexagonal cell, and (3) searching extreme weather events to capture the changes and handle the spatiotemporal variations of ship movements. According to the test results, our approach can significantly outperform other state-of-the-art forecasting models for long-term hourly and daily ship flow predictions. Our hexagonal based machine learning approach outperformed these comparison models for predicting more drastic ship flow change during the hurricanes to addresses the gap of existing models in predicting traffic flow in open water under extreme weather conditions. Nevertheless, due to its complexity, this model may require reconfiguration to optimize its performance for making forecasts in a specific region.
In the future, there are several options that can take this research to the next level: (a) using H3 hexagons with different resolutions to test the model performance at other scales; (b) collecting more historical data to forecast for longer periods of time in the future; (c) using a hybrid deep-learning model to improve the accuracy; and (d) using more detailed extreme weather events data to fine-tune the model.