Container Shipment Demand Forecasting in the Australian Shipping Industry: A Case Study of Asia–Oceania Trade Lane

: Demand forecasting has a pivotal role in making informed business decisions by predicting future sales using historical data. Traditionally, demand forecasting has been widely used in the management of production, stafﬁng and warehousing for sales and marketing data. However, the use of demand forecasting has little been studied in the container shipping industry. Improved visibility into the demand for container shipments has been a long-held objective of industry stakeholders. This paper addresses the shortcomings of both short-term and long-term shipment demand forecasting for the Australian container shipping industry. In this study, we compare three forecasting models, namely, the seasonal auto-regressive integrated moving average ( SARIMA ), Holt–Winters’ seasonal method and Facebook’s Prophet, to ﬁnd the best ﬁtting model for short-term and long-term import demand forecasting in the Australian shipping industry. Demand data from three years, i.e., 2016–2018, is used for the Asia–Oceania trade lane. The mean absolute percentage error ( MAPE ), root mean squared error ( RMSE ) and 2-fold walk-forward cross-validation are used for the model evaluation. The experiment results observed from the selected metrics suggest that Prophet outperforms the other models in its comparison for container shipment demand forecasting.


Introduction
Requirements for transportation arise due to the need for the movement of goods from one place to another depending upon the consumer demand. Transport through the sea is the cheapest mode of transportation and is the oldest one. The economic growth of a country relies on the success of its shipping industry [1]. However, the global shipping supply chain is intricate with container demand that is exceptionally seasonal and driven by buyer-related occasions such as Christmas, New Year and Easter. These factors are additionally complicated by extreme weather events and changes to the geopolitical regulatory environment affecting trade such as the US-China trade war. Contrary to its importance, the global shipping industry lacks digitization and hence visibility into industry statistics such as the real-time availability of supply and demand. Without the real-time visibility of the market, market-wide information is limited. Such ignorance of market information can result in poor procurement and pricing decisions [2]. Providing real-time visibility into the current and future container shipment demand gives a market-wide view of the industry. This can eliminate the risks introduced by the volatile nature of shipment demand but can also offer the opportunity to make an informed decision regarding revenue and infrastructure development.
Demand forecasting in the supply chain helps to maintain stock optimization and cost reduction and can also enhance sales as well as customer loyalty. However, the forecasting of demand in various domains still is an unattended issue [3]. Despite the importance of demand forecasting in such an important industry, there is scant research in this domain. There is an adequate amount of research and its application in demand forecasting for sales, electricity consumption and housing but not enough analysis has been rendered for forecasting shipment demand in the shipping industry.
The aim of this study is to forecast container shipment demand for both the short-term and long-term. The dataset is sourced from the five major Australian ports operating internationally. In applied research, the gathering of a real-time dataset is a challenge. The real-time dataset (the Asia-Oceania trade lane) was chosen and made available by our industry partner, the Mizzen group, a digital pricing and rate management solution. The rest of this paper is organized as follows. In Section 2, a literature review in demand forecasting is summarized. Section 3 describes the research design. The experimental results and evaluation are discussed in Section 4. Section 5 outlines the conclusion and future directions.

Literature Review
Machine learning (ML) aids decision-making in almost every sector of life such as business, industrial engineering, medicine, physics and statistics [2]. However, there is no ML model that can forecast shipment demand to help make informed decisions. As seasonal variations drive the shipping industry, the prediction of future events is highly time dependent.
Carrying out forecasts from time series data has been a general problem for a long time. This is because a time series allows us to predict future values depending on the components of the series from the historical data. The moving average is the simplest forecasting method. It calculates the average sample observation and provides forecasts for the next period based on the calculated average. For each new sample, there is a newly calculated average and the previous one is removed. Thus, a forecast is computed for every new data observation [4]. The method can generate entirely accurate forecasts for a time series with regular trends. A series where trends change with time may provide false forecasts [5]. The time series models can be categorized into three major classes, as shown in Figure 1.

Statistical Models
Statistical models are the mathematical models that utilize a set of statistical assumptions underlying the data samples to provide forecasting. The weighted moving average is a variant of a simple moving average. In this method, weights are assigned to the most critical period. The higher the weights, the more critical the data values. This method is more sensitive to trends [5]. Simple exponential smoothing (SES) assumes that forecasted data have fluctuations around a constant level over time [6]. A variant of exponential smoothing is Holt-Winters' non-seasonal method. It includes a trend term that measures the expected increases or decreases per unit period at the local mean level. Holt-Winters' seasonal method is an extension of Holt-Winters' non-seasonal method. A smoothing factor for each period of the year is added to adjust the forecast according to the expected seasonal fluctuation [7]. Box and Jenkins in the 1970s presented auto-regressive (AR) and moving average (MA) models for time series predictions [8]. AR considers the current values of a time series as the linear combination of its past values. However, MA is a function of random interference that affects the series. The proposed models proved to be quite useful for predictions in their initial era. As the research continued, it was noticed that there are situations where the time series does not follow linear trends. Thus, a range of new models has been presented to cater for these needs [9]. The auto-regressive integrated moving average (ARIMA) is most used for time series forecasting [9,10]. ARIMA exploits the dependency between an observation and a residual error from a moving average model applied to a lagged observation. It does so by utilizing the relationship between observations and lagged observations. It makes the time series stationary by subtracting an observation from a previous one. The ARIMA model has variants such as the seasonal ARIMA (SARIMA), which caters for the seasonal variances in a time series, and ARIMAX, which handles the covariance of the data points in a time series.

Hybrid Models
Hybrid models are a combination of machine learning models and statistical models to cater for both linear and non-linear data more effectively. Artificial neural networks (ANNs) [11,12] have also been found to be very efficient for catering for the non-linearity of a time series. Support vector regression (SVR) can also handle the non-linear part of the time series well. Ebrahimian et al. [11] presented a novel method for energy demand prediction using SARIMA and support vector regression was performed. SARIMA handled the linear data component and SVR handled the non-linear data components. Hybrid linear and non-linear models have also been employed for time series forecasting. Much focus has been on ANN and ARIMA models. ARIMA models handle the linear data components and ANN models handle the non-linear parts. In [9], ANNs were used one month ahead of a price prediction in the liner shipping industry. The liner shipping industry is volatile and is impacted by seasonal variations, public holidays and travel routes. According to [13], ANNs can handle the volatility of the shipping industry and provide promising forecasts. However, ANNs suffer from overfitting problems. In [14], a new regression-based model was designed to forecast shipping container volumes, i.e., supply. The author claims that the designed regression model can cater for the non-stationary parts of a time series. In [15], SVMs are used to forecast a dry bulk freight index.

Deep Learning-Based Models
A new area of deep learning has been explored for a time series analysis [10]. Spot electricity prices are predicted using deep learning methods. The author proposed four deep learning models to perform a time series analysis for spot electricity price prediction, which included deep neural networks (DNNs), a hybrid long short-term memory DNN (LSTM-DNN), a hybrid GRU-DNN and convolution neural networks (CNNs). The study inferred that deep learning methods outperform statistical and ANN-based models. In [12], research was conducted on Bitcoin price prediction by comparing LSTM, RNN and ARIMA. The results from the deep learning models compared to the other class of models are more promising.

Comparative Analysis of Existing Time Series Techniques
Based on the literature review in regard to time series forecasting models, it is evident that plenty of work has been done in varied domains to perform predictions based on time series. Various models from different classes have been designed and applied in different domains. However, the application of any of the time series models are still scant and there are a limited number of models that are applied to/designed for this domain. To the best of our knowledge, there exists no forecasting capability in the industry that can predict future demand based on seasonality and trends from past data. In order to cover this gap, we applied existing time series forecasting models that can handle trends and seasonality that are present inherently in the dataset. Table 1 shows the existing time series forecasting models and their application in the container shipping industry.

Research Design
Although researchers are actively working on forecasting demand for various domains such as sales and electricity from the last few decades, demand forecasting for container shipment is still in the shadows. To our knowledge, there are no prominent studies that can assist in forecasting container shipment demand.

Methodology
To address the gap, as mentioned earlier, we performed a comparative study between two state-of-the-art time series models, namely, SARIMA and Facebook's Prophet, to forecast container shipment demand for both the short-term and long-term in the Australian container shipping industry. Three years of historical demand data, i.e., 2016-2018, were collected from five international Australian ports for the Asia-Oceania trade lane [16][17][18][19][20]. The root mean squared error (RMSE) and the mean average percentage error (MAPE) were used as evaluation metrics.

Data Sourcing and Cleansing
This section explains the data sourcing and cleansing to make them suitable for shortterm demand forecasting. To our knowledge, there exists no dataset that can provide insights into the shipment demand in the Australian shipping industry. We started by collecting the real-time shipment demand datasets from five international ports to form a consolidated demand dataset. These ports were the Port of Melbourne [16], Fremantle Port [17], Flinders Port [19], the Port of Brisbane [18] and Port Botany [20], as shown in Figure 2. The detailed methodology for the shipment demand data collection and cleansing can be seen from our research work presented in [21].  Figure 3 explains the data cleansing process of the data sources to get the trade lanespecific dataset ready to be used in the machine learning algorithms. The initial step of the data cleansing was to select the time horizon. Historic data from 2016-2018 were extracted. This selected dataset consisted of data from all the trade lanes operating from Australia (import and export). As this research study aimed to target only Asia-Oceania trade, trade lane-specific data to and from the Asia-Oceania trade lane were segregated. The trade dataset acquired was the combination of full and empty containers coming in to and going out of the Australian ports over the Asia-Oceania trade lane. Full containers are the number of containers (measured in TEUs) coming in and out filled with goods. In contrast, empty containers are the ones that are emptied before entering into Australian water or are moved back without goods filled in them. However, in the total shipment demand, both the filled and empty containers are equally important. The feature in the dataset is shown in Table 2. In the table, the date refers to the first day of the week and provides an insight for the same whole week. Hence, for both imports and exports, the dataset contains the total number of filled incoming (inbound) and outgoing (outbound) containers as well as empty containers. As the scope of this study was demand forecasting for Asia-Oceania trade lane imports only, we filtered the imports and exports in the next step. Table 3 shows the dataset features for the Asia-Oceania trade lane imports. The total shipment import demand can be calculated by adding the total incoming empty containers (Empty Containers ) and the total incoming full containers (Full Containers ). This can be expressed as Equation (1). Features from the final dataset are shown in Table 4. Import Demand Total = Empty Containers + Full Containers . (1)

Missing Value Handling
Missing values were handled by filling the average shipment demand from the previous and following year's shipment demand at similar timestamps [22]. By performing a box plot analysis, it was evident that no such outliers could affect the performance of the models. Figure 4 shows the box plot of the demand dataset.   Figure 5) and Figure 6 shows the data description of the demand dataset.

Test-Train Split
Once the missing values were filled, the dataset was divided into two parts: the train and the test dataset (see Figure 7). A total of 70% of the data were used for training the model; the remaining 30% of the data were used for testing the model. In time series models, the usual test-train split does not work as the values are dependent on time [23]. Performing random partitioning can cause misleading results. Hence, we selected a cut-off date that corresponded with approximately 70% of the dataset, i.e., February 2018 (see the vertical red line in Figure 7), for the training data to capture enough seasonality and trends of the time series under observation and use the rest of the data as test data.

Forecasting Models
Three state-of-the-art time series models were selected to forecast shipment demand. These were SARIMA, Holt-Winters' seasonal method and Prophet. All these models were capable of a time series analysis using the seasonality present in the historical data.

Seasonal Auto-Regressive Integrated Moving Average (SARIMA)
SARIMA is an extension of the auto-regressive integrated moving average (ARIMA) [24] with an additional capability of handling seasonality in a time series. Hence, we used SARIMA to solve our research problem as it added three more parameters than ARIMA to cater for the seasonality in the time series. Mathematically, m is the seasonality pattern and p is the number of lag observations extracted through the partial auto-correlation function (PACF) plot: SARI MA = (p, d, q)(P, D, Q) m .
PACF may be considered to be the partial correlation between the series and its lag values, which is difficult to explain by realizing their mutual correlation. The d is the number of time differences to be calculated to make the time series stationary and q is the size of the moving window set by the ACF value. The series with no trend has d = 1. However, if the series has a trend d ≥ 1, P is a seasonal auto-regressive order and D is the seasonal difference order. If the series has a stable seasonal trend, then D = 1. If the seasonal pattern is unstable, then D = 0. Q is the seasonal moving average. Both P and Q are set by the ACF plots [24]. To find the optimal parameter, we used a grid search to determine the value for both (p, d, q) and (P, D, Q). To determine (p, d, q)(P, D, Q) m values, we used the grid search method [24]. The selection criterion was the AIC value. Table 5 shows the parameters used in this research to fit SARIMA for container shipment demand.

Holt-Winters' Seasonal Method
Holt-Winters' methods are suitable for data with trends and seasonality [25]. Similar to SARIMA, it has two variants: additive and multiplicative. Mathematically, it can be written as Equation (3) where, L t is the level equation that depicts the weighted average among the seasonal and non-seasonal forecasts, kb t , represents the trend of the data and S t+k−s represents the seasonal patterns and can be calculated using Equations (4)-(6) respectively. The coefficients α, β and γ are smoothing factors and their values lie between 0 and 1.
The parameters used to fit Holt-Winters' seasonal methods are shown in Table 6.

Facebook's Prophet
Facebook's Prophet is an open-source time series forecasting model [26]. It is designed to have in-built parameters that can be adjusted without going deep into the model's implementation details. At the core of the model, a decomposable time series model runs. These parameters include trends, seasonality and holidays. Mathematically, it can be written as: In (7), g(t) is a piece-wise linear or logistic growth curve, s(t) is a periodic change (e.g., weekly/yearly seasonality), h(t) is a user-provided holiday effect and ∈ (t) is an error term accounting for unusual changes. The role of the domain expert is significant in every phase of the modelling. The domain expert can tweak the Fourier order and identify whether the details present in the data points are noise or a trend. As Prophet treats forecasting as a curve-fitting problem, it is inherently robust to outlier and missing data. Custom-defined holidays can also be used when fitting this model. This capability has not been provided by any of the existing algorithms yet. Hence, we defined the custom holidays as desired by our industry partner [27]. Table 7 shows the parameters of Prophet used to fit the model for container shipment demand forecasting. Table 8 shows the list of holidays used in this research.

Experimental Results and Discussion
This section explains the experimental results achieved by applying the selected models over the shipping datasets. The first task was to analyze the dataset to understand the time series under observation. The primary stage was to perform a time series decomposition. Figure 8 shows the decomposition of the demand data time series into different components observed: the trend, cycle or seasonality and the residual. The observed part of the time series depicted how the time series was viewed as a combination of its residual, trend and seasonality (see Figure 8a). The time series trend explained the possible variation of the variable under observation with respect to time (see Figure 8b). However, the seasonal component depicted the existing seasonality in the time series (see Figure 8c). The residual was the leftover of the time series that could be considered to be noise after the fitting of the dataset in a model. However, at this stage, the residual might be considered to be noise in the time series that was added up in the observed time series (see Figure 8d). From Figure 8, it is evident that the shipment demand dataset was non-stationary in nature. The overall trend of the dataset for the whole time series showed a sudden decrease in the shipment demand after the first quarter of 2017, which gradually increased throughout 2018. However, there was no effect of holidays over the shipment demand. This is understandable, as explained by our industry partner, the Mizzen group [27]. According to industry experts, the shipping industry does not close on any day. When we looked at the weekly trend of the shipment demand (see Figure 9), at the start of every week (i.e., on Monday), demand suddenly rose and then became consistent after a sudden fall on Tuesday. Moreover, looking at the shipment demand dataset's monthly trend graphs, it was evident that the shipment demand started to rise from July until November and then started falling, reaching a minimum value in February.
The change point analysis over the shipping dataset shown in Figure 10 presents the change point present in the dataset under research. The graph depicts that there was a change in demand between May 2016 and September 2017. However, the changes were well within the range of the shipping infrastructure, hence suggesting a linear growth of demand. Figure 11 demonstrates the auto-correlation graph (ACF) of the shipping dataset. From the ACF plot, it is clear that there was an apparent relation between the past and previous values of the shipment demand. However, the values toggled between positive and negative values; these were due to the trends in the dataset and seasonality.
Once the time series dataset was analyzed, SARIMA was trained over the training data using the parameters shown in Table 5. Once the model was trained, the trained model was tested over the test data. As explained earlier, ARIMA's configuration requires the finding of an optimal combination, which was achieved by performing a grid search in our research. The minimum AIC value achieved was used to fit the model. For our dataset, (1,1,1) (0,1,1)12 provided the minimum AIC value and was therefore used in this research. Finally, demand forecasts were made using the trained model for both the short-term (i.e., 6 weeks) and the long-term (i.e., 52 weeks). Figure 12a shows the test forecast provided by SARIMA. The train and test forecast can be seen in Figure 12b. The short-term and long-term forecasts are shown in Figure 12c,d, respectively.   Further continuing the experiment, Holt-Winters' seasonal model was trained. From the decomposition of the time series, it was evident that the dataset was seasonal in nature and was suitable for additive seasonality. The parameters are shown in Table 6. The test forecast provided by Holt-Winters' seasonal model is shown in Figure 13a. The forecast over both the train and test data can be seen in Figure 13b. The short-term (i.e., 6 weeks) and long-term (i.e., 52 weeks) demand forecast are shown in Figure 13c,d, respectively.  Finally, Prophet was trained using the parameters tabulated in Tables 7 and 8. Prophet offers two additional flexible tuning parameters for researchers, i.e., custom-defined holidays and a custom change point definition. In addition to this, the model provides the ability to impose a dominant or recessive effect of holidays based on the requirements by modifying the parameter name 'Holiday Prior Scale' (see Table 7). The greater the values of the said variable, the greater the effect of the holidays over the forecasts. Furthermore, the model provides an edge over other models by offering the control over weekly, monthly and yearly seasonality. Based on the dataset, the same could be set as true and false (see Table 7). The test forecast provided by Prophet is shown in Figure 14a. The collective train and test forecast are shown in Figure 14b. The short-term (i.e., 6 weeks) and long-term (i.e., 52 weeks) forecast are presented in Figure 14c,d, respectively.

Evaluation and Discussion
An accurate valuation is fundamental to conclude the best fit model. We selected the root mean squared error (RMSE) and mean absolute percentage error (MAPE) to evaluate the performance of the selected models. The RMSE can be computed using Equation (8) given below.
In (8), f is the forecast and o is the observed value. The MAPE measures the forecast accuracy as a percentage. The MAPE can be calculated as Equation (9).
In (9), A t is the actual value and F t is the forecast. Figure 15 shows the comparison of the RMSE achieved by the selected models (for both the short-term and long-term). Figure 14 shows the RMSE achieved using the train and test data. Prophet offered a lesser RMSE for both the train and test data compared with its competitive models, i.e., SARIMA and Holt-Winters' seasonal model. However, Holt-Winters' seasonal model provided a lesser RMSE than SARIMA over the training dataset but lost its performance over the test data. Thus, it could be concluded that Prophet outperformed both SARIMA and Holt-Winters' seasonal model.   This conclusion was supplemented by looking at the comparative MAPE values of the selected algorithms, as presented in Figure 16. Prophet reached approximately a 4% and 11% MAPE for the train and test datasets, respectively. On the other hand, the train and test MAPEs provided by SARIMA were 62% and 36%, respectively, and Holt-Winters' seasonal method attained a 17% and 12% MAPE for the train and test, respectively. Henceforth, it was concluded that Prophet outperformed both Holt-Winters' seasonal method and SARIMA.  Table 9 shows the experiment results comparing the RMSE and MAPE for both the test and train datasets for SARIMA, Prophet and Holt-Winters' seasonal method. Finally, we performed a walk-forward cross-validation to analyze the best fitted model. As the shipping dataset we used as our source was for three years, only a 2-fold crossvalidation could be performed. To do so, we divided the dataset into two groups, i.e., train and validation sets. Figure 17 shows the dataset division into train (in blue) and the validation dataset (in red). The average RMSE for all the three selected algorithms was observed. The results from the 2-fold walk-forward cross-validation are tabulated in Table 10. The observation of the average RMSE values depicted similar results to those explained earlier in this section. The walk-forward 2-fold cross-validation showed that Prophet surpassed both SARIMA and Holt-Winters' seasonal method.

Conclusions and Future Work
In this research, three time series forecasting models were applied on a real-time shipping dataset to forecast short-term and long-term shipment demand forecasting for the Australian shipping industry, specifically for Asia-Oceania trade lanes (imports only). Shipment demand forecasting was performed concerning the time in historical data while incorporating seasonality and volatility. To the analyses, the performance of the selected models and the MAPE and RMSE values were observed. The evaluation results suggest that Facebook's Prophet outperformed Holt-Winters' seasonal method and SARIMA for both short-term and long-term demand forecasting by offering lesser RMSE and MAPE values on both the train and test datasets. In addition, Prophet also offered the flexibility of incorporating custom-designed holidays into its forecasting results.
The study of the existing literature revealed that there is limited work on forecasting shipment demand in the shipping industry. In the absence of real-time reflectivity into the operations of the container shipping supply chain, the industry is experiencing a worthwhile loss of revenue. Hence, we selected three state-of-the-art time series forecasting models to forecast container shipment demand in both the short-term and long-term, consequently providing a real-time insight into the future shipment demand for making informed pricing and planning decisions. This research work is one of the primary studies performed so far for the Australian shipping industry.
There exist a few limitations to this research that can offer future research directions. We forecasted the shipment demand for a single trade lane's import and with just two variables as desired by the research partner (due to their business requirement). The forecasting can be expanded for other operating trade lanes for both imports and exports and with more complex variables. Apart from other time series algorithms, deep learning models can also be applied to the dataset to determine the performance and find an even better performing forecasting model.

Data Availability Statement:
The dataset is propriety to Mizzen and their allied industry partners. Hence, we have not provided the data for public access.