Container Shipment Demand Forecasting in the Australian Shipping Industry: A Case Study of Asia–Oceania Trade Lane

Ubaid, Ayesha; Hussain, Farookh; Saqib, Muhammad

doi:10.3390/jmse9090968

Open AccessCase Report

Container Shipment Demand Forecasting in the Australian Shipping Industry: A Case Study of Asia–Oceania Trade Lane

by

Ayesha Ubaid

^*

,

Farookh Hussain

and

Muhammad Saqib

Centre for Artificial Intelligence, School of Computer Science, University of Technology Sydney, 15 Broadway, Ultimo 2007, Australia

^*

Author to whom correspondence should be addressed.

J. Mar. Sci. Eng. 2021, 9(9), 968; https://doi.org/10.3390/jmse9090968

Submission received: 3 August 2021 / Revised: 31 August 2021 / Accepted: 31 August 2021 / Published: 6 September 2021

(This article belongs to the Special Issue Short Sea Shipping, Multimodality, and Sustainable Maritime Transportation)

Download

Browse Figures

Versions Notes

Abstract

:

Demand forecasting has a pivotal role in making informed business decisions by predicting future sales using historical data. Traditionally, demand forecasting has been widely used in the management of production, staffing and warehousing for sales and marketing data. However, the use of demand forecasting has little been studied in the container shipping industry. Improved visibility into the demand for container shipments has been a long-held objective of industry stakeholders. This paper addresses the shortcomings of both short-term and long-term shipment demand forecasting for the Australian container shipping industry. In this study, we compare three forecasting models, namely, the seasonal auto-regressive integrated moving average (SARIMA), Holt–Winters’ seasonal method and Facebook’s Prophet, to find the best fitting model for short-term and long-term import demand forecasting in the Australian shipping industry. Demand data from three years, i.e., 2016–2018, is used for the Asia–Oceania trade lane. The mean absolute percentage error (MAPE), root mean squared error (RMSE) and 2-fold walk-forward cross-validation are used for the model evaluation. The experiment results observed from the selected metrics suggest that Prophet outperforms the other models in its comparison for container shipment demand forecasting.

Keywords:

forecasting; shipping industry; SARIMA; Prophet; short-term; long-term; Holt–Winters’ seasonal method

1. Introduction

Requirements for transportation arise due to the need for the movement of goods from one place to another depending upon the consumer demand. Transport through the sea is the cheapest mode of transportation and is the oldest one. The economic growth of a country relies on the success of its shipping industry [1]. However, the global shipping supply chain is intricate with container demand that is exceptionally seasonal and driven by buyer-related occasions such as Christmas, New Year and Easter. These factors are additionally complicated by extreme weather events and changes to the geopolitical regulatory environment affecting trade such as the US–China trade war. Contrary to its importance, the global shipping industry lacks digitization and hence visibility into industry statistics such as the real-time availability of supply and demand. Without the real-time visibility of the market, market-wide information is limited. Such ignorance of market information can result in poor procurement and pricing decisions [2]. Providing real-time visibility into the current and future container shipment demand gives a market-wide view of the industry. This can eliminate the risks introduced by the volatile nature of shipment demand but can also offer the opportunity to make an informed decision regarding revenue and infrastructure development.

Demand forecasting in the supply chain helps to maintain stock optimization and cost reduction and can also enhance sales as well as customer loyalty. However, the forecasting of demand in various domains still is an unattended issue [3]. Despite the importance of demand forecasting in such an important industry, there is scant research in this domain. There is an adequate amount of research and its application in demand forecasting for sales, electricity consumption and housing but not enough analysis has been rendered for forecasting shipment demand in the shipping industry.

The aim of this study is to forecast container shipment demand for both the short-term and long-term. The dataset is sourced from the five major Australian ports operating internationally. In applied research, the gathering of a real-time dataset is a challenge. The real-time dataset (the Asia–Oceania trade lane) was chosen and made available by our industry partner, the Mizzen group, a digital pricing and rate management solution. The rest of this paper is organized as follows. In Section 2, a literature review in demand forecasting is summarized. Section 3 describes the research design. The experimental results and evaluation are discussed in Section 4. Section 5 outlines the conclusion and future directions.

2. Literature Review

Machine learning (ML) aids decision-making in almost every sector of life such as business, industrial engineering, medicine, physics and statistics [2]. However, there is no ML model that can forecast shipment demand to help make informed decisions. As seasonal variations drive the shipping industry, the prediction of future events is highly time dependent.

Carrying out forecasts from time series data has been a general problem for a long time. This is because a time series allows us to predict future values depending on the components of the series from the historical data. The moving average is the simplest forecasting method. It calculates the average sample observation and provides forecasts for the next period based on the calculated average. For each new sample, there is a newly calculated average and the previous one is removed. Thus, a forecast is computed for every new data observation [4]. The method can generate entirely accurate forecasts for a time series with regular trends. A series where trends change with time may provide false forecasts [5]. The time series models can be categorized into three major classes, as shown in Figure 1.

2.1. Statistical Models

Statistical models are the mathematical models that utilize a set of statistical assumptions underlying the data samples to provide forecasting. The weighted moving average is a variant of a simple moving average. In this method, weights are assigned to the most critical period. The higher the weights, the more critical the data values. This method is more sensitive to trends [5]. Simple exponential smoothing (SES) assumes that forecasted data have fluctuations around a constant level over time [6]. A variant of exponential smoothing is Holt–Winters’ non-seasonal method. It includes a trend term that measures the expected increases or decreases per unit period at the local mean level. Holt–Winters’ seasonal method is an extension of Holt–Winters’ non-seasonal method. A smoothing factor for each period of the year is added to adjust the forecast according to the expected seasonal fluctuation [7]. Box and Jenkins in the 1970s presented auto-regressive (AR) and moving average (MA) models for time series predictions [8]. AR considers the current values of a time series as the linear combination of its past values. However, MA is a function of random interference that affects the series. The proposed models proved to be quite useful for predictions in their initial era. As the research continued, it was noticed that there are situations where the time series does not follow linear trends. Thus, a range of new models has been presented to cater for these needs [9]. The auto-regressive integrated moving average (ARIMA) is most used for time series forecasting [9,10]. ARIMA exploits the dependency between an observation and a residual error from a moving average model applied to a lagged observation. It does so by utilizing the relationship between observations and lagged observations. It makes the time series stationary by subtracting an observation from a previous one. The ARIMA model has variants such as the seasonal ARIMA (SARIMA), which caters for the seasonal variances in a time series, and ARIMAX, which handles the covariance of the data points in a time series.

2.2. Hybrid Models

Hybrid models are a combination of machine learning models and statistical models to cater for both linear and non-linear data more effectively. Artificial neural networks (ANNs) [11,12] have also been found to be very efficient for catering for the non-linearity of a time series. Support vector regression (SVR) can also handle the non-linear part of the time series well. Ebrahimian et al. [11] presented a novel method for energy demand prediction using SARIMA and support vector regression was performed. SARIMA handled the linear data component and SVR handled the non-linear data components. Hybrid linear and non-linear models have also been employed for time series forecasting. Much focus has been on ANN and ARIMA models. ARIMA models handle the linear data components and ANN models handle the non-linear parts. In [9], ANNs were used one month ahead of a price prediction in the liner shipping industry. The liner shipping industry is volatile and is impacted by seasonal variations, public holidays and travel routes. According to [13], ANNs can handle the volatility of the shipping industry and provide promising forecasts. However, ANNs suffer from overfitting problems. In [14], a new regression-based model was designed to forecast shipping container volumes, i.e., supply. The author claims that the designed regression model can cater for the non-stationary parts of a time series. In [15], SVMs are used to forecast a dry bulk freight index.

2.3. Deep Learning-Based Models

A new area of deep learning has been explored for a time series analysis [10]. Spot electricity prices are predicted using deep learning methods. The author proposed four deep learning models to perform a time series analysis for spot electricity price prediction, which included deep neural networks (DNNs), a hybrid long short-term memory DNN (LSTM-DNN), a hybrid GRU-DNN and convolution neural networks (CNNs). The study inferred that deep learning methods outperform statistical and ANN-based models. In [12], research was conducted on Bitcoin price prediction by comparing LSTM, RNN and ARIMA. The results from the deep learning models compared to the other class of models are more promising.

2.4. Comparative Analysis of Existing Time Series Techniques

Based on the literature review in regard to time series forecasting models, it is evident that plenty of work has been done in varied domains to perform predictions based on time series. Various models from different classes have been designed and applied in different domains. However, the application of any of the time series models are still scant and there are a limited number of models that are applied to/designed for this domain. To the best of our knowledge, there exists no forecasting capability in the industry that can predict future demand based on seasonality and trends from past data. In order to cover this gap, we applied existing time series forecasting models that can handle trends and seasonality that are present inherently in the dataset.

Table 1 shows the existing time series forecasting models and their application in the container shipping industry.

3. Research Design

Although researchers are actively working on forecasting demand for various domains such as sales and electricity from the last few decades, demand forecasting for container shipment is still in the shadows. To our knowledge, there are no prominent studies that can assist in forecasting container shipment demand.

3.1. Methodology

To address the gap, as mentioned earlier, we performed a comparative study between two state-of-the-art time series models, namely, SARIMA and Facebook’s Prophet, to forecast container shipment demand for both the short-term and long-term in the Australian container shipping industry. Three years of historical demand data, i.e., 2016–2018, were collected from five international Australian ports for the Asia–Oceania trade lane [16,17,18,19,20]. The root mean squared error (RMSE) and the mean average percentage error (MAPE) were used as evaluation metrics.

3.2. Data Sourcing and Cleansing

This section explains the data sourcing and cleansing to make them suitable for short-term demand forecasting. To our knowledge, there exists no dataset that can provide insights into the shipment demand in the Australian shipping industry. We started by collecting the real-time shipment demand datasets from five international ports to form a consolidated demand dataset. These ports were the Port of Melbourne [16], Fremantle Port [17], Flinders Port [19], the Port of Brisbane [18] and Port Botany [20], as shown in Figure 2. The detailed methodology for the shipment demand data collection and cleansing can be seen from our research work presented in [21].

Figure 3 explains the data cleansing process of the data sources to get the trade lane-specific dataset ready to be used in the machine learning algorithms. The initial step of the data cleansing was to select the time horizon. Historic data from 2016–2018 were extracted. This selected dataset consisted of data from all the trade lanes operating from Australia (import and export). As this research study aimed to target only Asia–Oceania trade, trade lane-specific data to and from the Asia–Oceania trade lane were segregated.

The trade dataset acquired was the combination of full and empty containers coming in to and going out of the Australian ports over the Asia–Oceania trade lane. Full containers are the number of containers (measured in TEUs) coming in and out filled with goods. In contrast, empty containers are the ones that are emptied before entering into Australian water or are moved back without goods filled in them. However, in the total shipment demand, both the filled and empty containers are equally important. The feature in the dataset is shown in Table 2. In the table, the date refers to the first day of the week and provides an insight for the same whole week. Hence, for both imports and exports, the dataset contains the total number of filled incoming (inbound) and outgoing (outbound) containers as well as empty containers.

As the scope of this study was demand forecasting for Asia–Oceania trade lane imports only, we filtered the imports and exports in the next step. Table 3 shows the dataset features for the Asia–Oceania trade lane imports.

The total shipment import demand can be calculated by adding the total incoming empty containers (

E m p t y_{C o n t a i n e r s}

) and the total incoming full containers (

F u l l_{C o n t a i n e r s}

). This can be expressed as Equation (1). Features from the final dataset are shown in Table 4.

I m p o r t D e m a n d_{T o t a l} = E m p t y_{C o n t a i n e r s} + F u l l_{C o n t a i n e r s} .

(1)

3.3. Missing Value Handling

Missing values were handled by filling the average shipment demand from the previous and following year’s shipment demand at similar timestamps [22]. By performing a box plot analysis, it was evident that no such outliers could affect the performance of the models. Figure 4 shows the box plot of the demand dataset.

Figure 5 shows the container shipment demand count in TEUs (shown in the y-axis of Figure 5) for the Asia–Oceania trade lane (imports only) for the year 2016 through to 2018 (presented in the x-axis of Figure 5) and Figure 6 shows the data description of the demand dataset.

3.4. Test-Train Split

Once the missing values were filled, the dataset was divided into two parts: the train and the test dataset (see Figure 7). A total of 70% of the data were used for training the model; the remaining 30% of the data were used for testing the model. In time series models, the usual test-train split does not work as the values are dependent on time [23]. Performing random partitioning can cause misleading results. Hence, we selected a cut-off date that corresponded with approximately 70% of the dataset, i.e., February 2018 (see the vertical red line in Figure 7), for the training data to capture enough seasonality and trends of the time series under observation and use the rest of the data as test data.

4. Forecasting Models

Three state-of-the-art time series models were selected to forecast shipment demand. These were SARIMA, Holt–Winters’ seasonal method and Prophet. All these models were capable of a time series analysis using the seasonality present in the historical data.

4.1. Seasonal Auto-Regressive Integrated Moving Average (SARIMA)

SARIMA is an extension of the auto-regressive integrated moving average (ARIMA) [24] with an additional capability of handling seasonality in a time series. Hence, we used SARIMA to solve our research problem as it added three more parameters than ARIMA to cater for the seasonality in the time series. Mathematically,

m

is the seasonality pattern and

p

is the number of lag observations extracted through the partial auto-correlation function (PACF) plot:

S A R I M A = (p, d, q) {(P, D, Q)}^{m} .

(2)

PACF may be considered to be the partial correlation between the series and its lag values, which is difficult to explain by realizing their mutual correlation. The

d

is the number of time differences to be calculated to make the time series stationary and

q

is the size of the moving window set by the ACF value. The series with no trend has

d = 1

. However, if the series has a trend

d \geq 1

,

P

is a seasonal auto-regressive order and

D

is the seasonal difference order. If the series has a stable seasonal trend, then

D = 1

. If the seasonal pattern is unstable, then

D = 0 .

Q

is the seasonal moving average. Both

P

and

Q

are set by the ACF plots [24]. To find the optimal parameter, we used a grid search to determine the value for both (p, d, q) and (P, D, Q). To determine

(p, d, q) {(P, D, Q)}^{m}

values, we used the grid search method [24]. The selection criterion was the AIC value. Table 5 shows the parameters used in this research to fit SARIMA for container shipment demand.

4.2. Holt–Winters’ Seasonal Method

Holt–Winters’ methods are suitable for data with trends and seasonality [25]. Similar to SARIMA, it has two variants: additive and multiplicative. Mathematically, it can be written as Equation (3) where,

L_{t}

is the level equation that depicts the weighted average among the seasonal and non-seasonal forecasts,

k b_{t}

, represents the trend of the data and

S_{t + k - s}

represents the seasonal patterns and can be calculated using Equations (4)–(6) respectively. The coefficients

α

, β and

γ

are smoothing factors and their values lie between 0 and 1.

F_{t + k} = L_{t} + {kb}_{t} + S_{t + k - s} .

(3)

L_{t} = α (y_{t} - S_{t - s}) + (1 - α) (L_{t - 1} + b_{t - 1} .

(4)

b_{t} = β (L_{t} - L_{t - 1}) + (1 - β) b_{t - 1} .

(5)

S_{t} = γ (y_{t} - L_{t}) + (1 - γ) S_{t - s} .

(6)

The parameters used to fit Holt–Winters’ seasonal methods are shown in Table 6.

4.3. Facebook’s Prophet

Facebook’s Prophet is an open-source time series forecasting model [26]. It is designed to have in-built parameters that can be adjusted without going deep into the model’s implementation details. At the core of the model, a decomposable time series model runs. These parameters include trends, seasonality and holidays. Mathematically, it can be written as:

y (t) = g (t) + s (t) + h (t) + \in (t) .

(7)

In (7),

g (t)

is a piece-wise linear or logistic growth curve,

s (t)

is a periodic change (e.g., weekly/yearly seasonality),

h (t)

is a user-provided holiday effect and

\in (t)

is an error term accounting for unusual changes. The role of the domain expert is significant in every phase of the modelling. The domain expert can tweak the Fourier order and identify whether the details present in the data points are noise or a trend. As Prophet treats forecasting as a curve-fitting problem, it is inherently robust to outlier and missing data. Custom-defined holidays can also be used when fitting this model. This capability has not been provided by any of the existing algorithms yet. Hence, we defined the custom holidays as desired by our industry partner [27]. Table 7 shows the parameters of Prophet used to fit the model for container shipment demand forecasting. Table 8 shows the list of holidays used in this research.

5. Experimental Results and Discussion

This section explains the experimental results achieved by applying the selected models over the shipping datasets. The first task was to analyze the dataset to understand the time series under observation. The primary stage was to perform a time series decomposition. Figure 8 shows the decomposition of the demand data time series into different components observed: the trend, cycle or seasonality and the residual. The observed part of the time series depicted how the time series was viewed as a combination of its residual, trend and seasonality (see Figure 8a). The time series trend explained the possible variation of the variable under observation with respect to time (see Figure 8b). However, the seasonal component depicted the existing seasonality in the time series (see Figure 8c). The residual was the leftover of the time series that could be considered to be noise after the fitting of the dataset in a model. However, at this stage, the residual might be considered to be noise in the time series that was added up in the observed time series (see Figure 8d). From Figure 8, it is evident that the shipment demand dataset was non-stationary in nature.

The overall trend of the dataset for the whole time series showed a sudden decrease in the shipment demand after the first quarter of 2017, which gradually increased throughout 2018. However, there was no effect of holidays over the shipment demand. This is understandable, as explained by our industry partner, the Mizzen group [27]. According to industry experts, the shipping industry does not close on any day. When we looked at the weekly trend of the shipment demand (see Figure 9), at the start of every week (i.e., on Monday), demand suddenly rose and then became consistent after a sudden fall on Tuesday. Moreover, looking at the shipment demand dataset’s monthly trend graphs, it was evident that the shipment demand started to rise from July until November and then started falling, reaching a minimum value in February.

The change point analysis over the shipping dataset shown in Figure 10 presents the change point present in the dataset under research. The graph depicts that there was a change in demand between May 2016 and September 2017. However, the changes were well within the range of the shipping infrastructure, hence suggesting a linear growth of demand. Figure 11 demonstrates the auto-correlation graph (ACF) of the shipping dataset. From the ACF plot, it is clear that there was an apparent relation between the past and previous values of the shipment demand. However, the values toggled between positive and negative values; these were due to the trends in the dataset and seasonality.

Once the time series dataset was analyzed, SARIMA was trained over the training data using the parameters shown in Table 5. Once the model was trained, the trained model was tested over the test data. As explained earlier, ARIMA’s configuration requires the finding of an optimal combination, which was achieved by performing a grid search in our research. The minimum AIC value achieved was used to fit the model. For our dataset, (1,1,1) (0,1,1)12 provided the minimum AIC value and was therefore used in this research. Finally, demand forecasts were made using the trained model for both the short-term (i.e., 6 weeks) and the long-term (i.e., 52 weeks). Figure 12a shows the test forecast provided by SARIMA. The train and test forecast can be seen in Figure 12b. The short-term and long-term forecasts are shown in Figure 12c,d, respectively.

Further continuing the experiment, Holt–Winters’ seasonal model was trained. From the decomposition of the time series, it was evident that the dataset was seasonal in nature and was suitable for additive seasonality. The parameters are shown in Table 6. The test forecast provided by Holt–Winters’ seasonal model is shown in Figure 13a. The forecast over both the train and test data can be seen in Figure 13b. The short-term (i.e., 6 weeks) and long-term (i.e., 52 weeks) demand forecast are shown in Figure 13c,d, respectively.

Finally, Prophet was trained using the parameters tabulated in Table 7 and Table 8. Prophet offers two additional flexible tuning parameters for researchers, i.e., custom-defined holidays and a custom change point definition. In addition to this, the model provides the ability to impose a dominant or recessive effect of holidays based on the requirements by modifying the parameter name ‘Holiday Prior Scale’ (see Table 7). The greater the values of the said variable, the greater the effect of the holidays over the forecasts. Furthermore, the model provides an edge over other models by offering the control over weekly, monthly and yearly seasonality. Based on the dataset, the same could be set as true and false (see Table 7). The test forecast provided by Prophet is shown in Figure 14a. The collective train and test forecast are shown in Figure 14b. The short-term (i.e., 6 weeks) and long-term (i.e., 52 weeks) forecast are presented in Figure 14c,d, respectively.

Evaluation and Discussion

An accurate valuation is fundamental to conclude the best fit model. We selected the root mean squared error (RMSE) and mean absolute percentage error (MAPE) to evaluate the performance of the selected models. The RMSE can be computed using Equation (8) given below.

R M S E = \sqrt{\bar{{(f - o)}^{2}}} .

(8)

In (8),

f

is the forecast and

o

is the observed value. The MAPE measures the forecast accuracy as a percentage. The MAPE can be calculated as Equation (9).

M A P E = \frac{1}{N} \sum_{t = 1}^{n} |\frac{A_{t} - F_{t}}{A_{t}}| .

(9)

In (9),

A_{t}

is the actual value and

F_{t}

is the forecast. Figure 15 shows the comparison of the RMSE achieved by the selected models (for both the short-term and long-term). Figure 14 shows the RMSE achieved using the train and test data. Prophet offered a lesser RMSE for both the train and test data compared with its competitive models, i.e., SARIMA and Holt–Winters’ seasonal model. However, Holt–Winters’ seasonal model provided a lesser RMSE than SARIMA over the training dataset but lost its performance over the test data. Thus, it could be concluded that Prophet outperformed both SARIMA and Holt–Winters’ seasonal model.

This conclusion was supplemented by looking at the comparative MAPE values of the selected algorithms, as presented in Figure 16. Prophet reached approximately a 4% and 11% MAPE for the train and test datasets, respectively. On the other hand, the train and test MAPEs provided by SARIMA were 62% and 36%, respectively, and Holt–Winters’ seasonal method attained a 17% and 12% MAPE for the train and test, respectively. Henceforth, it was concluded that Prophet outperformed both Holt–Winters’ seasonal method and SARIMA.

Table 9 shows the experiment results comparing the RMSE and MAPE for both the test and train datasets for SARIMA, Prophet and Holt–Winters’ seasonal method.

Finally, we performed a walk-forward cross-validation to analyze the best fitted model. As the shipping dataset we used as our source was for three years, only a 2-fold cross-validation could be performed. To do so, we divided the dataset into two groups, i.e., train and validation sets. Figure 17 shows the dataset division into train (in blue) and the validation dataset (in red). The average RMSE for all the three selected algorithms was observed. The results from the 2-fold walk-forward cross-validation are tabulated in Table 10. The observation of the average RMSE values depicted similar results to those explained earlier in this section. The walk-forward 2-fold cross-validation showed that Prophet surpassed both SARIMA and Holt–Winters’ seasonal method.

6. Conclusions and Future Work

In this research, three time series forecasting models were applied on a real-time shipping dataset to forecast short-term and long-term shipment demand forecasting for the Australian shipping industry, specifically for Asia–Oceania trade lanes (imports only). Shipment demand forecasting was performed concerning the time in historical data while incorporating seasonality and volatility. To the analyses, the performance of the selected models and the MAPE and RMSE values were observed. The evaluation results suggest that Facebook’s Prophet outperformed Holt–Winters’ seasonal method and SARIMA for both short-term and long-term demand forecasting by offering lesser RMSE and MAPE values on both the train and test datasets. In addition, Prophet also offered the flexibility of incorporating custom-designed holidays into its forecasting results.

The study of the existing literature revealed that there is limited work on forecasting shipment demand in the shipping industry. In the absence of real-time reflectivity into the operations of the container shipping supply chain, the industry is experiencing a worthwhile loss of revenue. Hence, we selected three state-of-the-art time series forecasting models to forecast container shipment demand in both the short-term and long-term, consequently providing a real-time insight into the future shipment demand for making informed pricing and planning decisions. This research work is one of the primary studies performed so far for the Australian shipping industry.

There exist a few limitations to this research that can offer future research directions. We forecasted the shipment demand for a single trade lane’s import and with just two variables as desired by the research partner (due to their business requirement). The forecasting can be expanded for other operating trade lanes for both imports and exports and with more complex variables. Apart from other time series algorithms, deep learning models can also be applied to the dataset to determine the performance and find an even better performing forecasting model.

Author Contributions

Conceptualization, A.U. and F.H.; methodology, A.U.; software, A.U.; validation, A.U., F.H. and M.S.; formal analysis, A.U.; investigation, A.U.; resources, A.U.; data curation, A.U.; writing—original draft preparation, A.U.; writing—review and editing, A.U., F.H. and M.S.; visualization, A.U.; supervision, F.H.; project administration, F.H.; funding acquisition, F.H. All authors have read and agreed to the published version of the manuscript.

Funding

Research was funded by University of Technology Sydney and Mizzen Group, Sydney.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The dataset is propriety to Mizzen and their allied industry partners. Hence, we have not provided the data for public access.

Acknowledgments

The Mizzen Group funded this research. The Mizzen Group (www.mizzengroup.com, accessed on 15 July 2021) is a digital pricing and rate management solution. The Mizzen team combines innovative digital capability in the shipping industry. The company delivers software for freight sellers, shipping lines and freight forwarders to set and distribute prices dynamically and in new ways to their customers in the digital channel. This enables them to deliver new products with a range of valuable attributes to serve their customers’ needs.

Conflicts of Interest

The authors declare no conflict of interest.

References

Lun, Y.H.V.; Lai, K.; Cheng, T.C.E. Shipping and Logistics Management; Springer: London, UK, 2010. [Google Scholar]
Drewry. Technology to Reduce Freight Rate Volatility and Capacity Risks 2019. Available online: https://www.drewry.co.uk/white-papers (accessed on 15 August 2019).
Kilimci, Z.H.; Akyuz, A.O.; Uysal, M.O.; Akyokus, S.; Bulbul, B.A.; Ekmis, M.A. An Improved Demand Forecasting Model Using Deep Learning Approach and Proposed Decision Integration Strategy for Supply Chain. Complexity 2019, 2019, 1–15. [Google Scholar] [CrossRef] [Green Version]
Cortinhas, C.; Ken (Kenneth Urban), B. Statistics for Business and Economics, European ed.; Wiley: Chicheste, UK, 2012. [Google Scholar]
Taylor, J.W. Exponentially weighted information criteria for selecting among forecasting models. Int. J. Forecast. 2008, 24, 513–524. [Google Scholar] [CrossRef] [Green Version]
Ostertagova, E.; Ostertag, O. Forecasting using simple exponential smoothing method. Acta Electrotech. Inform. 2012, 12, 62–66. [Google Scholar] [CrossRef]
Chatfield, C. Time Series Forecasting, illustrated ed.; CRC Press: Boca Raton, FL, USA, 2000. [Google Scholar]
Stellwagen, E.; Tashman, L. ARIMA: The Models of Box and Jenkins. Foresight Int. J. Appl. Forecast. 2013, 30, 28–33. [Google Scholar]
Biyik, A.C.; Tanyeri, M. Pricing Decisions in Liner Shipping Industry: A Study on Artificial Neural Networks. J. Mark. Mark. Res. 2018, 21, 125–150. [Google Scholar]
Lago, J.; De Ridder, F.; De Schutter, B. Forecasting spot electricity prices: Deep learning approaches and empirical comparison of traditional algorithms. Appl. Energy 2018, 221, 386–405. [Google Scholar] [CrossRef]
Ebrahimian, H.; Barmayoon, S.; Mohammadi, M.; Ghadimi, N. The price prediction for the energy market based on a new method. Econ. Res.-Ekon. Istraživanja 2018, 31, 313–337. [Google Scholar] [CrossRef] [Green Version]
McNally, S.; Roche, J.; Caton, S. Predicting the Price of Bitcoin Using Machine Learning. In Proceedings of the 2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP), Cambridge, UK, 21–23 March 2018. [Google Scholar]
Voyant, C.; Notton, G.; Kalogirou, S.; Nivet, M.-L.; Paoli, C.; Motte, F.; Fouilloy, A. Machine learning methods for solar radiation forecasting: A review. Renew. Energy 2017, 105, 569–582. [Google Scholar] [CrossRef]
Chou, C.-C.; Chu, C.-W.; Liang, G.-S. A modified regression model for forecasting the volumes of Taiwan’s import containers. Math. Comput. Model. 2008, 47, 797–807. [Google Scholar] [CrossRef]
Han, Q.; Yan, B.; Ning, G.; Yu, B. Forecasting Dry Bulk Freight Index with Improved SVM. Math. Probl. Eng. 2014, 2014, 12. [Google Scholar] [CrossRef]
Port of Melbourne, VIC, Australia. Available online: www.portofmelbourne.com/about-us/trade-statistics/monthly-trade-reports/ (accessed on 27 May 2019).
Fremantleport, WA, Australia. Available online: www.fremantleports.com.au/trade-business/container-traffic-reports (accessed on 27 May 2019).
Port of Brisbane, QLD, Australia. Available online: www.portbris.com.au/Operations-and-Trade/Trade-Development/ (accessed on 27 May 2019).
Flinders Port, SA, Australia. Available online: www.flindersports.com.au/ports-facilities/port-statistics/ (accessed on 27 May 2019).
Port Botany, NSW, Australia. Available online: www.nswports.com.au/resources/trade-results/ (accessed on 27 May 2019).
Ubaid, A.; Hussain, F.; Charles, J. Modeling Shipment Spot Pricing in the Australian Container Shipping Industry: Case of ASIA-OCEANIA trade lane. Knowl.-Based Syst. 2020, 210, 106483. [Google Scholar] [CrossRef] [PubMed]
Missing Value Handeling. Available online: https://scikit-learn.org/stable/modules/impute.html (accessed on 18 May 2020).
Timse Series Split. Available online: https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.TimeSeriesSplit.html (accessed on 25 August 2020).
Arlt, J.; Trcka, P. Automatic SARIMA modeling and forecast accuracy. Commun. Stat. —Simul. Comput. 2019, 1–22. [Google Scholar] [CrossRef]
Pongdatu, G.; Putra, Y. Seasonal Time Series Forecasting using SARIMA and Holt Winter’s Exponential Smoothing. IOP Conf. Ser. Mater. Sci. Eng. 2018, 407, 012153. [Google Scholar] [CrossRef]
Facebook Prophet. Available online: https://facebook.github.io/prophet/ (accessed on 22 September 2019).
Mizzen Group Pty Ltd. Available online: https://www.mizzengroup.com/ (accessed on 20 November 2019).

Figure 1. Classification of the time series models.

Figure 2. Data sources for the shipment demand dataset.

Figure 3. Data cleansing process.

Figure 4. Box plot for the container shipment demand dataset.

Figure 5. Shipment demand visual dataset.

Figure 6. Data description of the demand dataset [21].

Figure 7. Test train split of the demand dataset (70–30%).

Figure 8. Shipment demand dataset decomposition.

Figure 9. Shipping dataset trend analysis.

Figure 10. Shipping dataset change point analysis.

Figure 11. ACF plot for the shipment demand dataset.

Figure 12. Demand forecast by SARIMA: (a) test forecast, (b) train-test forecast, (c) short-term forecast, (d) long-term forecast.

Figure 13. Demand forecast by Holt–Winters’ seasonal method: (a) test forecast, (b) train test forecast, (c) short-term forecast, (d) long-term forecast.

Figure 14. Demand forecast by Prophet: (a) test forecast, (b) train-test forecast, (c) short-term forecast, (d) long-term forecast.

Figure 15. Comparative RMSE of the forecasting models.

Figure 16. Comparative MAPE values of the forecasting models.

Figure 17. Walk-forward 2-fold cross-validation dataset split.

Table 1. Existing time series forecasting models and their application in the container shipping industry.

Models	Application in the Shipping Industry	Domain	Supporting Models
Simple Exponential Smoothing (SAS)	×	-	-
Holt–Winters’ Non-Seasonal Method	×	-	-
Holt–Winters’ Seasonal Method	×	-	-
Auto-Regressive (AR)	×	-	-
Moving Average (MA)	×	-	-
Auto-Regressive Moving Average (ARIMA)	√	Price Prediction	ANN
Seasonal Auto-Regressive Moving Average (SARIMA)	×	-	-
Facebook’s Prophet	×	-	-
Artificial Neural Network (ANN)	√	Price Prediction	ARIMA
Support Vector Regression (SVR)	√	Price Prediction	ARIMA
Support Vector Machine (SVM)	√	Freight Index	NA
Long Short-Term Neural Network (LSTM)	×	-	-
Recurrent Neural Network (RNN)	×	-	-

Table 2. List of demand dataset features.

Region	Feature Name	Description
Asia–Oceania	Imports	Number of Inbound Containers
	Full	Total Inbound Container (Filled)
	Empty	Total Inbound Container (Empty)
	Date	Starting Day of the Week
	Exports	Number of Outbound Containers
	Full	Total Outbound Container (Filled)
	Empty	Total Outbound Container (Empty)
	Date	Starting Day of the Week

Table 3. Demand dataset features for Asia–Oceania imports.

Region	Feature Name	Description
Asia–Oceania (Imports)	Full	Total Inbound Container (Filled)
	Empty	Total Inbound Container (Empty)
	Date	Starting Day of the Week

Table 4. Final shipment demand dataset.

Region	Feature Name	Description
Asia–Oceania (Imports)	Demand	Total Demand (Full + Empty Containers)
Asia–Oceania (Imports)	Date	Starting Day of the Week

Table 5. SARIMA parameters for demand forecasting.

Parameters	Value
Order (p, d, q)	(1,1,1)
Seasonal Order (P, D, Q)	(0,1,1,12)
m	12
Stationarity	False
Inevitability	False

Table 6. Holt–Winters’ seasonal method parameters for demand forecasting.

Parameters	Value
Seasonal Period	12
Trend	Additive
Seasonal	Additive
Damped	True

Table 7. Prophet parameters for demand forecasting.

Parameters	Value
Growth	Linear
Holidays	Custom
Holiday Prior Scale	40
Seasonality Mode	Additive
Daily Seasonality	False
Weekly Seasonality	True
Yearly Seasonality	True
Monthly Seasonality Fourier Order	20
Monthly Period	30
Yearly Seasonality Fourier Order	15
Yearly_Period	365

Table 8. Custom-defined holiday list.

Holiday Name	Dates
Golden Week	29 April 2019, 30 April 2019, 1 May 2019, 2 May 2019, 3 May 2019, 4 May 2019, 5 May 2019, 6 May 2019
Australian Holidays	29 December 2019, 19 April 2019, 2 May 2019

Table 9. Experiment results comparing SARIMA, Prophet and Holt–Winters’ seasonal method.

Method	RMSE		MAPE (%)
Method	Train	Test	Train	Test
SARIMA	33,703	8491	62	36
Prophet	2414	6737	4	11
Holt–Winters’ Seasonal Method	9354	21,892	17	12

Table 10. Walk-forward 2-fold cross-validation results.

No of Attributes	Total Sample Size	Sample Size	Model	RMSE	Avg. RMSE
1	156	52–104	Prophet	2807	2787
		104–52	Prophet	2767	2787
		52–104	SARIMA	25,071	20,171
		104–52	SARIMA	157,071	20,171
		52–104	Holt–Winters	3840	3351
		104–52	Holt–Winters	2862	3351

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ubaid, A.; Hussain, F.; Saqib, M. Container Shipment Demand Forecasting in the Australian Shipping Industry: A Case Study of Asia–Oceania Trade Lane. J. Mar. Sci. Eng. 2021, 9, 968. https://doi.org/10.3390/jmse9090968

AMA Style

Ubaid A, Hussain F, Saqib M. Container Shipment Demand Forecasting in the Australian Shipping Industry: A Case Study of Asia–Oceania Trade Lane. Journal of Marine Science and Engineering. 2021; 9(9):968. https://doi.org/10.3390/jmse9090968

Chicago/Turabian Style

Ubaid, Ayesha, Farookh Hussain, and Muhammad Saqib. 2021. "Container Shipment Demand Forecasting in the Australian Shipping Industry: A Case Study of Asia–Oceania Trade Lane" Journal of Marine Science and Engineering 9, no. 9: 968. https://doi.org/10.3390/jmse9090968

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Container Shipment Demand Forecasting in the Australian Shipping Industry: A Case Study of Asia–Oceania Trade Lane

Abstract

1. Introduction

2. Literature Review

2.1. Statistical Models

2.2. Hybrid Models

2.3. Deep Learning-Based Models

2.4. Comparative Analysis of Existing Time Series Techniques

3. Research Design

3.1. Methodology

3.2. Data Sourcing and Cleansing

3.3. Missing Value Handling

3.4. Test-Train Split

4. Forecasting Models

4.1. Seasonal Auto-Regressive Integrated Moving Average (SARIMA)

4.2. Holt–Winters’ Seasonal Method

4.3. Facebook’s Prophet

5. Experimental Results and Discussion

Evaluation and Discussion

6. Conclusions and Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI