Prediction and Analysis of Container Terminal Logistics Arrival Time Based on Simulation Interactive Modeling: A Case Study of Ningbo Port

Wang, Ruoqi; Li, Jiawei; Bai, Ruibin

doi:10.3390/math11153271

Open AccessArticle

Prediction and Analysis of Container Terminal Logistics Arrival Time Based on Simulation Interactive Modeling: A Case Study of Ningbo Port

by

Ruoqi Wang

^1,2,*,

Jiawei Li

²

and

Ruibin Bai

²

¹

College of Information and Intelligent Engineering, Zhejiang Wanli University, Ningbo 315104, China

²

School of Computer Science, University of Nottingham Ningbo China, Ningbo 315104, China

^*

Author to whom correspondence should be addressed.

Mathematics 2023, 11(15), 3271; https://doi.org/10.3390/math11153271

Submission received: 30 March 2023 / Revised: 7 July 2023 / Accepted: 7 July 2023 / Published: 25 July 2023

(This article belongs to the Special Issue Advances in Statistical Modeling)

Download

Browse Figures

Versions Notes

Abstract

This study is a driving analysis of the transfer data of container terminals based on simulation interactive modeling technology. In the context of a container yard, a model was established to analyze and predict the arrival time and influencing factors of container transportation through the data from the control center of the yard. The economic benefit index in the index system was determined through expert consultation, the automatic terminal can be obtained by acquiring the actual operating parameters of the terminal, and the terminal to be built can be acquired mainly through simulation modeling. Therefore, when determining the design scheme before constructing the automated container terminal, a terminal simulation model needs to be established that meets the requirements of loading and unloading operations and terminal production operations. In addition, an automated container terminal simulation model needs to be implemented to verify the feasibility of the evaluation model. The results reveal that the accuracy of the current prediction model is still limited—the highest accuracy is only 72%, whether there are continuous or discrete variables, traffic or weather variables. Moreover, the study denotes that the relationship between weather and specific time factors and the arrival time of containers is weak, even negligible. This study provides guidance and decision-making support for the construction of automated terminals.

Keywords:

driving analysis; simulation interactive modeling; container terminal transshipment data

MSC:

70-10

1. Introduction

A container yard is a place that connects the transport flow in the supply chain. Once at the port yard, the transport vehicles load and unload containers full of goods and then continue to the next destination. In modern logistics supply chains, container yards are vital hubs that connect transport processes. Transport companies take export containers from production facilities or warehouses and deliver them to container yards. Then, the container yard is responsible for unloading, loading, storing, arranging transport, and transporting the container to its destination. The transportation company usually transports the outbound containers from the origin to the container terminal while picking up the inbound containers from the container yard to the other employers. The rest of the employers refer to the container cargo demand customers. Late arrivals present a problem for the port storage yard. For example, a late container impedes the queue or prevents on-time arrival. Other containers may be waiting for the delayed trucks. Similarly, early arrival could lead to heavy traffic and queues with few trucks in the port yard. It means that both early and late arrivals will prevent the port yard from taking full advantage of its capacity to load and unload cargo. The port yard is plagued by inaccurate timing, and more generally, synchronization of the trucks’ arrivals and departures (Van Belle, Valckenaers and Cattrysse 2012) [1]. Therefore, accurately estimating the arrival time of containers is crucial for the smooth operation of the logistics supply chain. The occurrence of delays can lead to delays in transportation schedules, cargo retention, and supply chain disruptions, causing economic losses and inconvenience to transportation companies and relevant stakeholders. Thus, it is urgent to study how to accurately estimate a container’s arrival time and determine the influencing factors and prediction. The background of this study is designed to fill gaps in existing research, address practical issues, and provide valuable information and insight into transportation companies and terminal yard decisions.

A method for reliably determining the arrival time of the container needs to be developed to solve the synchronization problem. If this is known, the shipyard can act accordingly. Despite numerous studies on travel and transport time prediction, quite a few studies combine these fields. This study used a data-driven analytical method to estimate the arrival time of export containers, which has not been widely used in past studies. By analyzing the control center data of Ningbo terminal yard, a model was established for prediction. This method innovatively uses big data analysis technology to improve prediction accuracy. This innovative approach is significant for transportation companies and terminal yards to help them better plan and manage logistics transportation, improving efficiency and accuracy.

Port yards can respond appropriately to exceptions by predicting container arrival time, as this study sought to achieve.

The following research questions are addressed:

How is a model built to explain and predict the time the container reaches the port yard?

Based on the literature/theory, what are the most critical factors?
According to the trucker, what factors are the most significant?
Can these factors be combined into the model for predicting when containers arrive at the port yard?

In a case study conducted in Ningbo, data from the Meishan port terminal were employed to develop a model for predicting when containers are expected to arrive at the terminal.

The objectives of the predictive model developed here are presented and the relevance of the prediction model to the port yard is explained.

The port yard plan is reliable because it specifies the stages in which the containers must be moved in preparation for the other containers’ arrival. Moreover, it determines the number of personnel needed. Further, late goods cause accumulation at the terminal, and the early arrival of truckers causes congestion, leading to traffic congestion in the yard. This plan attempts to delay the process of loading trucks by letting delayed trucks bypass other trucks waiting in line, so that the goods on these delayed trucks will not hinder other trucks.

Staff at the port suspect that some truckers may abuse this policy. The driver may be deliberately late because he knows he will go through a green passage instead of waiting at the door. From another perspective, the port yard rewards drivers’ lateness and is a perverse incentive. The port management firm will propose two solutions:

A: Although lateness may be punished, the port yard wants to do something other than this as the company of these truckers is the port yard’s customer.
B: The truckers’ location can be queried. The latter option is currently not feasible since the port terminal cannot know who the trucker is in advance.

The staff is therefore looking for alternative solutions, such as using historical patterns to predict container arrival time and considering environmental factors (e.g., weather, congestion).

There may be two objectives for predicting the arrival time of containers, according to Shmueli and Koppius (2011) [2]. One is to calculate accurate numerical results. For example, a container may be 37 minutes late. The other is to calculate a time range. For example, a container will be 30–60 minutes late. The latter prediction method is adopted here. It is unnecessary for the container yard to accurately specify the arrival time to the minute. Therefore, it is good enough if the truck’s arrival time can be determined within a specific interval.

To develop the container arrival time prediction model, the prediction analysis method proposed by Shmueli and Koppius was used. These two researchers emphasized the significance of predictive analysis and its distinction from causal interpretation statistical models: this is the principal method in information system research. The goal of this study is to establish an automated container terminal simulation model through the analysis of container terminal transfer data to improve the efficiency of production work. This study offers guidance and decision support for constructing an automated terminal.

Predictive analysis typically involves using empirical models and future predictions based on observed data, combined with methods for evaluating the performance of these models.

This paper is organized into five sections. Section 1 is the introduction, which mainly describes the background of the working process between transport companies and container terminals, puts forward the research methods, and explains the contribution and structure of this study. Section 2 is the literature review, which majorly reviews the studies on container arrival time prediction, analyzes the defects of previous studies, and clarifies the research motivation. Section 3 describes the materials and methods, expounding on the design method of this study. Section 4 gives the results, evaluating the prediction results of different machine learning algorithms in export container arrival time. Section 5 is the conclusion, which mainly explains the results of this study and looks forward to future research work.

2. Literature Review

2.1. Review of the Literature

This study is contextualized by looking at 82 papers, excluding those that do not discuss certain factors/sets of environment, behavior, or other pairs bordering the paper on transport time or similar dependent variables. Twelve concepts are proposed that impact travel time based on the literature review. The influencing factors of dependent variables, like time, are evaluated. Combined with the study by Vander et al. (2017) [3], the conceptual results affecting travel time are shown in Table 1. Table 2 lists the contents of each paper reviewed.

In Table 2, the literature review indicates that transportation congestion, day, week, month, and time of year are the most common factors mentioned in the literature (i.e., according to the body of data). In datasets and subsequent models, the subject matter of the literature should be available since it influences transport time. Most studies hold that traffic congestion is the main cause of delays. Other factors are considered to be responsible for more congestion and longer transport time. Furthermore, the literature on container logistics transportation in recent years is sorted out. Hu et al. (2021) [40] studied the mixed fuel consumption prediction model of ocean-going container ships by using sensor data and estimated the fuel cost of container ship transportation. The results indicate that fuel consumption data of container fleets should be collected in the future. Maldonado et al. (2019) [41] explored the decision support system of container stacking operations in port logistics and used analytical techniques to predict the residence time of each container. The results showed that the yard performed well in loading and unloading compared with the current practice of the port terminal and the well-known stacking strategy. Lei (2022) [42] studied the intelligent logistics scheduling model and algorithm of the Internet of Things (IoT) technology and proposed a smart distribution model based on the IoT. This model optimized the distribution process and proposed an efficient distribution strategy in the face of large amounts of data. The results indicated that this technology had practical application value to promote the management efficiency of container logistics. Therefore, the study needed to combine logistics transportation time and arrival time to optimize container transportation and management strategies.

It is worth noting that just two studies provide empirical support for the claim that specific factors affect transportation time/arrival time. Most papers rely on empirical data to verify their proposed models’ overall prediction ability. Therefore, many papers hold that factors, including weather and time, can affect transport time (mainly because of a congested infrastructure). Still, they need to provide evidence to support their views. The view that congestion can explain transport time has been investigated in many studies by analyzing prediction models generated based on the assumption that congestion plays a crucial role.

Many studies refer to the literature to explain the factors affecting transport time, so these studies were examined again to see if they refer to provided empirical evidence. A paper often cited by Tu H, Li H, Van Lint, Hugendom, and Van Zuren (2012) [43] outlined the source of congestion, but only used congestion as the contributing factor affecting transport time in the predictive model of the paper. Papers using empirical data measure flow at a specific time and place through cycle detector data or (a variant) floating vehicle data—none of them adopt other data sources like weather variables. In conclusion, no papers examine container arrival time. They all investigate transit time.

Container arrival time is determined by a combination of transportation and departure times. Hence, while the literature on transport time applies, the human factors included in planning departure time may influence container arrival time (in particular, if truckers have previously traveled this route). Thus, the literature on container arrival time prediction is blank.

2.2. Data from Yard Control Centers

Overall, 28,889 L of planned actual container arrival data were analyzed on the basis of the literature review described. The data acquired from the yard terminal include the planned and actual arrival time and the city name of the transportation destination. In addition to the port yard data, weather data compiled by the Physical Sciences Laboratory (NOAA) and the scheduled and actual container arrival time and destinations were adopted. Based on the literature review, the weather data compiled by the NOAA and the port yard data were employed.

3. Materials and Methods

3.1. Simulation Interactive Modeling Method

Simulation interactive modeling is a method to model and simulate complex systems through computer simulation technology. It can conduct interactive experiments to evaluate the system’s performance under different scenarios and predict its future behavior. This method typically adopts computer programs and graphical interfaces to simulate the system, using real-time data input and interactive operations to simulate and optimize the system’s operation.

Simulation interactive modeling method is used to model and simulate a complex system through computer simulation technology, and interactive experiments are conducted to evaluate the system’s performance under different scenarios and predict its future trend. This method usually employs computer programs and graphical interfaces to simulate the system and uses real-time data input and interactive operations to simulate and optimize the system operation. The specific implementation process generally includes the following. Firstly, the abstract analysis and modeling of the system are carried out to determine the different components and their attributes and the relationship between attributes. Secondly, the simulation model is constructed using computer simulation technology, and the simulation model is implemented based on computer programs and algorithms. Thirdly, through the graphical interface and model collaborative interaction, real-time data and human intervention are input to simulate and adjust the system’s operation status. Finally, the accuracy of the simulation model is verified by comparing it with the real system, and the model construction and parameter setting are continuously optimized to ultimately simulate the state and evolution process of the complex system more accurately. A diagram of the specific process of simulation modeling and research methods is displayed in Figure 1.

3.2. Preparation of Data

3.2.1. The Collection of Data

The data presented in this section can be used to predict container arrival time. First, a literature review is provided, followed by the port yard data. The following section reviews the previous literature to provide an understanding of the data and variables that are considered good predictors of container arrival time (as well as those with no predictors).

3.2.2. Missing Value

What needs to be added to the data and how the data can be processed are described in this section. The location of the truck is the primary missing variable in the data. It will influence other variables to a certain degree.

3.2.3. Traffic and Weather Zones and Missing Value

According to the Ministry of Transport (MOT), the traffic flow data are from City Road-Zhejiang province’s primary intersection traffic control scheme data.

Container information (from Meishan port terminal, China): The order identifier and the time and date of when the container is due to arrive. The first item is a digital identifier, which may include other information about the container, such as weight. The second item is the arrival time and date.

Weather information (from NOAA): Average precipitation (various types), average fog, average temperature, average rainfall, and maximum and minimum wind speed.

Traffic information (from the MOT): Number of cars moving along a stretch per second at maximum, minimum, and average speeds.

This section describes how to prepare for container arrival, weather, and traffic data as a table row. Each row includes one or more of the following variables and the delay.

The literature suggests that traffic, weather, and the time of day can affect arrival time. Weather as well as traffic data, the route to the port yard, and the driving speed need to be mastered. None of these are explicitly provided, but the frequent departure sites can be realized from the survey. Through the use of the Google Maps Directional Application Programming Interface (API) (Google, Maps, API, Web, Service), the primary route that the trucks may travel is extracted. It is assumed that they are driving at 80 km/h, and that 80 km/h is the legal driving speed. These parameters can be adopted to estimate the truck’s location and 1 hour before the expected arrival time of the container.

Since the origin and direction of the truck are still being determined, the departure source is divided into 12 zones. Zones 1 and 2 are 0–80 km and 80–160 km from the storage yard, and zone 3 is beyond 160 km from the storage yard. In Table 3, a truck in zone 3, 2, or 1 should arrive at the yard 3 h, 2 h, or 1 h earlier than the planned arrival. Here, the traffic and weather in zones 1, 2, and 3 are mainly discussed.

For zones 1, 2, and 3, the road condition data for each zone and the data from meteorological stations in the zone are summarized. Road data are adopted for the summary, considering the lowest, maximum, and average number of cars on any associated road in the area. There is a correlation, indicating it is one of the possible routes. In the same way, the present study employs the highest and mean scores of wind speed, temperature, fog, and precipitation in each area. According to the literature (Servos et al., 2019) [44], combined with the maximum wind speed and the average, minimum, and maximum temperatures of urban meteorological stations, the traffic in different zones is exhibited in Table 3.

There are also some variables with missing values, or values due to information system records, in addition to the variables for truck location and route. Lines with insufficient data are discarded as missing values are not substituted. There are 18,922 of 28,189 rows remaining after the missing values are discarded.

3.3. Data Exploration and Analysis

The study presents statistical results of the arrival time of logistics vehicles at container terminals. Table 4 shows the frequency, percentage, and average delay for different arrival time delays. The table provides a way to understand how logistics vehicles behave at arrival times. It presents the degree of delay in the arrival time of the logistics vehicle and its frequency distribution.

3.3.1. Findings of the Survey

As part of the study, the truckers arriving at the port yard are examined to determine whether the factors in the literature review correlate with their perception of early or late arrival. This survey is conducted on the basis of the factors in the literature review. It consists of eight sets of questions, each rated on a 5-point scale, as well as seven questions regarding prior and subsequent stations.

Similar factors are classified. Congestion, accidents, road work, and road conditions are all classified as traffic/congestion because they relate to containers and road usage. Among the factors above, congestion is the most common.

Among the factors in the literature review, speed and driving style are associated with driving styles for groups. Port yards request that their plan be used as a factor because they consider it will also affect container arrival time. The investigation includes two “other” categories for truckers to indicate the factors affecting their arrival time.

Overall, 200 responses to the survey were received. This survey mainly investigates truckers’ experience and opinions on container arrival time. In the survey, truckers are asked to explain the factors (listed in the literature review) that affect their arrival times. If each factor negatively impacts their arrival time (i.e., earlier or later), they must account for it. Truckers are also required to explain the planned and actual container arrival time, to verify whether there is a relationship between these factors and their lateness. Since the investigation is anonymous, there is no way to compare the planned arrival expressed by the trucker with the one communicated to the port yard. Finally, the truckers must indicate their starting point and destination to determine the distance, the route they are taking, and where port yards fit into the overall transport chain.

Table 5 indicates that eight factors have been satisfactorily addressed. It also lists how many operators complete each factor survey, while 31 drivers do not answer any factors but do answer one or more other questions (such as previous as well as future sites). Combined with the literature by (Balster et al., 2020) [45], the number of different survey factors is analyzed. The results are outlined in Table 5. This table presents statistics on the frequency of the number of factors filled out in the questionnaire. It includes the number of factors to fill in and the corresponding frequency.

Table 6 displays the Pearson’s r correlation coefficients between the variables and between the independent and dependent variables (as mentioned by truckers). Moreover, it suggests that all independent variables have significant associations between them. The results are inconsistent with those of the literature review. This review identifies congestion and traffic as the most crucial predictors of transport time and container arrival time.

3.3.2. Data about the Port Yard

In this study, a dataset is collected that includes variables related to meteorological conditions and traffic conditions. The numbering of these variables is detailed in Table 7. Each variable represents a specific meteorological indicator or traffic flow information, such as tardiness time, mean wind speed, maximum gusts, and mean temperature. To better understand the relationship between these variables, the strength of the correlation between them is calculated, and the results are summarized in Table 8. Table 8 shows the Pearson’s r coefficient between the variables, which measures the linear correlation between them. It is important to note that in determining the strength of a correlation, there is no fixed rule for determining what degree of correlation is considered strong, medium, or weak when a = 0.05. It is generally believed that a correlation coefficient greater than 0.4 indicates a strong correlation, between 0.2 and 0.4 means a moderate correlation, and less than 0.2 indicates a weak correlation. Analyzing the data in Table 7 and Table 8 allows the relationships between each variable to be understood in depth, providing a basis for further data interpretation and modeling.

The results suggest a strong correlation among independent variables. Several traffic variables have a significant relationship with delay variables, similar to weather variables. Although the variables are the same, Pearson’s correlation coefficient for the crucial relationship can be ignored since it is weak. For example, there is a −0.21 correlation between lateness and the minimum value for traffic in zone 3. This means that lateness may increase when the minimum traffic in zone 3 is low. Therefore, this variable may influence the estimation of arrival time. The correlation coefficient of the traffic variable in zone 1 is about −0.2, indicating a certain relationship between the traffic conditions in zone 1 and lateness. The traffic variables in zone 2 are less correlated, but may still provide some useful information. Except for traffic volume, all variables have no significant or negligible correlation with lateness.

3.4. Variable Selection

The variables used should be available when prediction is performed. Additionally, they must be of “high quality”. Moreover, they note that the selection of potential predictors is usually more extensive than that of explanatory models.

Variables considered here are those determined in the literature review and surveys, which determines their pre-availability and quality. Due to these variables’ availability, NOAA can provide real-time meteorological and traffic information. It can also be continuously available through several channels (but not the source used here).

The data quality needs to be improved. Although the raw data are likely to be trustworthy, they will need to be converted into data about a “zone” (Section 4.1). This may decrease their prediction ability. However, it is inevitable because the location data of the truck are not available. Furthermore, they are unlikely to be available shortly.

3.5. Selection of Data Mining (DM) Methods

Some DM methods are adopted to determine whether the container arrival time can be predicted, although the correlation between independent and dependent variables can be ignored. The difference between scheduled container arrival time and actual container arrival time is important. The DM argument is that combining dependent variables can be a fairly accurate prediction of container arrival time. Moreover, the correlation between independent variables makes some statistical techniques inappropriate (because the variables are assumed to be independent).

The five DM methods adopted here are the following:

A: A variant of Classification and Regression Trees (CART) (Breiman et al.2001) [46].
B: A variant of the classification algorithm, Support Vector Machine (SVM) (Cortes and vapnik 1995) [47].
C: A variant of the cluster technology, K-Nearest Neighbors (KNN) (Tan, Steinbach and Kumar 2006) [48].
D: A machine learning (ML) algorithm integrated classifier, Adaptive Boosting (Adaboost).
E: An ML algorithm integrated classifier, Bagging of Tree, Random Forest (RF).

This classifier selection is the primary type of classifier sample, thus avoiding all of the specific classifier performance problems. Moreover, previous research reveals that the RF, SVM, and Adaboost classifiers should perform well (Spoel et al., 2012) [49] Adapted from Mastering ML with Python in Six Stepsa practical implementation guide was followed for predictive data analysis using Python.

3.6. Experiment Settings

In this study, specific mechanisms were adopted to implement the interaction of Pandas and TensorFlow software tools with Simio. First, Pandas was used for data processing and the data were converted into a format suitable for TensorFlow. Then, TensorFlow was utilized to implement predictions based on interactive simulation models, and Simio was employed to simulate the loading and unloading operations and production operations of automated container terminals. This software implementation aimed to optimize the operational efficiency of automated container terminals and improve overall productivity through real-time data analysis and prediction.

The experiment used a hardware environment with Intel Pentium Central Processing Unit (CPU) G2030 3Ghz, 4GB memory, and Microsoft Windows 10 operating system. Pandas was adopted in data processing and analysis software to clean and compile the collected data and ensure the accuracy and consistency of the data. TensorFlow was chosen as the software tool to establish the container arrival time prediction model. Simio was used as experimental simulation software to simulate loading and unloading operations and production operations in automated container terminals. Moreover, the economic benefit index in the index system was determined through expert consultation. The automated terminal can be acquired by obtaining actual operating parameters, and the terminal to be built can be obtained mainly through simulation modeling. Therefore, when determining the design scheme before constructing an automated container terminal, it is necessary to establish a terminal simulation model to meet the requirements of loading and unloading operations and production operations. In addition, it needs to implement the simulation model of the automated container terminal to verify the feasibility of the evaluation model. The mechanism of interaction between Pandas and TensorFlow software tools and Simio was analyzed, and the structure is revealed in Figure 2:

3.7. Factors Affecting the Arrival Time of Export Containers

The following observations regarding the generality of field research in one port yard apply to other port yards. There will be some difficulties when predicting container arrival times for trucks heading to them. Human factors (truckers) and organizational factors (e.g., plan for the driver’s company) will decide the departure time. An example of organizational factors that affect departure time is abnormal incentives specific to the yard.

In this work, we examined the factors affecting the arrival time of export containers. Here are a few research questions:

A: How can we build a model to explain and predict the time when the container reaches the port yard?
B: On the basis of the literature/theory, what are the most significant factors?
C: According to the trucker, what factors are the most critical?
D: How can we combine these factors into the prediction model of when trucks arrive at the port yard?

The literature review indicates that congestion is the leading cause of delays and affects the time of containers’ arrival. Factors like weather conditions, accidents, and time of day will also influence the time of arrival. However, many authors hold that the above factors mainly affect congestion, and in the process, affect the time of arrival. Furthermore, the factors reported in the literature have rarely been empirically validated, and the literature mainly focuses on transport time. There are gaps in the studies on container arrival time.

Firstly, based on introducing the standardization of evaluation indexes, the relative membership degree matrix and multi-level evaluation model of automatic container terminal design were established. Secondly, according to the evaluation model, the evaluation example of the automated design scheme was analyzed, and the optimal scheme was obtained. Next, a sensitivity analysis was carried out on the decision of the terminal layout, including the perturbation analysis of the attribute weight and the upper limit of the attribute value, and the sensitivity analysis of the combined weight method. Finally, under the condition that the optimal solution remains unchanged in the case analysis, the stable interval of the numerical analysis of container transportation was obtained by using the software. The economic benefit index in the index system was determined through expert consultation, the automatic terminal can be obtained by acquiring the actual operating parameters of the terminal, and the terminal to be built can be obtained mainly through simulation modeling. Thus, when determining the design scheme before the construction of the automated container terminal, it is essential to implement a terminal simulation model that meets the requirements of loading and unloading operations and terminal production operations, and to implement an automated container terminal simulation model to verify the feasibility of the evaluation model.

However, the findings in the literature regarding factors predicting container arrival time were not recognized by the 200 truckers in our survey. No factors have significant correlations with lateness. However, the strongest correlation is between traffic/congestion and lateness.

The findings come from the preliminary conclusions drawn by the prediction data analysis. Apparently, when DM classifiers like integrated classifiers, decision trees, clustering, or SVM are adopted, a conclusion can be drawn that models with weather and traffic have better prediction ability than models without these variables, but have limited prediction ability.

Finally, when developing the arrival prediction model, it must consider the expected container arrival and departure times. As input data for the yard bin turning problem, artificial factors such as the expected arrival time and departure time of containers will be excluded as the premise.

Here, people are interested in traffic/congestion and container lateness. It is essential to rapidly move the containers in and out of the potentially thousands of stacks in this part of the terminal. Two primary problems involve retrieving containers from the yard once they have been placed through. One is the container stacking problem (Dekker, Voogd, and van Asperen, 2007) [50] and the container (or block) relocation problem, in which containers are extracted from stacks (Kim and Hong 2006) [51]. The other is the container pre-marshaling problem (CPMP), in which containers are re-sorted into stacks. The second one is the research subject. As part of future work, the block relocation problem (BRP) will be addressed, known as the container relocation problem (CRP). Because of space constraints, loading or unloading containers vertically is common in warehouses, especially in container yards. However, it may result in difficulties in the retrieval of blocks. It means that when the block needed is not at the top, it is essential to relocate all blocks above it to other places. However, this is only temporary.

The total number of such relocations relies on two aspects. The first is the order of arrival of containers, and the second is the location of block relocations. For instance, placing them on the block to be retrieved next incurs further relocations. This observation suggests that care and wise decisions about relocations can decrease the unnecessary workload, thus significantly improving the throughput of the warehouse and container yards.

4. Results

4.1. Evaluation, Validation, and Model Selection

The performance measurement of the DM test is the prediction score that the distance between the actual and the predicted arrival time category is shorter than or equal to a specific threshold, as described below. When the actual category is [−3 h…−2.5 h] and the predicted category is [−1…−0.5 h], the distance between the predicted and actual categories is 3. Because 30 categories are adopted, the largest distance becomes 29. When 10 predictions have distances (3, 4, 7, 1, 2, 4, 6, 3, 2, 1), the score within the threshold 3 in these predictions is 6/10 = 0.6.

The threshold is adopted as a reliability interval form for distance prediction. It provides a clearer performance indication than scores with the correct category. The threshold is chosen on the basis of the indication of the port yard that the 1-hour prediction error will be “acceptable”.

Several dataset variations are adopted for testing classifiers’ performance changes when variables are added or changed. For instance, the prediction is tested based on weather as well as traffic data. Moreover, a test is performed to decide whether dimensional reduction by categorizing continuous variables (e.g., dividing precipitation into five categories of the same size) affects the classifier’s performance. A 10-fold cross-validation is conducted to confirm the results of the test.

In this test, the following steps are taken. First, continuous variables are taken and dependent variables are transformed into classification variables. Then, different test steps are performed according to specific settings, including using only continuous traffic variables, continuous traffic variables for specific zones, discrete traffic variables, etc. Each step has a various combination of variables and processing, such as sorting continuous variables into different classification types. Furthermore, many other tests are conducted, such as considering different delay times, discrete variables for specific zones/times, etc. The tests also involve breaking weather and traffic variables into different classification types and classifying the variables according to specific criteria. These tests aim to study the effects of different data settings on classifier performance to determine the best combination of variables and treatment. Through these tests, researchers can evaluate the model’s performance under various conditions and provide more accurate predictions for the estimated arrival time of export containers.

Figure 3 reveals that the most favorable result comes in at 72%. It suggests that 72% of the predictions differ from the actual type by three types, or within 2 h later or earlier than the expected arrival. It can be concluded that the RF has the most accurate prediction results. It has a single accurate prediction, and its average performance is superior to other prediction methods. The best performance is 72.9% of arrival predictions less than 1 h. Through this prediction method, the accuracy will be 9% of actual arrivals. The accuracy of conducting DM is higher at 72.9%. The following Table 8 gives a more detailed look at the results of the tests, including the best and least desirable results in each classifier. According to the data in Table 9, it can be seen that in the test results in line 10, all classification algorithms have the highest accuracy when all continuous variables are used and the limited tardiness is greater than −3 hours. Specifically, when RF, Adaboost, and KNN algorithms are employed, the accuracy rate reaches 72%.

4.2. The Use of the Model and Its Reporting

Figure 4 exhibits that in terms of the model’s performance, the most accurate results can be obtained by classifying the weather as well as traffic data. One of the best-performing models with the RF is number 15, in which the traffic as well as weather data are divided into five types. Among the five prediction models, number 18 is the most accurate, since only two categories of traffic and weather data are used.

Moreover, the results of this study are compared with those of other literature. For example, Kim et al. (2021) [57] applied several ML models to predict container port accidents at different intervals. They selected the optimal model by comparing different models’ accuracy, precision, recall rate, and F1 scores. The results demonstrated that the classified operation data and accident data of container ports not only predicted the occurrence of accidents but also analyzed the risk of accidents. Islam et al. (2021) [58] investigated the propulsion power of inland container ships in open and restricted waterways. They introduced the simulation results of inland container ships designed for operation in China’s inland waters. The results displayed that a significant increase in drag was observed when the ship was operating in a shallow and narrow passage, which limits its possible speed of operation. Larsen et al. (2021) [59] explored the predictive control model for simultaneous route planning of containers and vehicles. They proposed a model predictive controller to determine which combination of trucks, trains, and ships should be used to transport containers, and which routes should be used by empty trucks and full trucks as an integration problem. The results expressed that the efficiency of logistics transportation can be greatly improved by adopting the strategy of simultaneously routing containers and trucks. By analyzing the factors and variables of container arrival time, the time of logistics transportation can be saved, and the efficiency of management transportation can be improved.

5. Conclusions

5.1. Conclusions and Future Work

This study used a literature review, survey, and data analysis to explore the factors and variables affecting the arrival time of containers. The literature review highlighted significant factors affecting arrival time, and the most crucial factor is traffic congestion. However, the survey results show no significant correlation between any factors and delays, and truckers believe that traffic congestion is the main cause of significant delays. In addition, in actual trucker data, the relationship between congestion, weather, and time with delays is also weak or negligible. The study also notes that truckers may be deliberately late for strategic reasons, which could affect the forecast results. At the same time, the ability and desire of truckers to deliver goods on time are also among the factors affecting the predictive ability. If the truckers do not know how long the route will take, they may choose to leave at random, resulting in unpredictable container arrival times.

Different prediction methods are used in this study, including RF, Adaboost, KNN, SVM, and CART, aiming at the prediction model of the arrival time of container terminal logistics. However, regardless of whether we use continuous or discrete variables, traffic or weather variables, the prediction accuracy is not high—the highest accuracy obtained is only 72%. This means that the current prediction model still has some limitations in predicting the arrival time of container terminal logistics. Additionally, both the literature review and trucker surveys indicate that traffic congestion is a major factor in container arrival delays. However, the variables and factors associated with traffic congestion do not predict arrival times well, suggesting that current models have challenges in capturing and exploiting traffic congestion’s effects.

It is found that weather and time factors (such as specific hours) have a weak or even negligible relationship with container arrival times. This indicates that the current prediction model cannot significantly improve accuracy using weather and specific time factors.

5.2. Contributions and Deficiency

This study used data analysis, a literature review, and a survey to study the factors and variables affecting the arrival time of containers. The arrival time of the export container was explored by a data-driven analysis method and simulation model. This study fills the research gap in related fields, provides an understanding of container arrival time’s prediction and influencing factors, and analyzes and predicts the transfer data through the simulation interactive modeling method. This method can obtain information and patterns from actual data, and make predictions and analyses based on this information, which helps improve prediction accuracy and reliability. Through the analysis of the data of the control center, the factors affecting the arrival time of container transportation are determined, which offers a basis for further improving the arrival time accuracy. Through simulation modeling, the actual operating parameters and terminal production and operation of the automated terminal can be obtained, thus providing guidance and decision support for the construction of the automated terminal. Despite the use of “big data” methods for data analysis and decision modeling, some limitations do not fully account for all influencing factors. It is suggested that future studies should consider more organizational factors and further study the relationship between transportation time and arrival time to improve prediction accuracy. Future efforts can be made to consider more factors, improve data collection and analysis methods, consider uncertainty and risk, integrate and optimize models, and implement practical applications and systems. Thus, the ability to predict export containers’ arrival time and practical application effect is further improved.

Author Contributions

Conceptualization, J.L. and R.W.; methodology, J.L. and R.W.; software, R.W.; validation, R.W.; formal analysis, J.L. and R.W.; writing—original draft preparation, R.W.; writing—review and editing, J.L. and R.W.; supervision, J.L. and R.B.; projection administration, R.B.; funding acquisition, J.L. and R.B. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by the National Natural Science Foundation of China (grant number 72071116) and the Ningbo Science and Technology Bureau (grant numbers 2019B10026, 2017D10034).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

The studies involving human participants were reviewed and approved by the University of Nottingham Ningbo China Ethics Committee. The participants provided their written informed consent to participate in this study. Written informed consent was obtained from the individual(s) for the publication of any potentially identifiable images or data included in this study.

Data Availability Statement

The data that has been used is confidential.

Conflicts of Interest

The authors declare no conflict of interest.

References

Van Belle, J.; Valckenaers, P.; Cattrysse, D. Cross-docking: State of the art. Omega 2012, 40, 827–846. [Google Scholar]
Shmueli, G.; Koppius, O.R. Predictive analytics in information systems research. MIS Q. 2011, 35, 553–572. [Google Scholar] [CrossRef]
Vander, S.; Amrit, C.; van Hillegersberg, J. Predictive analytics for truck arrival time estimation: A field study at a European distribution centre. Int. J. Prod. Res. 2017, 55, 5062–5078. [Google Scholar] [CrossRef]
Hall, R.W. Route choice and advanced traveler information systems on a capacitated and dynamic network. Transp. Res. Part C Emerg. Technol. 1996, 4, 289–306. [Google Scholar] [CrossRef]
Sheu, J.B.; Ritchie, S.G. A new methodology for incident detection and characterization on surface streets. Transp. Res. Part C Emerg. Technol. 1998, 6, 315–335. [Google Scholar] [CrossRef]
Yang, H. Multiple equilibrium behaviors and advanced traveler information systems with endogenous market penetration. Transp. Res. Part B Methodol. 1998, 32, 205–218. [Google Scholar] [CrossRef]
Amini, B.; Shahi, J.; Ardekani, S.A. An observational study of the network-level traffic variables. Transp. Res. Part A Policy Pract. 1998, 32, 271–278. [Google Scholar] [CrossRef]
Bell, M.G.H. A game theory approach to measuring the performance reliability of transport networks. Transp. Res. Part B Methodol. 2000, 34, 533–545. [Google Scholar] [CrossRef]
Bates, J.; Polak, J.; Jones, P.; Cook, A. The valuation of reliability for personal travel. Transp. Res. Part E Logist. Transp. Rev. 2001, 37, 191–229. [Google Scholar] [CrossRef]
Rietveld, P.; Bruinsma, F.R.; Van Vuuren, D.J. Coping with unreliability in public transport chains: A case study for Netherlands. Transp. Res. Part A Policy Pract. 2001, 35, 539–559. [Google Scholar] [CrossRef]
Golob, T.F.; Regan, A.C. Impacts of highway congestion on freight operations: Perceptions of trucking industry managers. Transp. Res. Part A Policy Pract. 2001, 35, 577–599. [Google Scholar] [CrossRef]
Stathopoulos, A.; Karlaftis, M.G. A multivariate state space approach for urban traffic flow modeling and prediction. Transp. Res. Part C Emerg. Technol. 2003, 11, 121–135. [Google Scholar] [CrossRef]
Zhang, X.; Rice, J.A. Short-term travel time prediction. Transp. Res. Part C Emerg. Technol. 2003, 11, 187–210. [Google Scholar] [CrossRef]
Fowkes, A.S.; Firmin, P.E.; Tweddle, G.; Whiteing, A.E. How highly does the freight transport industry value journey time reliability—And for what reasons? Int. J. Logist. Res. Appl. 2004, 7, 33–43. [Google Scholar] [CrossRef]
Wu, C.H.; Ho, J.M.; Lee, D.T. Travel-time prediction with support vector regression. IEEE Trans. Intell. Transp. Syst. 2004, 5, 276–281. [Google Scholar] [CrossRef]
De Feijter, R.; Evers, J.J.M.; Lodewijks, G. Improving travel-time reliability by the use of trip booking. IEEE Trans. Intell. Transp. Syst. 2004, 5, 288–292. [Google Scholar] [CrossRef]
Clark, S.; Watling, D. Modelling network travel time reliability under stochastic demand. Transp. Res. Part B Methodol. 2005, 39, 119–140. [Google Scholar] [CrossRef]
Van Lint, J.W.C. Online learning solutions for freeway travel time prediction. IEEE Trans. Intell. Transp. Syst. 2008, 9, 38–47. [Google Scholar] [CrossRef]
Golob, T.F.; Regan, A.C. Trucking industry preferences for traveler information for drivers using wireless Internet-enabled devices. Transp. Res. Part C Emerg. Technol. 2005, 13, 235–250. [Google Scholar] [CrossRef]
Lo, H.K.; Luo, X.W.; Siu, B.W.Y. Degradable transport network: Travel time budget of travelers with heterogeneous risk aversion. Transp. Res. Part B Methodol. 2006, 40, 792–806. [Google Scholar] [CrossRef]
Hollander, Y.; Liu, R. Estimation of the distribution of travel times by repeated simulation. Transp. Res. Part C Emerg. Technol. 2008, 16, 212–231. [Google Scholar] [CrossRef]
Paterson, D.; Rose, G. A recursive, cell processing model for predicting freeway travel times. Transp. Res. Part C Emerg. Technol. 2008, 16, 432–453. [Google Scholar] [CrossRef]
Yeon, J.; Elefteriadou, L.; Lawphongpanich, S. Travel time estimation on a freeway using Discrete Time Markov Chains. Transp. Res. Part B Methodol. 2008, 42, 325–338. [Google Scholar] [CrossRef]
Lam, W.H.K.; Shao, H.; Sumalee, A. Modeling impacts of adverse weather conditions on a road network with uncertainties in demand and supply. Transp. Res. Part B Methodol. 2008, 42, 890–910. [Google Scholar] [CrossRef]
Van Lint, J.W.C.; Van Zuylen, H.J.; Tu, H. Travel time unreliability on freeways: Why measures based on variance tell only half the story. Transp. Res. Part A Policy Pract. 2008, 42, 258–277. [Google Scholar] [CrossRef]
Jula, H.; Dessouky, M.; Ioannou, P.A. Real-time estimation of travel times along the arcs and arrival times at the nodes of dynamic stochastic networks. IEEE Trans. Intell. Transp. Syst. 2008, 9, 97–110. [Google Scholar] [CrossRef]
Van Lint, J.W.C.; Hoogendoorn, S.P.; van Zuylen, H.J. Accurate freeway travel time prediction with state-space neural net-works under missing data. Transp. Res. Part C Emerg. Technol. 2005, 13, 347–369. [Google Scholar] [CrossRef]
Van Hinsbergen, C.P.I.J.; Van Lint, J.W.C.; Van Zuylen, H.J. Bayesian committee of neural networks to predict travel times with confidence intervals. Transp. Res. Part C Emerg. Technol. 2009, 17, 498–509. [Google Scholar] [CrossRef]
Nie, Y.M.; Wu, X. Shortest path problem considering on-time arrival probability. Transp. Res. Part B Methodol. 2009, 43, 597–613. [Google Scholar] [CrossRef]
Li, Z.; Hensher, D.A.; Rose, J.M. Willingness to pay for travel time reliability in passenger transport: A review and some new empirical evidence. Transp. Res. Part E Logist. Transp. Rev. 2010, 46, 384–403. [Google Scholar] [CrossRef]
Chen, A.; Zhou, Z. The α-reliable mean-excess traffic equilibrium model with stochastic travel times. Transp. Res. Part B Methodol. 2010, 44, 493–513. [Google Scholar] [CrossRef]
Ng, M.W.; Waller, S.T. A computationally efficient methodology to characterize travel time reliability using the fast Fourier transform. Transp. Res. Part B Methodol. 2010, 44, 1202–1219. [Google Scholar] [CrossRef]
Figliozzi, M.A. The impacts of congestion on commercial vehicle tour characteristics and costs. Transp. Res. Part E Logist. Transp. Rev. 2010, 46, 496–506. [Google Scholar] [CrossRef]
Figliozzi, M.A. The impacts of congestion on time-definitive urban freight distribution networks CO₂ emission levels: Results from a case study in Portland, Oregon. Transp. Res. Part C Emerg. Technol. 2011, 19, 766–778. [Google Scholar] [CrossRef]
Yu, B.; Lam, W.H.K.; Tam, M.L. Bus arrival time prediction at bus stop with multiple routes. Transp. Res. Part C Emerg. Technol. 2011, 19, 1157–1170. [Google Scholar] [CrossRef]
Fei, X.; Lu, C.C.; Liu, K. A bayesian dynamic linear model approach for real-time short-term freeway travel time prediction. Transp. Res. Part C Emerg. Technol. 2011, 19, 1306–1318. [Google Scholar] [CrossRef]
Khosravi, A.; Mazloumi, E.; Nahavandi, S.; Creighton, D.; Van Lint, J. A genetic algorithm-based method for improving quality of travel time prediction intervals. Transp. Res. Part C Emerg. Technol. 2011, 19, 1364–1376. [Google Scholar] [CrossRef]
Li, L.; Chen, X.; Li, Z.; Zhang, L. Freeway travel-time estimation based on temporal–spatial queueing model. IEEE Trans. Intell. Transp. Syst. 2013, 14, 1536–1541. [Google Scholar] [CrossRef]
Lederman, R.; Wynter, L. Real-time traffic estimation using data expansion. Transp. Res. Part B Methodol. 2011, 45, 1062–1079. [Google Scholar] [CrossRef]
Hu, Z.; Zhou, T.; Osman, M.T.; Li, X.; Jin, Y.; Zhen, R. A novel hybrid fuel consumption prediction model for ocean-going container ships based on sensor data. J. Mar. Sci. Eng. 2021, 9, 449. [Google Scholar] [CrossRef]
Maldonado, S.; González-Ramírez, R.G.; Quijada, F.; Ramírez-Nafarrate, A. Analytics meets port logistics: A decision support system for container stacking operations. Decis. Support Syst. 2019, 121, 84–93. [Google Scholar] [CrossRef]
Lei, N. Intelligent logistics scheduling model and algorithm based on Internet of Things technology. Alex. Eng. J. 2022, 61, 893–903. [Google Scholar] [CrossRef]
Tu, H.; Li, H.; van Lint, H.; van Zuylen, H. Modeling travel time reliability of freeways using risk assessment techniques. Transp. Res. Part A Policy Pract. 2012, 46, 1528–1540. [Google Scholar] [CrossRef]
Servos, N.; Liu, X.; Teucke, M.; Freitag, M. Travel time prediction in a multimodal freight transport relation using machine learning algorithms. Logistics 2019, 4, 1. [Google Scholar] [CrossRef]
Balster, A.; Hansen, O.; Friedrich, H.; Ludwig, A. An ETA prediction model for intermodal transport networks based on machine learning. Bus. Inf. Syst. Eng. 2020, 62, 403–416. [Google Scholar] [CrossRef]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
Tan, P.N.; Steinbach, M.; Kumar, V. Introduction to Data Mining; Addison-Wesley: Boston, MA, USA, 2006. [Google Scholar]
Spoel, S.; Keulen, M.; Amrit, C. Process prediction in noisy data sets: A case study in a dutch hospital. In Proceedings of the International Symposium on Data-Driven Process Discovery and Analysis, Campione d’Italia, Italy, 18–20 June 2012; Springer: Berlin/Heidelberg, Germany, 2012; pp. 60–83. [Google Scholar]
Dekker, R.; Voogd, P.; Asperen, E. Advanced methods for container stacking. In Container Terminals and Cargo Systems; Springer: Berlin/Heidelberg, Germany, 2007; pp. 131–154. [Google Scholar]
Kim, K.H.; Hong, G.P. A heuristic rule for relocating blocks. Comput. Oper. Res. 2006, 33, 940–954. [Google Scholar] [CrossRef]
Antoniadis, A.; Lambert-Lacroix, S.; Poggi, J.M. Random forests for global sensitivity analysis: A selective review. Reliab. Eng. Syst. Saf. 2021, 206, 107312. [Google Scholar] [CrossRef]
Huang, X.; Li, Z.; Jin, Y.; Zhang, W. Fair-AdaBoost: Extending AdaBoost method to achieve fair classification. Expert Syst. Appl. 2022, 202, 117240. [Google Scholar] [CrossRef]
Lu, J.; Qian, W.; Li, S.; Cui, R. Enhanced K-nearest neighbor for intelligent fault diagnosis of rotating machinery. Appl. Sci. 2021, 11, 919. [Google Scholar] [CrossRef]
Gaye, B.; Zhang, D.; Wulamu, A. Improvement of support vector machine algorithm in big data background. Math. Probl. Eng. 2021, 2021, 5594899. [Google Scholar] [CrossRef]
Carrizosa, E.; Molero-Río, C.; Romero Morales, D. Mathematical optimization in classification and regression trees. Top 2021, 29, 5–33. [Google Scholar] [CrossRef]
Kim, J.H.; Kim, J.; Lee, G.; Park, J. Machine learning-based models for accident prediction at a Korean container port. Sustainability 2021, 13, 9137. [Google Scholar] [CrossRef]
Islam, H.; Soares, C.G.; Liu, J.; Wang, X. Propulsion power prediction for an inland container vessel in open and restricted channel from model and full-scale simulations. Ocean Eng. 2021, 229, 108621. [Google Scholar] [CrossRef]
Larsen, R.B.; Atasoy, B.; Negenborn, R.R. Model predictive control for simultaneous planning of container and vehicle routes. Eur. J. Control. 2021, 57, 273–283. [Google Scholar] [CrossRef]

Figure 1. Specific process structure diagram of simulation modeling and research methods.

Figure 2. Mechanism structure of interaction between Pandas and TensorFlow software tools and Simio.

Figure 3. DM test results. In the horizontal coordinate, 1: all variables are continuous; 2: only the traffic variables are continuous; 3: only the traffic zone 3 variables are continuous; 4: discrete traffic variables only; 5: discrete traffic zone 3 variables only; 6: all, weather and traffic variables are discrete; 7: discrete weather per zone/hour and discrete traffic variables; 8: discrete weather per zone/hour and discrete traffic variables with three categories; 9: discrete weather per zone/hour and discrete traffic variables with two categories; 10: all continuous variables, tardiness greater than −3 h; 11: continuous traffic variables only, tardiness greater than −3 h; 12: continuous traffic zone 3 variables only, tardiness greater than −3 h; 13: discrete traffic variables only, tardiness greater than −3 h; 14: discrete traffic zone 3 variables, tardiness greater than −3 h, 15: all weather and traffic variables discrete, tardiness greater than −3 h; 16: discrete weather per zone/hour and discrete traffic variables, tardiness greater than −3 h; 17: discrete weather per zone/hour and discrete traffic variables with three categories, tardiness greater than −3 h; 18: discrete weather per zone/hour and discrete traffic variables with two categories, tardiness greater than −3 h.(Antoniadis et al., 2021 [52]; Huang et al., 2022 [53]; Lu et al., 2021 [54]; Gaye et al., 2021 [55]; Carrizosa et al., 2021 [56]).

Figure 4. The portions predicted at the specific absolute distance between the predicted and actual arrival gaps. The table displays the best and worst results of each classifier on the basis of the proportion of results with a distance ≤ 2. (a) Adaboost; (b) CART; (c) SVM; (d) RF; (e) KNN.

Table 1. Key factor statistics of logistics arrival time prediction and analysis based on simulation interactive modeling.

Factors	Description
Road capacity	Road capacity refers to the maximum passing capacity of vehicles on the road. Lower capacity will lead to vehicle congestion, extending the time required to cross the road.
Population density	Population density is the number of people per unit area. High-population-density areas usually have more traffic congestion because more vehicles share limited road resources.
Estimated arrival time	The estimated arrival time depends on traffic conditions, driving speed, and distance. The higher the congestion and the slower the driving speed, the longer it will take to reach the destination.
Transportation mode	Different transportation modes (such as land, rail, and water) affect logistics arrival time differently. Each transportation mode has its specific speed and efficiency characteristics that affect the transport time of goods.
Reliability	Reliability indicates the degree to which the logistics transportation can reach the destination on time. Lower reliability means greater uncertainty and risk of delays, which can increase logistics arrival times.
The value of goods	The value of goods will affect the safety and speed of logistics transportation. High-value goods may require more security and protection measures, thus increasing logistics arrival times.
Seasonal factor	Traffic and road conditions in different seasons may change logistics arrival times. For example, adverse weather conditions (such as heavy rain and snow) can lead to traffic congestion and deteriorating road conditions, thereby extending the transport time of goods.
Relationship between supply and demand	The relationship between logistics supply and demand directly impacts the transportation efficiency and speed of goods. The imbalance between supply and demand can lead to congestion and delays, increasing logistics arrival times.
Transportation cost	Transportation cost includes fuel cost, manpower cost, equipment cost, and so on. To reduce transportation costs, logistics operators may adopt some strategies, such as choosing economic routes, reducing the number of transfers, etc. These strategies may have an impact on the transportation cost of goods.

Table 2. The contents discussed in each paper.

Authors	Year	Weather	Congestion/ Flow	Speed	Distance	Type of Cargo	Type of Truck	Time of Day/Week/ Month/Year	Cumulative Previous	Accidents/Incidents	Road Work	Traffic Signal	Road Condition	Driving Style	Empirical	Simulated Data	Factor Validation
Hall [4]	1996										X					X
Sheu and Ritchie [5]	1998		X								X				X	X
Yang [6]	1998		X													X
Amini et al. [7]	1998								X						X
Bell [8]	2000													X	X	X
Bates et al. [9]	2001								X						X
Rietveld et al. [10]	2001										X				X
Golob and Regan [11]	2001		X												X
Stathopoulos and Karlaftis [12]	2003		X	X											X
Zhang and Rice [13]	2003		X	X											X
Fowkes et al. [14]	2004				X		X	X	X						X
Wu et al. [15]	2004	X	X	X					X		X				X		X
De Feijteretral [16]	2004		X													X
Clark and Watling [17]	2005	X									X	X	X			X	X
Van-Lint et al. [18]	2008	X	X												X	X	X
Golob and Regan [19]	2005	X	X					X	X					X		L	X
Lo et al. [20]	2006								X	X	X				X	L
Hollander and Liu [21]	2008	X	X	X		X	X					X			X	N	X
Paterson and Rose [22]	2008		X	X										X	X	L
Yeon et al. [23]	2008	X	X				X							X	X	L	X
Lam et al. [24]	2008	X	X				X		X	X					X	L	X
Van Lint et al. [25]	2008		X				X							X		E
Jula et al. [26]	2008		X				X								X	N
Van Lint [27]	2005		X											X		L
Van Hinsbergen et al. [28]	2009		X											X		L
Nie and Wu [29]	2009						X								X	L
Li et al. [30]	2010	X							X			X	X	X		L	X
Chen and Zhou [31]	2010	X	X				X		X		X				X	L	X
Ng and Waller [32]	2010		X							X					X	L
Figliozzi [33]	2010		X											X		L
Figliozzi [34]	2011		X											X		L
Yu et al. [35]	2011		X											X		L
Fei et al. [36]	2011		X											X		L
Khosravi et al. [37]	2011	X		X	X			X	X				X	X		L	X
Li and Rose [38]	2013	X				X		X	X						X	L	X
Lederman and Wynter [39]	2011	X				X							X	X		L	X
Times mentioned		11	23	7	1	2	4	11	3	12	4	3	3	4	22	17	36

Meaning of factor validations abbreviations: L—literature, E—empirical, N—none.

Table 3. Matrix of traffic zones.

Traffic and Weather Zones	Distance	Scheduled Arrival Time (h)	Departure City Zone
Zone 1	0–80 km	0–1	Ningbo
Zone 2	80–160 km	1–2	Shaoxing, Taizhou
Zone 3	160–240 km	2–3	Jinhua, Hangzhou, Wenzhou
Zone4	240–320 km	3–4	Quzhou
Zone5	320–400 km	4–5	Nanjing
Zone6	400–480 km	5–6	Hefei
Zone 7	480–560 km	6–7	Jiujiang
Zone 8	560–640 km	7–8	Xuzhou
Zone 9	640–720 km	8–9	Shanghai
Zone 10	720–800 km	9–10	Wuhan
Zone 11	800–880 km	10–11	Changsha
Zone 12	880–960 km	11–12	Chongqin
Zone 13	960–1040 km	12–13	Chengdu

Table 4. Statistics on the arrival time of logistics vehicles at container terminals.

Lateness	Frequency (s⁻¹)	Percentage (%)	Average Delay (min)
More than 8 h early	1500	8	−480
−8 to −6 h	400	2	−420
−6 to −4 h	300	2	−300
−4 to −2 h	600	3	−180
−2 to 0 h	1200	6	−60
On-time (0 h)	5000	25	0
0 to 2 h late	5000	25	60
2 to 4 h late	2000	10	180
4 to 6 h late	1000	5	300
More than 6 h late	1000	5	360
Total	20,000	100

Table 5. Frequency of a number of factors filled out in the survey.

Factors Filled Out	Frequency	Percentage (%)
0	20	10
1	15	8
2	10	5
3	5	3
4	30	15
5	40	20
6	35	18
7	25	13
8 or more	20	10
Total	200	100

Table 6. Pearson’s r correlation coefficients between questionnaire variables.

Factors	Weather	Traffic	Driving	Cargo	Planning	Distance	Truck/Trailer	Time of Day
Weather	1	0.401	−0.101	−0.202	−0.125	−0.245	−0.425	−0.651
Traffic	0.401	1	0.301	0.202	0.301	0.202	0.301	0.301
Driving	−0.101	0.301	1	0.270	0.602	0.701	0.601	0.525
Cargo	−0.202	0.202	0.270	1	−0.157	0.342	−0.152	−0.152
Planning	−0.125	0.301	0.602	0.157	1	0.282	−0.157	−0.135
Distance	−0.245	0.202	0.701	0.342	0.282	1	−0.136	−0.159
Truck/trailer	−0.425	0.301	0.601	0.152	0.157	0.136	1	−0.057
Time of day	−0.651	0.301	0.525	0.152	0.135	0.159	−0.057	1

Note: Bold values indicate a significant correlation when α = 0.05. The significance level used in hypotheses tests suggests that there is a 5% risk of concluding that a difference exists when an actual difference does not exist.

Table 7. The variable numbers in the dataset.

Variable Name	Number
tardiness	1
mean.wind	2
maxgust	3
meantemp	4
mintemp	5
maxtemp	6
max.traffic1	7
min.traffic1	8
mean.traffic1	9
max.traffic2	10
min.traffic2	11
mean.traffic2	12
max.traffic3	13
min.traffic3	14
mean.traffic3	15
mean_t.1	16
mean_w.1	17
max_w.1	18
precip_amount.1	19
fog.1	20
rain.1	21
mean_t.2	22

Table 8. Statistical results of correlation strength values between variables in the dataset.

	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22
1	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-
2	−0.01	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-
3	0	0.78	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-
4	0.01	−0.13	0.03	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-
5	0	0.09	0.15	0.92	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-
6	−0.2	−0.24	−0.05	0.96	0.82	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-
7	−0.2	0.14	0.01	−0.03	−0.03	−0.14	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-
8	−0.18	0.06	0.01	0.14	0.06	0.03	0.81	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-
9	−0.19	0.03	0.03	−0.01	0.03	−0.03	0.98	0.88	-	-	-	-	-	-	-	-	-	-	-	-	-	-
10	−0.19	0.01	0	−0.03	−0.03	−0.14	0.84	0.83	0.87	-	-	-	-	-	-	-	-	-	-	-	-	-
11	−0.17	−0.03	−0.03	0.08	0.09	0.07	0.76	0.91	0.83	0.88	-	-	-	-	-	-	-	-	-	-	-	-
12	−0.21	0.01	0	−0.01	0	−0.03	0.85	0.87	0.88	0.98	0.91	-	-	-	-	-	-	-	-	-	-	-
13	−0.18	0.01	0	0	0.03	−0.03	0.68	0.76	0.71	0.92	0.83	0.93	-	-	-	-	-	-	-	-	-	-
14	−0.07	0.14	0.03	0.03	0.01	−0.03	0.58	0.84	0.66	0.81	0.87	0.81	0.83	-	-	-	-	-	-	-	-	-
15	−0.03	0.01	0.03	−0.03	0.03	−0.01	0.65	0.78	0.68	0.91	0.84	0.92	0.98	0.88	-	-	-	-	-	-	-	-
16	−0.03	−0.14	0.14	0.89	0.84	0.83	0.22	0.41	0.26	0.29	0.38	0.3	0.3	0.37	0.31	-	-	-	-	-	-	-
17	0.01	0.82	0.65	−0.05	0.14	−0.15	0.14	0.25	0.16	0.19	0.22	0.2	0.21	0.25	0.22	0.09	-	-	-	-	-	-
18	0.03	0.8	0.66	−0.14	0.14	−0.15	0.17	0.28	0.19	0.22	0.26	0.23	0.23	0.28	0.25	0.12	0.99	-	-	-	-	-
19	0.01	0.13	0.13	−0.14	0.05	−0.07	0	−0.03	−0.03	0	−0.01	0	0.03	−0.03	0	−0.03	0.17	0.19	-	-	-	-
20	0.03	−0.22	−0.22	0.03	−0.03	0.07	−0.16	−0.22	−0.17	−0.2	−0.2	−0.2	−0.2	−0.21	−0.2	−0.13	−0.3	−0.3	−0.14	-	-	-
21	0.01	0.34	0.3	−0.07	0.08	−0.15	−0.03	−0.01	−0.03	−0.01	−0.03	−0.01	−0.03	−0.01	−0.01	−0.03	0.39	0.41	0.5	−0.15	-	-
22	−0.05	−0.03	0.05	0.9	0.86	0.84	0.15	0.38	0.2	0.24	0.35	0.26	0.27	0.36	0.29	0.98	0.11	0.14	−0.01	−0.11	0.03	-

Note: when a = 0.05, there is no rule for determining the size of a correlation as strong, moderate, or weak. For such data, it is usually considered that correlations higher than 0.4 are relatively strong, correlations 0.2–0.4 are moderate, and those below 0.2 are weak.

Table 9. The test result refers to the mean amount of predictions with a distance ≤ 2 (or 1 h) from the actual category.

Tests	RF (Antoniadis et al., 2021) [52]	Adaboost (Huang et al., 2022) [53]	KNN (Lu et al., 2021) [54]
All variables are continuous	0.664	0.660	0.663
Only the traffic variables are continuous	0.601	0.595	0.595
Only the traffic zone 3 variables are continuous	0.582	0.580	0.573
Discrete traffic variables only	0.580	0.578	0.598
Discrete traffic zone 3 variables only	0.573	0.580	0.574
All weather and traffic variables are discrete	0.666	0.664	0.663
Discrete weather per zone/hour and discrete traffic variables	0.665	0.664	0.659
Discrete weather per zone/hour and discrete traffic variables, tardiness greater than −3 h	0.735	0.733	0.728
Discrete weather per zone/hour and discrete traffic variables with three categories, tardiness greater than −3 h	0.735	0.733	0.730
Discrete weather per zone/hour and discrete traffic variables with two categories, tardiness greater than −3 h	0.734	0.733	0.729
Max	0.737	0.733	0.729

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, R.; Li, J.; Bai, R. Prediction and Analysis of Container Terminal Logistics Arrival Time Based on Simulation Interactive Modeling: A Case Study of Ningbo Port. Mathematics 2023, 11, 3271. https://doi.org/10.3390/math11153271

AMA Style

Wang R, Li J, Bai R. Prediction and Analysis of Container Terminal Logistics Arrival Time Based on Simulation Interactive Modeling: A Case Study of Ningbo Port. Mathematics. 2023; 11(15):3271. https://doi.org/10.3390/math11153271

Chicago/Turabian Style

Wang, Ruoqi, Jiawei Li, and Ruibin Bai. 2023. "Prediction and Analysis of Container Terminal Logistics Arrival Time Based on Simulation Interactive Modeling: A Case Study of Ningbo Port" Mathematics 11, no. 15: 3271. https://doi.org/10.3390/math11153271

APA Style

Wang, R., Li, J., & Bai, R. (2023). Prediction and Analysis of Container Terminal Logistics Arrival Time Based on Simulation Interactive Modeling: A Case Study of Ningbo Port. Mathematics, 11(15), 3271. https://doi.org/10.3390/math11153271

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Prediction and Analysis of Container Terminal Logistics Arrival Time Based on Simulation Interactive Modeling: A Case Study of Ningbo Port

Abstract

1. Introduction

2. Literature Review

2.1. Review of the Literature

2.2. Data from Yard Control Centers

3. Materials and Methods

3.1. Simulation Interactive Modeling Method

3.2. Preparation of Data

3.2.1. The Collection of Data

3.2.2. Missing Value

3.2.3. Traffic and Weather Zones and Missing Value

3.3. Data Exploration and Analysis

3.3.1. Findings of the Survey

3.3.2. Data about the Port Yard

3.4. Variable Selection

3.5. Selection of Data Mining (DM) Methods

3.6. Experiment Settings

3.7. Factors Affecting the Arrival Time of Export Containers

4. Results

4.1. Evaluation, Validation, and Model Selection

4.2. The Use of the Model and Its Reporting

5. Conclusions

5.1. Conclusions and Future Work

5.2. Contributions and Deficiency

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI