4.2. Determining the Parameters for Two-Tier Architecture
Suburbs with weight and volume less than 3% of the total were neglected to simplify the model. The driver’s main service areas are concentrated in Wigram and Sockburn. Therefore, addresses in Wigram and Sockburn were considered as one cluster. The proportion of consignment quantities, weight and volume for suburbs in Christchurch is shown in
Table 3.
One-year consignment data were input in ArcGIS. The number of data points was 3195.
Figure 4 exhibits hotspots of consignment data with attributes of the consignment number, weight and volume in ArcGIS. These hotspots were calculated based on the target features within Euclidean Distance [
61].
All three hotspot maps show a similar distribution by different attributes. Therefore, the centre of the cluster can incorporate the feature of the number of consignments, consignment weight and volume in the two-tier architecture.
Individual customer locations are challenging to embody in logistics models. Therefore, clusters may be considered in models instead of specific locations. The principle of cluster analysis is to make the objects in the same cluster have the greatest possible similarity. Then a representative centre is applied to express the cluster [
62]. Cluster centres and centroids have emerged in research, including wireless sensor networks [
63], the internet of things [
64], biological systems [
65] and the global navigation satellite system [
66].
Cluster centres are also defined in the K-means clustering algorithm. K-means clustering is based on the concept of dividing
n points into
k clusters, so that each point belongs to the cluster corresponding to the nearest centre [
67]. Combined with the K-means clustering algorithm, GIS may be used to identify the peak travel of residents and hot spots for taxis [
68]. K-means clustering was used to establish hot spots where road accidents occur [
69]. Similarly, a modified clustering method (DP-Dip) has been used to estimate the centre of the cluster [
70]. DP-Dip does not make any assumptions about the data distribution and only admits that each cluster has a unimodal distribution. This method adaptively splits some clusters according to their density.
The benefits of clustering are that it provides a simple representation of the problem, which may be used to infer structure in a wider data set of addresses. However, applications of clustering to last-mile delivery are scarce. Successful applications include food delivery [
71]. Costs were estimated by K-means clustering for restaurants in a city. However, the difficulty with the clustering approach for logistics is the high degree to which the operation reality has been simplified. Therefore, a cluster model can represent an average operational day, but does not correspond to the operational reality for any one day. Also, vehicles do not actually travel backwards and forwards from a cluster centre, but rather have a delivery run. In situations where the delivery addresses have high day-to-day variability, clustering methods are inadequate because the need is to represent vehicle routing.
Mean centres, median centres with volume and weight attributes were calculated in ArcGIS, as shown in
Figure 5. It is observed that the multiple measures of centredness are all very similar.
The mean centre identifies the geographic centre (or the centre of concentration) for a set of features. It is the average of x and y coordinates of all locations. The median centre identifies the location that minimizes the overall Euclidean distance to the location in a dataset. The mean centre is calculated by averaging x and y coordinates of all points. The mean centre (
is given as Equation (5) shows:
where
,
are the coordinates for feature
,
is the total number of features. In comparison, the median centre
,
is typically found by minimising the Euclidean distance
from each point to the centre, as Equation (6) shows. Kuhn and Kuenne [
72] introduced the method of calculating the median centre, and Burt and Barber [
73] provided an iterative process.
It is observed that the six centres are geographically similar because weight values make small differences in hotspots, as
Figure 3 shows.
The purpose of the median centre is to find a centre that approximates each data point. The minimum distance was not involved in this research, so median centres were excluded. In addition, as
Figure 5 shows, distributions of data points with weight, volume and consignment number are similar, which results in the mean centres with these factors being approximately coincident. From the consideration of data processing convenience, the mean centre with consignment number was selected to represent the cluster centre in the two-tier architecture, consistent with [
74].
The first-tier distance
is 762.5 m, and all Euclidean distances (
) were fitted in a Gamma distribution, as shown in
Figure 6. The result was computed statistically with Statistica.
Ensembles of 10 and 15 consignment addresses were randomly selected from the consignment data. For example results of the TSP route determination for 10 and 15 consignment delivery, see
Figure 7 and
Figure 8, respectively. The entire results are listed in
Figure A1 and
Figure A2.
Note that the random selection was done by consignments, not addresses, to reflect actual operations. The difference is small, and causes the occasional duplicate address due to the consolidation process whereby one address may receive multiple consignments from different senders. The models allow the truck to economise in these cases, which is realistic. The effect is to broaden the uncertainty in the models, which is conservative.
The total distances were obtained from the two-tier architecture and TSP models.
Table 4 presents the results of 10 and 15 consignment cases.
The R-value for each run was used to form a normal distribution with a mean 0.806 and a standard deviation 0.153. It was applied to
n = 10, 11, 12, 13, 14, 15 cases (number of consignments). Therefore, the second-tier distance was formed through multiplying the gamma distribution and the normal distribution.
Table 5 indicates the parameters of the two-tier architecture for the cluster delivery.
4.3. Results and Validation of Two-Tier Architecture
To validate the TSP simulation results, a delivery case for
n = 10 was selected and the corresponding customer locations were input to ArcGIS. The TSP simulation was conducted and compared with real truck GPS data, see
Figure 9.
The close correspondence between GPS and TPS results indicated that the truck driver did actually and intuitively find the optimal route to conduct the delivery. This means the TSP result can generally reflect the actually delivery route.
Stochastics simulations were conducted for each N case with 100 replications. All mean values of route distance are shown in
Figure 10.
The results in
Figure 10 show a reasonably close approximation for the two methods. The TSP method is the more accurate, but is not a practical method from an operational perspective because of the large amount of effort required to implement it. Hence the two-tier method has significant practical advantages. As the second stage of validation of the model, an additional set of ten TSP simulations were conducted for
n = 12 using the method described above. These results were then compared to the corresponding two-tier result for
n = 12. An ANOVA analysis shows the differences are not significant [F(1, 108) = 0.01745,
p = 0.895], see box and whisker in
Figure 11. The mean for two-tier architetcture is 6560m and for TPS 6595 m.
The validation shows distances estimated by the two-tier architecture for two cases, and the mean value is approximate to the GPS data. Therefore, freight delivery with stochastic customer locations can be validly represented by the two-tier architecture.
Hence the entire delivery last-mile region can be simplified into a cluster model with a functional dependency on the number of consignments. This potentially moves the field forward, because it allows the complexity of a variable last-mile PUD situation to be reduced into a stochastic formulation. This can be used for operational planning purposes, in real-time, by use of a suitable stochastic engine. Such an engine might be @Risk, or DES software for a more comprehensive model. A DES model could, in principle, be expanded to include multiple such suburbs, as well as the additional complexity of consolidation and line haul.
While the resulting model is relatively simple, the key enabling method is GIS. This is because the TSP algorithm within GPS provides the virtual data representing the real route. The real route could alternatively be obtained from GPS, which would be superior. However, GPS data are limited and difficult to interpret, and not always available. Furthermore, the proposed scheme using GIS overcomes the problem when the delivery runs are not yet established, as occurs in a new PUD territory.
4.4. Discrete-Event Simulation (DES) in Arena®
DES gives the opportunity to incorporate not only the distance, but also other operational realities such as time taken, truck utilisation, etc.
The simulation model was developed in Arena in accordance with the two-tier architecture: (a) first-tier movement, (b) second-tier movement, (c) and the return to the depot. The architecture of the DES model is shown in
Figure 12.
A description of the modelling approach follows. Firstly, in
Figure 12a, consignments were generated with random addresses. Then, weight and volume were assigned to each consignment from probability distributions. Freight consignments were loaded by a forklift and consolidated on the truck. The forklift was assumed to carry one consignment for each movement. Secondly, in
Figure 12b, when the truck finished the first-tier movement, the truckload sequentially completed the second-tier movement with stochastic distances. The freight was assumed to be unloaded by customers. Last, after completing the transportation of all consignments, a return module was applied to the truck. As
Figure 12c shows, the truck was moved back to the depot.
Simulation inputs are shown in
Table 6. The truck speed and freight unloading time were obtained from the GPS data. The speed includes the truck stop and start time on roads. Distributions of consignment weight and volume were fitted based on the consignment data. The freight loading time was recorded on-site. First-tier and second-tier distances were obtained from
Section 4.2.
The total time for the truck delivery
Ttotal in the simulation is theoretically calculated by Equation (7).
4.5. Simulation Results
The simulation was run by 100 replications for 10 and 15 consignment cases, respectively.
Figure 13 presents simulation results for time values.
The travel time accounts for a small portion of the total time in two cases. This means most of the time was spent on freight loading and unloading. In addition, the average queueing time for each consignment is large in the delivery operations. The loading and unloading activities could be improved by adding more forklifts or optimising depot operations.
The freight volume is also of concern to freight companies. The truckload weight and volume were obtained from the simulation, see
Table 7. The ideal weight limit and volume limit for the PUD truck are 11 tons and 40 m
3. However, the actual volume limit is 70% of the ideal limit from the consideration of health and safety, which is 28 m
3.All truckload results, including maximum values are under the truck limits. When the consignment number is 10, the capacity utilisation is low.