A Crowdsourcing Approach for Sustainable Last Mile Delivery

: Sustainable transportation is one of the major concerns in cities. This concern involves all type of movements motivated by different goals (mobility of citizens, transportation of goods and parcels, etc.). The main goal of this work is to provide an intelligent approach for Sustainable Last Mile Delivery, by reducing (or even deleting) the need of dedicated logistic moves (by cars, and/or trucks). The method attempts to reduce the number of movements originated by the parcels delivery by taking advantage of the citizens’ movements. In this way our proposal follows a crowdsourcing approach, in which the citizens that moves in the city, because of their own needs, become temporal deliverers. The technology behind our approach relays on Multi-agent System techniques and complex network-based algorithms for optimizing sustainable delivery routes. These artiﬁcial intelligent approaches help to reduce the complexity of the scenario providing an efﬁcient way to integrate the citizens’ routes that can be executed using the different transportation means and networks available in the city (public system, private transportation, eco-vehicles sharing systems, etc.). A complex network-based algorithm is used for computing and proposing an optimized Sustainable Last Mile Delivery route to the crowd. Moreover, the executed tests show the feasibility of the proposed solution, together with a high reduction of the CO 2 emission coming from the delivery trucks that, in the case studies, are no longer needed for delivery.

In [8], the authors proposed a methodological approach that applies crowdsourcing solution for LMD. In [9], a case study in which a crowdsourcing approach is used for library deliveries is presented.
Despite these attempts there is still a lack of an appropriate treatment of sustainability issues when optimizing the routes.

Transport Networks and Complex Networks Analysis
Transport networks have been one of the areas of application of nomplex networks science. Usually, the static and dynamic visions of transportation networks are studied separately. Ducruet and Lugo [10] provided a complete review of how complex networks have been used in this domain. The topology of these networks has a great impact in its performance. It has been studied theoretically to determine how it affects to the efficiency of the movements along the city [11]. Characteristics such as the scale-free distribution of the average degree or the small-world properties of the network determine the effectiveness of the transport network or its resilience. It is also important to observe at different scales [12] and also its evolution along the time, such has been done with the public transport rail network of Kuala Lumpur [13] or the U.S. Air Transportation Network [14].
Air transportation is one of the most relevant practical domains in which complex networks have been applied [14,15]. The most relevant topic is to characterize the delays and study how they propagate. Fleurquin et al. [16] studied the delays in US transportation network through the topological properties of the network and the aircraft rotation. The paper shows that airports in remote areas tend to accumulate longer delays and these delays affect mainly the destination airports. Furthermore, these delays can propagate, affecting the neighborhood of the airport and, eventually, involving a significant part of the network [17]. Resilience to failures is an extremely important feature in airports [18] and diffusions models, such as epidemiological models, are used to study the propagation of delays and congestion, showing a more accurate prediction than the ones obtained with probabilistic models [19].
In the concrete case of urban transport, several views can be considered. Usually, the network models the physical connections among stops and stations of the different transport media. For example, Haznagy [20] studied the urban transportation systems of five Hungarian cities taking into account the capacities of the different lines for the available transports, which are modeled as weights in the network. Network characterization and centrality measures give the authors a detailed picture of the differences in the public transport organization in each city. Zhong et al. [21] created a network from travel records where nodes are urban areas and edges denote the possibility of travel between two areas, being their weight the number of trips made. Graph properties provide a view of the global travel demand, centrality measures identify hubs in the network and communities detection uncovers socioeconomic clusters. An alternative vision considers how people moves, detecting patterns [22]. In this case, the authors did not use the physical transport network, but they constructed the network connecting by edges the origin and destination point of each trip in two cities.
As in the case of airports, robustness of transport network is a relevant topic. Resilience depends on the network structure and, taking into account the traffic flow, which kind of networks are optimal (minimize traffic jams and congestion) in cases of low and high density of traffic can be determined [23]. Furthermore, the failures that can be produced are not unique. Birdsall et al. [24] made a difference in the type of failures considered. Since them, most studies are focused on gradual failures. However this work considers a sudden failure in a node of the infrastructure of the network and proposes a methodology to determine the vulnerability of a system and to calculate the cost of the consequences. Besides physical damage, overload can also break the network [25]. To deal with this situation, two network models can be combined: one network that contains edges modeling the routes of the buses, and another, in which two nodes are connected if a bus between the involved bus stops exists.
Usually, a unique transport network is taken into account. However, the reality is that cities host a combination of transport media and they are interconnected in a way that the performance of one of the media affects the rest of them. To study the complete public transport network as a whole, multilayer networks seems to be a useful approach [26]. For example, Tsiotas and Polyzos [27] integrated different aviation companies as separate layers in a multiplex network to identify the strategic plans of each company and its contribution to the total air transport network. A practical application of multilayer networks to the whole UK public transport system integrates airports, ferry, docs, rail, metro, coach and bus stations [28]. Aleta et al. [29] showed the relevance of multiplex networks to model transportation network, but also the superlayer approach, in which all the transport media are integrated in the same layer. The authors claimed that both are relevant and each one of them shows different, complementary aspects of the network.
An important factor is how population is distributed along the network. There are works that use social networks information to model how people move. Lenormand et al. [30] used a Twitter database with 5 million geotagged tweets from 39 countries to explore the usage of social networks in transport networks, discovering differences in railway and road networks. In another work, they studied how population is attracted by different cities when they travel worldwide, using geolocated information in social networks. They selected 58 cities and analyzed their influence [31].
Therefore, according to the above reviewed related work, the integration of a complex network analysis module results of interest to improve the quality of services given in transport networks.

Materials and Method
This section introduces the resources, data, and concepts that must be included to support the crowdsourcing approach for parcels' LMD with the goal of optimizing sustainability goals (CO 2 emissions, towards a zero-dedicated-vehicles for urban logistic distribution, etc.) while satisfying, at the same time, the economic and temporal goals. Moreover, the methods and techniques that are used to develop the approach are also described.
Smart transport of parcels in a city [6] is critical for Smart Cities efficient implementation. City logistics must consider the process of logistics optimization and transport activities in an urban area taking into account three pillars: economic, social and environmental. Logistic activities in an urban area require a set of resources (physical and cyber-physical) that are, sometimes, already in place. Thanks to the Smart Cities philosophy and its advantages, many cities have already adopted it and provide a set of information platforms in which open data are provided for developers to build on top of them. These sources provide the materials and resources to work with. Moreover, the cities' public transportation system is another source of materials that must be integrated into any approach that aspires to reduce dedicated logistics vehicles to access congested urban areas. The citizens that move in the city, due to their own needs and requirements, must also be integrated since in the proposed approach any given willing citizen may become a parcel temporal delivery.
Citizens movements in cities is a complex information system. To appropriately work with the complex data required and the intricate interrelation for combining the different transportation means together with an optimized way for collaborative distribution of parcels by citizens, in this approach, we use a set of different intelligent methods that aids to implement the approach. The different used methods are the following.
In our approach, the collaborative and sustainable distribution of parcels is solved by an open fleet approach [1]. In this approach, the vehicles (from different transportation modes) may be used by different deliverers for transporting parcels when they, at the same time (as citizens), travel to their own particular destinations. In this scenario for parcels distribution: a new parcel delivery request appears dynamically; the number of vehicles and temporal deliverers are not fix nor uniform; autonomy control (there is no mandatory requirement for a central manager); and the size of the potential deliverers network may be significant.
The advanced techniques behind our proposal come from the specialized research field of Multi-Agent System (MAS) [32]. Three layers support the execution of the proposal. The first layer is the SURF Framework that provides the required elements for collaborative and trustworthy delivery of parcels. The second layer is the complex network-based module (Transport Network Analysis Module-TNAM) that computes optimized parcels delivery paths and recommend them to the network of temporal deliverers (the crowd). Finally, the third layer is a transportation ontology that facilitates the specification of the particular transportation elements, structures, vehicles, temporal deliverers, particular routes of the citizens and the networks available for moving in the city.
The proposed approach is presented to the users as a mobil application, CALMeD SURF (Crowdsourcing Approach for Last Mile Delivery). In this way the mobil application encapsulates the complex framework and makes it accessible for two type of users: those who want to deliver a parcel, and those who wish to serve as occasional deliverers in an urban area. The users register in the system, and CALMeD SURF locates them in the city in real-time, sharing their position with the SURF Framework. In this way, when there is a delivery request, a dynamic network analysis module uses a network of geo-localized temporal deliverers (provided by SURF) to compute an optimized path for delivering the parcel to its final destination. It is important to point out that, when calculating the optimized path, multiple objectives are used, such as sustainable means, economic issues, temporal constraints, etc. The optimized path may be constructed as a chain of collaborative deliverers in which the parcel is passed to different deliverers that implement different sub-sections (sub-path) of the optimized total path. The main objective is to minimize new emissions originated by paths (or sub-paths) that deviate the deliverer from his/her daily routes (to the store, school, home, work, etc.).
The next subsections describe the main components of the proposed system that support CALMeD SURF.

Results: SURF Framework
The results that are made up of different intelligent components and layers to support the proposed approach are described in this section.
The framework that supports CALMeD SURF is an extension of the one presented in [1]. This framework implements a series of services and utilities for open fleet management. The main components of the framework are: a fleet operator, for global control and monitoring of fleets; a fleet coordination, a complex component that offers different services such as fleet tracker, event processing, tasks allocation, predictive redeployment, persuasion module and trust and reputation module; an agent organization, a team of agents that manage the urban transportation entities; and the vehicle layer, the dynamic group of vehicles that forms the fleet.

CALMeD SURF Ontology
CALMeD SURF relays on an ontology for capturing the transportation network and its users into an urban area. The complete ontology is described in [3]. The main components in the ontology are: • The public transport system that comprises: public transport infrastructure systems such as bus, tram, metro, and rail; the urban public transport network (see Figure 1) the means of public transport, people and the network infrastructure.

•
The transportation multi-modality or transportation mode. • A dynamic team of private transportation means for completing some deliveries.

•
The features that describe the concepts of the ontology are tailored to capture sustainable, economic, and temporal attributes.

•
The users that may play two roles: customer and deliverer.

Transport Network Analysis
The Transport Network Analysis Module (TNAM) is the third principal component that supports CALMeD SURF. Its primary goal is to compute and propose to the crowd optimized delivery paths. To accomplish this goal, TNAM uses a dynamically generated graph that is provided by the SURF framework from an instantiated CALMeD SURF ontology. In this graph, the nodes are GPS locations in a city tagged as potential deliverer location, customer location or UDC location. A connection between two nodes represents a feasible route plan in the transportation infrastructure of the town connecting the two GPS locations. The route plan may be a walking route, a cycling route, own vehicle route, public transport route, etc., or a combination of them. To delimit the computational time of the graph analysis algorithm when dealing with highly connected networks, TNAM allows restricting the connection routes length defining a reachability constraint for the nodes.
To compute the optimized delivery path, TNAM applies complex network and spatial network analysis, as well as the user activity information.
Recent studies have shown that transportation infrastructures are related and the effects of some events in one of them may affect other networks. Therefore, to facilitate the study of the transportation system in this work, we consider the transport network as a multilayer network.
In a multilayer transport network, each layer represents a different transportation mode (see Section 4.1). For those transportation modes with planned and fixed routes such as the lines from a public transport system, the layer represents those routes. On the other hand, for the transport modes with no fixed lines, such as taxi services, bike rental or car sharing services, the layers need to be generated as available transportation networks in the transportation infrastructure of the city.
To generate a structure that can be integrated with the multilayer network, TNAM uses spatial network techniques.
In Figure 2, the multilayer graph of the public transport system of Valencia (Spain) can be seen. TNAM uses this graph to build a path that allows interchanges among the layers. In this way, deliverers that connect different transportation modes for delivering a given parcel are taken into account when searching for the optimized delivery path. Let us consider the bike sharing service of the city of Valencia to show how a network is generated for those transportation modes with no fixed routes or lines. A region around a station is defined by the set of points that are closer to this station than to any other. With this information, a Voronoi diagram is generated centered around the location of the stations (Figure 3, left). This representation generates a tessellation that identifies the position of the bikes. Then, the Delaunay triangulation is obtained as the dual graph that links each vertex of adjacent Voronoi regions (Figure 3, right). It is considered the associated network for the bike sharing service. When a user rides a bike, the route passes at least two Voronoi regions. We can create a path through the stations associated with each one of these regions, which will be a path over the Delaunay triangulation. To compute the optimized delivery route and the needed exchanges to reach the final destination (i.e., the customer address), TNAM uses a simple, greedy algorithm. TNAM takes into account the current GPS location of the parcel and analyzes the reachability vicinity (i.e., the GPS location of the deliverers within the reachability constraint) in the network of deliverers. Any of these deliverers can move through the different layers of transportation modes of the complex network. They determine which one of their neighbors is closer to the GPS location of the parcel customer. The delivery path is proposed to these deliverers, and they are free to accept it, engaging in the delivery agreement, or rejecting it. A proposal rejected by all the neighbors requires a new TNAM analysis to compute a different set of neighbors. This process is repeated until the parcel is finally delivered to its customer. Figure 4 shows an example of this behavior. Each circle contains the coverage distance of the corresponding agent. It represents the area at which the agent can deliver or exchange the parcel with the customer or another agent. In this example, the origin is the green agent. It exchanges the parcel with another agent that is: (i) inside its area of influence; and (ii) near the endpoint (red). This process is repeated until the parcel reaches its destination. Section 5 evaluates different approaches for the agents to choose the most promising alternative.

Discussion
To validate the proposal, this section analyzes the proposed model and algorithmic approach. The tests are done using the network data model to represent the city and synthetic data to evaluate the performance of the proposed delivery algorithm.
The network model used to represent a generic city is a random geometric graph (RGG) G(n, r). It is a graph defined into the boundaries of a unit square [0, 1] 2 , where nodes n are uniformly distributed over the area, and two nodes are connected if their Euclidean distance is at most the radius r. These graphs are interesting because they can introduce geometrical constraints to the network, which is why they are also called spatial networks. One of the most interesting aspects related with the area of this work is related to the epidemic spreading, which has many points in common with the problem addressed in this paper.
To generalize the model, the area of a city is rescaled to a unit square and the rest of measures renormalized to such a scale. For example, the city of Valencia lays on a square of 7 km. The figures in this section use rescaled data. In this way, the axes of the figures are: the X-axis represents scaled longitude coordinates of Valencia and the Y-axis represents scaled latitude coordinates of Valencia. Figure 5a shows an example of RGG formed by 128 nodes. The circles represent the distance of coverage (or reachability constraint) of each node. Two nodes are connected in the graph (Figure 5b) if their distance is under the radius. That means that, if the circles overlap, then there is a connection. In our case, two users can exchange a parcel if they are in near each other. In that case, there will be a link between both nodes or the intersection between their radii is not empty. Both concepts are equivalent. To simplify the model, it is assumed that all nodes have the same radius, that is, all the users have the same affordable distance. This constraint can be relaxed, and it does not affect the formalization. It is important to point out that the network must be connected to achieve the desired results. If there are nodes isolated from the rest of the network they will not be reachable. Furthermore, if there are areas that are not covered by any circle, these parts of the city would remain unattended. There are two ways to avoid these situations: • Increase the number of nodes • Increase the radius In this section, the effect of both strategies are studied.
To ease the interpretation of the results and the comparisons among different cities, let us use some examples. If the city of Valencia is considered and its area is scaled to the square unit, it means that a distance of 0.1 in the RGG is equivalent to 700 m. The total population of the metropolitan area is around 6500 inhabitants per km 2 . Our examples use networks from 100 to 1000 nodes. That means that the proportions of the active users of the system vary from 1 60 to 1 6 approximately. The corresponding distances of coverage for the examples shown in Figures 5 and 6 are d c ov = 0.14 for a network with 128 nodes and d c ov = 0.07 for a network with 512, which means distances of 850 m and 450 m, respectively. Figure 6 compares a RGG of 512 nodes using radius r = {0.07, 0.1, 0.15, 0.2}. The lower value provides a connected network, so there exists a path between any pair of nodes.
Some properties of the system can be ensured and analyzed by the theoretical study of the properties of the network. RGG share most of their characteristics with random graphs, but the spatial limitations establish some differences that have to be considered. For example, if an RGG with 512 nodes is considered, the characteristic measures are shown in Table 1.  The number of edges indicates the number of connections among the participants. The density provides a measure to compare the proportion of the existing edges amongst all the possible ones (that is, if it were a complete network with paths between any pair of nodes). Next, the average degree denotes how many neighbors the nodes have (on average). We can see how it increases quickly with the distance of coverage. For example, in the last row, the network with radius r = 0.20 has an average degree of 53. That means that each user can exchange a parcel with other 53 users. Equivalently, the area intersects with the area of 53 other people. The average shortest path length and the diameter are measures of the exchanges needed to complete a delivery. The diameter of the network (or the geodesic) is the longest of the shortest paths. It is the maximum number of exchanges needed in the worst case. The average shortest path indicates how many transfers are required on average. For example, an average shortest path of 3 means that we can deliver any parcel just with two exchanges, with a maximum trip of 1.5 km (referred to Valencia). Finally, the clustering coefficient counts how many of the possible triads are present in the network (that is, how many of my neighbors are neighbors among them). This is an essential factor in networks because the combination of high clustering and short average path lengths defines the appearance of the phenomena known as small-world. It is crucial because it improves the efficiency of the network: the information (or the deliveries in this case) need a few exchanges. Figure 7 shows how both measures evolve as the distance of coverage increases. The small-world effect appears with values of r from 0.15. Figure 8 compares the obtained average shortest path lengths for RGG from 100 to 1000 nodes at their minimum distance of coverage d cov (the minimum distance that guarantees that all nodes are connected). Bigger networks have longer paths because there are more nodes in between and d cov generates the worst possible scenario.
Another important factor is the degree distribution. It determines how the number of neighbors is distributed among the users. Typically, random networks exhibit a Poisson distribution, which leads to the absence of extreme values, or hubs, that concentrate most of the connections. This is a well-known characteristic of other kinds of networks, in which the degree distribution follows a power law. Anyway, the absence of hubs provides an advantage: networks are more resilient and failure tolerant. There is no central node, which is usually more vulnerable, so the efficiency of the network remains high even when a high percentage of the nodes have been removed. Figure 9 shows the degree distribution for different network sizes. It can be seen that it follows a Poisson distribution and the median value is bounded between 5 and 10.  In the previous measures, the distance of coverage has been taken into account many times as a critical, lower bound to guarantee the connectivity of the network. Figure 10 shows the results of a set of experiments to empirically calculate it. RGG from 100 to 1000 nodes have been generated. For each of them, 100 samples have been created, and the obtained distances of coverage averaged. The results decrease from d cov = 0.15 for 100-node networks to d cov = 0.05 for 1000-nodes. In the case of the city of Valencia, it represents distances from 1 km to 300 m.  Once the RGG model has been characterized, the movement patterns and the performance of the proposed algorithm has to be checked. Three alternatives are considered.
1. Static network, where the node remains in the same position. 2. Model the movements as random walks. This model assumes that users travel short distances.
3. Model the trips as Levy flights, in which large movements can eventually occur. It is a good model to represent public transport usage, and it has already been used to model animal and human trips. Figure 11 shows the difference between a random walk and a Levy flight. In a random walk, the distances among two consecutive points are small and similar. Meanwhile, Levy flights combine periods with random walk-like movements with eventual long-range journeys. Figure 11. Random walk (left); and Levy Flight (right) movement patterns. In a random walk, users move randomly in the near of current location. The difference with Levy flight is that users eventually perform long displacements.
Experiments are divided into two sets. The first one analyzes the performance of the algorithms when the networks are formed taking as radius the minimum distance of coverage d cov for each network size. In the second one, different ranges are considered. Figure 12 shows the results for the first set. As in the previous case, RGGs from 100 to 1000 nodes have been generated. For each of them, 10 samples have been created, and 10 senders and receivers have been randomly chosen for each sample. Results from this 100 executions have been averaged. The obtained paths include the sender and the receiver, so the total number of exchanges excluding the delivery and the reception are path length 2. Therefore, a path of length 3 involves just one messenger.
The static network is not efficient, as expected. It represents a scenario in which users move only from fixed positions and turn back. It does not exploit the advantages of the mobility of people. Besides, the greedy algorithm does not guarantee the delivery of the parcel, since it can fall in endpoints (in general, RGGs are not navigable). Levy flights are as bad as static networks since the person that carries the parcel eventually can travel far away from the receiver. However, this occurs because they do not take into account the next ubication of the receiver or the candidate neighbors (see the explanation of the algorithm in Section 4.2). A simple solution is to include in the system information about the next expected position of the different actors. Therefore, the process is slightly modified as follows: the parcel will be delivered to the neighbor that, in its next position, is closer to the next estimated position of the final receiver.
With this modification, two new experiments were carried out and the results labeled in Figure 12 as "random walk pred" and "levy flight pred". The introduction of this change barely affects the random walks, but it has a significant influence on the performance of the Levy flights, which now are the best model. In most of the cases, the process is completed without exchanges among messengers (path length = 3), and it is independent of the size of the network. These results show the performance in the worst case when r = d cov for each network size. The second set of experiments analyzes the performance for bigger radii. However, we must ensure that r ≥ d cov to guarantee the connectivity. Besides, according to Figure 10, d cov varies with the size of the network.
Only three of the networks sizes have been included in the paper to simplify the explanation: 100, 500 and 1000 nodes (see Figure 13). The radii r has been chosen as follows. For each network size, the distance of coverage d cov (n) is calculated. Then, 10 radii are chosen to multiply d cov (n) by a factor. r i = i × d cov (n), ensuring that r 1 = d cov (n). In all cases, it can be seen that, with a factor between three and four times the d cov (n), the parcel can be delivered using at most one exchange between messengers (path length = 4), even in static networks. Again, the Levy flight using the next available positions obtain the best results, almost independent of the distance of coverage.
The section has shown different tests made to show how the idea we are proposing in this work is feasible, that is, to use a crowdsourcing approach to LMD. It has to be underlined that, in the above-shown experiments, as there is not yet available a historic log of the approach, the different experiments have been generated with random loads and with restrictions common to all potential deliverers that are as demanding as possible. For instance, in the experiments, the potential deliverers have 500 m of reachability, in other words, they would be able to move up to 500 m to get or deliver a parcel. In real case scenarios, it would be normal to have users able to move more than this reachability constraint. Moreover, in the experiments, we are assuming deliverers that only use walking routes, so the interaction possibilities are lower than when using public or private transport. Anyway, the experiments have shown that, even in such restricted situations, the crowdsourcing approach can give satisfying solutions.

Conclusions
This paper proposes a sustainability-oriented solution for the LMD problem. The approach takes advantage of the citizens movements and combines them with the delivery of parcels. To do this, a multi-agent system approach based on the SURF framework was developed. The framework offers a set of optimized functions for the participants that are willing to collaborate in order to transport parcels to different destinations in the city, and it has been implemented by three main components, the SURF Framework, the transportation analysis, and optimizing module, as well as a transportation ontology that specifies the different concepts of the transportation model.
The proposed framework employs complex network-based algorithms and measures which considers the urban area as a graph of nodes that capture the real-time geo-location of the participants and the arcs link adjacent participants trough a possible delivery path. This allows us to calculate how to arrive from an origin point of the path to an endpoint employing several users that may pass the package from one to another. The performed tests show that the proposal is feasible and can give satisfying solutions for the LMD problem.
As future work, we want to introduce more aspects in the network analysis such as delivery deadlines and security issues. Regarding security issues, a trust and reputation model can be used to estimate the future behavior of participants based on an analysis of their historical behaviors. Funding: A Multi-agent supported approach can be successfully used to implement an efficient crowdsource solution for Sustainable Last Mile Delivery. Moreover, the executed tests show the feasibility of the proposed solution, together with a high reduction of the CO 2 emission coming from the delivery trucks that, in the case studies, are no longer needed for delivery.