Assessing the Impacts of Crowdshipping Using Public Transport: A Case Study in a Middle-Sized Greek City

Crowdsourced deliveries or crowdshipping is identified in recent literature as an emerging urban freight transport solution, aiming at reducing delivery costs, congestion, and environmental impacts. By leveraging the pervasive use of mobile technology, crowdshipping is an emerging solution of the sharing economy in the transport domain, as parcels are delivered by commuters rather than corporations. The objective of this research is to evaluate the impacts of crowdshipping through alternative scenarios that consider various levels of demand and adoption by public transport users who act as crowdshippers, based on a case study example in the city of Volos, Greece. This is achieved through the establishment of a tailored evaluation framework and a city-scale urban freight traffic microsimulation model. Results show that crowdshipping has the potential to mitigate last-mile delivery impacts and effectively contribute to improving the system’s performance.


Introduction
Every year actions are initiated to promote sustainable urban freight transportation (UFT), with great interest in pioneering initiatives that enable the participation of citizens [1]. These initiatives aim at addressing the recognizable threats of urban unsustainability i.e., urbanization, aging population, globalization, etc., focusing on the rapid growth of e-commerce and the high service expectations for fast and flexible deliveries [2]. Crowdshipping, an expression of the sharing economy in the field of city logistics, is an emerging UFT solution, aiming at reducing delivery costs, congestion, and environmental impacts [3]. The concept of crowdshipping as a commercial business model is based on a platform provided by a company through which last-mile delivery is performed by commuters, and not corporations [4]. Although crowdshipping is not a new concept, the booming of Information and Communication Technologies (ICT) and the rapid growth of e-commerce owing to online purchases have set the ground for widespread adoption [5].
For crowdshipping to be proved beneficial, certain parameters must be sufficiently determined. There are factors that affect the organizational design and task environment, while there are behaviors that are affected by them [6]. For example, the success of the solution depends upon the incentives that will be given to crowdshippers or the length of the detour to perform a delivery [7]. Variables such as weight, size and price of the package, and delivery urgency also seem to affect the willingness of crowdshippers to perform a delivery [8]. Previous literature has also reported accountability and insurance issues associated with crowdshipping [9]. Furthermore, requests on popular crowdshipping platforms may result in generating dedicated new delivery trips rather than modifying existing ones (rebound effect) [10].
This study examines the use of public transportation (PT) for the deployment of crowdshipping services, along with smart locker installations which will be strategically positioned in areas with high demand, to allow final recipients to receive their parcels with flexibility and at low cost as compared to home deliveries. Analytically, the concept dictates Future Transp. 2022, 2 56 that a parcel is picked up from an automated smart locker installation located near a PT stop close to the depot and gets delivered to the final locker installation using intermediate smart lockers near PT stops. The intermediate pick-ups/deliveries are performed by PT users. The final recipient is then notified to pick up the parcel from the final locker installation which is just a couple of minutes away on foot. This way, the existing spare capacity of PT is exploited for delivering parcels, without further increasing congestion and pollution. To pick up a parcel from a locker installation, PT users' personalized monthly season ticket is necessary which resolves accountability issues.
Unlike other crowdshipping studies that focus on the behavioral aspects contributing to crowdshipping services and the solution's feasibility, this particular research aims at specifying the benefits of the UFT solution in environmental and traffic terms under the viewpoint of the operator, as well as the local public authority. This is achieved through an evaluation framework, adapted to the context and needs of the specific UFT solution, that follows a hierarchical structure, considering selected performance indicators [11]. The formulation of the alternative scenarios considers variables such as demand, location and number of smart lockers, range of considered service area around a smart locker, and adoption level by crowdshippers. For the analysis of the scenarios, a before-after comparison is performed. The "before" state, is based on the analysis of a city-scale urban traffic microsimulation model that is developed in PTV Vissim, which incorporates the daily door-to-door freight trips of a private courier company in Volos. The "after" state is based on the analysis of new formulated scenarios which are coded in the same microsimulation model in PTV Vissim.
The rest of the paper is structured as follows: Section 2 reports the previous research concerning the main groups of studies dealing with the analysis of crowdshipping. Section 3 outlines the methodological steps and elaborates on the scenario formulation. Section 4 presents the results. Finally, Section 5 concludes the paper and provides suggestions for future research.

Literature Review
Crowdshipping, as an emerging UFT solution, is getting increasing attention from scholars and industry [12,13]. According to Buldeo Rai et al. [14], the success of the solution depends on information capacity and crowd engagement. On the one hand, information capacity has been substantially improved due to the pervasive use of mobile technology that enables peer-to-peer platforms, allowing the instantaneous matching of demand and ad-hoc crowdshippers [15]. On the other hand, fundamental research on assessing people's willingness to contribute to the solution has yet to be undertaken [16]. Extended literature review showed that there are three main groups of studies dealing with the analysis of crowdshipping: those focusing on (a) determining the factors for a successful deployment, (b) determining the impacts associated with its deployment, and (c) exploring various business models.
To begin with the first group of analyses, Serafini et al. [17] used stated preference to identify the key elements associated with the choice of acting as a crowdshipper. In their attempt to study the underlying behavior, authors used discrete choice models. Punel et al. [18] explored the differences between crowdshippers and non-crowdshippers. The results showed that: (i) crowdshipping is more prevalent among young, full-time employed men, (ii) urban areas are preferential for deploying the service, and (iii) crowdshippers are willing to perform medium-distance deliveries. In a similar study, Marcucci et al. [19] investigated under which conditions commuters are willing to act as crowdshippers and under which conditions receivers are willing to have their parcels delivered by a crowdshipper. Results show that 87% of students would, in principle, be willing to act as crowdshippers (i.e., supply) with adequate compensation, while 93% of them are willing to receive their goods through a crowdshipping system under certain conditions, especially characterized by delivery timing and punctuality. Furthermore, Ermagun [20], considering a national data set of 16,850 crowdshipping requests across the US, researched the probabil-ity of receiving a bid from a crowdshipper, and came up with the following insights for transportation planners and crowd-shipping companies: (i) supply is unevenly distributed across the U.S, (ii) this geographical disparity is a function of not only the shipping request and service characteristics, but also the socioeconomic and built-environment attributes, (iii) the supply has denser pockets in areas with a higher percentage of African-American population, high wage workers, and families with two or more vehicles, (iv) the supply peters off in areas with higher population and employment densities, while, it is accumulated in geographical areas with higher destination accessibility and regional employment diversity, (v) the out-of-state and the business-to-customer shipments present the highest elasticity in receiving a bid, while posted requests with a delivery deadline form the most inelastic segment.
As per the second group of analyses, Simoni et al. [7] researched the effects of crowdshipping by means of a hybrid simulation model on traffic and emissions for the city center of Rome, Italy. The generated externalities were associated with strategic as well as operational aspects with respect to the network's level of congestion. Buldeo Rai et al. [14] performed an environmental impact assessment analysis for delivering a parcel through an operating crowdshipping platform in Belgium, by comparing it with the traditional delivery way. Results indicated that the crowdsourced delivery led to higher external transport costs, thus higher environmental impacts. Paloheimo and Waris [21] investigated whether crowdshipping has sustainability benefits based on a case study of book deliveries from a library in the city of Jyväskylä, Finland. In this case, results showed that crowdshipping environmental benefits were directly proportional to the saved distances. Moreover, based on indicative extrapolations based on the trial that was developed in the study, it was concluded that crowdsourced delivery can save on average 1.6 km/delivery, and that if all deliveries would have been made by cyclists', savings would raise to 2.5 km/delivery. Moreover, Comi and Savchenko [22] proposed a methodology to support the assessment of urban delivery means within the inner area of a city, through a specific scenario analysis of an online bookstore. In their study, they formalized internal (salary, amortization, fuel, maintenance and repair, other operation costs) and external (pollutant and greenhouse gas emissions, safety, noise, congestion) cost components for traditional means for last-mile delivery (car, motorcycle, bicycle, on-foot/public transport) by private users (i.e., crowdshipping), providing a global view on last-mile deliveries. Their results show that the car mode is the best alternative, while on-foot delivering is the most expensive, as carrying capacity is low and salary costs are high, regardless of the low external costs.
Moreover, a number of studies focus on the development of tailored business models to crowdshipping and economic aspects. In the study of Rougès and Montreuil [23], a review of 18 crowdshipping companies and start-ups revealed a variety of business models determined, among others, by their revenue model. Based on their review, a typology of business models in the crowdsourced delivery industry was developed. Mak [24] studied a novel crowdshipping strategy aiming at foreseeing the exploitation of in-store customers as potential couriers, with respect to the impacts on retailers' marketing strategies. In a similar approach, Dayarian and Savelsbergh [25] also considered in-store customers as potential couriers, investigating the interrelation of the dynamic exchange of information depending on store status with demand and delivery capacity. Recent literature revealed other emerging business models, i.e., the exploitation of social media platforms to assist with last-mile delivery, or other operational research approaches [26,27]. For example, Suh and Linhoff [28] presented a framework for leveraging social networks with a mobile and communication platform to assist in parcel pick-up in last-mile delivery. Devari et al. [16] attempted to assess the benefits of using friendship connections for performing last-mile delivery, by bridging the gap between the crowdsourced last-mile delivery research, friendship modeling in social networks, and travel activity modeling.
Finally, there are studies in the literature that examine the concept of crowdshipping as such, highlighting the trade-offs between privacy, reliability, and cost [29][30][31].

Methodology
To examine the impacts of the proposed set of solutions, 18 scenarios will be formulated to be compared with the current base scenario. The base scenario regards home deliveries with a fleet of four (4) mopeds and two (2) minivans starting from the company's depot. The 18 scenarios regard a combination of smart locker installations, with crowdshipping and PT use for delivering parcels. Analytically, a grid of strategically positioned smart lockers will be established in areas with high demand near PT stops. These smart lockers will act as transshipment points to transport the parcels to the final recipients.

Description of the Experiment
The following example provides a step-by-step description of the delivery service concept to facilitate a clear understanding of the methodological approach that is adopted.
Two smart lockers-smart locker "E" and "A"-are to be established. Smart locker "E" is located near the PT stop in the area of the depot and smart locker "A" is located near a major PT stop in the city center. For this example, the city center is assumed to be the destination of the specific parcel delivery. A PT user who is heading to the city center-or the city center is an intermediate stop on his/her journey-decides to act as a crowdshipper. The user enters the crowdshipping platform and gets notified about a delivery request. The user breaks his/her trip at the PT stop near smart locker "E", uses the code generated by the platform to open the smart locker and picks up the parcel. Then, the user continues his/her trip e.g., by taking the next bus to the city center. When in the city center, the user gets off the bus and uses the code again to deliver the parcel in smart locker "A".
The realization of the aforementioned crowdsourced delivery results in reducing the number of deliveries that are to be performed by the company's fleet. The number of deliveries that will be reduced depends (i) on the adoption by PT users, and (ii) on the local demand served by the smart locker, given that all recipients accept to receive their parcels in the city lockers. For example, a 0% adoption would mean that all deliveries will be performed as in the base scenario. Respectively, a 100% adoption would mean that all deliveries around the smart locker will be realized through crowdshipping.
Further relevant background information regarding the example is given in the form of a Q&A: Why should a courier company enable crowdshippers to perform deliveries? Changing consumer habits and their expectations for instant deliveries and same or next day deliveries put pressure on last-mile logistics networks, carriers, and supply chain stakeholders to replace legacy and conventional business models and achieve faster delivery services, offering real-time transparency throughout the fulfillment process [32]. Crowdshipping emerges as an immediate response to these new challenges and helps serve on-demand deliveries, without needing frontloaded investments.
Are there any benefits for the final recipients who should reach the smart locker instead of having their parcels delivered to home? The benefits of smart lockers are already well recognized and identified in the literature [33,34]. Smart lockers provide flexibility to the recipient, security-ensuring that nothing gets lost or stolen-while they provide contactless deliveries, an issue that received increased attention in the years of COVID-19.
Why participate as a crowdshipper if the PT-based trip will be significantly delayed due to the additional trip breaks or potential detours? The incentive for PT users to perform the delivery task is the discount on their next monthly ticket. When a crowdshipper performs a number of deliveries, then scalable discounts are offered to their next monthly ticket. The discounts can be covered jointly by the city's authorities and the PT service operator as PT usage will be increased with resulting economic benefits. In general, successful implementation of the solution requires specific sets of rules. For example, excessive incentivization for servicing as a crowdshipper could result in creating dedicated delivery trips with private vehicles and, thus, higher impacts than those that the solution was meant to mitigate.
Is it convenient or practically feasible for crowdshippers to carry bulk and/or heavy parcels, especially in the oversaturated PT vehicles within peak hours? There is evidence from relevant literature that carrying parcels in e.g., trams or metro lines during peak hours might not be feasible from an organizational perspective, due to oversaturation [35]. The envisaged services presented in this study regard the delivery of light parcels, i.e., those that weigh less than 2 kg which also assumes that parcels can easily fit into the lockers' boxes. Therefore, no travel implications are expected either for passengers or crowdshippers when traveling during peak hours. Any bulk and/or heavy parcels continue to be delivered as in the base scenario, thus by the company's conventional fleet.
How is accountability resolved in this case? To register on the platform, the crowdshipper has provided personal information including his/her personal PT card number.

Methodological Approach
To achieve the above, the study geocodes the delivery locations of the ELTA Courier company in the city of Volos within a typical day and based on the identified hotspots recommends the installation of automated smart lockers near PT stops. Easy walking access areas-service areas-are created around the smart lockers, in which the number and location of deliveries that fit certain weight and size restrictions, are determined. Then, 18 alternative scenarios are formulated considering (a) demand, and (b) various rates of adoption by the PT users to act as crowdshippers. The 18 scenarios are configured in an urban freight traffic microsimulation model and are compared with the base scenario, which regards that all deliveries in the typical day are realized Point-to-door (P2D), by the company's fleet.
For the configuration of each of the 18 scenarios, an online GIS-based commercial route planning optimization tool was used [36] to recalculate the freight vehicle routes. This was necessary, as in the 18 scenarios the number of deliveries decreases compared to the base scenario due to crowdshipped deliveries. Details for the recalculation process of the delivery routes as well as for the above methodological steps are given in the following Sections 3.2.1-3.2.7.
It is noted that the only PT services in Volos are PT bus services and that all freight trips in the base scenario regard delivery trips and not pick-up trips.

Location of Smart Lockers' Installation
The location of smart lockers was defined by geocoding parcels' delivery locations. Delivery locations along with other data were provided by the ELTA Courier company (Appendix A). Based on the geocoded locations, a heatmap was created which shows the magnitude of these delivery locations by clustering them in obvious visual cues that allow easy identification of the hotspots [37] (Figure 1). Considering the hotspots of Figure 1, it is safe to conclude that there are two location where high demand for deliveries is concentrated: Locations (A) and (B). As a next ste PT stops near the two locations were denoted as candidate PT stops for installing sma lockers. The candidate PT stops were tested against their routes' frequency, as shown Considering the hotspots of Figure 1, it is safe to conclude that there are two locations where high demand for deliveries is concentrated: Locations (A) and (B). As a next step, PT stops near the two locations were denoted as candidate PT stops for installing smart lockers. The candidate PT stops were tested against their routes' frequency, as shown in Table 1. This was made as PT users are more likely to accept to break up their trip to pick-up/deliver a parcel from/to a smart locker, knowing that the next bus is coming in a few minutes as compared to e.g., 15 min.   The top candidate PT stop for Location (A) was "Ermou" PT stop and for (B) was "Mavrokordatou" PT stop. Specifically, the "Ermou" PT stop was preferred over the "Pavlou Mela" stop for three reasons: (a) the frequency of PT number four is higher than the frequency of PT number five-the rest lines serving the two stops are common, (b) the distance of "Ermou" PT stop is closer to the center of the visual cue of Location (A), as shown in Figure 1, and (c) the "Ermou" PT stop lies on the "Ermou" street which is a walkway exclusively for pedestrians that passes through the center of the city. Respectively, the top candidate PT stop for Location (B) was the "Mavrokordatou" PT stop, as it is the only PT stop nearby.
In addition to the two smart lockers at Locations (A) and (B), another smart locker installation was established at the PT stop in front of the ELTA Courier depot (E) (see Figure 2 below). It is noted, that according to the frequency classes of a group of transport modes categorization by Poelman and Dijkstra [38], the "Ermou" PT stop is a "high" frequency stop as more than 10 departures per hour take place during weekdays, while the "Mavrokordatou" PT stop is of "medium" frequency (≥4 and <10 departures per hour). Respectively, the PT stop in front of the ELTA Courier depot is also characterized by "high" frequency.

Service Areas around Smart Locker Installations
Service areas were created to represent areas with easy walking access for picking up/delivering parcels to a smart locker. Following the definition of an accessibility area around a PT stop [38], a service area is the equivalent of a 5-min walk to a bus or a tram stop. There are cases, however, in which a 5-min walk can be considered inconvenient when carrying a parcel, i.e., bad weather conditions, heavy parcels, low quality walking infrastructure, high delinquency rates, etc. Hence, instead of 5-min service areas, 3-and 1.5-min areas were considered. The design of the service areas was made by using the distances of the actual road network around the smart lockers' location, instead of Euclidean distances [39], see Figure 2.
around a PT stop [38], a service area is the equivalent of a 5-min walk to a bus or a tram stop. There are cases, however, in which a 5-min walk can be considered inconvenient when carrying a parcel, i.e., bad weather conditions, heavy parcels, low quality walking infrastructure, high delinquency rates, etc. Hence, instead of 5-min service areas, 3-and 1.5-min areas were considered. The design of the service areas was made by using the distances of the actual road network around the smart lockers' location, instead of Euclidean distances [39], see Figure 2.

Demand
Punel et al. [18] recognized in their study that some parcels are too bulky or heavy to be delivered by crowdshipping. On this basis, Crowdshipping Demand (CD), expressed as the number of deliveries, was determined as follows: i freight vehicle (1-4 for mopeds and 5, 6 for minivans). • TD total number of deliveries that are to be performed per vehicle within the service area. • d deliveries that are to be performed per vehicle and weigh more than 2 kg (given that parcels' volume data were unavailable, this assumes also that parcels can easily fit into the lockers' boxes). Table 2 summarizes the total deliveries per freight vehicle, and the TD and CD within the 3-and 1.5-min service areas per freight vehicle.

Demand
Punel et al. [18] recognized in their study that some parcels are too bulky or heavy to be delivered by crowdshipping. On this basis, Crowdshipping Demand (CD), expressed as the number of deliveries, was determined as follows: where • i freight vehicle (1-4 for mopeds and 5, 6 for minivans). • TD total number of deliveries that are to be performed per vehicle within the service area. • d deliveries that are to be performed per vehicle and weigh more than 2 kg (given that parcels' volume data were unavailable, this assumes also that parcels can easily fit into the lockers' boxes). Table 2 summarizes the total deliveries per freight vehicle, and the TD and CD within the 3-and 1.5-min service areas per freight vehicle. The scenarios in each cluster are determined by the (a) demand, and (b) PT users' adoption rates that act as crowdshippers ( Table 3). The adoption rates were assumed as 20%, 50% and 70% of CD served by crowdshippers. The range 20-70% was selected as it appears to be more realistic than considering no adoption at all (0%) or 100% adoption. It also promises interesting insights as per the association of impacts with a low/medium/high adoption percentage. It is noted that the 20% percentage represents low adoption by crowdshippers, while the 70% represents high adoption.

Evaluation
The evaluation of the scenarios was performed through an open city-oriented web application-Evalog [40]. Evalog is composed of three levels of evaluation components. The first level consists of seven impact areas-Economy and energy, Environment, Transport and mobility, Society, Policy and measure maturity, Social acceptance, and User Uptakewhich are disaggregated into 26 criteria and 140 indicators [11], see Table 4. Following the above structure of Evalog's methodology, weights for each component (impact area, criterion, indicator) are calculated based on a hierarchical pairwise comparison, according to the Analytic Hierarchy Process (AHP) method [41], see Figure 3. Specifically, the user is called to indicate the importance (or preference) of element 1 (Environment) compared to element 2 (Transport and mobility) by rating them on a scale from 1 to 9, where: 1 = same 3 = moderately 5 = very 7 = much more 9 = exceptionally more re Transp. 2022, 2, FOR PEER REVIEW Then, the inserted indicator values are normalized (following the Max and Min n malization method), multiplied by their weights, and a final index is estimated per imp area. The generated impact area indices are further aggregated into the Logistics Susta ability Index (LSI) that is used for the comparison of the different scenarios. In this speci example, there are two impact areas, equally weighted (as anything different from th would be unfounded), thus the LSI is the average of the impact area indices. A full demo stration of the evaluation framework can be found in Nathanail and Karakikes [42]. Then, the inserted indicator values are normalized (following the Max and Min normalization method), multiplied by their weights, and a final index is estimated per impact area. The generated impact area indices are further aggregated into the Logistics Sustainability Index (LSI) that is used for the comparison of the different scenarios. In this specific example, there are two impact areas, equally weighted (as anything different from that would be unfounded), thus the LSI is the average of the impact area indices. A full demonstration of the evaluation framework can be found in Nathanail and Karakikes [42].
In our experiment, "before" is considered to be the base scenario and "after" is considered to be each of the 18 alternative scenarios. The selected evaluation components considered in this study are a subset of Evalog's components and are shown in Table 5. The Environment impact area refers to the preservation of natural resources and the limits within which activities should take place without depleting non-renewable resources. The environmental impact of logistics is addressed through emissions and air quality on communities. The Transport and Mobility area refers to the continuous pursuit of improving the transport of goods and mobility of people and is translated into terms of attractiveness, accessibility, level of service, safety, as well as the availability of infrastructure. The valuation of the indicators of the two impact areas aims at assisting: • the operator in monitoring and controlling the company's performance (O1-O13), and • the public authorities in assessing the impacts of crowdshipping, projected to the whole network performance (N1-N9).

Simulation Analysis Approach
For the configuration of each of the 18 scenarios, an online GIS-based commercial route planning optimization tool was used [36]. The tool is based on a combinatorial optimization algorithm aiming for the cheapest solution while taking into account realistic constraints that apply to the fleet, the orders, and the end customers. Specifically, the tool splits the target into sub-problems, following the logic of Wang and Kopfer [43]. "Winner Determination Problem" cases are solved through an iterative route creation process to achieve the minimization of costs by solving the central "Collaborative Transportation Planning" problem with the use of advantageous heuristic algorithms.
Analytically, the tool is used to replan freight vehicles' routes, as in the 18 scenarios, the number of deliveries decreases compared to the base scenario, due to crowdshipped deliveries. It is noted here that the "base scenario" freight vehicles' routes were reorganized, as compared to the routes provided by the courier company, using the same optimization tool-OptimoRoute. The reason for this recalculation was to ensure comparability with the results of the 18 scenarios, as optimizers tend to over-perform as compared to reallife applications [44], and thus any resulting improvements could be attributed to that. Specifically, as input data the tool was given: In addition, to ensure comparability of results and to measure impacts attributed only to crowdshipping, any other parameters, i.e., service duration at each stop, were kept the same for all scenarios.
Based on the route planning optimization tool's results, respective PTV Vissim scenarios were developed, following the approach described in the next section.

PTV Vissim Configuration
The main steps for developing the city-scale traffic freight microsimulation model (available datasets, coding the model, calibration, and validation) are elaborated in Appendix A.
For the configuration of the base scenario, two new vehicle types (610: ELTA Motorbike and 620: ELTA Van) were created in PTV Vissim to represent the freight vehicles. The first vehicle type represents the four motorbikes. This type was assigned PTV Vissim's default distribution attributes (accelerations/decelerations) for motorbikes. The second vehicle type represents the two minivans, to which default distribution attributes for cars were assigned.
Freight vehicle routes were coded in the model as "Public Transport Lines" [45]. The route of each freight vehicle was split into sub routes, each indicating a trip with origin and destination the ELTA Courier station (depot). Public transport stops were used to represent delivery stops.
As the density of the modeled road network in PTV Vissim is not the same as the real network, projections of the real addresses of the delivery stops were made to the closest point of the modeled road, see Figure 4. This design assumes that parking spaces are available in all delivery addresses. In the city of Volos, due to low enforcement and poor urban design, there are many cases in which freight motorbikes park shortly wherever there is adequate space e.g., in front of garage entrances, on the sidewalk, between parked vehicles, etc. This has a negative effect on, for example, the walkability or urban aesthetics, but slightly affects the rest of the vehicular traffic. Thus, the model's design and real parking availability, pertaining to motorbikes, do not yield any differences. Furthermore, minivans are primarily used for remote deliveries or deliveries of heavyweight or bulk parcels. While remote deliveries do not deal with parking problems, heavy or bulk deliveries are made by blocking a lane of the road for several seconds, which majorly affects the vehicular traffic. However, the modeling of such events was not considered as it requires detailed study.  Furthermore, as the model's results can vary highly due to Vissim's stochastic variations, a minimum number of simulation runs was determined to achieve statistically confident results for each scenario. According to the one-step approach of Tian et al. [46], the required number of runs was equal to nine and was computed as follows: 2 /2 () a n = z      (2) where, • n denotes the number of runs. • ⌈ ⌉ ceiling function. • σ sample's standard deviation (based on five initial runs). • α significance level. Furthermore, as the model's results can vary highly due to Vissim's stochastic variations, a minimum number of simulation runs was determined to achieve statistically Future Transp. 2022, 2 68 confident results for each scenario. According to the one-step approach of Tian et al. [46], the required number of runs was equal to nine and was computed as follows: where, • n denotes the number of runs. • ceiling function. • σ sample's standard deviation (based on five initial runs). • a significance level. • z a/2 threshold value (for 95% confidence interval, z a/2 = 1.96). • E error range at the set confidence level (taken as 10% in this study which is considered acceptable for general practice [46]).
Finally, the calculation of the environmental indicators' values was made via COPERT Street level [47] based on the output of the PTV Vissim scenarios and freight vehicles' specifications.

Results
This section highlights the logistics' performance interdependencies of crowdshipping scenarios at an operator and network level. This is achieved by calculating the change between alternative scenarios' values and base scenario values, and the change of Before-After LSI values for each indicator, using the utilities of the Evalog platform.

Cross Scenario Analysis Results-Indicators
Focusing on the indicators, it can be concluded that as the number of crowdshipping deliveries increases per scenario, the better the indicator values become, both in environmental and traffic terms (beneficial indicators' values increase, while non-beneficial indicators' values decrease), as shown in Table 6. Although this observation is valid for the majority of the operator performance indicators, there are three cases of indicators in the opposite direction: (i) number of roundtrips (O7), (ii) vehicle utilization factor (O10), and (iii) load factor (O11).
Analytically, the number of roundtrips indicates how many trips a freight vehicle has realized to deliver its daily volume. The lower the number of roundtrips performed, the more efficient the delivery process is, as a higher number of roundtrips adds more vehicle-kilometers. According to Table 6, in the base scenario, all freight vehicles perform only one roundtrip to deliver their daily volume (O7, Base scenario value: 1 (average value)). However, there are scenarios in which the number of roundtrips per freight vehicle is increased by +20% (scenarios 1, 2, 3, 5, 9, 10, 11, 12, 15, 17, and 18). This appears to be a paradox, given that in the crowdshipping scenarios the number of deliveries performed by freight vehicles is reduced and thus fewer roundtrips should be needed. The explanation for this paradox is that in those scenarios the delivery process is reorganized with three, instead of four mopeds, which necessitates an additional roundtrip for one of them (see O8 indicator "Number of delivery mopeds"). The delivery process reorganization is the outcome of the route planning optimizer according to the number and location of deliveries that are to be performed in each scenario.
Second, one of the weaknesses of low freight demand, when a courier operates their own vehicles, is the low vehicle utilization factor (O10). Vehicle utilization factor is defined as the average number of hours a vehicle is in service over 24 h. More crowdshipping deliveries correspond to less service time for freight vehicles, and thus, lower values for the indicator.
Third, increased load factors usually represent the efficient transport of goods as the vehicle-kilometers needed to transport the same load are reduced [48]. The load factor on average for all freight vehicles in the base scenario was estimated as 60%. This has been computed considering vehicles' maximum loads and for every successful delivery the load is decreased by the weight of the delivered parcel. In the crowdshipping scenarios, the load factors seem to lower-at around 50% (see O11). This is attributed to the fact that in the crowdshipping scenarios fewer parcels are being delivered by the vehicles, and thus, the maximum loads are less than the respective loads in the base scenario.
Another interesting observation is that for scenarios with a low number of crowdshipping deliveries, i.e., 1, 2, 7, 10, and 13, values do not follow a specific pattern. Indicators' values seem to be affected more by the advancements of a more efficient route planning process, rather than the UFT solution per se. In scenarios, however, with a high number of crowdshipping deliveries, i.e., scenarios 6, 17 and 18, values show clear benefits.
The Operator analysis results for all scenarios in absolute numbers can be found in Appendix B. It is noted that indicators O6 "Traffic throughput" and O12 "Average freight vehicles' speed" are estimated as a direct output of the Vissim model evaluation attributes.
Regarding the Network performance indicator values, the reduced freight trips due to crowdshipping deliveries seem to be an inconsiderable proportion of the total daily trips. Thus, any changes to the total daily traffic impacts are attributed solely to the stochastic nature of the model and not to crowdshipping per se. Figure 5 shows the change of the Before-After LSI values at an operator and network performance level, with respect to the ratio of crowdshipping deliveries over the total deliveries for each scenario. Focusing on the operators' results, all crowdshipping scenarios are evaluated positively, as After scenario values were higher than the Before values.  Figure 5 shows the change of the Before-After LSI values at an operator and network performance level, with respect to the ratio of crowdshipping deliveries over the total deliveries for each scenario. Focusing on the operators' results, all crowdshipping scenarios are evaluated positively, as After scenario values were higher than the Before values.

Cross Scenario Analysis Results-LSI
Results show that scenarios with a higher number of crowdshipping deliveries tend to improve the overall performance of the operator by at least 6%, while networks' overall performance remains steady. A further analysis was conducted to reveal any correlations between the LSI change and the CD/TD ratio with regards to the adoption rates by public transport users. There is not a significant relationship between the operator LSI change and the CD/TD ratio, r(16) = 0.36, p > 0.05 (alpha = 0.05), see Figure 6.  Results show that scenarios with a higher number of crowdshipping deliveries tend to improve the overall performance of the operator by at least 6%, while networks' overall performance remains steady.

Multiple
A further analysis was conducted to reveal any correlations between the LSI change and the CD/TD ratio with regards to the adoption rates by public transport users. There is not a significant relationship between the operator LSI change and the CD/TD ratio, r(16) = 0.36, p > 0.05 (alpha = 0.05), see Figure 6. A further analysis was conducted to reveal any correlations between the LSI change and the CD/TD ratio with regards to the adoption rates by public transport users. There is not a significant relationship between the operator LSI change and the CD/TD ratio, r(16) = 0.36, p > 0.05 (alpha = 0.05), see Figure 6.   Table 7.

Concluding Discussion
This study shows that crowdshipping has the potential to contribute towards the overall sustainability of the urban freight transport system. Based on the analysis results, scenarios achieve an overall improvement from 6 to 13%.
Interestingly, LSI improvement cannot be correlated with the CD/TD ratio which indicates that assigning only a small percentage (0-8%) of the daily deliveries to crowdshippers instead of freight vehicles can have measurable benefits for the operators, but an unclear contribution to the systems sustainability. A higher number of crowshipping deliveries could probably reveal a clearer relationship between the CD/TD and system's sustainability. That is the reason why many similar studies started referring to crowdshipping as a way to address a latent need, rather than replacing commercial deliveries [21]. However, certain benefits emerge, even without reaching high scalability levels for the solution. Analysis results captured a strong correlation between the CD/TD ratio with vehicles' utilization, and a medium relationship between CD/TD ratio with VOC emissions, PM emissions, freight vehicle traffic throughput and freight vehicles' average speed, respectively. As per the vehicle utilization, more crowdshipping deliveries, correspond to lower utilization for freight vehicles, and thus, less service time with various resulting benefits. Going one step further and inspired by the Physical Internet vision for moving from individual to interconnected logistics networks, crowdshipping could contribute into transitioning to a shared fleet scheme of eco-friendly vehicles that promote the priority of the European Green Deal for drastically less polluting freight transport [49].
Furthermore, a well-designed operation framework with a dense network of smart lockers which also support deliveries with green vehicles could bring even more benefits than the ones described in this study. Coupling this framework with a crowdshipping app that aligns PT user's origin and destination with the one of the parcel could significantly increase the adoption rates and decrease delivery times. Especially in the era of COVID-19, in which courier companies are pushed to their limits, crowdshipping can provide immediate aid to last-mile delivery, as smart lockers combine both flexibility and contactless deliveries. Furthermore, it is logical to assume that a wider accessibility area of 5-10 min walking distances-instead of 1.5 and 3 min-and a higher number of installations across the city would bring more benefits. Restrictions related to spatial, economic, and safety issues, however, should not be overlooked [17].
Comparing the results of this study with other relevant studies in the literature, interesting conclusions can be drawn. Direct result comparison, however, cannot be performed as in many cases the assumptions and the scope are different. Analytically, Gatta et al. [50] analyzed the likely impact of a metro-based crowdshipping system in Rome from an economic and environmental point of view. The results show significant environmental-related benefits (more than 100% emission savings in CO, CO 2 , NO x , PM 2.5 ), however, assuming substantially higher adoption rates (16.4% to 66.1%) and not 0.3% to 7.7%, as in this study. In addition, the demand reflected on the total number of potential crowdshipping deliveries in Rome (3500 to 14,100 per day), and not on the proportionally infinitesimal number of crowdshipped deliveries that could be removed from the delivery planning of one courier company that serves 350 deliveries on a daily basis. Another difference in the two studies is that the study of Gatta et al. [50] assumes catchments areas of 800 m, instead of 1.5 min (~150 m) to 3 min (~300 m) assumed in this study. Having to walk, as a final recipient, 800 m to get your parcel from a metro station locker and another 800 m back to your house may not be possible in many cases. Moreover, a similar study describes how crowdshipping services can be designed considering the proximity of delivery points and home addresses, with students' flows between origins and destinations and PT [51]. Although no impacts were calculated, the authors conclude that headways maximum of 7 min can support successful deployment of the solution. In our study, the headways of buses are a bit higher (can range from 8 min-in some cases even lower-to 30 min), which inevitably limits the potential of wide adoption. Moreover, similar conclusions can be also drawn from different approaches that focus on private vehicle-based crowdshipping. For example, Dai et al. [52] estimated the environmental and economic benefits that would be brought by crowdshipping in comparison with the traditional delivery, according to participants' stated willingness. Their results are also based on very high adoption rates (~43%) which could achieve a~68% reduction in CO 2 , 97% in PM and NO x , but also a~25% increase in CO.

Limitations
Delivery data availability from more courier companies is a major limitation of this study. Combining the delivery trips of all six major courier companies in Volos, instead of one, could shed more light on the evaluation of the performance of the solution, not only at an operator level, but also at a network level.
Another limitation is that certain aspects of freight vehicles' trips in the model have not been thoroughly determined. For example, the scenarios in Vissim, assume that parking spaces are available for minivans in all delivery addresses, which is not accurate. However, the modeling of such events needs detailed study. This is an interesting matter which could be further researched in the future, especially with the use of traffic microsimulation.
Finally, various operational aspects that go beyond the one-day horizon of this study, e.g., duration of time that parcels can stay in the locker until they are picked-up, have not been studied.

Further Research
Apart from parking availability matters, future research could also be directed into determining the specifications for a wider adoption of crowdshipping by PT users. Buses, as the most common public transport mode, have the potential to contribute in this direction. Such a set of requirements could involve a dense network of smart lockers, reciprocal benefits for crowdshippers, public transport administrated ICT platforms for matching, or other micromobility issues that may discourage PT users to undertake a delivery (maximum distances between a smart locker and a PT stop), etc. Acknowledgments: This research has been based on data provided by ELTA Courier company (https://www.elta-courier.gr (accessed on 9 April 2019)).

Conflicts of Interest:
The authors declare no conflict of interest. Delivery data were requested and given by a private courier company which provides door-to-door deliveries in the city of Volos. As the dataset indicated that deliveries were performed between 09:00 and 19:27, traffic data were respectively searched for the same time of period 09:00-20:00. Morning and noon peak hour traffic volumes and travel times were given by the Traffic, Transportation and Logistics Laboratory (TTLog) of the University of Thessaly based on a travel survey conducted in 2011, 2012 and 2013 aiming at developing an origin-destination matrix of travel demand in the city of Volos. Evening peak hour traffic volumes were taken from a diploma thesis of the Department of Civil Engineering of the same University, which recorded traffic volumes in major intersections in an effort to store city's traffic data in a structured way by using GIS (Tsitsogiannis, 2017). Traffic signal programs were provided by the Traffic Planning Department of the city of Volos, while public transportation routes, stops, and timetables were downloaded from the official website of Volos' public transportation (Astiko Ktel Volou). The remaining operational elements required were determined either from Google Maps or on-site observation.

Appendix A.2. Coding the Traffic Model in PTV Vissim
The traffic model was developed using PTV Vissim traffic simulation software. Vissim is a microscopic, time step and behavior-based simulation model developed to model urban traffic and public transit operations. The software can analyze traffic and transit operations considering constraints such as lane configuration, traffic composition, traffic signals, transit stops, etc., thus, it can be a useful tool for the evaluation of various alternatives based on transportation planning solutions of effectiveness (Matsuhashi et al., 2005). Regional, arterial, and characteristic collector roads running through the study area were selected to form the model's road network, as modeling all of the city's road sections would demand a very high amount of data and resources. The network was designed based on information on the geometrical characteristics of the routes drawn by Google Earth. The traffic lights were placed at the intersections of the network with the corresponding operating programs as provided by Volos' Traffic Planning Department. Vehicle composition during the morning and noon hours was determined by visual observation of historical images provided via Google Earth-as in the relevant dataset traffic volumes were given in passenger vehicle units-while for the evening hours by processing the traffic volumes. Emphasis was also given to the warmup period and selection of driving parameters to imitate as accurately as possible the driving behavior of Greek drivers.

Appendix A.3. Model Calibration and Validation
The model's calibration and validation were based on the U.S. Department of Transportation (2019) criteria. Representative data collection points and travel time sections of the Vissim network were selected. The results of the selected points and sections were compared with real-world data in five simulation rounds with different random seeds.     The validation process is the one that ultimately verifies that a model outputs reliable updated data which can be used for various purposes. To validate the model, the generated average travel times of the time sections were compared to real travel time data from taxi GPS devices. Taxi GPS data were given by Volos' Taxi association for the period of 1 February-31 May 2019.