Performance and Intrusiveness of Crowdshipping Systems: An Experiment with Commuting Cyclists in The Netherlands

: Crowdshipping systems are receiving increasing attention in both industry and academia. Di ﬀ erent aspects of crowdshipping (summarized as platform, supply, and demand) are investigated in research. To date, the mutual inﬂuence of crowdshipping platform design and its supply side (with participating crowdshippers) has not yet been thoroughly investigated. This paper addresses this mutual inﬂuence by investigating the relations between shipping performance and intrusiveness to daily trips of commuters who voluntarily act as cycle couriers. In an experiment in The Hague, cyclists were asked to transport small parcels during a simulated daily commuting routine. The grid of commuting trips acted as a relay network to move parcels to their individual destinations. All the movements of the parcels were recorded by GPS trackers. The analysis indicates that a higher degree of complexity of rules in crowdshipping systems can lead to better system performance. Meanwhile, it also imposes higher intrusiveness, as participants need to deviate more from their routines of daily, uninterrupted trips. The case also suggests that a well-designed crowdshipping system can increase system performance without having to ask too much from crowdshippers. This study provides reference to better design such systems, and opens up directions for further research that can be used to provide thorough guidelines for the implementation of crowdshipping platforms.


Introduction
With the developing technology and society's growing concern over the environment, academic and industrial communities are rethinking the way we organize production [1], transportation [2], and logistics [3]. The development of novel methods and technologies provides previously unforeseen solutions to better management, operations and planning of industrial activities towards improved efficiency and sustainability. One example is tracing [4] and decision support systems [5] in perishable goods logistics.
Among the emerging conceptual solutions, crowdsourced delivery, or crowdshipping, is receiving increasing attention. Crowdsourced delivery can be defined as a delivery service, a business mode that designates the outsourcing of logistics to a crowd, while achieving economic benefits for all parties involved [6]. By making use of crowdsourced transportation capacities, deliveries of goods are performed without having to deploy dedicated logistics services. Instead, they are transported in a non-dedicated context. This means a reduced delivery cost for the owners of products and a decreased impact on the environment. Crowdshipping can take the form of almost all transportation modes: passenger cars [7], public transport [8], and taxis [9]; it can also be implemented in different ways in terms of network layout: point-to-point [10], hub-and-spoke [11], and multi-hop relays [12].
We narrow down our scope to city logistics, which by itself is complex and can be improved on many dimensions [13]. Literature sees crowdshipping as the more sustainable solution to city logistics [14], because of its potential in terms of economic and environmental benefits [15]. Pilot projects of crowdshipping platforms are increasingly reported in scientific studies and by industrial sectors. Rougès et al. [16] analyze 26 businesses run by companies and start-ups that provide platforms for crowdsourced delivery and point out that the potential power of crowdshipping could be one of the alternative transportation solutions. In particular, a multi-segment multi-carrier delivery mode called "TwedEx" is discussed in [12]: people carry packages secondary to their daily lives, e.g., commuting or shopping. In this context, each package is handled from person to person as relay based on overlaps in time and space until the package is delivered. This business model is further researched in simulated numerical studies in [17]. The analysis shows great potential in this business model, as it has remarkable speed and coverage. The authors call for "constructing and fielding (such) services", which could provide new insights for crowdsourced activities and business models.

Literature Review
Literature on crowdshipping mainly focuses on 3 aspects, namely supply, demand, and platform (readers are referred to [18] for a comprehensive literature review on these three parts of crowdshipping). The supply side focuses mainly on crowdshippers, who, with some compensation, provide casual delivery service according to their availability and willingness. For instance, Miller et al. [19] study commuters' behavior by sending out surveys to understand how willing people are to participate as workers in crowdsourced logistics. Chi et al. [20] use surveys and reveal that being recognized by an organization or a community is an important motivating reason to participate in crowdshipping. Le and Ukkusuri [21] suggest that socio-demographic characteristics, freight transportation experience, and social media usage significantly influence people's decision in participating as crowdshippers. Ermagun and Stathopoulos [22] carry out a comprehensive analysis of factors that can contribute to the response from the supply side.
The demand side concerns mainly parcel senders and receivers regarding topics such as characteristics of demand and customer trust. Frehe et al. [23] show that the usability of crowdshipping platforms and customer trust are important factors that influence customer acceptance of crowdshipping. Dablanc et al. [24] carry out a survey of crowdshipping platforms and find that prepared meals, groceries, retailing goods, and laundry are the most common delivery items. Gatta et al. [15] use a state preference survey to estimate demand for crowdshipping and evaluate its economic and environmental impacts. Punel et al. [25] use a survey to study motivations of people making use of crowdshipping services. They find that saving money is not the dominating factor that motivates customers to choose for crowdshipping services. Rather, users tend to be driven by environmental concerns.
The performances of crowdshipping platforms have been analyzed in abundance at a system level, mostly by optimizing the overall system performance with mathematical approaches in matching or task assignments. Chen et al. [26] develop a method for recommending tasks to mobile crowdworkers with the aim of maximizing the expected total rewards collected by all agents. Soto Setzke et al. [27] develop a matching algorithm that assigns items to drivers for delivery, with the objective of minimizing the additional travel time apart from planned routes. Chen et al. [9] use Taxi data in a city as a reference to develop a strategy to minimize package delivery time by assigning paths to each package request. Arslan et al. [10] study a dynamic pickup and delivery problem in order to match the delivery requirements to existing traffic flow. These articles investigate crowdsourced delivery at a system level, mostly to optimize the overall performance of the system by improving matching or task assignments.
A few studies also report the joint effect of these aspects, which are essential steps to comprehensive analysis that can support crowdshipping initiatives from multiple aspects. Marcucci et al. [8] conduct a survey that estimates people's willingness of participating in and paying for a crowdshipping service to analyze market potentials of crowdshipping. Rai et al. [28] reveal the relation between a crowdshipping platform and the crowdshippers. They recognize that having a "happy crowd" is important for a crowdshipping platform. In addition, a few studies focus on supply-platform relations. Zheng and Chen [29] investigate a crowdsourced task-assigning problem considering the possibility that participants may reject a task. They measure the willingness of participation using a probability function of rejection. Gdowsk et al. [30] aim to minimize the total cost in matching and routing with a crowdshipping platform. They use a probability function to consider the crowdshippers' willingness to participate/reject jobs. Kim et al. [31] introduce a "Hit-or-Wait" approach in order to balance the timing when participants are matched with tasks with minimal disruptions of their existing route. These studies consider either only supply's (that is, crowdshippers') influence on platforms [29,30] or platform's influence on the supply aspect [31]. A gap is identified here, as the two-way interaction between supply aspects and platforms is not fully investigated.
In this study, we take a unique standpoint investigating the two-way interaction between the supply side and platform aspect of crowdshipping. We particularly focus on the load of tasks imposed by the platform on crowdshippers and how participants mentally perceive this load. Unlike many other studies with data gathered from questionnaires, we use real-world experiments to reveal crowdshippers' behavior. As mentioned in [12], crowdshippers are not dedicated employees of delivery companies, thus the delivery tasks are only perceived as a "side-objective" apart from their daily life. Carrying a parcel and giving it to someone will for sure introduce some degree of disruption to their normal living patterns. Naturally, the more disruption the system imposes to each of the participants, the more likely it will affect the willingness of participants in a negative way. On the other hand, a more demanding load can increase the overall performance of the crowdshipping platform. Therefore, understanding the relations among overall performance, the degree of disruption that tasks impose, and how this disruption is perceived, will no doubt help better design crowdshipping platforms.
Our objective for this study is to investigate the relations between system-level performance and individual-level intrusiveness by means of real-world experiments. We conducted a case study in a small area in the Dutch city The Hague. Volunteers were invited to cycle in this area. Meanwhile, they were asked to perform ad-hoc relays to deliver small mango parcels. The parcels were tracked by GPS (the Global Positioning System) trackers.
The rest of the paper is organized as follows. In Section 3, we briefly look at the elements of crowdsourced systems and explain the experiment design for the case study. In Section 4, the results of the experiments are analyzed. We also discuss the experiment results and the potential of this form of crowdsourced logistics. Section 5 concludes this article and points out future research directions.

Case Study
Bicycles are considerably popular in The Netherlands. It is reported that, on average, 2.8 million cycling trips are made daily in the country [32]. Such a vast number of trips can provide substantial potential for bicycle crowdshipping. In our experiments, volunteers simulate commuting behavior on bicycles in an area in The Hague. Each volunteer follows his/her own route repeatedly. The volunteers are also asked to deliver packages of one mango to specific locations. When they run into each other, they may pass on the packages until the mangoes arrive at the destination. The package is GPS tracked. Thus, we can observe how the packages are transported in this area. This section starts with a brief overview of crowdsourced logistics systems to motivate our design approach for the experiments.

Crowdsourced Systems
A crowdsourced delivery system is also a self-organizing system. These bio-inspired systems, sometimes with high complexity, are based on entities that exhibit rather simple behaviors [33,34]. An ant colony is a great example: each ant follows rather simple patterns of behaviors but can form a crowd that can carry out highly complex tasks. In the same way, to mimic such a self-organizing system, a well-defined set of rules is significant. Because the complexity of rules imposed on individual participant is closely related to the effectiveness of the system, as well as the task load brought to each individual participant. A balance needs to be considered in designing a crowdsourced delivery system: the set of rules should be simple for each participant, as the effort they make to comply with them does not discourage them from taking the task. They also need to be able to facilitate a logistic system that is complex enough to performance in a practical and profitable way.
We define the effort, physically and mentally, that each participant needs to make to complete a delivery task as "complexity of rules", and the perceived disruption caused to crowdshippers as "the level of intrusiveness". Figure 1 illustrates the relation of these elements: more complex rules and tighter constraints lead to higher efficiency in achieving better performance at a system level; On the other hand, it may be highly disruptive, as we use the term "level of intrusiveness" to describe the tendency that following rules brings disruptions to participants' daily lives. A higher level of intrusiveness will likely decrease the willingness of participants, for they need to go further to fulfill crowdsourced tasks. The experiment design in this study considers the complexity of rules and its impacts on both system level and individual level, to gain insight and give suggestions on designing crowdsourced logistics systems with a balance of system-wise effectiveness and level of intrusiveness on participants.
Sustainability 2020, 12, x FOR PEER REVIEW 4 of 14 a crowd that can carry out highly complex tasks. In the same way, to mimic such a self-organizing system, a well-defined set of rules is significant. Because the complexity of rules imposed on individual participant is closely related to the effectiveness of the system, as well as the task load brought to each individual participant. A balance needs to be considered in designing a crowdsourced delivery system: the set of rules should be simple for each participant, as the effort they make to comply with them does not discourage them from taking the task. They also need to be able to facilitate a logistic system that is complex enough to performance in a practical and profitable way.
We define the effort, physically and mentally, that each participant needs to make to complete a delivery task as "complexity of rules", and the perceived disruption caused to crowdshippers as "the level of intrusiveness". Figure 1 illustrates the relation of these elements: more complex rules and tighter constraints lead to higher efficiency in achieving better performance at a system level; On the other hand, it may be highly disruptive, as we use the term "level of intrusiveness" to describe the tendency that following rules brings disruptions to participants' daily lives. A higher level of intrusiveness will likely decrease the willingness of participants, for they need to go further to fulfill crowdsourced tasks. The experiment design in this study considers the complexity of rules and its impacts on both system level and individual level, to gain insight and give suggestions on designing crowdsourced logistics systems with a balance of system-wise effectiveness and level of intrusiveness on participants. We name our case study "Contingent Cycle Courier" (CCC) project. To best simulate the ad-hoc nature of the crowdsourced activities, the CCC project adopts a multi-hop peer-to-peer approach for parcel deliveries. The simulated commuting routes of cyclists form a grid system (shown in Figure  2). They can form an ad-hoc relay to carry and hand in a parcel until it reaches its destination. In comparison with the traditional point-to-point [10] and the lately discussed hub-and-spoke [11] methods, this approach requires more collaboration among participants, and thus provides more freedom to make extra steps (deviating from their default routes) to ensure a task is successfully completed. These extra steps can serve as indicators on how much intrusiveness a task brings to each participant. We name our case study "Contingent Cycle Courier" (CCC) project. To best simulate the ad-hoc nature of the crowdsourced activities, the CCC project adopts a multi-hop peer-to-peer approach for parcel deliveries. The simulated commuting routes of cyclists form a grid system (shown in Figure 2). They can form an ad-hoc relay to carry and hand in a parcel until it reaches its destination. In comparison with the traditional point-to-point [10] and the lately discussed hub-and-spoke [11] methods, this approach requires more collaboration among participants, and thus provides more freedom to make extra steps (deviating from their default routes) to ensure a task is successfully completed. These extra steps can serve as indicators on how much intrusiveness a task brings to each participant.

Route Selection and Parcel Design
We invited 9 volunteers for parcel delivery. To simulate participants' different commuting routes, we chose an area in The Hague as shown in Figure 2. For each participant, a route was selected and numbered from 1-9, and they traveled back-and-forth using bicycles along their designated routes. In route selection, we took into consideration the urban traffic and the safety of the participants, avoiding areas with heavier or more complex traffic. Before starting the experiments, the participants were gathered indoors to practice the activity using smaller-scale simulations so that they became familiar with the rules. At each of the points A, B, and C, shown in Figure 2, a crew member was present to give out or to collect arrived parcels.
We designed parcels that are easy to be carried on bicycles. The small parcel was given the nickname "Mango Equivalent Unit" (MEU). Figure 3 shows the design and actual size of an MEU. Before each MEU was given out to a participant, a GPS tracker was placed inside the parcel, with full awareness of all participants, to track the movements of mangoes. The tracking data were then used for analysis.

Route Selection and Parcel Design
We invited 9 volunteers for parcel delivery. To simulate participants' different commuting routes, we chose an area in The Hague as shown in Figure 2. For each participant, a route was selected and numbered from 1-9, and they traveled back-and-forth using bicycles along their designated routes. In route selection, we took into consideration the urban traffic and the safety of the participants, avoiding areas with heavier or more complex traffic. Before starting the experiments, the participants were gathered indoors to practice the activity using smaller-scale simulations so that they became familiar with the rules. At each of the points A, B, and C, shown in Figure 2, a crew member was present to give out or to collect arrived parcels.
We designed parcels that are easy to be carried on bicycles. The small parcel was given the nickname "Mango Equivalent Unit" (MEU). Figure 3 shows the design and actual size of an MEU. Before each MEU was given out to a participant, a GPS tracker was placed inside the parcel, with full awareness of all participants, to track the movements of mangoes. The tracking data were then used for analysis.

Scenario Design
We designed 2 scenarios, each with a set of rules with different degrees of complexity. Note that we did not aim for optimal overall performance. Rather, this was to observe the impact of the complexity of rules on participants' behaviors, which in turn can affect system effectiveness. Each scenario's experiment lasted for 30 mins.

Scenario 1
In Scenario 1, the complexity of rules is lower. Cyclists follow the routes designated to them. When experiments start, parcels are handed over to Cyclist 1 and Cyclist 9 from point A. The cyclists can approach any other cyclists they encounter when following their own routes, to hand over a parcel. In the end, the parcels need to be delivered to point B or C.

Scenario Design
We designed 2 scenarios, each with a set of rules with different degrees of complexity. Note that we did not aim for optimal overall performance. Rather, this was to observe the impact of the complexity of rules on participants' behaviors, which in turn can affect system effectiveness. Each scenario's experiment lasted for 30 mins.

Scenario 1
In Scenario 1, the complexity of rules is lower. Cyclists follow the routes designated to them. When experiments start, parcels are handed over to Cyclist 1 and Cyclist 9 from point A. The cyclists Sustainability 2020, 12, 7208 6 of 14 can approach any other cyclists they encounter when following their own routes, to hand over a parcel. In the end, the parcels need to be delivered to point B or C.

Scenario 2
Scenario 2 has a higher degree of complexity with its set of rules in comparison with Scenario 1. In Scenario 2, MEUs are handed out at point B and point C. The ones from point B need to be delivered to point C, and the ones from point C needs to be delivered to point B. On each parcel, there is a sticker with an icon and a color to denote the expected destination of this parcel, so that each cyclist should only pass the parcel to the right person to be able to complete the delivery. To ensure that the deliveries are fulfilled, we design a grid system and relevant rules to help the cyclists fulfill their tasks.
The grid system is applied to all cyclists on all routes as shown in Figure 4. Each of the cyclists is assigned to one of the two dimensions of the grid system, represented by icons or colors. Cyclists traveling along the dimensions wear hats to indicate their directions. The two directions are noted with the hats they put on. For cyclists traveling in the east-west dimension, they put on a red hat when traveling towards the east and put on a blue hat when traveling towards the west. For cyclists traveling in the north-south dimension, they put on a hat with a "tin" icon when traveling towards the north and put on a hat with a "flower" icon when traveling towards the south. In this way, point B on the map is denoted by a "tin" icon and the blue color, representing the north-west corner. Similarly, point C in the south-east corner is denoted with flower and red.  Half of the MEUs are handed out from point C, with a sticker of blue and tin denoting their destination at point B; and the other half start from point B and end at point C, which is denoted by red and flower. The cyclists carrying an MEU with the sticker blue tin can only pass on the parcel to another cyclist with a blue hat or a hat with a tin icon. The cyclists carrying an MEU with a red flower can only pass on the parcel to another cyclist with a red hat, or a hat with a flower icon. By introducing these rules, each handing over is ensured to have the parcel one step closer to its destination.
We make a list in Table 1 to compare the rules imposed on participants in the 2 scenarios. In Scenario 1, the destination of a parcel can be point B or C, thus only very basic rules are imposed to allow the parcels "flow" in the network. In Scenario 2, extra instructions are given to increase the efficiency of the system. Note that in both scenarios, the rules for each participant does not specify the overall objective of the system: to deliver parcels to specific points. Rather, the instructions to each participant are only to whom they can pass on the parcel. This design is in line with the principle of Half of the MEUs are handed out from point C, with a sticker of blue and tin denoting their destination at point B; and the other half start from point B and end at point C, which is denoted by red and flower. The cyclists carrying an MEU with the sticker blue tin can only pass on the parcel to another cyclist with a blue hat or a hat with a tin icon. The cyclists carrying an MEU with a red flower can only pass on the parcel to another cyclist with a red hat, or a hat with a flower icon. By introducing these rules, each handing over is ensured to have the parcel one step closer to its destination.
We make a list in Table 1 to compare the rules imposed on participants in the 2 scenarios. In Scenario 1, the destination of a parcel can be point B or C, thus only very basic rules are imposed to allow the parcels "flow" in the network. In Scenario 2, extra instructions are given to increase the efficiency of the system. Note that in both scenarios, the rules for each participant does not specify the overall objective of the system: to deliver parcels to specific points. Rather, the instructions to each participant are only to whom they can pass on the parcel. This design is in line with the principle of a self-organizing system, that simple rules imposed on each individual participant, can also achieve overall system-wise objectives that are more complex.

Results and Discussion
In this section, we discuss the results of the experiments by comparing the two scenarios. Figures 5  and 6 show typical routes of an MEU from Scenario 1 and 2, respectively.

Indicators
Since the objectives, origins, and destinations of the two scenarios are different, we cannot directly compare the overall performance of the two scenarios. Therefore, the system performance and the individual's behavior should be observed from the details of journeys made by all the parcels in the experiment. For instance, in Figure 6, we observe a repetitive pattern caused by a cyclist traveling back-and-forth a few times, indicating reduced efficiency of the system performance. We need to investigate how each (successful) delivery and each relay (contributing to successful deliveries) can influence the system-wise performance. We also need to observe how individuals perceive the complexity of the rules, i.e., the intrusiveness imposed on them. With these, the relation between the complexity of the rules, the level of intrusiveness, and how they could affect the system's overall performance can be observed in the two scenarios.

Indicators
Since the objectives, origins, and destinations of the two scenarios are different, we cannot directly compare the overall performance of the two scenarios. Therefore, the system performance and the individual's behavior should be observed from the details of journeys made by all the parcels in the experiment. For instance, in Figure 6, we observe a repetitive pattern caused by a cyclist traveling back-and-forth a few times, indicating reduced efficiency of the system performance. We need to investigate how each (successful) delivery and each relay (contributing to successful deliveries) can influence the system-wise performance. We also need to observe how individuals perceive the complexity of the rules, i.e., the intrusiveness imposed on them. With these, the relation between the complexity of the rules, the level of intrusiveness, and how they could affect the system's overall performance can be observed in the two scenarios.

Pass
The number of passes denotes how many times each parcel hops from one cyclist to another. Note that when a mango carrying cyclist turns around at the endpoint of his journey and begins to travel backward (with a cap switching motion), it also counts as one pass. This indicator gives an idea

Pass
The number of passes denotes how many times each parcel hops from one cyclist to another. Note that when a mango carrying cyclist turns around at the endpoint of his journey and begins to travel backward (with a cap switching motion), it also counts as one pass. This indicator gives an idea of how long the journey is before the parcel is delivered. This indicator reflects the effectiveness of the overall system.

Long-Wait
Each cyclist may choose to wait (although they were not told to) at an intersection to have the parcel handed over to someone else. This was not specified in the rules but was not forbidden either. If a cyclist waits for more than 30 s at an intersection in order to give the parcel away, this pass is counted as one long waiting pass. Note that when a mango carrying cyclist turns around without giving the mango to others, it also counts as one long waiting pass if he waits for more than 30 s at the turning point. This indicator reflects the extent to which cyclists are willing to deviate from their own routes to complete a relay. This helps us understand the level of intrusiveness of relay tasks.

Turn-Around
We count the number of turning-around actions of mango carrying cyclists. If a cyclist turns around with a parcel in hand, it means the mango travels a longer distance than necessary to be successfully delivered, which could lead to lower efficiency of the overall system. This gives us an idea of the effectiveness of the logistics system, and in particular, the efficiency of relaying activities.

Overlap
We count the total number of routes covered by each mango for more than once. This may also help us understand how much distance each mango travels over is non-effective, which directly relates to the effectiveness of the overall logistics system.

Results
We summarize the data collected from our GPS trackers in Table 2. The list comes in two sections: total count (which includes all parcels' movements) and successful delivery (which only includes movements of parcels that are successfully delivered within the given time). Number of successful deliveries, the average number of passes, long-waits, turn-arounds, and overlaps per delivery are Sustainability 2020, 12, 7208 9 of 14 shown in Table 3. Numbers of long-waits, turn-arounds, overlaps, time spent per pass/relay that leads to successful deliveries are shown in Table 4. The standard deviation of time spent per relay and the number of passes per 5 min are also shown in the table.

Analysis and Findings
We do not compare the number of successful deliveries in two experiments, because this number is affected by the origin-destination arrangements. We only investigate the motions of parcels in terms of "per delivery" and "per pass" to get an insight on the behaviors of the participants.
More complex rules bring higher intrusiveness. In both Figures 7 and 8, Scenario 2 has a higher percentage in long-waiting passes than Scenario 1. This indicates that participants exhibit more effort to adjust their commuting activity in order to finish delivery tasks when rules are stricter in Scenario 2. In other words, as the complexity of rules increases, the logistics activity becomes more demanding. As a result, the participants respond by putting more effort in to fulfill their tasks, deviating to a greater extent from their commuting behavior. This suggests that a more complex set of rules brings higher intrusiveness to participants. percentage in long-waiting passes than Scenario 1. This indicates that participants exhibit more effort to adjust their commuting activity in order to finish delivery tasks when rules are stricter in Scenario 2. In other words, as the complexity of rules increases, the logistics activity becomes more demanding. As a result, the participants respond by putting more effort in to fulfill their tasks, deviating to a greater extent from their commuting behavior. This suggests that a more complex set of rules brings higher intrusiveness to participants.    percentage in long-waiting passes than Scenario 1. This indicates that participants exhibit more effort to adjust their commuting activity in order to finish delivery tasks when rules are stricter in Scenario 2. In other words, as the complexity of rules increases, the logistics activity becomes more demanding. As a result, the participants respond by putting more effort in to fulfill their tasks, deviating to a greater extent from their commuting behavior. This suggests that a more complex set of rules brings higher intrusiveness to participants.   More complex rules can contribute to better effectiveness. As shown in Table 3, the average time spent per successful delivery is lower in Scenario 2. Table 4 shows that the average time spent per pass is also lower. This indicates that the system with more complex rules yields higher effectiveness of the whole system.
Well-designed rules increase the reliability of the system. Comparing Figures 7 and 8, it is observed that trips of MEUs differ greatly between unsuccessful and successful deliveries in Scenario 1. Passes leading to successful deliveries in Scenario 1 have a higher percentage of long-waits and lower turn-arounds and overlaps. It could be partially because the time for each experiment was 30 min, and MEUs relayed with less efficiency were stopped manually when the time was up. However, consistency is shown by the two figures regarding long-waits, turn-arounds and overlaps amongst delivered/undelivered MEUs in Scenario 2. If the experiment had continued for a longer time, the undelivered parcels in Scenario 2 would probably be delivered in the same efficiency. This concerns the reliability of delivery efficiency. To explain this we use Figure 9, which depicts the linkage between the number of passes per 5 min and the number of long-wait passes within each delivery. Each point represents one MEU successfully delivered. The position of points is a reflection of system effectiveness (on x-axis), and the level of intrusiveness (on y-axis). In Scenario 2, the system effectiveness increases along with the level of intrusiveness (although not strictly linear, a trend is shown). However, in Scenario 1, the points are scattered, denoting that efforts made by participants are not directed to contribute to system effectiveness. In other words, higher intrusiveness imposed by rules does not necessarily yield higher effectiveness. As a result, the reliability of overall system performance is lower with the set of rules for Scenario 1. This resonates with the higher standard deviation of travel time per pass in Scenario 1 compared with Scenario 2 in Table 4. effectiveness increases along with the level of intrusiveness (although not strictly linear, a trend is shown). However, in Scenario 1, the points are scattered, denoting that efforts made by participants are not directed to contribute to system effectiveness. In other words, higher intrusiveness imposed by rules does not necessarily yield higher effectiveness. As a result, the reliability of overall system performance is lower with the set of rules for Scenario 1. This resonates with the higher standard deviation of travel time per pass in Scenario 1 compared with Scenario 2 in Table 4. Marginal system effectiveness may not be at an equivalent level as the marginal level of intrusiveness. In Scenario 2, the system-wise improvements can be reflected (in Table 4) by (1) 13.0% less time spent per pass in successful deliveries; (2) 52.2% less standard deviation of time spent per pass in successful deliveries. Meanwhile, these improvements come with an increase of intrusiveness (reflected by 18.0% higher long-wait passes) imposed by the system onto the participants. On the other hand, there is a significant increase (149%) of overlapping links traveled by participants, introducing some sort of increment in ineffectiveness even when the task is more complex and that the average time spent per pass or per delivery is less. It is certain that a change of rule complexity can change participants' behaviors when they perform delivery tasks, which can affect overall system performance. It is, however, unclear that which aspects of the behavior change will have a greater impact on system-wise performance. A sound assumption is that the impact of the increased complexity of task/intrusiveness on overall system performance depends on the system design. Similarly, different parameters may contribute to the overall performance to different extend, also according to the system design. Hence it is necessary to conduct further quantitative analysis on the benefit/downside of increasing complexity of task/intrusiveness in a crowdsourced system. This is Marginal system effectiveness may not be at an equivalent level as the marginal level of intrusiveness. In Scenario 2, the system-wise improvements can be reflected (in Table 4) by (1) 13.0% less time spent per pass in successful deliveries; (2) 52.2% less standard deviation of time spent per pass in successful deliveries. Meanwhile, these improvements come with an increase of intrusiveness (reflected by 18.0% higher long-wait passes) imposed by the system onto the participants. On the other hand, there is a significant increase (149%) of overlapping links traveled by participants, introducing some sort of increment in ineffectiveness even when the task is more complex and that the average time spent per pass or per delivery is less. It is certain that a change of rule complexity can change participants' behaviors when they perform delivery tasks, which can affect overall system performance. It is, however, unclear that which aspects of the behavior change will have a greater impact on system-wise performance. A sound assumption is that the impact of the increased complexity of task/intrusiveness on overall system performance depends on the system design. Similarly, different parameters may contribute to the overall performance to different extend, also according to the system design. Hence it is necessary to conduct further quantitative analysis on the benefit/downside of increasing complexity of task/intrusiveness in a crowdsourced system. This is especially noteworthy since crowdshipping platform designers have to consider the impact of changing task loads on the performance at the system level as well as participant behavior at the individual level. With this insight, they can adjust the task load in the most beneficial way according to the need specified by the platform. In this way, a well-designed crowdshipping platform can accomplish much without having to require more than necessary from crowdshippers.

Conclusions
Rarely does scientific research focus on the mutual influence between supply aspects and crowdshipping platform design. This paper takes a first-yet preliminary-step to analyze the two-way relations between these two important pillars of crowdshipping systems by means of real-world experiments. In particular, the investigation seeks to understand how the designing of a crowdshipping system may influence the interactions between the complexity of rules, level of intrusiveness, and overall system effectiveness. We recruited volunteers to participate as bicycle crowdshippers in the case study, where they simulate their daily commuting actions. In the meantime, they formed a grid of ad-hoc relay system to move small parcels of mangoes and eventually deliver them to certain locations. Each parcel was equipped with a GPS tracker. The experiments were performed in an area in The Hague, Netherlands. Two scenarios with rules of different complexity were applied. Results from the GPS were retrieved and analyzed.
From the analysis, it is revealed that the way a crowdshipping platform is designed has influences on both the system level and individual level for crowdshippers. On one hand, more complex rules impose a higher level of intrusiveness. Thus participants, apart from their primary goals (i.e., their daily lives), may need to take extra steps, mentally and in practice, to follow the instructions given by the logistics system. On the other hand, more complex rules may contribute to better overall system performance (in this study, efficiency and reliability). In addition, our analysis indicates that when rules become more complex, the impact on selected indicators may not be of the same amount with the increase of the level of intrusiveness. Crowdshipping platform designers need to analyze the impact of a change of task load on system-wise performance as well as on participants' behaviors, in order to adjust the design in the best way. This is especially noteworthy as it can illustrate the significance of the design of rules of the crowdsourced systems: a well-designed system can accomplish much without being too intrusive and having to require more than necessary from participants. The study also provides a reference from an economic perspective, as more intrusiveness could be paired with higher rewards, which keeps the logistics activity attractive to participants.
There are certain limitations to this study. Firstly, the experiments are only with two comparing groups with relatively fewer participants at a smaller scale, which makes it difficult to conduct more thorough, quantitative studies. Subsequently, the participants in the experiments might have seen these tasks as their primary goal rather than the secondary, as the simulated commuting behavior is not their actual commuting behavior. Nevertheless, it does not diminish the value of this study, as it reveals the importance of system design in balancing system-wise performance and intrusiveness of individual participant in a crowdsourced logistics system. It also points directions for further and more thorough study on crowdshipping. In future research, it is worthwhile to conduct larger size experiments from people's real daily activities. It is also interesting to quantify the level of intrusiveness and/or complexity of rules/tasks with a more explicit form, which may serve as a step to providing analytical support for better crowdshipping system design.