Simulating the Cost of Cooperation: A Recipe for Collaborative Problem-Solving

: Collective problem-solving and decision-making, along with other forms of collaboration online, are central phenomena within ICT. There had been several attempts to create a system able to go beyond the passive accumulation of data. However, those systems often neglect important variables such as group size, the difﬁculty of the tasks, the tendency to cooperate, and the presence of selﬁsh individuals (free riders). Given the complex relations among those variables, numerical simulations could be the ideal tool to explore such relationships. We take into account the cost of cooperation in collaborative problem solving by employing several simulated scenarios. The role of two parameters was explored: the capacity, the group’s capability to solve increasingly challenging tasks coupled with the collective knowledge of a group


Introduction
Crowdsourcing and, more generally, group decision-making and collective problem-solving are central topics in the cognitive computation field [1][2][3].Generally speaking, there have been many attempts to exploit the properties of human information exchange in order to improve collective decision-making [1][2][3].By means of social and cognitive-inspired simulations based on the sociophysics approach, in this paper, we employ a numerical simulation framework for crowdsourcing [4] in order to investigate the role of the cost of cooperation and its interaction with other variables (group size, difficulty of the task, the presence of selfish individuals, etc.).We highlight the importance of the cost of cooperation and determine the conditions where higher costs do not hinder the overall performance.

The Importance of Crowdsourcing
Engaging a community of experts in solving complex problems or stakeholders in gathering new ideas has become an increasingly common practice.Such types of processes are generally known as crowdsourcing.
For example, in 2009, the mathematician Tim Gowers started the Polymath Project, a collaboration among other mathematicians to solve difficult mathematical problems by coordinating many colleagues.The basic idea was to persuade them to collaborate in order to find the best way to the solution.In just a few weeks, the effort of this community of mathematicians was able not only to solve the proposed problem but to figure out the solution to a more difficult generalized version of it [5].
Moreover, group decision-making and collective intelligence are the core concept of certain crowdsourcing models (e.g., Open Collaboration) [6] .Nowadays, problem-solving is no longer seen as the action of a single individual.Groups and communities have become central in ensuring a distributed, plural and collaborative decision-making process [7].In such a sense, the crowd proved to have the capability of solving highly complex problems that traditional problem-solving teams can't settle.
Although there are various definitions of crowdsourcing, a feature that seems to be common in many of its definitions [8][9][10][11][12] is conceiving such dynamics as a widespread problem-solver.

Limitations of the Extant Literature
Given the new possibilities created by information and communication technologies, collaborative decision-making has become a central topic within many fields, including cognitive computing.For example, the authors of [2] developed CO-WORKER, a real-time and context-aware system able to exploit information exchange in human interactions going beyond passive data storing.Indeed, the system, inferring contextual information during several different activities (learning, discussion, cooperation, decision-making, and problem-solving) actively engages the participants with respect to communication, meetings, information sharing, and work processes, among other activities.However, CO-WORKER assumes that people will collaborate to the platform: issues such as the number of interacting individuals, the difficulty of the task, and, in particular, the cost of cooperation (i.e., the possibility that some participant will not put enough effort in engaging the system) are neglected.Anyway, those are crucial factors in determining the success of the system.The same applies to other collaborative knowledge building architecture (e.g., TeamWork station, Virtual Math Team, and Dolphin) and, more generally, to systems that employ specific techniques (such as fuzzy logic and aggregation operators) in order to improve group decision-making via the reaching of a certain level of consensus [1].Also in this case, the above-mentioned variables are usually neglected, but, indeed, they are crucial in solving problems by a community of experts.Another important example could be the problem of the development of semantically structured data and metadata by the annotation of resources [1].For example, much effort has been devoted to the development of semantic web-based annotation system able to facilitate the creation of user annotation.However, even in this case, the issue of the cost of cooperation may hinder the entire system.What if the user does not engage in the activity because of laziness, lack of attention, or motivation?Exploring the factors that influence group decision-making and, more generally, online collaboration, may give important information to the extant literature about the development of systems aimed at exploiting collaborative problem-solving.However, those insights would obviously not be applicable to all forms of crowdsourcing since crowdsourcing itself is a broad and complex theme.

Factors Affecting Group Decision-Making: A Numerical Simulation Approach
Many variables affect group decision-making in problem-solving [13] such as cognitive [14], social [15], motivational [16], and evolutive [17] factors.Therefore, it can be assumed that a group needs to solve a problem whose solution may produce benefits for the entire community as well as for single individuals.Despite many crowdsourcing projects that include individuals who do not necessarily know each other (i.e., those who do not share a common identity), we have chosen to use the term "community" in order to consider other typologies of crowdsourcing, for instance, those related to organized and online communities [18,19], as it represents our perspective better (i.e., the production of collective knowledge by means of direct interaction among individuals).
Depending on personality factors, motivation, and cognitive variables, an individual may choose to combine his effort with other members or to remain an individualist (the so-called free rider).In the first case, if the subgroup of people who cooperate finds a positive solution to the problem, such a solution can give benefits to each individual even if their contribution was little to the solution achieved.However, free-riders play an important role from an evolutive point of view, for they have smaller chances of solving the problem, but if they find a solution, the individual learns much more than when the solution is found collectively.In the real world, as well as in a virtual environment, cooperative individuals live and interact with those who behave selfishly.In this sense, it is important to understand which factors affect the decision to act in a pro-social manner (i.e., to cooperate in order to achieve a common goal).Nonetheless, individual differences in the tendency to cooperate are not only attributable to genetic factors (or in a broader sense to individual aspects), even though these certainly play a significant role.Even the environment, and therefore learning processes, sharply influence cooperation and competition dynamics.For instance, social contexts (e.g., culturally related socialization experiences) appear to predispose individuals to adopt one strategy or another [20].According to the social heuristics hypothesis [21], people internalize those strategies that are generally advantageous in everyday social interactions, which also lead them into atypical social environments (e.g., virtual environments and laboratory experiments).Recently, cognitive science has paid special attention to the role of contextual variables that influence cooperation dynamics.In fact, today's technological society has prompted individuals to confront increasingly complex cognitive tasks, and one of the ways in which humans have responded to this complexity is through a group, of which crowdsourcing could be considered the numerically largest possibility [22].The environment that is created within a team (e.g., shared and interactive team cognition) can facilitate or hinder the achievement of a cooperative goal [23,24].In addition, the interaction with situational variables (e.g., the time available to make a choice, group size, or the complexity of the task) influences in a non-trivial manner the outcome of the decision-making process by making certain strategies of problem-solving more or less salient [25,26].Furthermore, computational models [27] and field studies [28] from other disciplines emphasize the role of group size in supporting the level and the quality of interactivity among individuals (i.e., the production of collective knowledge).For instance, experimental literature on social dilemmas suggested that different types of group-size effects on cooperation are possible (negative, positive, and curvilinear), depending on the payoff structure of the game [29,30].In a recent study, task complexity was further investigated [31].Despite the fact that micro-tasks have become increasingly common within crowdsourcing practices, not all problem-solving situations can be addressed with such an approach.Another factor that can influence the tendency of individuals to cooperate is the cost of cooperation.In fact, every human interaction involves a cost.In the simplest case, these costs concern the communication and the coordination (e.g., Ringelmann effect) among individuals.However, one of the ways in which it is possible to think about the cost of cooperation brings up the concept of reciprocity, which is the risk that our own cooperative behavior will not be reciprocated.With few guarantees that cooperation will not be exploited, the cost (the risk) of the cooperative behavior increases, and this has a negative effect on all cooperation levels [32].Conversely, a lower exploitation risk (lower cost) positively affects cooperative dynamics.For instance, the possibility to identify effectively [33,34], to reward or punish our social partners [35,36], or to spread rumors about them (i.e., to gossip) [37,38] seems to positively affect the establishment and the maintenance of good levels of cooperation.This phenomenon, which considers the intricate relationship among group dimension, the difficulty of the problem, the tendency to collaborate or not, as well as many other variables, is very complex, and even more so when the results, provided by recent literature and referring to small group situations, are considered.
Contributions in psychology have successfully handled the complexity of such psychological aspects recurring to agent-based modeling (ABM) [39].An ABM approach proved to account for dynamics characterized by many interdependent individuals that adapt their behavior according to the social environment demands [40][41][42].Moreover, some of the aforementioned psychological aspects that influence cooperation dynamics (e.g., reputation, peer influence, and empathy) have been modeled in order to replicate human decision-making [43].
A simulation approach based on social and physical principles can be useful to model this phenomenon by taking into account groups with greater amplitude, as in the case of crowdsourcing.It is worth stressing the fact that sociophysical models have already shown to be very useful for the understanding of social phenomena related with crowdsourcing.In particular, we remind the reader of the proper opinion dynamics models, as the voter and the Deffuant models, which describe the evolution of opinions in a population of agents that share their ideas (the former in case of discrete possible opinions, the latter with continuous ones) and are allowed to simulate, with suitable modifications, simple but realistic situations [44][45][46]; on the other hand, Galam's works [47,48] focused on the effects of minorities or agents with anti-social behavior as the contrarians, further refining the efficacy of the sociophysical models and simulations, showing the versatility and usefulness of the numerical approach besides the purely theoretical and experimental ones.Finally, a further step forward was accomplished in [49,50], where cooperation and defection as strategies adoptable by individuals were explicitly added to the models, allowing for a better understanding of the interplay among the cost of cooperation, the irrationality of the agents, and the topology for the emergence and evolution of pro-social behaviors.Following this path, together with the work carried out in [4], we believe more light can be shed on the phenomenon of crowdsourcing and its implications in human societies.
As hinted above, this work is based on a recent paper [4] in which the authors proposed a modeling framework for crowdsourcing in relation to the level of collectivism that characterizes the community facing the problem.More specifically, the model attempted to investigate the impact of dividing a given population with a fixed number of subjects (called players) into several smaller groups by the ability of these groups to solve problems of variable difficulty (tasks).Several scenarios were explored where everybody was in the same group to a specific scenario in which each player worked alone.The idea was to determine the optimal group size that would allow its players to learn the most.More precisely, the role of two parameters was explored: the capacity, the group's capability to solve increasingly challenging tasks coupled with the collective knowledge of a group, and the payoff, an individual's own benefit in terms of new knowledge acquired.The rationale behind these two scores was to model the incremental nature of human advances.It is given that the latest scientific discoveries depend on previous discoveries, as they literally set up the conditions for such an advancement.The famous quote by Isaac Newton, "If I have seen further, it is by standing on the shoulders of giants", clearly describes such dynamics.In other words, we are speaking about a chain of fitness gains, where the total gain is larger than the simple addition of the payoff due to single advancements.Therefore, the framework postulates the two distinct gaining schemes cited before, the capacity and the payoff.In short, the former reflects society's knowledge accumulated over history, whereas the latter reflects the individual knowledge related to skills for daily problem-solving in a given time and context.

Aim of the Study: Protecting Crowdsourcing from the Costs of Cooperation
Previous simulations have shown that, when facing not-so-hard tasks, the tendency to collaborate in a group was and still is inversely proportional to its dimension.Moreover, regardless of the difficulty of the task, there is an optimal group size where collectivism and individualism are balanced by achieving the highest fitness and capacity.However, such simulations did not take into account the cost associated with collaboration.Experimental literature on social dilemmas has stressed that the cost of cooperation greatly impacts the cooperation itself.Indeed, cooperation levels are related negatively to the cost and positively to the benefits of cooperation [51,52].As a matter of fact, many studies testified that trying to solve a problem with other people involved many different kinds of costs such as cognitive and [53] communicative ones [54], the need to acquire consensus and deal with relationships among members [55].Exploiting collective intelligence [56] of a group requires each member to pay a variety of costs.The crucial point is to evaluate the trade-off between such costs and the individual gain associated with collaboration.In this study, we added a cost for cooperation to Guazzini's model.It is trivial to predict that, by adding a cost for cooperation, the rate of cooperation decreases.However, for authors like Rachlin, the capability of acting altruistically (i.e., to pay a cost to benefit someone else) resides in the ability to ignore the short-term benefits of behavioral alternatives and to give greater importance to long-term gains of pro-social actions [57].If this is true, we might expect a lack of sensitivity towards the magnitude of the cost of cooperation in a whole range of possible scenarios.Furthermore, the presence of a cost cannot certainly motivate "selfish" agents to change their strategies, so the eventual decrease in cooperation levels would presumably be due to the abandonment of their basic strategies by those agents with a greater tendency to cooperate.Nevertheless, these agents may offer some resistance to changing their strategies in relation to the increase of the cost.In addition, given the complex interaction among the variables at stake, we can expect that this decrement will interact with the size of the group and difficulty of the task.

The Model: Settings and Simulations
Modern sociophysics and cognitive modeling frequently merge their approaches and languages, developing hybrid methods and models' architectures [58].Such a trend allowed sociocognitive sciences to go beyond the limitations characterizing the "classic" approach based on game theory (e.g., public goods games), sometimes capturing the minimal complexity required to "understand" the dynamics of human social systems [59].The complexity of our approach actually refers mainly to the way we implemented the collective dynamics of the agents.Despite such complexity, the computational model describing the cognitive dynamics of the agents is very simple and represents a standard in the "computational modeling of cognition" [60,61].From the other side, the "toy sociophysical model" we propose is devoted to bridging the agents' dynamics with the study of the collective competition between groups.Such a model has been already validated in a previous publication and represents the first attempt to capture the concurrent interplay between group competition and agent cooperation within the groups [62].Moreover, in order to mimic the "indirect reciprocity" effect [33], we introduced an explicit representation of the "group knowledge", defined as the result of the amount of past altruistic behaviours of its agents.In this way, the "basic" tendency to free-ride the others at the level of the agent is dynamically moderated by the evolutionary selection of the agents based even on its "groups knowledge".Finally, such an interplay, merging cognitive and psychosocial modeling, has been implemented as follows.
We divided a population of N players into n groups with the same size S, so that S = N n .We took N = 64, and seven values of S = 1, 2, 4, 8, 16, 32, 64 (N = 64 remained fixed).The algorithm assigned a value of p i (chosen uniformly between 0 and 1) to each player i in each group.The value p i was characterized for each player, which remained the same over time, and measured the player's tendency to collaborate with other group members when solving a certain task (i.e., the propensity to work collectively as opposed to individually).More precisely, a small value of p i (close to 0) indicated a tendency towards individualism, while large values (close to 1) indicated a propensity towards collaboration.We stress the fact that such p i are in all respects the strategies of player i: therefore, it must be considered as an innate feature of each individual and independent from other quantities.As usual in most game-theoretic models, see for example [63], by means of the evolution rule individuals with higher fitness will be more likely to reproduce, so that their strategies will survive to the detriment of the other ones.Indeed, our goal is to understand what are the best strategies depending on the values of the model parameters.
In a subsequent phase, a task was assigned to a group.The task was represented by the value R, which indicated the simplicity of the task and it was chosen randomly from six values (R = 0.01, 0.1, 0.3, 0.5, 0.7, 0.9).Values of R close to 0 indicated a hard task, while values close to 1 indicated easy tasks.Each of n groups worked in parallel to solve a task with the same simplicity R. For size group S and for task simplicity R value, we ran a sequence of games.Each iteration of the game was divided into three steps: (1) first, we determined if a player in a group was a collectivist or an individualist.Each player i had a probability p i of being a collectivist, so the player collaborated with other collectivists in the group to solve the problem; if not, the player was an individualist, who still benefited from the group but tried to solve the task alone.(2) Second, if the player i belonging to group j was a collectivist, the expected gain (G i ) was fixed at G i = C j + 1, with C j representing the cardinality of group j, which is described in detail below, as well as the level of knowledge reached by the group during the previous turns (i.e., experience).On the contrary, if the player i decided to adopt an individualistic strategy, the desired gain G * i was chosen uniformly at random (between 1 and 10).Larger G * i meant smaller probability to solve the task but with a potentially greater gain if there was a positive resolution of the task.This result reflected the more effort that the individualist needed to solve the task, but a greater reward was not shared with the group.The choice to let the individualists' gain be extracted at random, differently from the collectivists' case, is a conservative selection: indeed, while collectivists work together for a common goal, an individualist struggles for a given objective, which is harder or easier according to the specific instance.More precisely, we could have set the model so that individualists could select the possible gain following a given rule; however, since on average individualists face every kind of task, for simplicity we preferred to extract it at random.(3) Third, the algorithm determined if the task was actually solved (or not) by each player.The collectivist player solved it with a probability of R, whereas the individualist player solved it with a probability of R G * i .Obviously, since R ≤ 1, a larger desired gain G * i meant a smaller chance of solving the problem.As a consequence, the advantage of being collectivist is to have always the opportunity to gain a fitness equal to her group knowledge plus one (C j + 1), with a probability of R, while the individualists always gain a certain amount of fitness (F), with a probability of R F .
The expected gain used to study free-riding dynamics cannot always be known at the beginning.Indeed, the success (and thus the expected gain) of crowdsourcing application and platforms rely massively on users' adoption and participation [64].In this sense, this first phase of the crowdsourcing projects resembles a social dilemma [65].The gain resulting from crowdsourcing is unpredictable and depends on the use that others do of such platforms.Choosing not to tie the decision to cooperate or compete to the expected earnings could be considered a conservative solution that reflects this first phase of crowdsourcing projects.
Each group, regardless of its iteration-dependent divisions into collectivists and individualists, was indexed with j in order to differentiate from i, which indicated the players within a group.Regarding the players who solved the task for a given iteration, the algorithm assigned to the scores was as follows: • Cardinality C j equaled the group's capacity to solve increasingly more challenging tasks (e.g., the collective knowledge of a group) and thus, it was also an integer parameter that was equal to the number of iterations in which one collectivist solved the task, regardless of R. At the beginning of this experiment, it was set at the value of C j = 0 for all groups and then updated to C j → C j + 1 each time one collectivist player solved the task.• The player's fitness or payoff π i represented a player's own benefit in terms of new knowledge acquired.If a collectivist (C) or an individualist (I) failed to solve the task, their fitness increased only because the others' contribution of π i = C j S ∑ C j , with ∑ C j equal to the number of cooperators belonging to the group j of player i who solved the task in the game turn.However, if a collectivist solved the task, it contributed an additional fitness of S , with C * j = C j + 1 becoming the updated cardinality of the group, so having In addition to the gain shared by the collectivists in the group, an individualist who solved the task gained an additional fitness of • Furthermore, the cooperative players in the group needed to coordinate and synchronize the cooperation of solving the problem among each other.On the contrary, individualists did not have to pay this so-called cost for the very fact that they acted alone.To represent this difference, the collectivist player fitness always is computed as ), where the term δ c represented an additional cost of cooperation, which was the cost that every collectivist is assumed to pay in order to synchronize his effort with the group.On the contrary, the individualists are not affected by such cost directly.Such a model of payoff aims to represent the idea that collectivists distribute new knowledge both to themselves and to all the others, while individualists keep it for themselves.However, collectivists solved tasks more easily since they worked together, but with potentially less new knowledge (fitness) for each of them separately.In contrast, by working alone, individualists solving harder tasks learned much more since they avoided sharing this new knowledge with the others.
Summarizing, the dynamics of the system implemented by our model, is ruled by two linked equations (Equations ( 1)-( 4)), respectively, determining the agent's personal gain (i.e., the gain coming from its game turn), and the payoff of an agent which depends even from the possible cooperators' contribution.The average gain (γ i ) is the direct contribution to the own fitness of each player in a single turn of the game, and it can be expressed as or separately for Collectivists (C) and Individualists (I), as in the Equations ( 2) and ( 3): and The fitness of each player in a turn of the game π i is then defined as the total gain of each player at the end of such a turn, deriving both form its contribution (γ i ) and from the contribution due to the number of cooperators k, which solved the task during the turn within the same group of i. π i can be expressed by Equations ( 4), ( 5), (6), ...
with k = i.Again we can express the π i separately for Collectivist (π C i ) and Individualist (π I i ), as follows in Equations ( 5) and ( 6): Finally, if we introduce the cost of cooperation (δ c ), and we consider the time, we have that the expected fitness of an agent i at a certain time t becomes where the first term of the summatory argument represents the contribution of the cooperative actions, while the second term represents those of the individualistic actions.The simulations involved n groups of a size of S simultaneously for a given R. Since an entire game consisted of 2000 rounds, a round was interrupted after 1000 iterations in order to check the fitness of the players.The average fitness π of all players, regardless of the group they belonged to, was computed.At random, 20% of players whose fitness was below π were removed and replaced by new ones, whose p i was drawn anew, so that the groups' sizes S were preserved.From one round to another, all group capacities sumC j and all players' fitnesses π i were reset to 0, where the value R remained the same, only changing the structure of groups in terms of players p i , and the distribution of p i within each group, from one round to another.The fitter players were kept in the game as well as 80% of lesser fit players.It is the player's p i and his relationship with the other players' p i -s that dictated the player's overall performance in any game.The system evolved over 2000 rounds, with an evolutionary selection being applied at the beginning of each round and then after a number of iterations, and these rounds were sufficient in reaching a stable configuration.Finally, a different series of simulations were run in order to test the effect of the cost of cooperation (δ c ).The control parameter δ c varied during the testing of six different values, respectively, δ c = 0, 10%, 30%, 50%, 70%, 90% of the collectivist players' expected gain.A version of the MATLAB code implementing the numerical simulations is provided within Appendix A.

Results
According to the effect of the Cost of Cooperation on Problem Simplicity (Figure 1 Left), the simpler the tasks (from r = 0.9 to r = 0.1), the lesser the difference on the final agent fitness.Moreover, the difference increases from 15 to 55% in conjunction to the cooperation costs.On the contrary, for a difficult task (r = 0.01), the relationship between the cost of cooperation and the difference in the fitness is almost linear.The main reason for such behavior is that the cost of cooperation influences the reduction of the fitness measure in two ways: (i) directly, where the agent has to pay a cost to cooperate, and (ii) indirectly, where fewer agents want to cooperate because of the direct cost, and, thus, the cooperation is infrequent and the agents have fewer advantages.However, from the point of view of the community size (Figure 2 Right), the cost of cooperation affects the smaller group more than the larger ones.As for the smallest community size (i.e., s = 1), as well as for the smallest problem simplicity (i.e., R = 0.01), the final difference on the fitness is greater than 100%.Such an effect is due to the fact that, especially for very difficult tasks (i.e., R = 0.01), the cost of cooperation is frequently paid without any subsequent payoff, therefore producing a negative final fitness for the agent.
For what concerns the maximum group capacity reached by the system at the equilibrium (Figure 1), a general decrease is revealed as related to the cost of cooperation.The effect is caused by the reduction of collectivists' behavior within the system.Nevertheless, its magnitude is largely affected by the two control parameters of the system (i.e., problem simplicity and the size of the group).In particular, as shown in the left plot of Figure 1, the greatest reduction affects the systems facing the hardest problem (i.e., simplicity of the task = 0.01), quite independently to the cost parameters (δ c ), always reducing the final group capacity at about 40% in comparison to the zero cost condition.On the contrary, the systems facing the easiest problems (i.e., simplicity of the task = 0.9 and 0.7) appear not to be affected greatly by the cost, always reaching a reduction of the final group capacity below 5% quite independently from the cost of cooperation.Interestingly, the systems that faced problems of intermediate complexity (i.e., R = 0.1, 0.3, 0.5) are revealed to be the most sensitive to the cost of cooperation.For instance, a group challenged by a problem simplicity of R = 0.1 demonstrated a loss of around 13% when the cost of cooperation was equal to 10% of the expected gain, reaching a loss of 35% for a cooperation cost of 90%.By the same token, the results show an effect contributed by the community size (the right plot of Figure 1).The larger the community, the smaller it appears to be both in the magnitude of the capacity reduction and the sensitivity to the cost of cooperation.In particular, in the extreme case represented by individuals alone, the group capacity reduction ranges between the 2%, for a cooperation cost of 10%, to 60% for a cooperation cost of 90%.In general, when the cost of cooperation is zero, there is an inversely proportional relation between the average probability of cooperation and the size of the group (Figure 3 Left).Moreover, this relation is common despite the different difficulty of the task.Another important aspect is that the optimal equilibrium is reached when there are 2 groups of size 32, respectively.In fact, for smaller groups, more competition is required because an agent has no interest in splitting the gain equally with the others.This aspect is stressed when a cost of cooperation is needed.Actually, the average probability of cooperation significantly decreases for the harder tasks (r = 0.1 and r = 0.01), in particular for smaller size groups.Similarly, considering the complexity of the problem in Figure 3 (right) without the cost of cooperation, the average probability of cooperation is stable when the size is between 1 and 32, while it directly decreases with more complex tasks from s = 64.With the cooperation cost, the tendency in larger groups (s = 64) is to defect regardless of the complexity of the problem.For smaller size groups, this effect is clear for harder tasks (complexity < 0.3), while for simpler tasks (complexity 0.9) the average probability of cooperation converges to similar values reached without any cooperation cost.In Figure 4, the final average probability of cooperation compared to the problem simplicity (left subfigure) and to the community size (right subfigure) are presented.In both plots, the case of a cooperation cost of 90% is represented in red and compared to the baseline condition (i.e., cooperation cost of 10%) in black.Conversely to the case of cooperation cost of 10%, the effects of the payment to cooperate change dramatically in the final configuration of the system.It is worth noting, as is shown in the right subfigure, that the cost appears to be similar in the two extreme conditions S = 1 and S = 64.In other words, in both extreme cases, the introduction of a cooperation cost reduces the average probability of collectivist behavior quite independently from the complexity of the problem faced.The remaining system sizes ( i.e., S = 32, 16, 8, 4, 2) also appear to be strongly affected by the cost, which presents a noticeable increase of the average cooperation probability only for very simple tasks (i.e., R = 0.7, 0.9).Finally, the left subfigure shows another qualitative shift with respect to the cooperation cost of 10% for what concerns the relation between the frequency of collectivist behavior and the community size.With the hardest tasks (R = 0.01, 0.1), there is a collapse of the cooperation tendency for all the community sizes, because the final values are always below 30%.On the other hand, for less challenging tasks (R = 0.9, 0.7, 0.5, 0.3), we observe a maximum of the functions for intermediate values of the community size.A relation between this maximum and the size of the group, in the case a cooperation cost of 90%, can be also observed.In particular, it appears that the greater the simplicity of the task is, the smaller the fragmentation of the most cooperative system is.

Conclusions
The simulations presented here allowed us to investigate the complex relationship among the tendency to cooperate, group sizes, the cost of cooperation, as well as the difficulty of the task.Our results indicate that, when an agent has to pay a cost, such a price reduces the fitness both directly and indirectly (cooperation is less frequent and implies fewer advantages).These dynamics are modulated by the difficulty of the task, i.e., increasing the cooperation cost has a greater impact on the fitness of the agents in the case of very difficult problems.The reduction of cooperation due to the cost is mitigated by task simplicity and group size.To sum up, the larger the community is, the smaller the decrease of the capacity is, which leads to less sensitivity to the cost of cooperation.Such results indicate that, when dealing with small groups and hard tasks in concrete applications, it is better to control and reduce the cost of cooperation with ad hoc interventions.However, at the same time, we have to consider the effects already emerged in the work of [4], which is confirmed by our numerical simulations.In fact, beyond a certain size of a given interacting group, we registered a collapse in their performance (i.e., the production of collective knowledge).In its entirety, our findings could provide valuable insights into structured virtual environments and for the psychosocial ergonomics of web-based systems in relation to scientific and laboratory widespread problem-solving.These results also underline the importance of the design of crowdsourcing tasks.Complex problems do not need to be divided into smaller parts to be solved.Sometimes simple tasks are better than little (or micro) tasks.An effective design allows people, whose experience or knowledge is limited, to perform like expert individuals, i.e., to produce a qualitatively better knowledge than expected [66].Therefore, as our simulations also seem to suggest, making complex problems simpler (i.e., easily understandable, executable, and with the least possible degree of inherent uncertainty) helps to establish a higher level of cooperation within the group.Furthermore, it is possible to observe another possible effect due to the complexity reduction obtainable through task design, concerning the cost of cooperation.Despite the fact that in our model, the cost of cooperation and the complexity of the task are treated as two separate parameters, in reality there is an area, albeit limited, of an overlap.In fact, difficult tasks involve intrinsically higher costs related to the task.Therefore, it is reasonable to expect that a reduction of complexity also affects indirectly the levels of cooperation through the reduction of costs linked to the task.As demonstrated above, one of the ways in which it is possible to conceive the cost of cooperation is through the concept of reciprocity.Signs indicating a lower risk of exploitation of cooperation could reduce the cost of cooperation itself and facilitate the collaboration process [33,38,67] by means of an accurate modeling of task ergonomics.
We hope that our results could be taken as guidelines for the development of systems aimed at exploiting group problem-solving, group decision-making, and, more generally, online collaboration [1][2][3].For example, CO-WORKER [2] may benefit from our results modifying its architecture taking into account that the difficulty of the task, the size of the community engaged, and, of course, the cost of cooperation can affect the final result.The ergonomy of the interface along all the human-computer interaction facets could be made more user-centred in all the cases where the collective decision may be hindered by the cost of cooperation.In such cases, other solutions could embed the proposed architecture with incentive systems (or other motivational elements) or automatically warn users that the size of community is not sufficient given the difficulty of the problem and the level of engagement.The same may be applied to the more general field of group decision-making modeling, where the proposed formalism [3] could take into account the factors investigated in this paper and their relation.Such consideration can also be made for collaboration tasks with trivial difficulty (such as annotating videos) [1] but where the motivation and the cost of cooperation can be crucial for the effectiveness of the task (i.e., the development of semantic web technologies).
Clearly, our results appear to be strictly applicable only to certain types of crowdsourcing.For instance, interactions within groups need to be not episodic.Therefore, our indications seem not to greatly benefit virtual labor marketplaces (e.g., Amazon's M-Turk and Crowdflower) and those activities known as tournament crowdsourcing, while open collaboration projects appear more prone to exploit our findings [6].
However, it is necessary to stress that the observed results are based on a simulation study.Given the difficulty and the cost of performing empirical investigations about similar scenarios, it is better to start with numerical simulations under reasonable assumptions and then perform empirical investigations.Therefore, it is necessary to complete these studies with a direct empirical test of the observed results.Such empirical investigation could also be obtained employing an architecture such as CO-WORKER.
In conclusion, the cost of cooperation can affect the tendency to cooperate in a non-trivial way, so future simulations and empirical research should further investigate this point as well as take this point into account in concrete applications.

Compliance with Ethical Standards
This article does not contain any studies with human participants or animals performed by any of the authors.

Data Availability Statement
The dataset collected by the research can be retrieved easily by writing to Andrew Guazzini at andrea.guazzini@gmail.com.
Acknowledgments: An abstract of a preliminary version of this work appeared in the Proceedings of the 38th Annual Meeting of the Cognitive Science Society (Philadelphia, 10-13 August 2016) with the title "Simulating the cost of cooperation: A recipe for collaborative problem-solving".

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A
Algorithm A1 presents the pseudocode for a single round.The procedure takes 6 inputs: N is the number of players, r is the complexity factor, s is the group size, P is a vector with the probability of cooperation for each player i, C and F are vectors with the cardinality of the groups and the fitness of each group j, respectively.First, we initialize a support vector groupList in which we store the index of the groups that solved the task (Line 2).Each player i faces the task cooperating within the group (Lines 4-15) or as an individualist (Lines 16-20).In the former case, the player pays the cost of cooperation (Line 5) and tries to solve the task (Lines 6-15).If the task is solved, the fitness F increments and the group is accounted for a reward.The group receives only one reward, independently of how many players of the group solved the task (Lines 12-14).In the latter case, the individualist player has a random expected gain which is a positive integer (Line 17), and the greater the expected gain is, the lower the chance of solving the task (Line 18) and receiving the gain (Line 19) is.
Algorithm A1 Game Round Algorithm.G i = C j + 1 Expected gain return groupList, C 26: end procedure Algorithm A2 presents the pseudocode for the simulation.The procedure takes 7 inputs: R is the vector with the complexity factors, S is the vector with the group sizes, Epochs and Rounds are the number of epochs and rounds, respectively, N is the number of players, f is the genetic evolution coefficient, and iteration is the number of repetitions in each experiment.The first lines, from 2 to 5, define the cycles to repeat the experiment for all conditions.We repeat the experiment for iteration times for statistical evidence (Line 2), with the different complexity factors in vector R (Line 4) and with groups of different size in vector S.These cycles are independent of each other, so we use a parallel for (ParFor) to speed up the process.With a fixed complexity factor r and group size s, the population of players is initialized (Lines 6-7): each player i has a collaborative strategy, which is the probability to collaborate CS(i) and belongs to a group identified by an index stored in groupO f (i).The experiment runs for a certain number of epochs (Line 9).At the start of each epoch e, the cardinality of the group C j and the fitness F are reset (Lines 10-11).The players then try to solve a task for a certain number of rounds (Lines 12-16).At the end of the e epoch, the population evolves keeping the best players and replacing the worst (Lines 18-22).More precisely, each player whose fitness is below the average fitness of the population has a chance of changing his p i .The change is regulated by the genetic evolution coefficient f = 0.2, which means we remove about 20% of the population below the average fitness.

Figure 1 .
Figure 1.Percentage differences of the final group cardinality (i.e., the maximum complexity of the problem-solved in the past) in comparison to the cost of cooperation, for each problem simplicity (left plot) and for each group size (right plot).
cost on Difference on final Agent Fitness

Figure 2 .
Figure 2. Percentage differences of the final agent fitness in comparison to the cost of cooperation, for each problem simplicity (left plot) and for each group size (right plot).

Figure 3 .
Figure 3.Comparison of the final average probability of cooperation for the condition with the cost of cooperation equal to 0 (dark lines) and the cost of cooperation equal to 10% (red lines), compared to the problem simplicity (left plot), and to the group sizes (right plot).

Figure 4 .
Figure 4. Final average probability of cooperation for the condition with the cooperation cost equal to 0 (dark lines) and the cooperation cost equal to 90% (red lines), compared to the problem simplicity (left plot) and to the group sizes (right plot).

1 : 3 : 4 :
procedure PLAYROUND(N, r, s, P, C, F ) 2: groupList ← empty Groups which solved the task for i = 1 to N do The players try to solve the task if p i > rand then The player cooperates 5: