Emergence of simple characteristics for heterogeneous complex social agents

Models of interacting social agents often represent agents as very simple entities having a small number of degrees of freedom, as exemplified by binary opinion models for instance. Understanding how such simple individual characteristics may emerge from potentially much more complex agents is thus a natural question. It has been proposed recently in [E. Bertin, P. Jensen, C.R. Phys. 20, 329 (2019)] that some types of interactions among agents with many internal degrees of freedom may lead to a `simplification' of agents, which are then effectively described by a small number of internal degrees of freedom. Here, we generalize the model to account for agents intrinsic heterogeneity. We find two different simplification regimes, one dominated by interactions, where agents become simple and identical as in the homogeneous model, and one where agents remain strongly heterogeneous although effectively having simple characteristics.


Introduction
When modeling complex systems, statistical physicists often posit that the interacting entities they consider have simple individual properties, and that the possibly more complex behavior observed at a collective level, when considering many such simple entities, is simply the result of interactions between entities [1,2,3,4]. The emergence of macroscopically different properties from simpler interacting entities, often through a collective symmetry breaking mechanism, has been emphasized long ago by P. W. Anderson in his seminal paper "More is Different" [5] as one of the key mechanisms at play to account for the wealth of different objects or behaviors found in the real world. In this paper, Anderson actually presents the statistical physics approach as a generic and somewhat iterative method, to deal with the emerging complexity in a multilevel way, the outcome of a given level of analysis being the building blocks (i.e., the interacting entities) of the next level. The key role played by symmetry breaking phenomena in the emerging complexity at collective scale has been widely acknowledged in many different contexts (see, e.g., [6,7,2,8] among many others). Interestingly, a perhaps less explicit suggestion of Anderson's paper is also to consider statistical physics models where the interacting entities are not simple objects with a handful of characteristics, but are already rather complex objects with many internal degrees of freedom. Although it has been formulated almost fifty years ago, this suggestion looks very timely by now.
In the last decades, statistical physics has gone beyond the equilibrium paradigm based on molecular entities, and has indeed started to consider assemblies of macroscopic and potentially complex objects like grains of sand [9,10], or active particles modeling for instance some types of bacteria, or self-phoretic colloids [8]. However, in these examples, macroscopic particles may in practice still be considered as simple particles, as their many internal degrees of freedom can be subsumed into a small number of effective parameters encoding their macroscopic, non-equilibrium character, like dissipation coefficients [9,10] or self-propulsion forces [8]. To find genuine examples of assemblies of complex entities in a theoretical context, one may rather turn to the field of population dynamics and evolution in theoretical biology, where for instance large populations of individuals characterized by a complicated genome evolve under some evolutionary rules [11,12].
When considering models of social systems, it would be necessary at first sight to take into account the intrinsic complexity of human beings [1]. However, this complexity is too extreme to be captured by any type of statistical physics models, so that statistical physicists have often considered very simple models of social agents, retaining a small number of characteristics, for instance a binary [13] or continuous [14] opinion. A natural question is then to understand how more complex agents could in some situations reduce their intrinsic complexity to effectively appear as simple. This question can already be addressed within the framework of statistical physics, because there is no need to model the full complexity of human beings to address at least some aspects of this issue. A simple tentative answer has been given in [15], by illustrating on a toy model how the simplification of agents with many internal degrees of freedom may result from interactions among agents. This point of view is qualitatively consistent with the idea advocated by some sociologists that 'the whole is less than the parts' [16,17], in the sense that, roughly speaking, human beings may leave aside part of their complexity to build a group. It has been argued in [15] that this point of view can be reconciled with the seemingly antagonist viewpoint of statistical physicists according to which 'the whole is more than the sum of the part', due to collective phenomena and symmetry breakings. Yet, the model introduced in [15] considered an assembly of identical agents, while agents heterogeneity may be expected to be an important characteristics of human beings. In this note, we extend the model of [15] by including heterogeneity between agents. We show, using methods inspired by the physics of glasses, that two different types of agents simplification can occur in this model, one driven by interactions as in [15], and the other one driven by heterogeneity.

Model
We introduce a model of complex agents generalizing the one introduced in [15] by now considering heterogeneous agents. The model is composed of N interacting agents having an internal state described by a configuration C i ∈ {1, ..., H}, with i = 1, . . . , N and where the number H of configurations is large. We write for later convenience H in the form H = n M 0 , where n 0 is a fixed integer number, and M is assumed to be large. This is typically the case if the configuration C i is composed of M degrees of freedom, each of which taking n 0 possible values. Each agent is endowed with a characteristic that can be either present or absent depending on the configuration C i . Intuitively, this characteristic could be a preference for a specific kind of music or tempo for the members of a vocal ensemble [17] for instance, or more generally be related to a binary opinion [13]. This feature is conveniently encoded by a variable S i (C i ) ∈ {0, 1}: the characteristic is present when S i (C i ) = 1, and absent when S i (C i ) = 0. The general idea is that the characteristic would be present only in a small number of internal states, so that it would typically remain unobserved except if there is a strong probability bias towards the few configurations for which S i (C i ) = 1. To implement this idea in practice in the model, we assume that the characteristic is present in a single configuration, that we label C i = 1.
We now need to define the dynamics of agents. Following standard practice in the modeling of social agents [18], we assume that the dynamics is driven by an individual utility function u i = u i (C i |C j =i ) that accounts both for individual preferences and for interactions with agents. An agent i stochastically changes configurations according to the following rule. Given the current configuration C i , the new configuration C i is randomly chosen with a probability rate given by the logit rule, the variation of utility generated by the change of configuration (note that configurations C j of the other agents j = i are kept fixed). The parameter T plays a role similar to temperature in statistical physics, and characterizes the degree of stochasticity in the decision rule.
Our goal is to model heterogeneous agents that interact through their characteristic S i (their internal state is otherwise invisible to other agents). With this aim in mind, we choose the following form of the utility function of agent i, where U i (C i ) is the intrinsic (or idiosyncratic) utility of configuration C i for agent i, and K is the coupling constant characterizing the interaction with the other agents. Note the 1/N scaling of the interaction term, typical of fully connected models where all particles or agents interact with each other in a similar way. This interaction terms was already present in the homogeneous model of Ref. [15]. We then model the heterogeneity of agents as a quenched randomness of the intrinsic utilities (which were absent from the model of [15]). More precisely, for all i = 1, . . . , N and C i = 1, . . . , H, the intrinsic utility U i (C i ) is randomly drawn from a Gaussian distribution ρ(U ), where we recall that M is defined by H = n M 0 . The utilities U i (C i ) do not change in time.
The utility variation ∆u i can be reformutated as the variation ∆E = ∆u i of a global observable E that plays a role similar to the energy in physics (up to a change of sign). Note that contrary to u i , the pseudo-energy E does not depend on the agent i.
Here, the function E takes the form The quantity E is thus different from the total utility i u i , due to the factor 1 2 in the interaction term. Note also that the present model shares similarities with the Random Energy Model [19,20,21], as well as with the Ising model [7], the Potts model [22] or the Blume-Emery-Griffiths spin-1 model [23]. However, it also exhibits important differences with each of these models. Given the property ∆u i = ∆E, the dynamics defined by the transition rate Eq. (1) obeys the detailed balance property in terms of the equilibrium distribution with β ≡ 1/T , and where Z is a normalization constant. Now that we determined the equilibrium distribution of the model, our goal is to investigate its phase diagram to assess the effect of the competition between agents heterogeneity (due to their quenched intrinsic utility) and collective effects that could arise from interactions. As recalled in the introduction, the homogeneous version of the model, studied in [15], exhibits a transition driven by interactions between a phase where agents essentially visit all their internal states and thus have no strongly preferred configurations, and an ordered state where all agents 'standardize' in the same configuration C 1 such that S i = 1 (i.e., the characteristic of the agent becomes visible). In the present generalization of the model, we wish to explore whether the standardized state survives the heterogeneity of agents intrinsic utility. Indeed, the internal state C 1 such that S i = 1 may have a lower intrinsic utility U i (C 1 ) than other more favored configurations of agent i, and this may prevent a common standardization of all agents in the same state.
To investigate this issue, we introduce the order parameter q defined as If the number N of agents is large, the pseudo-energy E can be expressed in terms of the order parameter q as We now wish to determine the distribution P (q) of the order parameter q. This distribution is obtained by summing the joint distribution P (C 1 , . . . , C N ) over all configurations (C 1 , . . . , C N ) having a given value of q, namely with δ the Kronecker delta. For a given value of q, there are qN agents in configuration C i = 1 and (1 − q)N agents in any of the other n M 0 − 1 configurations. We denote by S q,N a subset with qN elements of the set [1, N ]. With these notations, P (q) can be written as where in the second product, the index i is implicitly restricted to the interval [1, N ]. The distribution P (q) defined in Eq. (8) actually depends on the specific realization of the random utilities U i (C i ), so that P (q) should in principle be eventually averaged over these random utilities. However, to make calculations easier, we do not compute explicitly the average over U i (C i ), but rather use heuristic arguments to evaluate the typical values of the random quantities appearing in Eq. (9). Such an estimate is expected to be sufficient to determine the leading exponential behavior P (q) ∼ e −N f (q) of the distribution P (q). The first product between brackets in Eq. (9) can be rewritten as the exponential of i∈S q,N βU i (1). The latter sum is (up to the factor β) a sum of independent and identically distributed Gaussian random variables drawn from the distribution Eq. (3). The sum is thus also Gaussian distributed, with zero mean and variance 1 2 qN m(βJ) 2 . It follows that and this product can thus be neglected (i.e., replaced by 1) when looking for the behavior of P (q) at exponential order in N .
The key point in order to determine P (q) explicitly is now to evaluate the typical value Z typ of the sum where we recall that each U i (C i ) is an independent random variable drawn from the distribution ρ(U ) given in Eq. (3). Once this estimate is known, the distribution P (q) can be approximated as The quantity Z i has the same form as the partition function of the Random Energy Model (REM) [19,20], and we can thus borrow some methods from the REM to evaluate it. A standard approach to study the REM is to evaluate the density of states of a typical realization of the disorder. It is convenient at this stage to define the rescaled variable u = U/M . Denoting as n(u) the density of state of a given realization, the average n(u) over the disorder is given by to exponential order in M , having definedρ(u) = M ρ(M u). Hence if ln n 0 − u 2 /J 2 > 0, corresponding to |u| < u 0 = J √ ln n 0 , the average density of state n(u) is exponentially large with M , and for a typical sample n(u) ≈ n(u) . By contrast, when |u| > u 0 , the average density of state is exponentially small with M , meaning that in most realizations there are actually no states with |u| > u 0 (the exponentially small value of the average density of states comes from very rare and atypical realizations having a few states in this range). Therefore, to describe a typical realization, one can in practice consider that the density of state is equal to zero for |u| > u 0 , and equal to n(u) for |u| < u 0 . We can thus evaluate Z i as The integral in Eq. (14) can be evaluated by a saddle-point calculation in the large M limit. Defining g(u) = ln n 0 − u 2 /J 2 + βu, Z typ is given for large M by Z typ ∼ e M gmax , where g max is the maximum value of g(u) over the interval [−u 0 , u 0 ]. Let us first look for the maximum of g(u) over the whole real axis. Defining u * such that g (u * ) = 0, we find u * = 1 2 βJ 2 . Recalling that β = 1/T , and defining we find that −u 0 < u * < u 0 for T > T g , whereas u * ≥ u 0 for T ≤ T g . Hence g max = g(u * ) for T > T g , and g max = g(u 0 ) for T ≤ T g , taking into account the fact that g(u) is monotonously increasing for u < u * . We thus obtain Using Eqs. (12) and (16), we obtain after expanding the factorials thanks to the Stirling formula that P (q) takes a large deviation form P (q) ∼ e N f (q) . In physical terms, f (q) may be thought of as (the opposite of) a free energy. The explicit expression of f (q) depends on the temperature range. For T > T g , f (q) reads whereas for T ≤ T g , Note that we did not take into account in f (q) the contribution coming from the normalization factor Z, as this would simply add a constant to f (q). Inspired by Ref. [15], where interesting results were obtained for a coupling constant K ∼ M , we assume in what follows that and take the reduced constant k as the relevant control parameter in the model (on top of temperature T ). For large M , the expression of f (q) then simplifies to We first observe that in this large-M approximation, f (q) is a convex function of q for all values of temperature T , so that the maximum of f (q) over the interval 0 ≤ q ≤ 1 is either f (0) or f (1). The most probable state is found to be q = 0 for k < k c (T ) and q = 1 for k > k c (T ), where the critical line The curve k c (T ) thus separates the (k, T ) phase diagram into two regions, a region with q = 0 at low coupling and a region with q = 1 at high coupling. The corresponding phase diagram is plotted in Fig. 1. Note that for J = 0 (i.e., in the absence of disorder), the glassy region in the phase diagram disappears, and one recovers the phase transition at T c = k/(2 ln n 0 ) between a high-temperature phase with q = 0 and a low-temperature phase with q = 1 found in the homogeneous model [15]. Besides, we have also seen that a change of behavior occurs at T = T g . For T > T g , the agents dynamically visit a large number of configurations, while for T < T g their dynamics becomes essentially frozen, and only few configurations have a significant probability to be visited. In other words, for T < T g agents become 'stuck' in a small number of configurations having the highest utility. In the context of the Random Energy Model for glasses, the temperature T g corresponds to the glass transition.
Hence there are actually three different regions in the phase diagram shown in Fig. 1. For T > T g and k < k c (T ), agents have no preferred configurations and visit many different configurations over time. For T < T g and k < k c (T g ), each agent spends a lot of time in a small set of preferred configurations. In other words agents look simple, but they remain different one from the other. This regime is dominated by agents heterogeneity, and there is on average no macroscopic overlap between agents configurations (q = 0). In the last region k > k c (T ), the coupling between agents dominates over agents heterogeneity, and all agents are essentially in the same configuration, leading to a strong overlap (q = 1) and to the emergence of a common characteristic, a phenomenon that has been called standardization in [15]. For T > T g , agents may be in any of their internal configurations, while for T < T g , they are dynamically blocked in the few configurations with the highest intrinsic utility. In this latter case, agents appear simple but remain heterogeneous.

Conclusion
We have extended here the model proposed in [15], where agents have many internal configurations, but can select a specific configuration thanks to interactions, leading to simplified, or standardized agents. While Ref. [15] focused on homogeneous agents, we have in the present work extended the model to account for agents heterogeneity, introduced through random idiosyncratic utilities associated with each configurations of each agent. Heterogeneity is introduced in a minimal way, directly inspired from the Random Energy Model for glasses [19]. Including heterogeneity in the model leads to the onset of a new phase at low temperature and small coupling, when heterogeneity dominates over interactions among agents. In physical terms, this new phase shares similarity with a glass phase. The main difference here is that we consider not a single glassy system as in physics, but a large assembly of interacting agents, each of which experiencing an internal glass transition.
In this glassy phase, agents are stuck in a small number of possible internal configurations, those with maximal idiosyncratic utility. In spite of the individual simplification of agents, the assembly remains heterogeneous, because all agents are in different configurations and do not share common characteristics. Increasing the coupling strength, interactions eventually dominate over heterogeneity, effectively leading to simple agents all occupying the same configuration and sharing the same characteristic. The latter phase is similar to the low temperature phase found in the homogeneous model of [15], where agents have been called standardized. Finally, at high temperature, agents keep their internal complexity and do not significantly feel interactions or heterogeneity, again similarly to the homogeneous model [15].
A further step in the study of this model would to study possible collective effects that could arise once agents are simplified. This has been done in the homogeneous model by assuming that standardized agents could be in one of two distinct states, corresponding for instance to two different opinions (or to a spin in physical terms). Interactions between these binary degrees of freedom may then lead, at low temperature, to a collectively ordered state. It would be interesting to see in more details how agents heterogeneity could possibly modify this simple picture. In addition, the role of a lower connectivity (in the present work, agents interact with all other agents) would clearly deserve to be investigated, as connectivity is known in many situations to modify critical properties [7].