Examining the Schelling Model Simulation through an Estimation of Its Entropy

The Schelling model of segregation allows for a general description of residential movements in an environment modeled by a lattice. The key factor is that occupants change positions until they are surrounded by a designated minimum number of similarly labeled residents. An analogy to the Ising model has been made in previous research, primarily due the assumption of state changes being dependent upon the adjacent cell positions. This allows for concepts produced in statistical mechanics to be applied to the Schelling model. Here is presented a methodology to estimate the entropy of the model for different states of the simulation. A Monte Carlo estimate is obtained for the set of macrostates defined as the different aggregate homogeneity satisfaction values across all residents, which allows for the entropy value to be produced for each state. This produces a trace of the estimated entropy value for the states of the lattice configurations to be displayed with each iteration. The results show that the initial random placements of residents have larger entropy values than the final states of the simulation when the overall homogeneity of the residential locality is increased.


Introduction
Human actors are very complex objects to model and the factors of the environment which affect their behavioral state is a challenge to encapsulate in a mathematical framework. This difficulty is further increased when long term macroscopic changes are taken into account. Modeling residential patterns as they evolve is of great interest, as the housing market is a large part of the economy, and the spatial arrangements shape the interactions among community members. The Schelling model of segregation [1], presents a model of residential movement dynamics based upon a homophilic threshold of satisfaction (Appendix A) for the number of similarly labeled neighbors in a grid. The model can be described as a set of hypothetically labeled residents in a grid allotment which, during a simulation, proceeds to be given a random position allocation when there are not enough of their adjacent neighbors sharing the same label. A simulation continues to allocate random position reassignments until all residents have the required amount of identically labeled neighbors needed to cease their movements. This results in an iterative procedure of spatially orienting residents to be placed in a macroscopic set of similarly labeled clusters. In an random initial allocation on the lattice a visual depiction is expected to look like 'noise' when different colors are provided for each type of resident. In the final iteration when the homogeneity satisfactions are met, the grid will look 'organized' as groups of similarly labeled actors cluster together.
The work of [2] provides an analytical formulation of the main steps of the Schelling model and reviews the variations possible. The variations are typically introduced in order to provide different proposals for new locations of moving actors or for a modification on the determination of homogeneity satisfaction. The set of Equations provided allows for the parameterization to adapt towards the different implementations without changing significant aspects of a program. This also has a benefit in that the comparisons between different modeling approaches can be seen in a compact form. The analysis of the states provided leads to a calculation of the 'density' of the homogeneity of local actors across all the residents. Presented here in Section 2.1 is a concise mathematical representation of the Schelling model which is formulated in such as way as to present a convenient form to produce a set of states that can be analyzed in a manner which allows a quantification of the entropy values of these states during the simulation.
A set of random initializations of the resident labels which appear 'organized' would be indicative of a mistake in the implementation. The final 'organized' state (in that there clusters of cells sharing a spatial arrangement of the same labels) could be considered to belong to a smaller class of configurations than the ones which resemble 'noise', from which a simulation initially begins. The paper of [3], describes in general various class sizes of lattice state space models and provides insight into the manner for the estimations of their values of entropy. By considering the Schelling model to be a system analogous to that of an ideal gas permits a similar approach for the entropy estimation between state transitions. The approach here aims to use an estimation of sizes as a definition for the model macrostates along each time point to compute the entropy trace. The change in the entropy values for the time ordered state transitions of the Schelling model aims to provide an estimate of the uncertainty of the configurations of the spatial arrangements.
In terms of detecting segregation the cellular configurations can be observed as they change over time and be interpreted by inspection but this does require qualitative descriptions which are hard to compare between studies. Various quantitative approaches for analyzing the resulting residential configurations of the simulation trajectory exist. These approaches generally involve a specification of a functional form which captures a change in the homogeneity densities of the actor arrangements over the iterations. The application of monitoring the density of the resident satisfactions across the iterations is applied commonly in practice [2,4,5]. Given that there are N residents the model explores the cell occupation state changes which results in the expected satisfaction among all non-empty cells. This value can be taken according to dissatisfaction so that the value is seen to decrease over the simulation progression.
The work of [6] develops a method based upon the cluster geometry to measure the label segregation. Approaches to finding a spatial measure which captures obscure cases is an ongoing area of research and can be relied upon for specified classes of patterns [7] produces a physically inspired model for the gradient of the cluster formations so that the arrangements over the iterations resemble that in other physical phenomena but these states are still assess qualitatively by inspection as noted in [8]. From the methodology of [9] a representation for certain 'patterns' (specific macroscopic configurations) are considered for their size examining a range of predefined scenarios. The approach also employs a Markov chain with Metropolis transition probabilities for the change of states between and although very interesting it does deviate from the archetypical Schelling model paradigm.
In [10,11] the density function of energy differences ∆E and average magnetization M for different time steps of the model is shown when the Schelling model is approached as a thermodynamic system. These models provide a great deal of insight into the phase changes of the system which delivers a potentially invaluable method to detect paradigm changes in the macrostates. An accurate application to an actual sociological scenario could potentially divert large scale negative consequences. Their main strength is in determining the changes in the value of the parameters of the model which can affect the spatial arrangements under the premise that those values can be mapped to values observed in a real world scenario (a task which is another challenge in itself). To monitor the state of the system [10], uses the energy per site, E / N . The results show a reduction in the energy over the system as the residents in satisfied states will cease to move.
There are also measures to examine the motifs of the spatial arrangements for clustering patterns that are then aggregated to assess the degree that those features of interest have manifested themselves. An example of this approach is in [12] that applies the community triads (3-cycle) analysis which is supported in social science [13]. This could be considered a mesoscopic feature of the state space over the homogeneity assignments and for situations where these features are specifically important, provides a specific insight.
These differences in the measures applied to monitor the Schelling model trajectory can be seen in a comparison of the figures produced in [2,10], which represent the degree of residential homogeneity satisfaction across the lattice for the time steps. Although they have similar starting and final points, as defined by the dynamics, the trajectories followed are different based upon the measures used to evaluate the overall homogeneity between these points. As the application is motivated by residential movements, these iterations can correspond to many years with differences in values representing large changes in populations and residential prices. The entropy estimation on the trajectory of the simulation is fundamentally different from macroscopic measures built upon the aggregate microscopic state values, or mesoscopic configuration properties. By using a measure based upon the size of a set of states which are considered equivalent given the macrostate values, the investigation can answer questions regarding the ability for certain dymamics to search the configuration space. With respect to the Schelling model an estimation of the entropy values along the trajectory allows for an assessment of the degree in which a simple set of decision rules for the agents leads to configurations that are highly unlikely to be derived from random allocations.
The Schelling model of the residential population movements can be considered as a system analogous to a lattice gas [14]. This permits a similar manner of analysis in which there is a definition of microstate configurations which are analyzed as being part of a macrostate for a thermodynamic system, and therefore the entropy can be estimated for the macrostates. Given this framing of the Schelling model environment of the actors, the homogeneity threshold calculation as a discrete measure dependent upon the state values of its adjacency set can be developed to associate it with other physically focused models. The homogeneity aggregate in the Schelling lattice model can be used to develop an analogy to the Ising model [15][16][17], and Reference [18] provides a thorough formulation of the Schelling model in a manner where an evident extraction of a statistical mechanics treatment of the model simulation analysis is presented. In the Ising model, there are 2 states that represent spin, 'up'/'down' ({+1, −1}). The vicinity of these states of spin alignments produces different aggregate forces for the system as do the homogeneous/heterogeneous neighbor associations in the Schelling model. The Ising model where on each location of the lattice (generalizable to d dimensions), it is occupied by an object with 2 states is: This is the main starting point for the similarity of the comparisons of the Ising model with the Schelling model. The work of [8,19,20] provide examples of methodological investigations of the Schelling model with approaches derived for the Ising model. A definition for a macrostate of the Schelling model allows for a set of configurations to be considered as microstates whose estimated set size provides for a means to calculate the entropy along the iterations of the simulation.
A different operator than multiplication is in use for the lattice adjacency set to assess the equality of state label assignments. It does not follow from the same motivation of the group equality test as is described in the sociological research [21]. Equation (1) provides a functionally equivalent formulation for the 2 dimensional lattice utilizing the Kronecker delta (testing for label equality). This can easily be generalized to larger group sets without a 'spin' property being closely adhered to (similarly to the general feature set homogeneity summation employed in [22]). As well, the approach taken here gives an explicit representation for the empty cells which is required in the Schelling model and not the Ising model. This extra state label is excluded from contributing in the macrostate assignment as a resident lacking a threshold of satisfaction for homogeneity. A physics model interpretation is the '3-state voter-type non-equilibrium model on a lattice' as described in [23].
The equal a priori probability postulate [24], does not apply to systems operating with these dynamics as the ergodic hypothesis does not necessarily always hold for the Schelling model. The configuration trajectory allows for the systems to become fixed without movement in the case where all residents have a sufficient number of adjacent homogeneous members in their locality. Another case is when there is a very limited number of empty cells and that explorations fail to produce an effective move. Even with increasing the number of iterations this will not change the state to produce new configurations (microstates) during a simulation. A lack of changes in the agent satisfactions from the initialization step are not conditioned for in the standard Schelling model although their addition does not impose any contradictions upon the framework. Such a situation can be treated with a terminating condition based upon the number of iterations which have not caused a change in the agent satisfactions. Another permitted case, which is ergodic, is when a limited number of residents, less than the remain threshold value for each group exist amongst a large number of empty cells so that they will be guaranteed movement in each iteration. Therefore the residents would continue to explore every microstate configuration over enough iterations (random walk over the whole state space) while failing to produce homogeneity satisfaction.

Methodology
The two subsections here outline the Schelling model implementation/variant applied in this work and the framework for the analysis of the Schelling model evolution as an entropy trace is described.

Schelling Model Outline
We consider a lattice Λ, of 2 dimensions (d = 2) to place the hypothetical residents within cells of the lattice. The number of cells which can be occupied for a square lattice is N = |Λ| which is equal to the side length raised to the value of d. The lattice cells can be indexed as n ∈ [1, 2, . . . , N] without taking into account the prefix ordering. According to the application of this work, each cell is occupied by a 'resident' which is a member of a particular group. Each cell is allocated the state of one of 3 different categories, group1, group2, and empty. The membership for each cell, belongs to a unique element in the set m n ∈ {m group1 , m group2 , m empty }∀n and metrics of the model can condition upon these labels as needed. The use of the label for empty units of cells, m n ∈ {m empty }, allows for there to be movement of a single resident independently of the other residents in {m group1 , m group2 }. These are the only members that can affect the residency satisfaction thresholds, and whether other residents remain in the same position in the next iteration of the simulation. Simulation require that The mapping of a resident's position in the lattice is denoted as m n ↔ m (ni,nj) when describing the position and placement amongst cell occupants in the space of d. The iterations of the simulation take place over a hypothetical unit of discrete time, t. Since the dynamics of the system do not parameterize upon the amount of time passed, the unit of time is a representation of the steps the residents have taken since the initialization. The cell occupation of a resident can include the time index to denote the temporal locations, m n,t ↔ m (ni,nj),t .
The basic premise of the Schelling simulation is that each cell resident performs a check for equality of group membership upon the adjacent cells. This affects the choice for a resident to remain in the same cell between iterations. While excluding for a check against the resident's self, it evaluates the membership of all the cells in the adjacency of a single unit. The aggregate of the equality checks can be found for m n via: Here δ m n ,m n , is the Kronecker delta so that group membership equality is 1 and 0 otherwise. The examination of cell values outside of the lattice bounds are treated as m empty making no contribution to the residential satisfaction as this function is not called on behalf of empty cells. This state determination for each cell based upon the adjacency of the grid is the main motivation for the comparison to the Ising model in the previous research [25,26].
The aggregate of the observed equality checks is then used to determine if a homogeneity threshold is satisfied. This value, h, is a homogeneity constraint upon the locality of the adjacent cells that determines whether the resident in a cell will remain in the same cell or attempt a move to another one in the subsequent iteration of the simulation. The threshold for whether the resident remains in the same cell is based upon the value of local homogeneity from Equation (1): The value for threshold is set to h = 6 in this work (as in [25]), and it is typical for this to be adjusted to examine the sensitivity with the final {m group1 , m group2 } (non-empty residents) spatial arrangements. This binary value represents the state for each of the non-empty occupants for which the movement is conditioned upon. The empty set of residents are considered to not have a value of residential choice as their 'displacement' between time units depends upon other group states.
If r(m n ) = 1, the resident remains in the same cell location between time steps, m n,t , m n,t+1 → m (ni,nj) . In the situations where r(m n ) = 0, the occupant no longer remains in the same cell, there is a move to another cell location which is occupied by a resident who is a member of the 'empty' label, This condition for the sampling requires that the proposed position replacement results in Equation (2) and the remain state for residents over the course of the simulation can subsequently be given by: t+1 ) , . . . , r (m n,t+1 ) , . . . , r (m N,t+1 )] .
This gives an expression for the state of every resident in the lattice and the coordinates that are occupied for the indexed time unit. Given that the allocation of a position for which a resident can change their state of 'remain', from 0 to 1, arises from this sampling procedure; this results in different trajectories of the simulation upon multiple realizations.

Estimating the Entropy of the Schelling Model from the Microstate and Macrostate Assignments
Using the formulation of the Schelling model in Equations (1)-(6) a representation of the model state space as a closed thermodynamic system can be provided. The simulation is modeled as a microcanonical ensemble where the microstates are the configurations of the binary valued entries of each resident's 'remain' value for the set of all lattice members in r t (Equation (6)).
The macrostate, denoted by R, of the Schelling model will be defined as the aggregate remain value over all the resident members: This can be indexed to derive the macrostate value for each time point as R t = ∑ N n=1 r(m n,t ). For the different states of remain that the residents produce this macrostate effectively groups the microstates by the local homogeneity satisfaction over all non-empty members. This aggregate can be satisfied by multiple configurations and the size of the sets which satisfy different values of R will be estimated via a sampling procedure. R represents an intuitive macrostate property that can be interpreted and the number of corresponding microstates are of interest for modelers to examine the extent to which the system follows various microstate explorations. The Schelling model phase space is therefore described here as having this 1 degree of freedom.
The number of microstate configurations which corresponds to a particular macrostate, R, is represented as, Ω (R). It is assumed that each microstate r t ∈ R has an equal probability of being selected in the macrostate; r t = 1 / Ω(R) . The entropy of the macrostate can then be found via: The value of Ω is found by enumerating all of the microstates which produces the same remain level of R for the set of residents. For a particular time point, the entropy for the macrostate of the Schelling model can be addressed as S t = k B lnΩ(R t ).
Given that the enumeration for the microstate space of the ensemble is not analytically tractable, with the state positions depending upon discrete non-linear operations, Ω (R) will be sampled with Monte Carlo [27] (similar manner employed in [18,23], and thoroughly described in [28]). The Monte Carlo sampling scheme draws from the space of Equation (6), U (r). The size of the collection of microstates for all macrostates, is equal to: N!/(Π group g N g !), is the de-labeling factor, N the number of positions in the lattice and N g the number of agents (members) within each group. The lattice structure is represented as an adjacency list so that the different microstates can be seen as a rearrangement upon an array which the de-labelling factor removes redundant permutations from as they do not change the macrostate assignment.
It is of particular interest to obtain the trace of the entropy values of the model during the time steps. These values S t , depend upon Ω(R t ) which will be proportions of the overall count from all macrostate sizes in Equation (9). This is because each non-redundant permutation of the labels will correspond to a particular value of R, The Monte Carlo sample for the probability of a microstate belonging to a particular macrostate is effectively the macrostate membership assignment ratio for a sample size k; |{r ∈ R}|/k. This allows for an estimation of S and at every time point S t = k B Ω(R t ). The convergence of the sample distribution over R (via Monte Carlo on the space of r) is tested by examining the first half of the samples against the second half and performing a K-S test. This allows for the Schelling model trajectory to have every lattice configuration mapped to a macrostate value through Equation (7) and the corresponding entropy value to be produced based upon the estimate of the number of microstates. Alongside the visual representations of the lattice configurations over time with color codings for the different actors, an entropy trace can be produced.

Results and Discussion
This section explores the Schelling model in terms of the macrostate value at each iteration defined in Equation (7) and the entropy value estimated S t = k B Ω(R t ). In the examples used N = 100, and 2 types of non-empty agents are considered m agent1 and m agent2 each set having 45 members. The members of the empty set are allocation subsequently as a default allocation to those cells which a label is unassigned so that m empty has 10 members in these cases.
The Monte Carlo sampling scheme required for Equation (10) is computationally demanding. It is possible to obtain reliable convergence on the estimated macrostate densities given new processor speeds, and the parallel processing capabilities readily available. The methodology is implemented in Julia-Lang [29,30], which is a relatively new programming language designed for high performance computing in mind. The computations were parallelized with minimal syntactic additions to the single threaded implementation via the intuitive programming macros which can encapsulate regions that require parallel operations. Figure 1 examines the value of the macrostate R at every time point (iteration) in the Schelling model. In Subfigure (a) a single run of a simulation is presented where it can be seen that the aggregate value of R increasing until stabilization at the value drawn in 'green'. This green line component represents the time points where no movement of the agents can change their state of l(m n ) (Equation (1)). The occasional increases and decreases emphasizes the aspect of the simulation dynamics of how the random uniform allocation of the new positions does not guarantee improvements upon the macroscopic state where the threshold satisfactions over the complete lattice are to be monotonically increasing. Subfigure (b) presents the results from 1000 independent simulations of the model with box plots for the values encountered at every iteration. Towards the end of the simulation the independent simulations maintain the same values. The dashed line represents the maximum value that could be achieved which is the total number of non-empty residents (90). It can be seen that as the simulation progresses the macrostate value of R also increases with the residents changing positions until a cell with h or more similarly labeled residents surround it. As R is the aggregate of these satisfactions across the lattice it can be expected that the dynamics provide an avenue to increase this macroscopic measure. In (b) the spread between the maximum R values for the first iteration (initialization) does not surpass the median value of the 5th iteration. This highlights the effectiveness of the dynamics to search for large R values in contrast to a random sample and that large R contain fewer microstates r.  Figure 2 shows the results of examining the macrostate values R (Equation (7)) from independent simulations of the Schelling model at the start (initialization) and at the end (final value). Subfigure (a) shows the Monte Carlo sample distribution of the R values and also displays the maximum possible value R = 90 in the dashed vertical line. Each possible value that can be sampled is given an initial observation count of 1 so that comparisons with states the simulation enters that are rare to sample can still be compared. Each initialization is independent of the previous ones and aims to depict the relative size of the different macrostate values. It can be seen that sampling from the microstates the values close to the maximum can be expected to require a large number of draws. In terms of the Schelling model, and the interpretation of the R value, it can be said that there are more random initializations (microstates) which have less than half of the agent thresholds satisfied. Subfigure (b) presents the final R value at the end of a set of independent simulations. The final value is determined by a stretch of iterations in which the R value does not change or that the maximum possible value has been reached. Although the microstates which correspond to such large values of R are encountered in a small fraction of the Monte Carlo simulation, they are arrived at consistently, given the iterations of the simulation dynamics. The Schelling model is therefore effective in navigating the space to find microstates which correspond to large R values.   as there is only a single microstate member counted for that value of R. From the progression of R in Figure 1 and the distribution of the sampled values of R in Figure 2, it can be seen that the larger R values the Schelling dynamics produces are of a smaller ensemble which results in the decrease in entropy. Subfigure (b) shows the results from 100 independent simulations where the maximum, mean and minimum values of the entropy S t (Equation (8)) are drawn. All the simulations reach the zero entropy and it can be seen how some simulations arrive at that value requiring less or more time points. This can be due to the initializations and the sequence of random allocation of moving residents.

Discussion
The work presented here provides a formulation of the Schelling model of social segregation [1] and a manner in which the states of the simulation can be analyzed with the framework of statistical mechanics. From the concise set of equations for the dynamics of the Schelling model iterations it is seen that there is a resemblance with that of the Ising model [15] which is also noted in previous research [15][16][17][18]25]. This similarity rests predominantly upon the premise that both models utilize a lattice for the placement of cell occupants whose state depends exclusively upon the states of the occupants in the adjacent/neighboring cells. The states of the cells are updated based upon the satisfaction of the aggregate of similarity (homogeneity) counts for the 'values' of the occupants and the surrounding cells. In the Ising model these states are defined by 'spins' given binary values that are aggregated in order to define a state transition. For the Schelling model these states are represented by societal identifiers (categorical variables) which can possibly be extended to arbitrarily sized sets and it therefore is a more direct translation of the model motivation to employ the use of the Kronecker delta as shown in Equation (1). This can be understood by the residential actors performing a label equality check whose satisfaction upon the adjacency set is compared to a threshold which determines the cell state.
Equations (1)-(5) are the Equations for the states of the lattice which governs the residential system state changes between time points (iterations). Each time point the lattice has a particular state defined by Equation (6), that produces a r t , which is the set of the homogeneity satisfactions for all the residential members of the lattice. These configurations can be treated as 'microstates' which change according the dynamics of the Schelling model. A 'macrostate' for the state of system is defined upon the microstates in Equation (7) producing a phase space with a single degree of freedom R. The macrostate value is the aggregate homogeneity satisfaction within the lattice and provides insight into a valuable macroscopic aspect of the system. Equations (9)-(10) describe the approach to estimating the entropy values for the simulation along the time points based upon the macrostates R. Figure 4 shows that the Schelling simulation upon randomly initialized residential placements upon the lattice acts to reduce the entropy value over time. These expected transitions into smaller ensembles, based upon the simulations, show that although there are sporadic deviations from the decrease in entropy; the dynamics of the system reliably reduce the entropy. From Figure 3 the initial time point and a latter time point provide an intuitive interpretation for this by the noticing the increase in R along with the organization of the 2 residential groups. To see how the simple dynamics of the model are able to navigate the space and obtain it with consistently low density high values of R, Figure 2 shows the distribution of R from the initializations and the R values at the end of the simulation. It is intriguing how the model dynamics, representative of residential movement patterns, produces a trajectory that goes against the arrrow of time.
Future work would entail expanding upon the model to take into account the cost/energy required to perform actions such as change position or assess the aggregate value of the homophily in a cell's locality. The subsequent investigations could look into the limit case in 1 dimension as explored in [8], and produce phase diagrams for the transitions based upon the homogeneity threshold values. The current implementation has a cubic run time in the number of grid cells for each iteration which may be improved upon using different programming data structures. It should be noted that there are alternative re-phrasings for the underpinning interpretation of the model outcomes, especially to highlight the phobias the agents can be acting out upon as an explanation for their decision to change position. As most models describe the dynamics of residents reaching stationarity as the sufficient aggregate of their locality for homophilic label assignments being discovered; this is a maximization process. This could be rewritten as minimization of a cost associated with heterogeneity label assignments where the remain decision (Equation (2)) has the comparator facing in the other direction to be based upon a heterophobic threshold on the aggregate of locality assignments from Equation (1), for disimilarity. Although this may not appear to be an error the that comparator threshold is written for a maximization and the language alludes to a minimization, it is misleading. This can be seen as justifiable in the effort to attract attention towards this area of research by choosing sensitive language. The degree to which it is an actual error is equivalent to referring to an application of Reinforcement Learning (RL), that searches to maximize a reward function, but is actually emphasizing that the state sequences are optimally reducing a cost. For this reason it is considered that there is a set of dynamics which the agents follow as a search mechanism for a satisfaction of this homophilic dependency or preference.