Opinion Models, Election Data, and Political Theory

A unifying setup for opinion models originating in statistical physics and stochastic opinion dynamics are developed and used to analyze election data. The results are interpreted in the light of political theory. We investigate the connection between Potts (Curie–Weiss) models and stochastic opinion models in the view of the Boltzmann distribution and stochastic Glauber dynamics. We particularly find that the q-voter model can be considered as a natural extension of the Zealot model, which is adapted by Lagrangian parameters. We also discuss weak and strong effects (also called extensive and nonextensive) continuum limits for the models. The results are used to compare the Curie–Weiss model, two q-voter models (weak and strong effects), and a reinforcement model (weak effects) in explaining electoral outcomes in four western democracies (United States, Great Britain, France, and Germany). We find that particularly the weak effects models are able to fit the data (Kolmogorov–Smirnov test) where the weak effects reinforcement model performs best (AIC). Additionally, we show how the institutional structure shapes the process of opinion formation. By focusing on the dynamics of opinion formation preceding the act of voting, the models discussed in this paper give insights both into the empirical explanation of elections as such, as well as important aspects of the theory of democracy. Therefore, this paper shows the usefulness of an interdisciplinary approach in studying real world political outcomes by using mathematical models.


Introduction
The search for common compromises in a discursive social process is at the heart of all democracies.In these debates, citizens seek their standpoint on the basis of information and news, but also based on discussions with family, colleagues, and acquaintances.Opinion dynamics aims to model the basic structure of precisely this process [1][2][3].
Many opinion dynamics models are constructed on communications graphs, where the nodes represent persons, who only interact with neighboring persons [4].Basically, there are two groups of models: either individuals are equipped with one of a finite number of opinions (typically pro and contra) or their opinions are characterized by a continuous spectrum of possibilities (typically the interval [0, 1], see [5]).In particular, the models with a continuous state space are often used to investigate whether a consensus can be reached in the long run [6][7][8].
Most papers do not aim to validate their models in analyzing empirical data in a quantitative way [9].Those papers that do address empirical data mainly focus on the dynamics of two opinions within homogeneous groups and do not use an underlying graph structure [10][11][12][13][14][15][16][17], as many interesting aspects, like different interaction patterns or phase transitions, already appear in models describing homogeneous populations and do not require interaction graphs.Here, we find a parallel in mathematical epidemiology, where it is clear that infections spread via contact graphs, but most quantitative models and methods aiming at the description and prediction of the dynamics of a real world outbreak are based on models that assume homogeneous mixing [18,19].Also in the present work, we find that these simple models are sufficiently rich to meet the structure of empirical data.Technically, the main objective of those more empirically oriented papers is to obtain the stationary distribution of the underlying stochastic process and then to use this result in order to fit model parameters to data.In that, these papers are able discuss possible mechanisms that generate striking patterns in data [11][12][13][14][15], aim to reveal changes in communication patterns in the course of time [16], or address spatial communication distances [17].Not only in political processes, but also in other fields such as vaccination hesitancy, opinion models contribute to an adequate description of the underlying communication mechanisms and, in that, potentially open up ways to handle (in this case) public health problems [9,20,21].It is, however, noticeable that all these aforementioned approaches from socio-physics and socio-mathematics up to now only have a small or no echo in the social and political sciences.
In the present paper, we aim to achieve three goals: in the first part, we connect two different approaches to opinion models: one driven by stochastic dynamics and the other one originating in statistical physics; in the second part, we target a model comparison to find out which models are able to explain empirical data.The third part then discusses the usefulness of mathematical models of this type for political sciences.
For the first part of the paper: if we review the literature, we find two main approaches to describe opinion models-one approach formulates mechanisms about how people change their minds when interacting with others in the form of stochastic dynamics.In the simplest case, the voter model [22], it is assumed that a person faithfully copies the opinion of some randomly chosen other person.A slightly more refined version of this idea introduces zealots (also called stubborns or activists) who never change their mind [23,24], which results in the zealot (or noisy voter) model.In the long run, we find an equilibrium in opinion dynamics, which is termed stationary distribution.This distribution can be used to analyze empirical data.
A completely different approach originates in the statistical physics of spin systems.Herein, the dynamics of the opinions are not considered, but it is assumed that only a little is known about the state of the population, e.g., surveys could inform us about the abundance of some opinions.To express this partial knowledge appropriately, a distribution is constructed that maximizes the Shannon entropy under the constraint of the known information.The models constructed in this way are called Potts models, and the distribution is called the Boltzmann distribution [25,26].As the stationary distribution of the dynamical opinion models, the Boltzmann distribution is also used to analyze data [26].
We connect both approaches in identifying a dynamical stochastic model (first approach), which generates a stationary distribution that coincides with a Boltzmann distribution (second approach).In this way, we associate a dynamical model with a statistical physics Potts model and vice versa.The advantage of this procedure is the construction of a unifying framework for both approaches.Based on this framework, the q-voter model [27][28][29] appears as a natural extension of the zealot model, not in view of the mechanisms modeled by the q-voter model, but in view of the underlying mathematical structure.
We aim to apply the models to data.Election data usually aggregate information about a large number of people, as constituencies usually comprise thousands to hundreds of thousands of voters.Therefore, a continuum limit is of interest.Here, we note that the zealot model is identical to the Moran model with mutation [30], which forms the basis of population genetics [31].In population genetics, two different diffusion limits have been established-the weak and the strong effects limit.These two approaches differ in the assumptions regarding the scaling of the parameters with respect to the population size.We investigate ways to transfer this idea to general opinion models.
In the second part, we analyze election data from the United States (US), United Kingdom (UK), France (FRA), and Germany (GER) based on four opinion dynamics models (the Curie-Weiss model, the weak and strong effects continuum limit of the q-voter model, and, additionally, the weak effects limit of the reinforcement model [21]).We use this analysis to test the models to find out to what extent they are able to describe the data not only qualitatively, but also quantitatively.The central finding is that models derived by a weak effects limit perform much better than their strong version.Potentially, these findings will be useful for future empirical studies.In the last part, we again change the focus and turn to the interpretation of our results in the light of political science.In particular, political processes and changes in social interactions potentially leave their traces in the election data, and therewith in the estimated parameters.However, this conjecture can only be confirmed based on in-depth political research.These considerations lead us to another important aspect, which is the question of the extent to which socio-mathematical models of this kind can also be a fruitful instrument as an integral part of political science, or whether the methods, objectives, and research questions of socio-mathematics and socio-physics on the one hand and political science on the other are too different.

General Structure
In this section, we first introduce the notation before introducing the two different types of models that we will investigate.
We consider a population of N individuals numbered 1, . . ., N, where each of the individuals supports either opinion A or opinion B. The opinions are coded by 1 (for A) and −1 (for B).The state space is given by Σ 2 = {±1} N , such that the i'th component σ i ∈ {±1} of state σ ∈ Σ 2 indicates the opinion of individual i.We consider Cannings models [32], that is, all individuals are exchangeable and the population is homogeneous.Particularly, we do not have an interaction graph; respectively, the interaction graph is the full graph.We introduce the functions which count the supporters of opinion A (function n + (σ)) and the supporters of opinion B (function n − (σ)), respectively.Our knowledge on the opinion distribution in the population will be expressed by random measures on Σ 2 .Due to the assumption of Cannings models, the random measures, necessarily, are invariant with respect to the permutations of individuals: two states σ 1 , σ 2 ∈ Σ 2 with n + (σ 1 ) = n + (σ 2 ) have the same probability.
That is, any random measure Q : Σ 2 → [0, 1] describing the state of the population induces a random measure P on the state space V N := {0, .., N}, which indicates the number of opinion-A supporters.Let σ (k) ∈ Σ 2 be a given state with n + (σ (k) ) = k; then, for combinatorial reasons: We will find out later that the binomial coefficient, which appears here for symmetry reasons, plays a distinct role in the theory developed below.
In the next two sections, we introduce two very different ways used in the literature to construct random measures for the opinion state of the population, that is, on Σ 2 , respectively, V N .The first approach is based on stochastic processes, while the second is statistical in design.We should keep in mind that-due to the symmetry discussed-every rate and every function used to define the models can be constructed in such a way that it depends on σ ∈ Σ 2 only via n ± (σ) and the population size N.

Opinion Process
A stochastic opinion process is a Σ 2 -valued Markov process σt , where single persons reconsider their opinion at rate ν.There is a certain probability that this person indeed changes her mind.These probabilities depend on the opinion distribution in the population; as mentioned above, this dependency is established via n ± (σ) and not via the fine-structure of the state (which individual is an A-and which individual is a B-supporter).For mathematical convenience, but without loss of generality, we assume that the rate for σ i i to switch from −1 to 1 is a function of n + (σ) and N, while the rate to switch from 1 to −1 depends on n − (σ) and N, To obtain a feeling for which terms to use for f ± , we can look at the voter model.Herein, ν represents the rate at which a person reconsiders her opinion and interacts with some randomly chosen person in the population (with so-called selfing, that is, the person might also choose herself).The functions f ± (n ± (σ)) = n ± (σ)/N simply specify the probability of interacting with a person of the other opinion.Below, we also consider other examples, where the probability to change the mind is slightly more involved, but the overall structure of the terms will be similar.All other entries in state σ are not affected by a flip of the i'th person's opinion.
As mentioned above, the Σ 2 -valued process σt induces a V N = {0, .., N}-valued process X t via X t = n + ( σt ).The transition rates of X t are given by We call a Markov process with transition rates given in this form an opinion process.It is straightforward to determine the stationary distribution of an opinion process if we have no absorbing states.As we aim at a specific notation that parallels the usual notation of Potts models, we derive the stationary distribution step by step.In what follows, we suppress the dependency of f ± on N.
Proposition 1 (stationary distribution).Assume f ± (k) > 0 for k ∈ V N = {0, .., N}.Furthermore, let F ± : V N → R be defined by Denoting the probability of state k ∈ V N in the stationary distribution by p k , we have Proof.The detailed balance equation for the stationary distribution p k = P(X = k) yields Hence, where C is determined by ∑ N k=0 p k = 1.Together with the definition of F ± , H(k), and Z, this formula proves the proposition.
Interestingly, the corresponding stationary distribution on Σ 2 can be written as where this is similar for n σ i .Each individual has an independent contribution exp(F σ i (.)) to the probability of state σ, where F ± (.) depend on the global statistics of the state via n ± (σ).That is, F σ i (n σ i (σ)) can be regarded as the environment of individual i, which determines the probability of the opinion that individual i has adopted.It is, furthermore, interesting to observe that the population size N does not explicitly appear in the expression exp(F σ i (n σ i (σ))).The number of the opposite-opinion-supporters only comes in indirectly, as n + (σ) and n − (σ) add up to the given population size N.
For obvious reasons, we call the functions F ± (.) the environmental conditions, or simply the environments, of the opinion process.
We have a degree of freedom in Equation (5).We might add a real constant A in the exponent, p k = Z−1 ( N k ) e − H(k)+A .Then, Z is still defined as the normalizing constant, guaranteeing that ∑ N k=0 p k = 1.Thus, also in Z, the term e A appears such that A cancels out and does not affect the value of p k .Below, in the definition of the Boltzmann distribution, we will find a similar invariance.This invariance will be used later to eliminate singularities appearing in Section 3.2, where we investigate the large population limit with weak effects.
For now, let us discuss the freedom given by this invariance more in detail and, in particular, explore the implication for the choice of the environments.If we replace F ± (k) by F± (k), where we choose U ± (0) = 0 in accordance with F ± (0) = 0 and require U ± (k) to satisfy Then, the stationary distribution p k is not affected.We now go backward and determine transition rates k f ± (N − k) that produce the new environments.Here, we use ln( f The function U + indeed cancels out in the detailed balance of Equation (6).
Please note that we are free to choose U + (k) (with the understanding that U + (0) = 0).Next, we defined In particular, the choice A = U + (N) is equivalent with the requirement U − (0) = 0 such that A is not free.The choice A = U + (N) is unique.All in all, the freedom we have is completely characterized by the choice of U + .
Corollary 1.The rates k f ± (k) (or, alternatively, the environments )) determine the stationary distribution of an opinion process.The set of all opinion processes with stationary distributions coinciding with that given distribution is characterized by where U ± are functions satisfying U + (0) = U − (0) = 0 and

Potts Machinery
Next, we introduce Potts models, where we again consider Potts models only on a full graph.Potts models on a full graph are often termed mean-field or Curie-Weiss models.In agreement with the literature, we return for the moment to the individual-based formulation of the opinion model, that is, we use Σ 2 as a state space.
We do not know the state of the population.The knowledge we assume to have are the results of some polls.For example, we could observe/measure the fraction of individuals with opinion +1.Consequently, we will know the expected number of persons in state 1, that is, n + (.).As the Potts models originate in physics, we follow the tradition and call these polls "observations" and, accordingly, the function n + (.) an "observable".
In general, observables are defined as functions F : Σ 2 → R without further restrictions.Another example of an observable that is often used is the number of pairs with identical opinions minus the number of pairs with different opinions, This observable incorporates information about correlations in the population.
Let us assume that we have m observables F1 , . . ., Fm .Our knowledge about the state of the population is restricted to the knowledge of fℓ := Fℓ (σ), ℓ = 1, . . ., m.We represent our knowledge, and, particularly, the absence of complete knowledge about the state, in the form of a random measure Q on Σ 2 .We, of course, require that E( Fℓ (.)) = fℓ such that Q does express our partial knowledge appropriately.However, there are many random measures that will satisfy this requirement.We express this lack of complete knowledge by uniquely defining Q as the random measure that maximizes the Shannon entropy S(Q) = − ∑ σ∈Σ 2 Q(σ) ln(Q(σ)), under the constraints E( Fℓ (.)) = fℓ .The random measure Q(.) we construct in this way is called the Boltzmann distribution.
We restrict ourselves to Cannings models such that the observables Fℓ only depend on Σ 2 via n ± (σ) and thus can be defined as maps Fℓ : V N → R. Also, the Boltzmann distribution can be defined directly on V N via P(n , this formula defines P consistently.However, as the lack of knowledge still concerns the state of the population in Σ 2 , we do not use the original Shannon entropy for P, but measure the entropy for P by the entropy for the associated measure Q on Σ 2 , Proposition 2. Let m ∈ N denote the number of observables and Fℓ : V N → R, ℓ ∈ {1, .., m} the observables themselves.The Boltzmann distribution is the distribution P : V N → R + that maximizes the Shannon entropy S Σ 2 (P) = − ∑ k∈V N P(k) ln(P(k)) − ln(( N k )) under the constraint E( Fℓ (.)) = fℓ ∈ R. If the Boltzmann distribution P exists, then where λ ℓ , ℓ ∈ {1, .., m} are Langrange multipliers.
Proof.The proof consists of a short computation (see, e.g., [25,26]), based on the standard Lagrangian approach for the maximization of S(P) under the constraints E( Fℓ (.)) = ∑ k∈V N Fℓ (k) P(k) = fℓ , ℓ = 1, . . ., m, and ∑ k∈V N P(k) = 1.Let P(.) denote the set of proba- bilities We determine all values P(k) by maximizing the function Fix k ∈ V N .If we equate the derivative of L with respect to P( k) to zero, we find and P( k) = e −H( k) e λ m+1 −1 .We obtain e λ m+1 −1 = 1/Z by the condition that the probabilities add up to 1, that is, ∑ k∈V N P(k) = 1.

Remark 1. (a)
In accordance with the literature, the Lagrangian multipliers λ 1 , . . ., λ m are not specified to actually determine a Boltzmann distribution that indeed satisfies E( Fℓ (.)) = fℓ ∈ R, but instead, the Lagrangian multipliers are from now on considered as parameters of the Boltzmann distribution.(b) We note that additive constants in the Hamiltonian do not affect the stationary distribution, as these constants appear in a multiplicative way in numerator e −H(k) as well as in the denominator Z of the stationary distribution.We make use of this observation below to get rid of singularities appearing in the weak effects limit.
The Curie-Weiss model in sensu stricto is defined by observables that are polynomials of the second order, as we will discuss in Section 4.1.For the time being (Sections 2.3 and 3), we allow for more general functions as observables, as this is necessary to obtain and utilize a connection between the Curie-Weiss models and the opinion processes, which we discuss next.

Connection between Opinion Processes and the Curie-Weiss Model
An opinion process has a stationary distribution, and a Potts model a Boltzmann distribution.In order to connect opinion processes and Potts models, we ask which conditions the observables (Potts models), respectively, the environments (opinion processes) need to satisfy such that the stationary distribution of an opinion model coincides with the Boltzmann distribution.If we combine Propositions 1 and 2, we find the following corollary.
Corollary 2. The stationary distribution of an opinion model with population size N and environment F ± : V N → R and the Boltzmann distribution for observables F± : Note that this corollary only states sufficient but not necessary conditions.Corollary 1 allows, for example, for constructing more observables that generate the same distribution.Considering the settings of this corollary for general λ ± ∈ R, we observe Therefore, a Boltzmann distribution with observables derived from environments of an opinion process is, for general λ ± ∈ R, the stationary distribution of the opinion process with transition rates We call this family of opinion models the Glauber family for the given observables/environments.Please note that the standard Glauber dynamics for the spin up/spin down mean-field Ising model [25] utilized the freedom discussed in Corollary 1 in choosing a non-trivial function U + (k).

Large Population Limits
Below, we consider applications of opinion processes to data, where, often, the population size is in the magnitude of N ≈ 10 5 .Therefore, a continuity limit is of interest.As above, we might consider the dynamics (opinion process) and work out a diffusion limit, or we might focus on the Boltzmann distribution, and consider a continuum limit for that distribution.The interesting point is to understand the connection between the two resulting objects.
Herein, we note that the rates f ± , the corresponding environments F ± , respectively, and the observables F± incorporate parameters.Two different assumptions about the dependencies of these parameters on the population size N (which itself is a parameter) lead to sensible continuum limits for N → ∞: one is the strong effects limit and the other the weak effects limit, as introduced for the Moran model as part of population genetics [33][34][35] (where the Moran model with mutation is mathematically identical to the zealot model).In physics, the terms "extensive" and "nonextensive" models are used to describe the same idea [36,37].
To explain the difference between strong and weak effects, let us focus on the zealot model.In the strong effects models, the number of zealots is assumed to increase proportional with the population size.In that, the probability for a person to interact with a zealot is independent of the population size, and zealots clearly will affect the invariant distribution and also in the limiting case, N → ∞.
In the weak effects setting, however, the number of zealots is constant and independent of the population size.A person's probability to interact with a zealot tends toward zero if the population size tends toward infinity.The effects of zealots become weaker and weaker if N increases, which is why these models are termed "weak effects models".Mobilia, indeed, asked in his paper [23]: "Does a single zealot affect an infinite group of voters?"Though it sounds unlikely that this single zealot has some noteworthy effect on a huge population, it turns out that this is the case.
We define the weak/strong effects limit assumptions in a mathematically precise way in Sections 3.1 and 3.2 below (please do not confuse the weak effects limit, which refers to the scaling of the parameters, with a weak limit, which refers to the topology of convergence).
We assume throughout the current section that f ± and F± depend on k/N for k ∈ V N , that is, on the sharing of an opinion Since the observables are defined via f ± , we rewrite them separately; see Section 3.1 below.For given N, the old and the new scaling are mathematically equivalent.However, if we aim at a limit N → ∞, the new scaling is a reasonable and very helpful assumption that the most commonly used models actually fulfill.

Strong Effects Limit
In the strong effects limit, we assume that is well-defined.For clarity of notation, we write the limits of rate functions, environments, and observables for N → ∞ in bold.To adapt the definition region of the associated environments from the discrete state space V N to the continuous state space x ∈ [0, 1], we re-define the environments as For x = k/N, k ∈ V N , we recover the environments as defined in Proposition 1. Therewith, the environments also satisfy a proper limit for N → ∞, We also introduce a limit for the observables.Here, a certain subtlety appears: we have In leading order, the observables are O(N).We thus scale them by 1/N and define We emphasize at this point that when using F± (x) below, we need to take the scale 1/N, introduced at this point, into account.
To obtain the behavior of the opinion process under the strong effects limit, we briefly sketch the Kramers Moyal [38] expansion of the model, which is-as usual-truncated at the second order to obtain a Fokker-Planck (or Kolmogorov forward) equation.

Proposition 3.
The Kramers Moyal expansion up to the second order for the Glauber family with limiting observables F± (x) = x 0 ln(f ± (y)) dy is given by Proof.We start with the master equations If we now assume that p k ≈ h u(x, t) for some smooth probability density u(x, t), where x = k h and h = 1/N, we have ∂ t u(x, t) ≈ ṗk , and The Taylor expansion of the last two terms up to the second order neglecting the error term and using the limit of f ± ( .; N) for N → ∞ yields the Kramers Moyal expansion.
Next, we turn to the Boltzmann distribution.In the continuum limit, we will denote the Boltzmann distribution (and later the stationary distribution of a limiting stochastic opinion process as well) by φ(x).Proposition 4. The Boltzmann distribution for observables F± (x) = x 0 ln(f ± (y)) dy and large N is given in leading order by the Hamiltonian where The result is a consequence of Equation ( 10), the scale of F± with respect to N introduced in (15), and the well-known approximation of the binomial coefficient by means of the binary entropy The connection between the stationary distributions of the Kramers Moyal expansion Equation ( 16) and the stationary distribution Equation ( 17) is not clear; though they are derived from the associated Potts models and opinion processes, they look rather different.The next proposition clarifies the connection.
Proof.We first investigate the Boltzmann distribution.Since we have a critical point of H(x) at x = µ.The second derivative reads, again using f(µ) = 0, Hence, ). Locally, at x = µ, the stationary distribution e −H(x) /Z behaves as N(µ, σ 2 ) with Next, we proceed to the stationary distribution based on the Kramers Moyal expansion (16).
The leading order terms of the linearization of the drift and the noise term at x = µ yield the Ornstein-Uhlenbeck approximation where we used (1 In order to identify the stationary distribution, we substitute u(x, t) = φ(x) with φ(x) = e −a(x−µ) 2 into the term bracketed with curly brackets, This term becomes zero and, therewith, φ(x) an invariant measure, if . Therefore, the stationary distribution of this approximate Fokker-Planck equation is a normal distribution N(µ, σ 2 ), where For large N, the stationary distribution will be concentrated close to µ and, hence, in the relevant regions both stationary distributions, that of Equation ( 19) and of Equation ( 17), coincide.In that, we consider the Kramers Moyal expansion (16) as the Glauber dynamics of the Boltzmann distribution.

Weak Effects Limit
The idea of the weak effects limit is to take a simple reference model as a basis and to perturb this model in such a way that for N → ∞, the transition rates converge back to that of the reference model.For good reasons, as we will find out later, we use the voter model as the reference model.In the voter model, at rate ν, a person copies the opinion of a randomly chosen person in the population (inclusive "selfing"-that does mean that the focal person might by chance copy her opinion from herself).Therewith, and hence, f ± voter (x) = x.For the weak effects limit, we allow for rates f ± (x; N), which depend on N, as long as lim N→∞ f ± (x; N) = f ± voter (x).Also, the Lagrangian parameters are allowed to depend on N, λ ± = λ ± (N), and the paradigm of weak effects requires again lim N→∞ f ± (x; N) λ ± (N) = f ± voter (x).Therefore, the expansion of f ± with respect to 1/N has f ± voter (x) as the zero order term and some arbitrary (well-behaved) function g ± (x) as the first order coefficient.Similarly, the zero order term of the expansion of λ ± (N) is 1, while the first order terms are some parameters κ ± ∈ R, which we are free to choose.All in all, the Glauber family suited for the weak effects limit assumes the form As we will see, for the weak effects limit, we not only assume the appropriate scaling of the parameters, but also rescale time.In that, even for N → ∞, we still obtain a non-trivial limiting process and do not simply return to the voter model.As a consequence, we need to re-consider the large population limit conducted in ( 14) and ( 15) more in detail and also pay attention to the terms of order O(N −1 ).Proposition 6.Consider the observables for the opinion process defined by (21), With the definition the leading order terms of the expansion of F± (x; N) in 1/N reads Proof.We first rewrite the sum such that it extends to Next, we replace the sum by an integral.Here, we take the Euler-McLaurin correction terms into account in the step from sum to integral.Furthermore, we note that an additive constant in the Hamiltonian does not change the stationary distribution.Instead of ∑ ⌊Nx⌋ ℓ=0 (. ..), we can change the lower starting value to any value independent on x and might, e.g., consider ∑ ⌊Nx⌋ ℓ=⌊N/2⌋ (. ..) instead.Only the upper bound of the sum matters; we only need an anti- derivative.To express this fact, we skip the lower bound of the integral and proceed (where the first equal sign has to be interpreted with the knowledge that we did drop some irrelevant term): Note that lim N→∞ x ln 1 + 1 y N g ± (y) dy = 0 (in the sense that 0 is a possible limiting anti-derivative) such that this integral only contributes to terms of order O(N −1 ) or higher.We introduce x g ± (y)/y dy and obtain the result.
Therewith, we are in the position to establish the following proposition.
Proposition 7. Assume that the functions f ± (x) scale with N as described above, and scale the Lagrangian parameters by λ ± = 1 + κ ± /N.Then, in leading order, the Hamiltonian reads We use the approximation of the binomial coefficient (18) to obtain the result with , where we drop terms independent of x and terms of a higher order in N −1 .
It is remarkable and typical for the weak effects scaling that the Hamiltonian and, in that, also the Boltzmann distribution become independent of N. We now turn to the underlying opinion process and discuss the Kramers Moyal expansion under the scaling assumed.Proposition 8.The Kramers Moyal expansion of the Glauber family under the weak effects scaling in rescaled time T = νt/N is given by Proof.The Glauber family with the weak effects scaling is defined by The drift term becomes in leading order (where h.o.t. is a placeholder for higher order terms), and the coefficient of the noise term becomes in leading order If we rescale time T = ν t/N, we obtain the result.
The Kramers Moyal expansion in rescaled time becomes, as the Hamiltonian, independent of N. Proposition 9.The stationary distribution of the Kramers Moyal expansion is identical with the Boltzmann distribution φ(x) = exp(−H(x))/Z, where the Hamiltonian H(x) is given in (23).
Proof.We plug the Boltzmann distribution into the right-hand side of the Kramers Moyal expansion.Thereto, we note that If we take the derivative of this equation with respect to x, we indeed find that φ(x) is a stationary solution of Equation (24).
Remark 2. For the weak effects limit, we have chosen the voter model with f ± (x) = x as the reference model.If N → ∞, the rates of the model at hand converge back to this model.This choice looks, at first glance, arbitrary.It is, however, up to the freedom characterized in Corollary 1, a unique choice: the binary entropy H 2 (x) generates in the Hamiltonian terms Nx ln(x) and N(1 − x) ln(1 − x).For a weak effects limit to exist, these terms of order O(N) need to be balanced and annihilated by the environments of the reference model, which already forces the reference model to be the voter model (or some model, which is, according to Corollary 1, equivalent to the voter model).
4. Four Models: Curie-Weiss, Weak and Strong q-Voter Model, and Reinforcement We use the framework introduced above to briefly introduce four opinion models we intend to apply to data.

Curie-Weiss Model
To introduce the classical Curie-Weiss model, we start with the Potts machinery.Recall that the central ingredients are observables, that is, functions F± : V N → R, which form constraints when determining the random measure maximizing the Shannon entropy.Perhaps the most simple, non-trivial case is given by observables, which are polynomials of the second order, We do not need a zero order term, as additive constants in the Hamiltonian do not influence the Boltzmann distribution.Furthermore, we scale the quadratic term by 1/N to balance the squared terms in case of large N.With this setting, the Hamilton defined in (10) reads We might rewrite the quadratic terms as We define J = λ + b + + λ − b − and h ± appropriately (e.g., Furthermore, for historical reasons, we introduce h − = 0 and drop terms independent of x.Therewith, we obtain which is the standard form of the model on the state space V N [25].We can still reduce the number of parameters from four (N, J, h + , h − ) to three (N, J, h) with h = h + − h − as the additive constant, which, apparently, can again be dropped.Last, we check the existence of the strong and the weak effects limit.The strong effects limit can be derived trivially, simply by replacing the binomial coefficient by NH 2 (x), cf.(18), The binary entropy introduces logarithmic terms of order N into the Hamiltonian H(x).As the observables of the Curie-Weiss model consist of polynomial terms, they cannot cancel these logarithmic terms such that the Hamiltonian always incorporates nontrivial terms of order O(N).In the proper weak effects limit, however, terms of this order are not present.Thus, a weak effects limit for the Curie-Weiss model is not possible (cf.Remark 2).This will be different for the other models we shall discuss next.

Two Flavors of the q-Voter Model
For the q-voter model, we do not start with the stationary distribution but the transition rates.Perhaps the most simple extension of the voter model where no opinion can die out is the zealot model, where we have N ± zealots for the opinion ±1.Palombi and Toti ([24] p. 337) call zealots "stubborn agents [. ..] who never change political preference".In our case, zealots are not real persons but represent sources of information that stand for a specific opinion.These could be politicians, friends, newspapers, or social media channels.As in the voter model, individuals copy their opinion from a randomly chosen person, now also from the zealots, which leads to The corresponding Glauber family is given by For λ + = λ − > 1, this is the q-voter model for a homogeneous population [27].In the case of λ ± = 1, we are back in the zealot model: a person simply copies the opinion of another person or a zealot.If λ ± > 1, the model can be interpreted as follows: the person will ask λ ± other persons for their opinion and will only change her opinion if all these other persons have the identical opinion.

q-Voter Model-Strong Effects
We consider the strong effects limit: if N ± = η ± N, that is, if the number of zealots scales linearly with the population size where η ± are the proportionality constants, such that f ± (x; N) becomes independent of N, and f ± (x; N) ≡ f ± (x).Therewith, the limiting observables are given by where C is a constant.The stationary distribution in the strong effects limit is given by φ(x) = Z −1 exp(−H(x)), where In the last formula, we again made use of the fact that we are allowed to drop constant terms from the Hamiltonian.

q-Voter Model-Weak Effects
We now go into the weak effects limit for the q-voter model.The basis is the zealot model with f ± (k; N) = N ± +k N+N − +N − .For the weak effects limit, the number of zealots N ± does not scale with the population size, and in that, zealots become rare if N becomes large.We can choose the time units, and in this, we have a degree of freedom in the form of a multiplicative positive constant.This freedom can be used to replace the original denominator N + N − + N − by N and work with That is, for our particular choice, the functions g ± (x) are independent of x.Recall that we also expand the Lagrangian parameters in terms of N and write λ ± = 1 + κ ± /N.To obtain the weak effects limit, we note that G ± (x) = x g ± (y)/y dy = ln(x) N ± and obtain the Hamiltonian with the corresponding stationary distribution The stationary distribution becomes a beta distribution in the case of the zealot model (κ ± = 0); this result was first derived in the context of population genetics, where the zealot model is termed the Moran model ( [33] page 108).The extension to the weak effects limit of the Glauber family presented here is novel.

Reinforcement Model-Weak Effects
We add one more model, which also allows for a strong as well as weak effects limit and which is, as the q-voter model, also a descendant from the zealot model: the reinforcement model [21].The idea of the reinforcement model is to express the psychological mechanisms that lead to filter bubbles and echo chambers: several kinds of cognitive biases let individuals communicate with persons of the opposite opinion with less awareness than with individuals of their own opinion.Some interactions with the opposite group are ignored.In that, the effective size of the opposite group is reduced by a factor θ ± ∈ (0, 1].The zealot model is described by We focus on the weak effects limit and hence keep (as in the weak limit of the q-voter model) N ± independent of N. Furthermore, we choose θ ± = 1 − ϑ ± /N such that we return to the voter model if N becomes large; the Lagrangian parameters are taken to be λ ± = 1 and are not scaled.It turns out that the computations assume a simpler form (additive constants will vanish below in the first order term of the expansion) if we use a trivial time scale such that a multiplicative term N + /N + 1 + N − /N appears in the rates, .
Therewith, we obtain the expansion of the rates with respect to 1/N, Consequently, we obtain and the stationary distribution (we use Equation ( 23) with κ ± = 0) If we compare the stationary distribution of the weak effects q-voter model and the weak effects reinforcement model, we find a striking similarity: in both cases, the measure is an adaptation of the beta distribution, φ(x where a ± = κ ± and ζ(x) = −x(1 − ln(x)) in the q-voter case, while a ± = ϑ ± /2 and ζ(x) = −x(2 − x) in the reinforcement case.As both functions ζ(x) resemble each other in that ζ(0) = 0, ζ(1) = 1, and both are convex, we expect very similar behavior for the two models if we take ϑ ± = 2κ ± (also inspect Figure 1).Phase transitions of the four models in the symmetric case.For a given parameter (x-axis), the density of the distribution is indicated (by the heat and contour plot) over x (y-axis).The blue lines indicate local maxima (solid lines) and local minima (dashed line) if the parameter (given at the x-axis) is fixed, while the dot marks the phase transition.(Curie-Weiss: h ± = 0; strong q-voter: η ± = 5 and N = 20; weak q-voter: N ± = 10, κ ± = κ; weak reinforcement: As a last remark, we note that we do allow, in the weak q-voter model and the weak reinforcement model, not only for positive parameter values κ ± and ϑ ± , but also for negative values.While positive values for these parameters lead to filter bubbles and echo chambers (a person hesitates to change her mind), negative values have the interpretation that the person iss open-minded and pays particular attention to the opposite opinion.In the case of the functioning of democracies and their institutions, open-minded people are much more preferable because they are easier to find compromises for solving political problems.

Model Behavior
We will not go deeper into the analysis (which can be found, e.g., for the Curie-Weiss model in [25,26], for the q-voter model in the strong effects limit in [27], and for the reinforcement model in [21]), but simply refer to Figure 1, which shows that all four models undergo a phase transition if the coupling between the individuals is sufficiently large.We also emphasize that the models based on the voter and zealot model (which do not allow for phase transitions per se), the two kinds of q-voter model, and the reinforcement model exhibit phase transitions.The mechanisms modifying the effects of zealots target in/outgroup communication.If in/outgroup communication is sufficiently strong, a bimodal distribution appears via a phase transition.Also, the behavior under non-symmetric conditions (parameters) leads to similar behavior of all four models.
It is interesting to note that the Curie-Weiss and the strong effects q-voter model incorporate the population size N explicitly, while the (weak effects limit of) the q-voter and the reinforcement model become independent of N. Usually, in applications, N is very large, and if we naively take N to the population size, the variance generated by the model is much smaller than the variance that is present in empirical data.The way out is to assume that individuals cluster together and to estimate an effective population size N = N e f f along with the other parameters, which, of course, is slightly dubious but pragmatic [26].The weak effects models elegantly circumvent this difficulty.

Data Analysis
We use data from four different Western democracies that represent different types of government and electoral systems (see [39] pp.145-161, 271, and [40] and Table 1 for further details).Concerning the governmental system, we have one presidential (US), one semi-presidential (France), and two parliamentary systems (UK, Germany).In the case of the electoral systems, we have two majority systems with a first-past-the-post design and relative majority (US, UK) and one majority system in France with a two-round system and absolute majority.In Germany, we have a mixed member proportional system that combines both a first-past-the-post vote and proportional representation.Finally, for each country, we study different numbers of elections (US: six presidential elections, 2000-2020; UK: 20 parliamentary elections, 1945-2019; FRA: second round of five presidential elections, 2002-2022; GER: two parliamentary elections, 2017-2021; the data sources are indicated in the data availability statement).Such a design is useful in understanding how the models used can explain the dynamics in different institutional settings with different political cultures and in varying periods of time.This comparative approach gives us more information about the functioning of the mechanisms in different contexts (e.g., [12,13]) and contributes to the existing research often based on single-case studies (e.g., [24,41]).In the analysis, we consider each election district as an i.i.d.repetition of the election.Herein, we obviously reduce the complexity of the data by neglecting social co-factors and spatial effects.In that, we obtain an empirical distribution of vote shares and can use a maximum likelihood estimator.Please find the technical details, particularly the algorithm used to perform the maximum-likelihood estimation, in Appendix A.
The present approach to data analysis is based on a steady state assumption, that is, the opinion formation process is assumed to be approximately in equilibrium.If there is a huge shift in the vote share of a candidate or party in recent times, this steady state assumption might not be met.The tables with estimates, p-values for the Kolmogorov-Smirnov test, and AICs can be found in Appendix A.
United States data: The densities of the four models (Curie-Weiss, weak and strong qvoter model, and reinforcement model) are rather similar (Figure 2a), and the Kolmogorov-Smirnov tests also resemble each other (see table in Appendix A.1).In the year 2000 only, the p-values of this test were small (between 0.018 and 0.04); in that year, apart from the Democrats and the Republicans, the green candidate did win a small but reasonable fraction of votes such that the dichotomous models might not be completely suited.In 2016, we also have about four percent of third-party votes, but at that point, the models fit better.Maybe this is due to the fact that polarization in the American society has already grown during these past 16 years.The AIC always selects the (weak) reinforcement model as the best-suited model, but in the year 2020, the weak q-voter model fit best.However, in this year, the reinforcement model and the weak q-voter model are very close.The strong models always perform worse (Figure A1).We clearly find a trend in the parameters, which shows that the model moves in time more and more towards a phase transition.United Kingdom data: The trend in the reinforcement parameters/coupling is particularly interesting (Figure 2d).As can be clearly visualized in the empirical and the estimated distributions for the election from 2015, the UK indeed became super-critical.We observe a bimodal distribution in 2015.It is most interesting to see that the theoretical prediction of possible phase transitions is realized in the UK.
France data: Particularly in 2022, the empirical distribution of vote shares is skewed.The strong effects models have difficulties dealing with this result, while the weak effects models are more flexible; in particular, the reinforcement model still performs well.
German data: As we have a proportional electoral system, the dichotomous model requires adaptation-for each party, we distinguish between the votes in favor of this party versus the votes for all other parties.In Figure 3, we find the reinforcement parameters together with the coupling J of the Curie-Weiss model for those parties in the 2017 and 2021 elections that did receive at least 5% of the votes (where we did disregard the CSU, as this is a Bavarian local party that only stands for election in few election districts).If we focus on the best models (lowest AIC), these models indeed are able to meet the empirical vote share distributions quite well (Kolmogorov-Smirnov test), with only one exception: the left-wing party Left Party (die linke) in the 2017 election.Due to historical reasons, this party performs very differently in the federal states coming from the former East and West Germany, respectively.Though these historical effects can also be understood to be based on opinion dynamics and in/out-group behavior, all models have difficulties in capturing this data structure.The assumption of a homogeneously mixed population may no longer be appropriate; instead, a two-island model would capture the communication structure better.Also, the (relatively recent) right-wing party AfD, which also performs rather differently in the two regions (former East/West Germany), poses a problem for all models but the reinforcement model, at least according to the Kolmogorov-Smirnov test.However, the difference in the two regions is less pronounced than in the case of the Left Party, which might also be the reason why the reinforcement model is still able to handle the vote share data of the AfD.2022), together with the probability density for the four models.Left: Histogram of ϑ ± and J (together with critical level, orange) for the parties in 2017, 2021, which hold more than 5% of the votes (threshold), apart from the CSU, which is a Bavarian local.

Summary of the estimations:
In almost all elections, at least some of the models describe the empirical data adequately (according to the Kolmogorov-Smirnov test).If we compare the performance of the models according to the AIC, we find that the weak effects models outperform the strong effects models, and the weak effects reinforcement model is superior to the weak effects q-voter model (Figure 4).It is interesting that although the structures of the weak effects models are very similar, we nevertheless find a difference in their suitability for practical applications.

CW
Curie-Weiss SqV strong effects q-voter WqV weak effects q-voter Re reinforcement We should keep in mind that we work with aggregated data (only the outcome of election in election districts that typically have a population of hundreds of thousands of individuals).We might consider the model parameters as a reduction of the data complexity to a low-dimensional parameter space, which allows us to better interpret the data.As the Kolmogorov-Smirnov test indicates the appropriateness of the models, we can be confident that the models capture at least some fundamental structure in the data.In that, the interpretation of the parameters suggested by the models will be appropriate.

Political Science Interpretations
United States: Looking at the parameters and measuring the actual strength of reinforcement in the reinforcement model, we can see that since the year 2000, there has been a continuous trend for a much more polarized voting behavior in the United States.For the voters of the Republican Party, the switch could already be seen in 2008 with the election of Barack Obama, and it continued in his re-election, while it became dominant in 2016 and 2020 when Donald Trump became candidate of the Republican Party.The voters for the Democratic Party clearly also changed from open-to closed-mindedness in the political realm.In fact, what we can see is that during those 20 years, the political discourse became so polarized in the US public arena that now both the parties and their voters are becoming clearly more and more separated from each other.For example, Binder [42] shows that the frequency of legislative gridlock has risen since the 1990s, and other studies show that polarization in the US citizenry is not going down [43].
Great Britain: Due to the electoral system (first-past-the-post) and the governmental system (parliamentary), it is clearly useful for the two dominant parties-the Labour Party on the left and the Conservative Party (Tories) on the right-to keep a certain or even a high degree of polarization so that they can form the government on their own.The bars of the reinforcement model show this clearly for Labour for almost all elections since 1945, but for the Tories, this strategy started only in the 1970s and became very dominant since the second election in 1974 and the Thatcher years, 1979-1990.Additionally, we can see that in 2019, the reinforcement parameters had not been so strong.This might be the result of the Brexit decision in 2016 and its political aftermath in which the Conservatives had become the party of the Leavers (in support of Brexit) and the Remainers have split between Labour and the Liberal Democrats [44].
France: The French Party System has undergone a political change in the last few years since 2017, so some already argue that we may see the rise of a new French Party System [45].In 2017, both presidential candidates of the traditional left-wing (Parti Socialiste) and rightwing parties (Les Républicains) were disqualified in the first round of the presidential elections [46].And this happened again in 2022, when both Emmanuel Macron of the centrist "La République En Marche" and Marine Le Pen of the new-named right-wing populist "Rassemblement National" were, for the second time, the political opponents in the second round of the election [47].That said, we can see in Figure A3 for the previous elections that the models did not explain so much for the second round.
Germany: Here, we can see that the right-wing populist party Alternative for Germany (AfD) was able, in both elections, to create their own space for resonating with their voters.In 2017, both the Social Democrats (SPD), the Left Party, and the Greens had been able to mobilize their voters against the AfD, but in 2021, it was the SPD and the liberal party (FDP).Compared to the left parties, the FDP tried to position itself as the party of Freedom, where some parts of the party also raised their critical voice against the political means of the former government of the two conservative parties CDU and CSU and the Social Democrats used during the Covid-19 pandemic.This can be observed in their parliamentary work, where they used so called "Kleine Anfragen" to question the government parties the most compared to all the other remaining opposition parties in the German Bundestag [48].Therefore, they had been a competitor both to the AfD and the other left and center parties and also gained a lot of support by young voters [49].

Discussion of the Findings and Their Interpretation in Political Sciences
In the present paper, we first discussed the connection between Potts models and stochastic opinion models, both for well mixing populations.In particular, we did provide an alternative approach to the q-voter model as a natural extension of the zealot model in the view of Glauber dynamics.Consequently, motivated by similar constructions in population genetics, we introduced a strong and also a particularly weak effects continuum limit.While the strong continuum limit is generically possible (and locally approximates a normal distribution), the weak effects limit requires additional structure.In that, the Curie-Weiss model only allows for the strong limit, while the q-voter model has both limits.Afterwards, we additionally introduced the reinforcement model, which has its foundation not in the Potts machinery, but is derived based on considerations of the impact of several kinds of cognitive biases on communication, especially on the resulting in/out-group communication strategies.Also, that model allows for both continuum limits, where it turned out that the weak q-voter and the weak reinforcement model are very similar in their mathematical structure.Basically, both are an adaptation of the beta distribution, which is the well-known weak effects limit of the zealot (or Moran) model [33].We also found that only models that are based on the voter model allow for a weak effects limit, which indicates that these models are in some sense special.
After these theoretical considerations, we turned to test the models based on election data.Herein, we used each election district as an i.i.d.repetition of the election, neglecting social co-factors, which vary between election districts, as well as spatial factors.We also assume the opinion process to be approximately in equilibrium such that the stationary distribution is an adequate description of the data; in case of a large shift in the vote share of a candidate or party, this assumption can also be called into question.Though the approach was simple, we mostly found that the models meet the data quite well, where the weak models performed better than the strong models, and the weak reinforcement model outperformed the weak q-voter model.It is interesting that the weak effects models seem to be better suited to describe the data appropriately.The comparison of models and data is always challenging, but it also seems that population genetics, where the weak effects such as weak selection are often used, rather supports that these kind of models are well suited for real world applications [50].The background could be that striking and immediately disruptive events are rare.Most stimuli are weak and require time to unfold their effect.If this observation is correct, weak effects models with their slow time scale might indeed be a better description of reality than strong effects models with a fast time scale.As a practical consequence, we propose to use rather weak effects opinion models than strong effects models in empirical studies, which also has the advantage of not needing to choose an appropriate population size, which is a well-known problem in itself [26,33].Though the models allowing for a weak effects limit are rather special, they seem to be a powerful description of reality and still have sufficient flexibility to address different mechanisms and different real world (electorate) systems.
Elections are at the center stage of modern representative democracies.Correspondingly, research on elections and attempts to explain the formation of their results are also central.Prominent and established approaches use statistical data concerning the social characteristics of voters to determine their voting behavior (e.g., [51]).But, there is an ongoing debate among scholars that the correlation between social characteristics and voting behavior has diminished over the last few decades (e.g., [52][53][54]).Additionally, party membership is also in decline, which also has consequences for voter turnout and voting behavior (e.g., [55,56]).As Clarke et al. [57] bluntly declared, for understanding electoral choice, one has to look elsewhere.It is not that the classic approaches have lost all their explanatory power, but it makes sense to look for explanations that are less context-dependent.
By focusing on the dynamics of opinion formation preceding the act of voting, the models discussed in this paper promise insights both into the empirical explanation of elections as such, as well as important aspects of the theory of democracy.
Our leading assumption, ensuring a larger independence of specific social contexts, is opinion formation via frequently contacting social sources of information, constituting a ubiquitous mechanism of collective decision-making.For sure, this assumption also holds for elections.The sources of information here may be real persons or media of all kinds.Pamphlets, newspapers, magazines, radio, TV, and the diversity of social media have accompanied political discussion since the early modern period.Albeit the basic mechanism is taken to be the same everywhere, its effects may be modulated by the impact of or the interaction with other mechanisms of other sections within an "organized complexity".Electoral systems in their specific forms are nested within the broader construction of a political system.The effects of opinion formation processes concerning elections are shaped by specific institutional settings as well as the political culture of the respective countries.
The model's reduction to only two opinions may look like too crude a simplification.But it is not as implausible as it may appear at first glance.As Denver and Johns [58] stress, when preparing their decisions, voters don't "sit down before an election to comb through the parties' manifestos and make detailed calculations of the costs and benefits of voting for each party.(. . . ) Such a process is neither realistic nor particularly rational ([58] p. 294)".Rather, voters base their decision on just a few subjects dominating the discussion."Issue Voting" and "Valence Voting" denominate the approaches based on that assumption.Where Issue Voting stresses the main issues of the election campaign like economic questions or social policy, Valence Voting focuses, for example, on the performance of the incumbent government."Neither involves complex calculations; indeed, the simpler versions of both approaches have fared better when confronted with the empirical evidence (ibid.)".And both aspects fare better than explaining results with respect to voters' social characteristics.Therefore, looking at opinion and opinion formation in this simple form offers a promising starting point to delve deeper into campaigning and voting.
In this view, the models in their present form may be interpreted as to suppose a stage of the election process, where the main issues are already settled.From here, the model may be enhanced by stepwise nesting the basic opinion formation process within a whole set of similar modeled ones.The determination of the salient issues and valences may be the next step.The party that succeeds in putting its topics in place may be at an advantage.A variant of the Issue Voting approach stresses that it is not only the preferences on the specific topics of the very election of today that affect the voter's decision, but general values and principles ([58] p. 295).We can think of processes concerning ideological backgrounds running on a larger time scale spanning across two or more elections, affecting the probability with which a voter makes up his mind.Another perspective would look at the developments within the zealots in particular.This would mean looking at the development of party programs and strategies, also in the form of opinion formation among party members.The possibilities are manifold.
The reinforcement model in particular also highlights important aspects concerning the theory of democracies.Especially in the liberal tradition of democracy, it is a common view to interpret campaigning and elections as a market analogous competition, where votes are exchanged for programs and personnel (see [59]).Competition appears as a form of regulated and limited conflict.The opponent's purpose is not to harm the antagonist, but only to be better.The idea behind this is that the aspiration to trump the adversary leads to the advancement of a common good, qualitatively better or cheaper products and processes in economy, better theories and methods in science, and better programs and personnel in politics.
The zealot model, as a predecessor of the reinforcement model, was used before in economics for market analysis [60].But behind this application lies a model designed to explain foraging processes of ant colonies [61].It describes a form of collective information processing against an uncertain environment.Time and again the colony has to leave an established feeding ground and look for another in time.It is inspiring to see the similarities.Political communities also have to alter their processes and organization because of altering circumstances, for example, transforming their way of living to a more sustainable way.In this way, an open political process, defining problems and looking for solutions, is a form of collective information processing, too.This idea was emphasized by Karl Popper [62], for example, and further developed by John Dryzek [63].It is not implausible to assume that a part of the success of democracies in general is their dealing with the world's shakiness in an analogous way to modern science.
Whatever makes the workers of the ant colony change their paths, the driving force behind the parliamentary process, at least from the perspective of the liberal standard model of democracy, is party competition.Since party competition is itself affected by special interests and personal ambition of politicians, democracies need additional features to balance these forces and bring the wanted effects of the competition to the fore.This has been part of the considerations from Harrington [64] to Tocqueville [65] to Dewey [66] but shall not be the point here.
What the reinforcement model enables us to see is the possible polarization between the opponents.It is important to note that polarization indicating a higher grade of conflict is not a problem per se, as higher grades of conflict are not as well.As sociologist Lewis Coser (1967) [67] argued for, conflicts in the first place point to societal problems within a society urging to deal with their causes.If the community is productively addressing that challenge, society reintegrates on a new level.We could see such effects, for example, in the course of the environmental movement in the 1970s to 1990s in Germany.The polarization on the side of the Greens was high in the beginning when they cracked open the consensus of the established parties on the use of nuclear power, and became lower again when environmental issues were successfully established on the agenda.
Polarization may become problematic when the reinforcement effects are strong on both sides of the debate.Conflicts can be disruptive too.American philosopher of law Ronald Dworkin was already asking in 2006 [68], looking at the polarization in the US "Is Democracy possible here?".The polarization in the US has not been in decline since then (Pew Research Center 2022, [43]).Polarization may also become problematic when actors show no tendency to consent or compromise, enabling reintegration.This appears to be the case with the populist movements and parties of the last decade.On the other hand, party polarization may generate stronger party attachments, which could also be a desirable strategy for political competitors ([69] p. 350).And here is probably the point where political scientists (at least at the moment) have to reach out for other methods than mathematical modeling, too.Qualitative analysis of texts or focus groups may be an appropriate means here.However, it should be ascertained that the reinforcement model supplies us with a strong indicator concerning an important variable of political processes.And since the claim that a society is polarized is also often used in an alarmist way, impeding compromise, the more it is helpful to have this indicator.It should be ascertained further that because of its context independence, the model will be useful when we look not only at well-established democracies of the West, but also at young ones or democracies in other world regions.Appendix A.3.France Data This is the second round of the presidential election, and we considered the vote share of the winning candidate.The performance of the AICs for the models is shown in Figure A3.

Appendix A.4. German Data
We include all parties that did reach a vote share of at least 5% except the CSU, which is a local party and thus only stands for election in a few districts.Note that "die linke" is present in the parliament of 2021, though this party did not reach 5%, and in that, we did exclude this party in 2021.In order to fit our dichotomy model, we focused on the vote share of the focal party, essentially distinguishing between supporters of this party and supporters of any other party.The performance of the AICs for the models is shown in Figure A4.

Figure 1 .
Figure 1.Phase transitions of the four models in the symmetric case.For a given parameter (x-axis), the density of the distribution is indicated (by the heat and contour plot) over x (y-axis).The blue lines indicate local maxima (solid lines) and local minima (dashed line) if the parameter (given at the x-axis) is fixed, while the dot marks the phase transition.(Curie-Weiss: h ± = 0; strong q-voter:

Figure 2 .
Figure 2. (a,b): Histograms of the vote shares for (a) the presidential US elections in 2020 and (b) the parliamentary elections in 2015 in UK, together with the probability density for the four models.(c,d): Bars indicate the reinforcement parameters ϑ ± for (c) Democrats and Republicans, and (d) the conservatives and the Labour Party (left axis); orange bullets indicate parameter J of the Curie-Weiss model (right axis), while the horizontal dashed orange line indicates the threshold for the phase transition of that model for h = 0.

Figure 3 .
Figure 3. Right: Histogram of the vote shares for the presidential elections in France (2022), together with the probability density for the four models.Left: Histogram of ϑ ± and J (together with critical level, orange) for the parties in 2017, 2021, which hold more than 5% of the votes (threshold), apart from the CSU, which is a Bavarian local.

Figure 4 .
Figure 4. Performance of the four models.For the 42 elections we consider in the present paper, we indicate the percentage for which the models performed best according to the AIC (yellow) and acceptable (AIC has a maximum distance of 2 to best model).

Table 1 .
Design of the study: data set used.
We investigated the vote share of the conservative among the conservative and Labour votes.The performance of the AICs for the models is shown in FigureA2.+ /N + h/η − /N − λ + /κ + /ϑ + λ − /κ − /ϑ − N Comparison of the models' AICs for the UK.