Models of Strategic Decision-Making under Informational Control

: A general complex model is considered for collective dynamical strategic decision-making with explicitly interconnected factors reﬂecting both psychic (internal state) and behavioral (external-action, result of activity) components of agents’ activity under the given environmental and control factors. This model uniﬁes and generalizes approaches of game theory, social psychology, theories of multi-agent systems, and control in organizational systems by simultaneous consideration of both internal and external parameters of the agents. Two special models (of informational control and informational confrontation) contain formal results on controllability and properties of equilibri-ums. Interpretations of a general model are conformity (threshold behavior), consensus, cognitive dissonance, and other effects with applications to production systems, multi-agent systems, crowd behavior, online social networks, and voting in small and large groups.


Introduction
What factors influence the decisions one makes? Each scientific domain gives its own answer, which is correct in the paradigm of its particular domain. For example, the theory of individual decision-making says that the main factor is the utility of the decision-maker. Game theory answers that it's a set of decisions made by others. Psychology says that it's a person's internal state (including their beliefs, attitudes, etc.). Table 1 contains factors of decisionmaking (columns), scientific domains (rows), and the author's subjective expert judgment on the degree (conventionally reflected by the number of plus signs in the corresponding cell) of taking into account the factors by the domains. Since all these domains are immense (but none of them explores a combination of more than two factors), references are given on several main books or representative survey papers.
In this paper, a model of strategic collective decision-making, which equally considers all of the factors listed in the columns of Table 1, is considered. The model includes explicit interconnected parameters, reflecting both psychic (state) and behavioral (action and activity result, see [1]) components of an agent's activity. Following the methodology proposed in [2], we study the mutually influencing processes of the dynamics of the agent's internal states, actions, and activity results and the properties of the corresponding equilibria.
The first three groups of sources of informational influence are "passive." The fourth source of influence-control-is active, and there may exist several agents affecting a given agent; see the model of informational confrontation in Section 6 below.
In the following paper, we introduce a general complex model of collective decisionmaking and control with explicit interconnected factors, reflecting both the psychic and behavioral components of activity. Some practical interpretations are conformity effects [10,11] as well as applications to production systems [25,27], multi-agent systems [23], crowd behavior [28], online social networks [29], and voting in small and large groups [9].
The main results are: • The general model of decision-making, which embraces all the factors listed above, influencing the decisions made by a strategic agent (see Figure 1 and Equations (1)-(3)); • Particular cases of the general model, reflecting many effects well known in social psychology and organizational behavior: consensus, conformity, hindsight, cognitive dissonance, etc.; • Two models (of informational control and informational confrontation) and formal results on controllability and the properties of equilibriums.  This paper is organized as follows: in Section 2, the general structure of the decision-making process is considered. In Section 3, the well-known particular models of informational control, conformity behavior, etc., are discussed. In Section 4, the simple majority voting model is used as an example to present the original results on the mutually influencing processes of the dynamics of the agent's states and actions (the psychic and This paper is organized as follows: in Section 2, the general structure of the decisionmaking process is considered. In Section 3, the well-known particular models of informational control, conformity behavior, etc., are discussed. In Section 4, the simple majority voting model is used as an example to present the original results on the mutually influencing processes of the dynamics of the agent's states and actions (the psychic and behavioral components of activity) and the properties of the corresponding equilibria. Section 5 is devoted to the model of informational confrontation between two agents, trying to control-influence on the third one-simultaneously in their own interests.

Decision-Making Model
Consider a set N = {1, 2, . . . , n} of interacting agents. Each agent is assigned a number (subscript). Discrete time instants (periods) are indicated by superscripts. Assume that there is a single control authority (principal) purposefully affecting the activity of different agents by control {u i ∈ U i }.
We introduce a parameter r i ∈ R i (internal "state") of agent i, which reflects all his characteristics of interest, including his personality structure [1]. In applications, the agent's state can be interpreted as his opinion, belief, or attitude (e.g., his assessment of some object or agent), the effectiveness of his activity, the rate of his learning, the desired result of his activity, etc.
Let agent i choose actions from a set of admissible ones; A i . His action is denoted by y i (y i ∈ A i ). The agent chooses their actions, and the results of their activity are realized accordingly, which is denoted by z i ∈ A zi , where A zi is a set of admissible activity results of agent i. The agent's action and the result of his activity may mismatch due to uncertainty factors, including an environment with a state ω ∈ Ω or the actions of other agents; see Figure 1.
The connection between the agent's action and the result of his activity may have a complex nature described by probability distributions, fuzzy functions, etc. [26]. For the sake of simplicity, assume that the activity result z i of agent i is a given real-valued deterministic function R i (y i , y -i , ω) that depends on his action, the vector y −i = (y 1 , . . . , y i−1 , y i+1 , . . . , y n ) of actions of all other agents (the so-called opponent's action profile for agent i), and the environment's state ω. The function R i (·) is called the technological function [27,30].
Suppose that each agent always knows his state, and his action is completely observable for him and all other agents.
Let agent i have preferences on a set A zi of activity results. In other words, agent i has the ability to compare different results of his activity. The agent's preferences are described by his utility function (goal function, or payoff function) Φ i : A zi × R i → •1 : under a fixed state, of the two activity results, the agent prefers the one with the utility function of greater value. The agent's behavior is rational in the sense of maximizing his utility.
When choosing an action, the agent is guided by his preferences and how the chosen action affects the result of his activity. Given his state, the environment's state, and the actions of other agents, agent i chooses an action y * i maximizing his utility: The expression (1) defines a Nash equilibrium of the agents' normal form game [8], in which they choose their actions once, simultaneously, and independently under common knowledge about the technological functions, utility functions, the states of different agents, and the environment's state [26].
(Whenever several factors appear simultaneously in a process or phenomenon, the corresponding arrows in a sequence are conventionally separated by commas.) Let us specify the decision-making model.

General Model
We introduce a series of assumptions. (Their practical interpretations are discussed below). Assumption 1 is purely "technical": as seen in the subsequent presentation, many results remain valid for a more general case of convex and compact admissible sets.

Assumption 3. Under a fixed state r i of agent i, his utility function
Assumption 2 is more significant, as it declares the following. First, the activity result (collective decision) z = R(y i , y −i ) is the same for all agents. Second, there is no uncertainty about the environment's state. The agent's state determines his preferences--attitude towards the results of collective activity. The vector of individual results of the agents' activity depending, among other factors, on the actions of other agents can be considered by analogy. This line seems promising for future research. By Assumption 2, there is no uncertainty. Therefore, the dependence of the activity result (and the equilibrium actions of different agents) on the parameter ω is omitted.
According to Assumption 3, the agent's utility function, defined on the set of activity results, has a unique maximum achieved when the result coincides with the agent's state. In other words, the agent's state parameterizes his utility function, reflecting the goal of his activity. (Recall that a goal is a desired activity result [3].) Also, the agent's state can be interpreted as his assessment, opinion, or attitude [1] towards certain activity results; see the terminology of personality psychology in [1].
Assumption 4 is meaningfully transparent: if the goals of all agents coincide, then the corresponding result of their joint activity is achievable.
The expression (1) describes an agent's single decision (single choice of his action). To consider repetitive decision-making, we need to introduce additional assumptions. The decision-making dynamics studied below satisfy the following assumption.

Assumption 5.
The agent's action dynamics are described by the indicator behavior procedure [26]: Mathematics 2021, 9, 1889 5 of 13 with given initial values y 0 i , r 0 i , i ∈ N, where γ t i ∈ (0, 1] are known constants. The action y * i y t−1 −i , r t i is called the local (current) position for the goal of agent i. In each period, the agent makes a "step" (proportional to γ t i ) from his current state to his best response (1) to the action profile in the previous period. Assumption 6. The agent's state dynamics are described by the procedure: (3) Assumption 9. The nonnegative constant degrees of trust (b i , c i , d i , e i ) and the trust functions B i (·), C i (·), and D i (·), i ∈ N , satisfy the condition: Assumptions 7-9 guarantee that the state of the dynamic system (2) and (3) stay within the admissible set.
The constant weights (b i , c i , d i , e i ) possibly reflect the attitude (trust) of agent i to the corresponding information source, whereas the functions B i (·), C i (·), D i (·), and E i (·) reflect his trust in the see the first term on the right-hand side of the procedure (3)) conditionally reflects the power of the agent's beliefs.
Note that, for unitary values of the trust functions, the expression (3) also has a conditional probabilistic interpretation: with some probability, the agent does not change his state (opinion); with the probability b i , the state becomes equal to the control and with the probability c i , to his action, etc.
Let us present and discuss practical interpretations of the five terms on the right-hand side of the expression (3). According to (3), the state r t i of agent i in period t is a linear combination of the following parameters: Figure 1); II.
his action y t−1 i in the previous period (t − 1) (arrow no. 6 in Figure 1); III.
the actions y t−1 −i and, generally, the activity results z t−1 −i of other agents in the previous period (t − 1) (arrows no. 11 and 9 in Figure 1, possibly indirect influence via the agent's activity result); IV.
the activity result z t−1 in the previous period (t − 1) (arrow no. 7 in Figure 1); V.
the external impact (control) u t i applied to him in period t (arrow no. 1 in Figure 1). Thus, the model (2)-(3) embraces both external (explicit) and internal (implicit) informational control of decision-making.
An example is the interaction of group members in an online social network. Based on their beliefs (states), they publicly express their opinions (assessments or actions) regarding some issue (phenomenon or process). In this case, the collective decision (opinion or assessment) may be, e.g., the average value of the expressed assessments (opinions). Some agents can apply informational control (without changing their states and actions); some honestly reveal their beliefs in assessments; some try to bring the collective assessment closer to their beliefs. The beliefs of some agents may "drift," depending on the current actions (both their own and other agents), control, and (or) collective assessment.
An equilibrium y * i (a, . . . , a) = r * i = a ∈ [0,1], i ∈ N, is called unified: the final decision and all states and actions of all agents are the same.
Under Assumptions 1-9, we have the following result: ). Let Assumptions 1-9 hold, and let all constant degrees of trust and trust functions be strictly positive. Without any control (b i = 0, i ∈ N), a fixed point of the dynamic system (2) and (3) is the unified equilibrium.
Really, substituting the unified equilibrium into the expressions (2) and (3), we obtain identities: the unified equilibrium satisfies (1) due to the properties of the utility function (see Assumption 3).
The unified equilibrium of the dynamic system (2) and (3) always exists, but its domain of attraction does not necessarily include all admissible initial states and actions. Moreover, it may be nonunique. Therefore, the properties of equilibria of the dynamic system (2) and (3) should be studied in detail, focusing on practically important particular cases.

Particular Cases
Several well-studied models represent particular cases of the dynamic model (2) and (3). Let us consider some of them; also, see the survey in [2].

Models of Informational Control
Models of informational control [29], in which the agent's opinions evolve under purposeful messages, e.g., from the mass media. In these models c i = d i = e i = 0, i ∈ N: The agent's state dynamics model (6) was adopted in the book [29] to pose and solve informational control problems.
The dynamics of opinions, beliefs, and attitudes of a personality can be described by analogy; see a survey of the corresponding models of personality psychology in [1,21].

Models of Consensus
Models of consesus (see [29] and surveys in [23,31]). In this class of models b i = c i = d i = 0, and each agent averages their state with the states or actions of other agents: In other words, the expression (3) takes the form: where the elements of the matrix e ij (the links between different agents) satisfy the condition ∑ j∈N\{i} e ij = 1, i ∈ N.
The existence conditions of equilibria can be found in [23,29].

Models of Conformity Behavior
Models of conformity behavior (see [9,11] and a survey in [28]). In this class of models, b i = c i = d i = 0, e i = 1 and each agent makes a binary choice between being active or passive (A i = {0; 1}). Moreover, his action coincides with his state evolving as follows: where ς i ∈ [0,1] is the agent's threshold. The agent demonstrates conformity behavior [9,11]: he begins to act when the weighted share of active agents exceeds his threshold (the weights are the strengths of links between different agents). Otherwise, the agent remains passive. The dynamics of conformity behavior (6) were studied in the book [28].
In the models of informational control, consensus, and conformity behavior, the main emphasis is on the agent's states: his actions are not considered, or the action is assumed to coincide with the state. [13,16]). On the one hand, the models of informational control, consensus, and conformity behavior can undoubtedly be attributed to the models of social influence. On the other hand, the general model (3) reflects other social influence effects known in social psychology, including the dependence of beliefs, relationships, and attitudes on the previous experience of the agent's activity [20][21][22].

Models of social influence (see a meaningful description of social influence effects and numerous examples in
Similar effects occur under cognitive dissonance: an agent changes his opinions or beliefs in dissonance with the performed behavior, e.g., with the action he chooses (see arrow no. 6 in Figure 1). In this case, an adequate model has the form: ). Within this model, the agent changes his state depending on the actions chosen. Another example is the hindsight effect (explaining events by the retrospective view, "It figures"). This effect is the agent's inclination to perceive events that have already occurred or facts that have already been established, as obvious and predictable, despite insufficient initial information to predict them. In this case, an adequate model has the form: ). Within this model, the agent changes his state depending on the activity result (see arrow no. 7 in Figure 1). The two models mentioned were considered in detail in [2].

Model of Voting
Consider a decision-making procedure by simple majority voting. Assume that the agents report their true opinions (actions) y t i ∈ {0; 1}: they either support a decision (y t i = 1) or not (y t i = 0). (Truth-telling means no strategic behavior.) The decision (the result of collective activity) is accepted (z t = 1) if at least half of the agents voted for it; otherwise, the decision is rejected (z t = 0): z t = I ∑ Agent i has a type (opinion or belief) r t i ∈ [0,1] reflecting his inclination to support the decision. Assume that the agent chooses his action depending on his type: Let the dynamics of the agent's type be described by the procedure: where u t i ∈ [0, 1] is the control (i.e., informational influence via mass media, social media, or personal communication), and the nonnegative constant degrees of trust (b i , c i , d i ) satisfy the constraints: (Also, see the expression (3)). Due to relations (8), the state of the dynamic system (7) stays within the admissible set [0,1] n .
According to the expression (7), the type r t i of agent i in period t is a linear combination of the following parameters: the external impact (control) u t i applied to him in period t; iii.
his action y t−1 i in the previous period (t − 1) (a change in the agent's type due to mismatch with the action chosen can be treated as the cognitive dissonance effect); iv.
the activity result z t−1 in the previous period (t − 1) (a change in the agent's type due to mismatch with the collective decision can be treated as conformity behavior).
Within this model, an active system is controllable if the action of any agent can be changed to the opposite in finite time using admissible controls according to (7).
Let {r 0 i ∈ [0, 1]} be given initial types of all agents. Consider different modifications of the model (7), as described in Table 2. Table 2. Modifications of model (7).

Modification
Control Cognitive Dissonance Conformity Behavior Modification 1 corresponds to no influence on the types of any agents. In these conditions, the types are static: r t i = r 0 i , t = 1, 2, . . . , i ∈ N. Modification 2. Here the expression (7) takes the form Proposition 2. In modification 2 with b i > 0, i ∈ N, the system (7) is controllable. For u t i ∈ {0; 1} and b i > max Lower bounds for constants {b i } in propositions 2, 4, 5, and 6 characterize minimal "strength" of informational control or minimal trust in the source of the control information to provide the system's controllability.
Modification 3. Here the expression (7) takes the form: In this modification, the types of agents vary, but their actions and activity result are stationary: y t i = y 0 i , z t = z 0 , t = 1, 2, . . . , i ∈ N. The agents become increasingly convinced of the correctness of their beliefs and initial action.
Modification 4. Here the expression (7) takes the form: In this modification, the types and actions of agents vary, but the activity result is stationary: z t = z 0 , t = 1, 2, . . . , i ∈ N. The prior majority of agents do not change their actions and, affecting those who prefer another alternative, gradually draw the latter to their side.
Modification 5. Here the expression (7) takes the form: Writing the monotonicity condition for the agent's type depending on the control goal, we easily establish the following result. Proposition 4. In modification 5 with b i > c i , i ∈ N the system (10) is controllable. Modification 6. Here the expression (7) takes the form: Writing the monotonicity condition for the agent's type depending on the control goal, we easily establish the following result: Proposition 5. In modification 6 with b i > d i , i ∈ N, the system (11) is controllable.

Modification 7.
Here there is no control, and the expression (7) takes the form: In this modification, the types of agents and, generally speaking, their actions vary, but the activity result is stationary: z t = z 0 , t = 1, 2, . . . , i ∈ N. The prior majority of agents do not change their actions and, affecting those who prefer another alternative, possibly gradually draw the latter to their side (depending on the relation between the parameters c i and d i ).
Modification 8. Here the type dynamics are described by the general expression (7). Writing the monotonicity condition for the agent's type depending on the control goal, we easily establish the following result: Proposition 6. In modification 8 with b i > 3 (c i + d i ), i ∈ N, the system (7) is controllable.
Concluding this subsection, we also mention an interesting modification of the procedure (7): no control and anti-conformists (the agents choosing actions to obtain a result different from the majority's one): Example. Consider an illustrative example of three agents with the initial types r 0 1 = 0.3, r 0 2 = 0.6, and r 0 3 = 0.4 Assume that the cognitive dissonance effect is absent (c i = 0, i = 1, 3). The first agent does not change his type: d 1 = 0. The second and third agents are anticonformists: d 2 = 0.1 and d 3 = 0.1. The dynamics of the agents' types (second and third agents) and activity result (unstable!) are shown in Figure 2.
do not change their actions and, affecting those who prefer another alternative, possibly gradually draw the latter to their side (depending on the relation between the parameters ci and di).
Modification 8. Here the type dynamics are described by the general expression (7). Writing the monotonicity condition for the agent's type depending on the control goal, we easily establish the following result: Proposition 6. In modification 8 with bi > 3 (ci + di), i N ∈ , the system (7) is controllable.
Concluding this subsection, we also mention an interesting modification of the procedure (7): no control and anti-conformists (the agents choosing actions to obtain a result different from the majority's one): Example. Consider an illustrative example of three agents with the initial types

Model of Informational Confrontation
Consider three agents: the first and second agents perform informational control (choose controls as their actions), affecting (due to the informational influence) the type (internal state-opinion or belief) of the third agent. The common activity result for all agents is the state of the third agent by a terminal period T.

Model of Informational Confrontation
Consider three agents: the first and second agents perform informational control (choose controls as their actions), affecting (due to the informational influence) the type (internal state-opinion or belief) of the third agent. The common activity result for all agents is the state of the third agent by a terminal period T.
Let the opinion r t of the third agent in period t be a linear combination of his opinion and the opinions of the first and second agents in the previous period: 2 . (All opinions have the range [0, 1).) Assume that the goals of the first and second agents are opposite (the first one is interested in turning r t to state "0", while the second one-to state "1") and their states are invariable: r t 1 ≡ 0, r t 2 ≡ 1. Interpretations of agents states are the same as in Section 4 above. If, in each period, the agents exchanged their opinions (true states), the opinion dynamics would be The controls of the first and second agents are to inform the third agent about their opinions in some periods. Therefore, we have: The sets of admissible actions have the form y t i ∈ {0; 1}, i = 1, 2, (such controls are called binary). Then y t i = I y t i = 1), i = 1, 2 . Substituting r t 1 ≡ 0, r t 2 ≡ 1, we arrive at the following state dynamics of the third agent: where b 1 + b 2 ≤ 1 and r 0 is a given initial state. (Also, see the expressions (3) and (7) above.) Let the first agent be interested in minimizing the terminal state r T , whereas the second in maximizing it. Note that the consumption of resources and other costs are not included in the goal functions.
In a practical interpretation, the state of the third agent (his opinion, belief, or attitude towards some issue or phenomenon) is reduced by the first agent and increased by the second. There is an informational confrontation between the first and second agents, described by game theory. In the dynamic case considered below, we have a differential game; static models of informational confrontation and models of repeated games can be found in [28,29].
According to (12), the combinations, presented in Table 3, are possible in each period. Table 3. The combinations of each period. y 1 = 0 y 2 = 0 ∆r t = 0 (the state of the third agent is invariable) In the latter case, the state of the third agent has a nonnegative increment if b 2 ≥ b 1 r t−1 1−r t−1 . A differential counterpart of the difference Equation (12) has the form: .
Assume that the actions of the first and second agents are subjected to the integral resource constraints (i.e., resources for customized publications in mass media or posts in social media, advertising costs, etc.) First, let us study several special cases. Case 1 (control applied by the first agent only). Substituting y t 2 ≡ 0 or (and) b 2 ≡ 0 into (13), we obtain the differential equation . r(t) = −b 1 y 1 (t) r(t). Due to the constraint (14), the solution r(t) = r 0 exp {−b 1 t 0 y 1 (τ) dτ} yields the estimate r(T) = r 0 exp {−b 1 C 1 } of the terminal state, which is independent of the trajectory y 1 (t).
Case 2 (control applied by the second agent only). Substituting y t 1 ≡ 0 or (and) b 1 ≡ 0 into (13), we obtain the differential equation . r(t) = b 2 y 2 (t) (1 − r(t)) . Due to the constraint (14), the solution r(t) the terminal state, which is independent of the trajectory y 2 (t).
Case 3 (unlimited resources, both agents choose the actions y t 1 ≡ 1,y t 2 ≡ 1 in all periods). In this case, Equation (13) takes the form: .
The solution is given by: The characteristic time is τ 0 ∼ 3 b 1 +b 2 , and the asymptotic value is r ∞ = b 2 b 1 +b 2 . Now, we return to the general case (13). Let c i (t) = Consider the differential zero-sum two-person (antagonistic) game in normal form [32,33] of the first two agents. At the initial time instant of this game, the first and second agents choose their open-loop strategies y 1 (t)| T t=0 and y 2 (t)| T t=0 , respectively, once, simultaneously, and independently of one another.
Further analysis will be restricted to the class of strategies with a single switch. In this class, at the initial time instant, the first and second agents simultaneously and independently choose some instants t 1 and t 2 , respectively, when they start consuming their resource (apply controls) until complete exhaustion. Therefore, the open-loop strategies have the form: The functional (17) monotonically decreases in c 1 (·) and increases in c 2 (·). Hence, the first and second agents benefit from consuming the entire resource, and consequently, t 1 ≤ T − C 1 and t 2 ≤ T − C 2 .
There are four possible relations among the parameters C 1 , C 2 , and T. The first relation: T ≤ min{C 1 ; C 2 } (both agents have enough resources).
Here the Nash equilibrium strategies are: ∀t ∈ [0, T] y t i ≡ 1, i = 1, 2, due to the monotonicity mentioned above.
The second and third relations: for some i = 1, 2, C i ≥ T i and C 3−i < T i .
Here, for agent i, the optimal strategy is: ∀t ∈ [0, T] y t i ≡ 1. For agent (3 − i), the optimal switching instant t 3−i is the solution of a scalar optimization problem. The case t 3−i = T − C 3−i is of practical interest. Note that the binary control is optimal under the constraints y t i ∈ [0, 1], i = 1, 2, due to the linearity of (13) in the controls. The fourth relation: T > max{C 1 ; C 2 } (both agents lack resources). Here the agents play a complete game. If τ 0 min{C 1 ; C 2 }, then the equilibrium of this game is t * 1 = T − C 1 , t * 2 = T − C 2 . Therefore, both agents start spending resources as late as possible, and the terminal value is r(T) ≈ r ∞ . The same pair of strategies will be an equilibrium for T C 1 + C 2 (when the quantities of resources are such that the controls are short-term on the scale of the period T). Practical interpretation is "save all reserves until the last decisive moment".
Hence, the results of this section give optimal strategies of the first two agents and characterize the equilibrium of their informational confrontation.

Conclusions
The main result is a general model (1)-(3) of joint dynamics of agents' actions and internal states, depending as on previous actions and states, as on the environment and the results of activity (see Figure 1). It allows combining methods and approaches of various decision-making paradigms, game theory, and social psychology to external and internal aspects of collective strategic decision-making.
Many known models and results of the above-mentioned scientific domains-reflecting the effects of consensus, threshold behavior, cognitive dissonance, informational influence, control, and confrontation-turn out to be the particular cases of the general model.
Three main directions seem prospective for future researches. First, the analysis of the general models in order to explore maximally general but analytical conditions for equilibrium existence, uniqueness, and its comparative statics. Second, generating new particular/applied models of collective activity and organizational behavior and management, taking into account not only "economical" rationality but psychological aspects as well. The third direction is the field of model identification and verification to put them closer to reality and practical applications.