Modeling Dynamic Decision-Making of Virtual Humans

Imagine a person visiting an urban event. At each moment in time, the person has to weigh up different possible actions and make consecutive decisions. For instance, a person might be hungry or thirsty and would therefore like to go somewhere to eat or to drink, or a person might need to go to the toilet and thus go searching for the restrooms. Other possible desires might be to go dancing or to have a rest due to exhaustion. All these examples can be seen in the context of dynamic decision-making. To be able to implement the dynamic decision-making of virtual humans living their lives in a persistent microworld, an advanced concept to solve this—in artificial intelligence research commonly called action selection problem—is required. This article focuses on an novel approach to model the activation of motivations—as an attempt to answer the recurring question of the virtual humans “What to do next?”. The novelty is to use System Dynamics, in general defined as a top-down simulation approach, from the bottom-up inside each instance of the agent population and to implement an action selection mechanism on the basis of this methodology. This approach enables us to model the dynamic decision-making of the virtual humans with stocks and flows resulting in nonlinear motivation evolution. A case study in the context of an urban event documents the application of this innovative method.


Introduction
In General, dynamic decision-making (DDM) focuses on the question how people make repeated decisions in complex environments that change over time-due to actions of the decision-maker(s) and due to occurrences and changing system-conditions of the system under consideration [1,2].As most real-life situations are not so much a matter of static one-time decisions [3], but rather a seemingly endless chain of dynamic time-to-time-decisions, feedback affects in decision-making are of fundamental importance in DDM-research, in terms of how previous decisions affect future decisions [4].
The aim of dynamic decision-making research is to investigate how people make decisions in complex real-world environments, to observe how the environment responds to actions and how-depending on the context-different strategies serve to achieve certain objectives.Figure 1 depicts that experimental actions in the real world are often risky, possibly expensive, dangerous or even unethical.An alternative is to conduct computational pretests in a commonly called microworld that reacts to possible actions just like the real world would react, in order to examine different dynamic decision-making strategies in this risk-free modeling and simulation environment.For that reason, DDM-researchers conduct computational laboratory experiments to investigate how possible decisions will affect the complex system in question.Mapping the real world to a microworld enables to investigate dynamic decision-making on a risk-free basis [5].
The goal of DDM-research is to systematically investigate different key characteristics of complex dynamic systems.The characteristics that were identified by Gonzalez [4] are envisaged in the following.
Dynamics.A dynamic system underlies continuous changes and the state of the system is dependent on the previous state of the system [6].In dynamic systems, there is autonomous evolution.The dynamics within a system result from positive (escalating) and negative (balancing) feedback processes that lead to amplified, oscillating and delayed behavior responses [7].
Complexity.A system is regarded as complicated if a high number of components is embedded in the observed system.A system is regarded as complex if these different components feature internal states that change over time and if the different components interact with each other to high degree [8].The degree of complexity increases with the amount of components, the number of interconnections and the number of different relationship types among these interconnections.In consequence, complex systems may lead to unintended and sometimes counter-intuitive consequences [9].
Opaqueness.In complex systems, some parts of the system remain invisible for the outside observer [2].The number of observable aspects tends to increase with the degree of systemic complexity.One the one hand, an observer might not possess the right senor instrument to get the required information-and on the other hand, the observer is biased about which aspects he focuses on and how much attention he pays to the different aspects he is observing [10].
In the next subchapter, the relatedness of agent-based simulations for dynamic decision-making is discussed.Further, the subchapter serves to introduce some DDM-related microworld examples and to explain the motivation of the case study.

Agent-Based-Modeling and Dynamic Decision-Making
Since the 1990s, agent-based modeling (ABM) has been an increasingly popular modeling approach to simulate complex systems, especially in connection with designing synthetic software agents that interact with each other and live their lives in computational microworlds [11].Microworlds enable scholars to conduct experimental studies to investigate dynamic decision-making and complex problem solving tasks [12].According to Gonzales [13], ABM is useful to kill two birds with one stone in the context of DDM.On the one hand, it enables investigations of how decisions of the synthetic agents change the system behavior from the bottom-up perspective, while, on the other hand, a planner or manager can alter an environment to improve a system's behavior from a top-down control perspective.In other words, ABM helps to understand how people actually make decisions in a dynamic real world environment (naturalistic decision-making [14]) by the necessity to make the behavior of computational agents explicit.At the same time, using a microworld as an advanced management flight simulator will enable a planner to investigate the system on a risk-free basis to achieve objectives.Both aspects are fundamentally relevant for DDM-research [13].
The first assumption for most agent-based microworlds is that micro-motives of low-level agents cause behavioral patterns on a macro-scale, as depicted in Figure 2 [15].The second assumption is that an understanding of the micro-motives is necessary to explain the emerging macroscopic phenomena.
DDM-related agent-based simulations cover a broad range of macroscopic patterns in several fields of application, and they may be related to ecological, social or economic issues.An example for such an emerging phenomenon on the macro-scale is a traffic congestion spreading in the opposite direction than the driving direction of a highway.This phenomenon is caused by delayed reaction-times of the drivers at the front of the congestion [16].In other words, the macroscopic pattern emerges from these delayed microscopic responses.Using ABM, the phenomenon can be explained in detail by modeling a traffic congestion and incorporating the delayed behavior responses into the agents.
The same is feasible for the Beer Game experiment, by computerizing it [17].Erroneous ordering policies of the agents in the supply chain lead to systematic delays in the upward-directed flow of information and the downward-directed flow of materials within the supply chain, resulting in misinterpreted limited information for the different decision-makers controlling the stocks in the supply chain [18].This lack of knowledge concerning the overall system state is responsible for the erroneous agent-based ordering policies.In this example, it are also specific micro-motives concerning the beer-orders at each distribution echelon that are responsible for the unintended overall system behavior.The literature features several other microworld-related case-studies that were designed to improve dynamic decision-making in complex environments in which macroscopic phenomena occur.One very early study describes the Funges Eater Game [19].In this example, humans try to control a robot that searches for fuel on a hypothetical planet.Other DDM-related case studies focus on DDM-related tasks in domains such as agricultural land use [20], disease outbreak prevention [21], transport logistics [22], urban evacuation [23], firefighting [24], prey and predator dynamics [25], health-care management [26], cash flow tasks [27], supervisory control [28] or dynamic problems resulting from human and social behaviors [29].
The next subchapter focuses on the motivation for the conducted case study in the area of urban event management.

Motivation for the Event Management Case Study
Managing an urban event successfully is a wicked problem, because of the complexity of interdependencies [30].Wicked problems like this are difficult to manage due to the occurring uncertainties.In case of urban events, these uncertainties result from the huge amount of individuals-each a complex system by itself, the massive amount of occurring interactions between the individuals, the occurring reinforcing and balancing feedback dynamics as well as the opaque and unobservable processes for the involved managers and planners.The Love Parade disaster in Duisburg, Germany, in the year 2010, resulting in 21 deaths and more than 700 injuries, [31] and other tragic crowd disasters contributed to the demand for dynamic decision support systems and the pre-evaluation of event locations.This is one main motivation for this work.
Building a microworld of an event and the execution of pretests in the created simulation environment has the advantage to enable examination of different research questions.These are, for example, in respect of investigating overall evacuation times (the time it takes to evacuate all visitors in case of an emergency situation), the optimization of human resource allocation (e.g., waiters and other service personal) and the consideration to improve the event layout (where to place which event facility) to enhance the comfort for the visitor.Further practically oriented research question are: • Which corridors at the event have high potential for jams and congestions?

•
Which measures can be taken to distribute the people at the event more uniformly to avoid high density conditions?• How many toilets are needed?• How much service staff is needed at the northern bar to avoid exceeding an average waiting time of five minutes?

•
How to design the time table at the end of the event to avoid batch departures?
In short, a microworld of an urban event helps to tackle mismanagement issues by enabling the planner to get control over the environment.Enabling the ability to get control of the research environment under consideration with the implemented microworld is a key aspect of DDM, as Gonzalez concludes: "To bolster our knowledge of dynamic tasks, microworlds must provide the characteristics of DDM environments and facilitate researchers' control over these environments."[13].
For a realistic simulation of an urban event such as a music festival, it is necessary to embed realistic virtual humans as agents within the simulated microworld.Realisitc in terms of dynamic decision-making and spatial movement.The design of autonomous virtual humans requires to solve the commonly called action selection problem, by creating a mechanism responsible to generate decisions at each moment in time for each virtual human [32].This action selection problem is widely discussed in the scope of artificial intelligence research, and is highly related to DDM-research.Therefore, the aim is to remove cross-community barriers and to benefit from both research areas to build a better decision architecture.

Dynamic Decision-Making as the Action Selection Problem of Artificial Intelligence Research
As stated, the dynamic problem of DDM-research is closely associated with the action selection problem (ASP) of artificial intelligence research, addressing the question for virtual agents such as robots or autonomous virtual humans: "What to do next?"[33] The problem of action selection can be defined as "how to choose, at each moment in time, the most appropriate action out of a repertoire of possible actions" [34].Alternatively, the ASP can also be described as a problem of time allocation.In the context of virtual humans within a microworld, each virtual agent has to decide how to allocate the available time to satisfy many different needs.Taking into account the three layer model that is rooted in robotic research, action selection is separated from navigation and locomotion.

Three Layer Model
The three layer model (see Table 1) provides a logical architecture to computationally design agent-based simulations in which agents, such as virtual humans, can select different actions while moving in time and space under given locomotion constraints.In other words, the three layer architecture allows to hierarchically structure the behavior of synthetic agents that change their motivation and satisfy different needs by navigating in and interacting with the environment.A distinction by means of different levels facilitates qualitative understanding.Further, distinguishing different layers helps to accomplish the technical aspects of the computational simulation.
Table 1.The Three Layer Model as the logical architecture of agent-based models in which agents dynamically make decisions and move in time and space.

Level Layer Meaning
Hoogendorn and Bovy [35] Blumberg [36] Reynolds [37] Strategic Level Motivation Layer Action Selection Layer Implements basic strategies, goals and objectives, thus the action selection of the virtual humans.

Tactical Level Task Layer Navigation Layer
Implements the wayfinding behavior of the agents.Further distinction by Kapadia [38].
Navigation: Detection of global collision-free path.
Steering: Movement of the agent along the path by avoiding static and dynamic obstacles.

Operational Level Motor Layer Locomotion Layer
Constrains the body movements of the agents in consideration of the performed action (e.g., walking, running, talking, etc.).
Regarding decision-making processes, Hoogendoorn and Bovy [35] distinguish three different levels, namely the strategic level, the tactical and the operational level.On the strategic level, actors come to fundamental decisions in view of planed basic activities.On the tactical level, individuals make decisions in terms of destination control and route choice to execute planned activities.The lowest operational level takes constraints in respect of mobility into account (e.g., driving in a wheel chair has different mobility constraints than walking).
Blumberg [36] and Reynolds [37] distinguish three different layers of dynamic decision-making.These layers are distinguished in order to effectively embed autonomous agents in a virtual reality.In other words, these different layers are used to structure the agents' behavior in agent-based simulations in terms of architectural organizing and the coding of the software.Rooted in the domain of robotic research, Blumberg differentiates between the motivation layer, the task layer and the motor layer.Reynolds' layers are, similar to the Blumberg's concept, named as action selection layer, navigation layer and locomotion layer.The first layer (action selection) implements fundamental motivations, goals and objectives that determine dynamic decision-making of agents over time.The second layer (navigation) implements the way-finding behavior of the agents.On this layer, Kapadia [38] differentiates between navigation and steering.While navigation addresses the aspect of finding a collision-free global path, steering ensures that the agents are able to move along the global path by avoiding static and dynamic obstacles.On the lowest layer (locomotion), constraints resulting from the agents' type of movement are taken into account.In the scope of visualization, locomotion determines the animation sequences of the virtual characters.
The main focus of this paper is on the first layer-the action selection layer.In respect of the simulation of pedestrians, there exist different approaches for the implementation of the second layer, such as lattice gas models [39] that are based on the Boltzmann equation or the Navier-Stokes equations, cellular automata models in discrete space [40] and the social force model in continuous space [41], plus network-based approaches [42] and hybrids.The third layer is related to the animation of the scene, as it implements the motoric and kinesthetic constrains of movement.Post-visualization software based on the Unity game engine is used to animate the virtual humans.Results will be shown later in the case study.
Some introductory thoughts about different fundamental assumptions and modeling attitudes are discussed in the next subchapter.

Modeling People
The human organism is a complex system-with a brain consisting of more than 10 12 neurons and more than 10 15 synapses [43].With electroencephalograms (EEGs) neuroscientists are able to the detect chaotic time signals in the brain system [44,45].Taking this neuroscientific perspective into account, it should be kept in mind that modeling decision-making of virtual humans is always a task of abstraction.
Thorngate's postulate of commensurate complexity states that "it is impossible for a theory of social behaviour to be simultaneously general, simple or parsimonious, and accurate" [46].Gergen [47] adds that "the more general a simple theory, the less accurate it will be in predicting specifics".Weick [48] interprets this issue in such a way that there is always a trade-off between Thorgate's three different virtues.Thus, he describes research concerning social behavior as a clock, with twelve o'clock representing general virtues, four o'clock representing accurate virtues and eight o'clock for simple virtues.Research attempts can try to fulfill two of these objectives, but will fail in trying to fulfill the third.For example, six o'clock research that aims to be accurate and simple will fail to be general in its results.
Taking into account the never ending debate on how rational or irrational people behave, make decisions and select actions over time, there are basically two contrasting attitudes towards the modeling of virtual humans.On the one hand, the basic assumption of the rational actor modeler is that actors possess all the required information to select an action and to maximize some kind of utility in doing so.In these types of models, the agents can draw on perfect models of their environment and never systematically err [49].As this approach is based on a strong simplification of reality, it falls into the category of ten o'clock or six o'clock research, aiming to be accurate or general while aiming to build a simple, but smart model.On the other hand, there is also a range of models that do not assume rational actors, based on the experience that, in reality, people do not always make optimal decisions.Scholars who work on this kind of models emphasize that humans are bound by cognitive capabilities, limited information and time when it comes to making the best decision and to perform the most suitable action [50].Studies have shown that preferences might be intransitive [51] and that even simple models might outperform individuals in decision-making [52].Behavioral modeling aims to collect all information about how people actually make decisions and how they are biased in their action selection behavior, and to incorporate this knowledge into an inevitably complex behavioral model.
In respect of solving the action selection problem, so as to decide how virtual humans dynamically make decisions, there is another approach apart from the rational and the behavioral one (compare with [53]).This approach aims to identify and incorporate a set of rules or mechanisms, earlier mentioned as the micro-motives, which describe how people actually do behave.A famous model for such a pure rule-based approach is Schelling's segregation model [15].In this model, the incorporated rule can be summarized as follows: If an individual feels overwhelmed being surrounded by people with different social characteristics, and if a certain threshold exceeds his feelgood value, at some point in time, the individual decides to move away.With this rule, the model can neither be classified as a rational actor model with some kind of utility maximization, nor does it incorporate complex behavioral modeling.Instead, the model builds on one essential rule, which is able to model the emergence of segregated neighborhoods in towns with diverse populations, resulting from collective dynamic decision-making processes.In summary, with such a rule-based modeling attitude in mind, it is not the aim to incorporate an image of reality within the agent, but rather to identify a set of rules and mechanism that is essential to solve the research problem.With this attitude, quantitative results are interpreted as useful benchmarks and not as precise values of reality [30].
In regard to the research questions of the last chapter, the aim is to identify rules and mechanisms that determine the dynamic decision-making or, respectively, the dynamic action selection of virtual humans-which has an impact on the whereabouts of these agents within a microworld representing an urban event as the one to be discussed later on in the case study section.Furthermore, the focus is set in this context on these kind of actions that lead to a spatial position change of the virtual event visitors.Furthermore, the scope within the DDM domain is set to naturalistic decision-making in view of the implemented actions.

Virtual Humans as Intelligent Agents
As discussed, for the design of autonomous virtual humans an action selection mechanism needs to be implemented in order to model dynamic decision-making.The criterion of autonomy is essential to create virtual humans as intelligent agents that live their own lives in a persistent microworld [32].Intelligent agents can be defined as autonomous entities that perceive their internal and external environment through sensors and act in their environment through effectors [54].Apart from sensing the environment-the outer world-the agent also senses its internal world in form of internal states that affect intrinsic motivation.Figure 3 illustrates the basic framework for designing an action selection mechanism of virtual humans living in a persistent microworld.
Autonomy is essential for virtual humans to be unique, to pursue their own goals, to be self-motivated and make decisions in an effective and coherent way [55].From this perspective, it is important that the action selection mechanism takes into account the current motivations of the agent as a result from its internal states.However, in addition, its essential to pay attention to the opportunities and demands coming from the environment as a consequence of what is currently happening around the agent [56].
The literature also contains different key criteria to fulfill the mentioned requirements and to increase the degree of autonomy.The behavior of the agent should be individual, motivational, reactive and proactive [32].Individuality requires distinct motivations that are defined and self-generated for each agent [57].Motivations are necessary for any cognitive system and are relevant for emotions [58,59].Reactivity is essential as it includes opportunity-driven and demanding behavior.Proactivity is essential to let the agents start to pursue their personal objectives [60].

Modeling Dynamic Decision-Making for Virtual Humans
This chapter focuses on the framework of the implemented action selection mechanism for virtual humans living in a persistent microworld.Each of the virtual humans selects different actions over time out of a repertoire of possible actions, causing a change of their actual spatial position.The dynamic decision-making process that leads to actions is based on the activation of motivations, or in other words, the dynamic change of preferences over time (internal stimuli) and the sensor information coming from the environment (external stimuli).To model the motivational activation processes, the approach uses the methodology of System Dynamics.As the field of System Dynamics conventionally focuses on modeling system behavior from a top-down perspective in order to map causal interdependences of aggregated stocks and flows [7,61,62], the research results so far suggest that it is promising and beneficial to apply System Dynamics inside each agent part of an agent population.This approach brings about new opportunities for the SD, DDM and ABM community.

Related Research
Before going into detail about how the ASM is implemented, some related research concepts are to be explained.

Busemeyer: Decision Field Theory
The decision field theory introduces a stochastics-based concept of how the preferences of a decision-maker evolve over time.The theory "provides for a mathematical foundation leading to a dynamic, stochastic theory of decision behavior in an uncertain environment" [63].The theory is based on a psychological model that was originally named field theory, later renamed to avoidance conflict model [64].Just like the rule-based concept discussed earlier, the aim of the theory is neither to formulate a logical formula of how preferences evolve over time for an ideal decision-maker (rationalist approach), nor to determine the behavioral principles of how preferences are obeyed (behaviorist approach).Instead, the purpose of the theory is "to understand the motivational and cognitive mechanisms that guide the deliberation process involved in decisions under uncertainty" [63].The decision-finding mechanism is based on four (or, respectively, five stages-provided that the final action selection is seen as a stage of its own): subjective expected utility generation, variability of subjective probability weights (valence difference), accumulation of preference states in the deliberation process, a random walk mechanism based on expected utility plus the subjective probability weights and finally a stopping rule controlled by a threshold.Figure 4 exemplarily illustrates the evolution of a decision-making process based on the decision field theory.

De Sevin: Activation of Motivations
De Sevin provides an action selection mechanism for autonomous virtual humans that is based on the activation of motivations over time [32].In this research, De Sevin introduces a case study in which a virtual human lives in an apartment where it is able to engage in different actions in this simulation environment.Such actions are-for instance-eating, drinking, exercising, resting, cleaning up, and so on.Hence, the research can be classified into the domain of naturalistic decision-making.In De Sevin's action selection mechanism, concepts, such as a hierarchical classifier system [66] and the free flow hierarchy [67], are included.Using hierarchical classifier systems allows reactive and goal-oriented behavior, because the rule base contains two different rules: external classifiers to generate actions and internal classifiers to modify the internal states of the agent.The use of a free-flow hierarchy is a concept that allows the execution of parallel actions that are compatible with one another (e.g., eating and drinking).For the activation of the different motivations, De Sevin uses a separate equation to describe the activation process of each possible motivation over time.To do so, the concept of hysteresis is used to keep part of the motivation from the previous iteration in each time step.Figure 5 gives an idea of how the action selection in De Sevin's work operates.From the graphs of the figure, it can be seen that the missing causalities between the different executed actions are one drawback of De Sevin's concept.Each activity is isolated from all other activities.This does not necessarily lead to an incorrect overall time allocation of a human living in an apartment in which the human executes different activities from time to time.However, taking a complex environment and a huge population of agents into account, the causality between different actions may be vital for realistic simulations and for understanding problematic patterns.Another drawback is the missing implementation of actions focusing on interactions among different virtual humans.

Schmidt: The PECS Model
The PECS model aims to simulate human behavior in social environments [68].The model is based on the Adam model coming from the same author [68].PECS is an acronym for physical conditions, emotional state, cognitive capabilities and social status.The model is based on the assumption that these factors influence social behavior.As shown in Figure 6, the different factors can be interpreted as internal state reservoirs.(Source: [68]) For the modeling of the internal state variables, Schmidt uses transfer functions.Thus, action selection once again depends on the activation-level of the internal states, so that it is always the most activated internal state that triggers the corresponding action.Schmidt's work mentions causal interdependences among the different internal states, but without explicitly linking them to specific levels of activity or motivation.Unfortunately, there is no case study documented that shows how decision-making mechanisms work in a simulation environment and what the results look like.

Silverman: PMF Reservoirs
Silverman et al., aim to improve the realism of socially intelligent agents and to use more suitable human performance moderator functions (PMFs) that are rooted in the behavioral literature.They hope to integrate existing PMF models that explain "physiology and stress, cognitive and emotive processes, individual differences, and group and crowd behavior" [69].A further attempt of the authors is to interoperate this kind of new era psycho-socio-physiologic models in the gaming industry to enhance the realism of gaming environments that are populated by human agents [70,71].Using modern game engines, it is indeed possible to create characters of high physical realism, modeled geometrically accurate, and moving around their environment in a kinesthetically natural manner.However, even game characters that show a high level of cognitive behavior can lead to an unfulfilling and shallow game experience [72].In other words, even modern computer games often show a lack of realistic behavior in terms of high level cognition and especially in terms of reasonable decision-making that leads to the actions that are visible for the users.
Figure 7 illustrates the reservoir concept based on the physiological module that is part of the agent-based decision-making architecture PMFserv.A reservoir, such as the energy store, is a form of memory that keeps track of some specific factors that influence the decision-making [69].The rates that determine the in-and outflows depend on other factors such as stress, injury or physical exertion.The figure shows a flow from a stomach reservoir to an energy store reservoir, but without making the rate of digesting dependent on the stomach reservoir.Given that the authors generally describe the reservoir behavior of PMFserv as linear, it seems that the authors did not incorporate any causal reservoir or stock to rate or flow causality, because that that would lead to nonlinearity.Hence, the described concept is, so far, limited to first-order stock and flow dynamics.All in all, it seems as if the authors are not familiar with stock and flow dynamics and that they could benefit from System Dynamics literature.The next chapter serves to describe the developed concept for an action selection mechanism for nonlinear motivation evolution.

The Concept
Taking the described three layer model into account, the aim is to create a decision architecture that implements an action selection mechanism for virtual humans.The assumption is that a navigation and locomotion layer exists.In other words, the concept is fully focused on high level action selection and thus omits questions that focus on navigation (e.g., which path an agent should choose) and locomotion (e.g., how an agent should use its body to move and to get somewhere).As stated, the concept is implemented based on a case study in the domain of urban event management.The explicit actions that are included in the repertoire of possible actions are the ones that mostly require a position change of the agent in an urban event environment (e.g., drinking, eating, dancing, resting, and so on).
The basic concept depicted in Figure 8 shows that each agent features a stock and flow model (SFM), affecting the agent's action selection.The building blocks for the SFM are the ones known from the methodology of System Dynamics: sources, sinks, stocks and flows, plus auxiliary variables and connectors.This set of building block serves as a perfect toolkit to describe nonlinear accumulation processes.The concept to implement a System Dynamics model inside an agent was previously proposed by Borshchev [5], Größler and Schieritz [73].In the next subchapter, the concept is introduced by drawing a behaviorist analogy.

Behaviorist Analogy to the Human Physiological Homeostasis
The analogy is used to elucidate that the human physiological system is essentially subject to accumulation processes that govern human behavior [74].It should be stressed that the aim is not to explicitly model the physiological homeostasis.Rather, it is to be assumed that the resulting behavior exhibits more realism if the model abstracts from the actual accumulation processes.
Accumulation processes inside a human are not limited to the processes of food or fluid intake or, respectively, the energy balance in general-as hinted in Figure 7. Apart from the neural network, accumulation dynamics are rather essential for dynamic decision-making in the brain, because the dynamics of neurotransmitters (e.g., adrenalin, endorphin, serotonin or adenosine) are based on accumulation processes and these neurotransmitters are very essential for the construction of mind [75,76], affecting human decision-making.However, stock and flow dynamics are of course also essential with regard to ingestion and excretion of food, water, toxic substances (e.g., alcohol) and in terms of other internal bodily fluid exchange processes.Physiological homeostasis means the self-regulation of these accumulation processes by several effective balancing mechanisms.Homeostasis aims to maintain equilibrium states in terms of avoiding different levels that exceed certain thresholds.
With regard to the dynamics of neurotransmitters, it is possible to assign a System Dynamics building block to each essential biochemical component involved in the process [77].
Chemo-Containers as Stocks.The purpose of a chemo-container is to store a substance.The substance inside the chemo-container is changed through chemo-pipes.On the one side, a chemo-emitter increases the amount, on the other side, a chemo-receptor serves to decrease the amount inside a chemo-container.In SD-specific methodological terms, a chemo-container is a stock that is at the mercy of flows.
Chemo-pipes are the connectors between chemo-emitters, chemo-containers and chemo-receptors.Thus, a chemo-pipe can exist between a chemo-emitter and a chemo-container, between a chemo-container and a chemo-container or between a chemo-container and a chemo-receptor.In SD-terminology, chemo-pipes are the in-and outflows that connect stocks with sources and sinks as well as stocks with stocks.
Chemo-Emitters as Sources.In biological terms, chemo-emitters generate neurotransmitters based on input signals.Hence, chemo-emitters influence how much of a substance is generated and accumulated via chemo-pipes in chemo-containers.For example, a dangerous situation leads to the release of adrenalin and other stress hormones such as cortisol, thus affecting the human decision-making process.In the scope of SD, chemo-emitters are to be seen as the sources.The initial source stock, the one from which the flow arises, lies outside the model boundaries.
Chemo-Receptors as Sinks.A chemo-receptor is able to absorb a substance of a particular type.In other words, chemo-receptors are responsible for the decrease of a substance inside a chemo-container, and they may generate signals based on the quantity of the absorbed substance.The signals can trigger other chemo-emitters and lead to the release of other neurotransmitters.In SD-terminology, chemo-receptors are synonymic to sinks.As a source points to the model boundary, so does a sink.

Activation of Motivations as an Abstraction of the Physiological Homeostasis
As mentioned before, the aim is neither to model the physiological homeostasis of the brain system, nor to use the stated terminology in the decision architecture.Instead, thanks to the abstraction, the autonomous virtual humans are able to self-regulate their motivations-leading to autonomous decisions and actions.In other words, the general concept of physiological homeostasis is used to satisfy the various different needs and motivations of each virtual human, by abstracting the agents' self-regulating internal state variables (e.g., hunger, thirst, etc.) from homeostatic mechanisms.To do so, the building blocks of the System Dynamics methodology are used.
To get to an idea how this can be done, a generic stock and flow model with three different internal state variables is shown in Figure 9.In this model, the repertoire of internal state variables contains hunger, thirst and the need to visit the toilet.These internal states are modeled as stocks that are regulated by according in-and outflows.While the inflow of hunger and thirst is causally dependent on the exhausting-factor of agent's current activity, the inflow to the need to visit the toilet stock is dependent on the outflow of the two previous stocks.The two double bars that are part of the two connector arrows characterize them as delayed causal relationships, here resulting to third-order exponential delays.The delay times are incorporated by delay parameters.While the inflows are regulated continuously, the outflows are triggered when a certain threshold is exceeded.Then, the virtual human decides to eat, to drink or to go to the toilet.The corresponding trigger values lead to a full release of the corresponding stock.As seen in the next chapter, this approach is even suitable to give the different virtual humans personal characteristics.

Modeling Internal Motivations of Event Visitors
The model above exhibits generic causality that is indisputably veritable for all humans, as all humans get hungry, thirsty, and therefore need to visit the toilet.If this model is extended in order to take other motivations into account, such as dancing or resting, it becomes more and more difficult to identify corresponding motivational causalities that can be verified for all humans.It is assumed that there is no universal causal model that is indisputably veritable for all humans if a high level of abstraction is chosen (remember Thorngate's postulate of commensurate complexity from Section 2.2).However, with this approach, it would be possible to model different types of characters based on psychological or sociological reasoning or under consideration of empirical data assessment or other research methods.But as it is not the aim to reach such a high degree of complexity here, because-taking the dynamic problem into account-the aim is rather to generate usable benchmarks for an event planner and not to conduct accurate behaviorist research by incorporating atomically complex behavioral models.Thus, in the conducted case study, a model for a unified virtual human was implemented.One could argue that this virtual human is some kind of stereotype character of a event visitor visiting a music festival.Later, however, it will be shown how close a simulation based on this character gets to reality, allowing to answer some of the research questions discussed earlier.
Figure 10 shows the stock and flow model, describing the activation processes of the motivations for the virtual event visitors.It includes types of motivations that, in general, lead to position changes in the urban event environment with regard to the case study that is envisaged.The internal state variables that influence the motivations are the stocks in red-including hunger, thirst, the need to visit the toilet, a mood to dance, the need for rest and tiredness stock.The first three internal states are modeled just as shown and described before.In addition, it is assumed that a proportion of the consumed beverages are alcoholic drinks.Further, it is assumed that an increase in blood alcohol leads to an increasing exuberance and, thus, the desire to go dancing.The submodel describing the alcohol consumption dynamics was, in its logic, adapted from [78,79].Finally, the more an event visitor engages in physical exertion, the more thirsty and hungry the visitor will get, the higher the need for rest and the faster tired the human will become, so the assumption, leading to these closed main feedback loops.Finally, when a certain threshold is exceeded in the tiredness stock, the virtual human decides to leave the event.
Figure 10.Apart from the internal states of hunger, thirst and the need to visit the toilet, the extended stock and flow also incorporates the motivations to go dancing, to take a rest and to leave the event due to tiredness.
Beside these intrinsic motivation activation processes, the action selection mechanism also takes environmental information into account.This sensory information is provided by implemented program functions.The return values of these sensory requests are normalized and used as multipliers to modify the intrinsic motivation activation levels.Basically, the following two functions affecting the preference order are implemented.

•
getDistance(): Returns the shortest air distance to the closest place where the activity can be executed.• getQueueLength(): Returns the number of waiting people in the service line(s).This takes into account how long the agent will have to wait before the activity can be executed.
These two functions demonstrate how to consider opportunities and demands coming from the environment.For example, if the distance to the stall that sells beverages is very far, the virtual human will first conduct an action that is on a similar activation level but spatially close.Otherwise, if the virtual human passes a food stand and the queue length is very short, the virtual human may spontaneously decide to buy food, as the food activation level is multiplied (opportunist behavior).
Taking into account the activation of motivations over time, together with considering the mentioned sensory information that is provided, the created event visitors reach a high degree of autonomy.All four previously mentioned characteristics, which are essential to increase the degree of autonomy, are met: The agent behaves individual, motivational, reactive and proactive.

Resulting Nonlinear Motivation Evolution
The action selection mechanism is used to demonstrate how virtual humans repeatedly make decisions over time.As described in the last chapter, the developed action selection mechanism takes both aspects into account: internal stimuli and external stimuli that influence the dynamic decision-making and, thus, lead to actions that will change the current position of the virtual human in the environment.The exemplary activation evolution inside a virtual human is shown in Figure 11.The diagram serves as a preview of the results of the conducted case study that is discussed in in the next chapter.Here, the diagram serves to close the description about the nonlinear motivation evolution based on the implemented agent-based decision architecture.The upper diagram shows the dynamic precedence changes for the six different activities for a single agent that moves about in the urban event environment.If the precedence for one activity exceeds a threshold on the normalized Y-axis, the agent decides to execute an action.In consequence, this leads to a drop of the motivation activation level, because the virtual human was able to satisfy his need.The initial values are randomly generated based on a normal distribution function.The eating precedence was relatively high right from the beginning, but the other motivations were activated at every further moment in time.As the tiredness stock is the only stock modeled without an outflow, the precedence of going home increases up to a point where the virtual human decides to leave the event.

Implementation of the Case Study
To test the decision architecture or, respectively, the action selection mechanism in view of a real-world urban event, a case study was conducted in view of the Back to the Woods music festival in Garching, Germany.The event took place in July 2014 and July 2015, both times with approximately 5000 visitors.A detailed description about the event can be found in [80].The simulation of the event was conducted with the Java-based software Anylogic.A detailed description about Anylogic can be found in [5]. Figure 12 shows Anylogic's working environment.A major iteration loop controls the agents' dynamic decision-making.Condition values are responsible for determining which path in the iteration loop is taken.The generation of virtual event visitors is conducted in the first pedestrian source building block in this structure.The input rate of visitors is adjusted to the measured empiric values of the 2014 event [80].All generated virtual humans are stored in an arraylist of the object type visitor.For each of the visitors, this allows to access the variables and functions (in technical terms) or, respectively, the decisions and actions (in logical terms).
To incorporate interaction with the environment, use was made of the various building blocks provided by the software.The spatial model boundary for the pedestrians were defined by using wall object polygons and rectangles.The targets for the different actions are defined by using polygonal area objects for the activities dancing and resting, and there are building blocks to represent service points with queues for the activities of drink and food consumption as well as visits to the toilet.
With regard to the three layer model, the navigation layer in Anylogic is based on the social force model (personal conversation with Vladimir Koltchanov (Anylogic Europe) at the 32nd International Conference of the System Dynamics Society, Delft, the Netherlands).

Collection of Key Parameters for Calibration and Validation Purposes
As empirical data is the key to simulate accurately, a large set of data was collected based on the video observations [80].Data from semi-open questionnaires will follow.Data regarding the internal states of the humans will be evaluated based on over 300 questionnaires.An investigation that is based on videos involves quite a lot of effort (see Figure 13), but leads to accurate results since the collected data can be double-checked.The following data is inserted into the simulation as a result of the conducted empiric data collection.

•
Average duration time for a person to get served (service time) and average duration time for a person to visit the toilet.

•
Length of waiting queues over time.

•
People entering and leaving.

•
Density assessment for different areas.From a managerial perspective, the average duration times and accurate figures concerning the number of people arriving and departing from the event seem to be most vital to increase the simulation's accuracy.This assumption is based on the consideration that the critical congestions in regard to action execution are at the mercy of average duration times.Many relevant questions are related to the maximum throughput of the various event facilities.This maximum throughput depends on firstly how many people can be served at the same time (amount of service personal respectively number of toilets) and secondly the average time it takes to serve one visitor (service time or, respectively, duration of stay).Once collected, the average duration times can be repeatedly used, if the boundary conditions do not change too much.In the case under consideration, it was known previously how many tickets had been sold-so the expected number of visitors could be estimated quite accurately.If this number is not known (if there was no previous event and if the number of sold tickets is unknown) an educated guess must be taken into consideration.With regard to the arrival distribution, an S-shaped arrival pattern (logistical growth) is a good hint.

Visualization of Simulation Results
Using Anylogic, the simulation can be visualized in 2D and 3D. Figure 14 shows a visualization of one experimental run.The 2D-environment shows the macro behavior of all virtual humans within the microworld from a top-down perspective.The people are added to and removed from the simulation based on a target line at the end of the bottom-left street in the simulation.The pedestrians walk into the entrance area based on the maximum possible throughput at the mercy of the ticket desk and the security checks.If the arriving inflow exceeds the maximum throughput, a queue is formed.For virtual humans that have entered the area, the action selection mechanism takes control over their high-level movement targets and, thus, determines where the agents decide to go.In 2D, the visualization of the human bodies is based on top-down human images, while the 3D-visualization features an example character.
The top-right corner of the figure shows the built-in 3D-visualization.The 3D objects were modeled to resemble the facilities of the real-world event [81].However, Anylogic's built-in visualization tool does not feature animated agents, which is why the 3D-visualization is not able to provide a realistic live animation of the scene.To get to better results, the trajectories and the geometry are exported from the Anylogic software and imported into a 3D-post-visualization software based on the Unity game engine.Figure 15 features screenshots from the 3D-post-visualization.While the picture on the left shows results from the software SumoViz [82], the right picture includes the 3D-objects of the event facilities as it is a screenshot from the newer software derivative with the name PedViz [81].As the tiny picture shows, this post-visualized animation can be compared to the videos from the real-world event.

Analysis of Simulation Results
Apart from assessing qualitative factors, a quantitative data analysis was conducted too.For this aim, two different views were created-one named pedestrian view and the other named management view.

The Pedestrian Perspective
The pedestrian view (Figure 16) provides various sets of information about the virtual humans' individual and collective activity preferences.Based on this view, it is both possible to assess information on how the internal states of each individual are changing as well as to monitor the internal states changes on an aggregate level.Furthermore, the decision control loop provides information about how often different activities were performed collectively.The upper left chart in the figure enables to show the internal states for each virtual human and, on aggregate level, the mean and the variance values of all virtual humans.In the bottom half of the figure, the small diagrams show the probability density functions (PDF) and the cumulative distribution functions (CDF) concerning the activation of the different motivations.These charts consider the whole agent population at the current time of the simulation.The plots of column one and three show how often the different activation levels occur within the agent population (PDFs as bar diagrams and CDFs as red line diagram; x-axis: level of motivation; y-axis: count in the population).The other diagrams in green (columns two and four) show 2D histogram envelopes capturing how the distribution of the activation levels behaves over time (x-axis: time; y-axis: distribution).The structure at the right shows the major iteration loop, providing information about how often which action was performed and, in even more detail, how often which event facility was visited.

The Management Perspective
The management view (Figure 17) provides more aggregated practice-related managerial information.The four different diagrams capture information concerning the queue length, the visitor count, the accumulated frequenting and the expected revenue over time.The upper left chart shows the dynamic change in regard to the length of the waiting queues in front of the different event facilities over time (occupancy information).The lower right chart illustrates how often the different event facilities were visited over time (throughput information).The upper right chart depicts how many visitors have arrived at and departed from the event, plus the number of currently present at the event.Finally, the bottom left chart provides information on the expected revenue, depending on how often the different event facilities were frequented as well as on the expected average shopping baskets.

Empirical Data Comparison
For validation purposes it is essential to assess the simulation results and to compare the results with data from the real world.There are many possibilities to assess the results.Apart from comparing quantitative values, it is also possible to qualitatively estimate whether the structure and behavior of the simulation makes sense.Since the reasons for designing the action selection mechanism the way it was build, have already been explained, this subchapter addresses a quantitative empirical data comparison.As stated before, more than 300 interviews with semi-open questionnaires were conducted in addition to the video recordings, aiming to collect real-world information about the internal states of the event visitors (hunger, thirst, comfort, etc.) as well as some other information.Analyzing this data will help to enhance the accuracy of the simulation results, so the assumption, and will be documented in a follow-up publication.
As it is a rather essential aspect for the event visitors to avoid long waiting queues in order to feel comfortable, an empirical comparison based on this issue is conducted.Data from monitoring the real-world waiting queues is compared to the simulation results.Figure 18 shows the length of the waiting queues measured in the real world.Figure 19 shows the length of the waiting queues from the simulation.If the graphs of the real-world data are compared to the graphs of the simulated data, it can be stated that, basically, the shape and the characteristics are very much alike.Even the peak values match quite well.Therefore, it is assumed that the incorporated decision architecture is able to generate useful benchmarks that are suitable for planners and managers of urban events.

Summary and Conclusion
In this article, dynamic decision-making was conceptualized as an action selection problem in the scope of artificial intelligence research.For autonomous virtual agents such as robots or virtual humans, the question "What to do next?"was addressed.The three layer model, which distinguishes action selection from navigation and locomotion, served to flesh out computerized microworlds, enabling to conduct risk-free experiments with agent-based modeling to examine how people repeatedly make decisions in a complex environment that changes over time.Different approaches towards modeling virtual humans were discussed, such as the rationalist, the behaviorist and the rule-based approach.The basic design of the action selection mechanism was introduced, drawing on literature about intelligent agents.As a result, the aim was to treat action selection as being governed both by internal stimuli depending on the internal states of the virtual humans as well as by external stimuli in the way of sensory information coming from the environment-in order to model individual, motivational, reactive and proactive behavior within the individual agents, desirably resulting in high autonomy.Before documenting the developed action selection mechanism, related research from Busemeyer, De Sevin, Schmidt and Silverman was discussed.
The concept was introduced with a behaviorist analogy to the human physiological homeostasis.The assumption was that a model that abstracts from these homeostatic processes will exhibit a more realistic behavior for virtual humans.The developed ASM, classified as naturalist decision-making ASM in the scope of low stake decisions, focuses on the activation of motivations over time, leading to self-motivated autonomous decision-making that is influenced by multiplier functions that accumulate information from the environment and thus result in nonlinear motivation evolution.Finally, a case study was documented, showing that the concept can be used in the domain of urban event management.This final part addresses the architecture of the Anylogic simulation, the collection of key parameters for the purpose of calibration and validation, the built-in 2D/3D feature and a 3D-post-visualization, plus the analysis of the stimulation results.On the one hand, the latter analysis part includes a pedestrian view to show statistics on an individual and on an aggregated level in respect of the action selection processes of the virtual event visitors.On the other hand, it includes a management view to collect more practically relevant data for a hypothetically involved planner or manager.Finally, a comparison between simulation data and data collected in the real-world was documented, based on the length of the waiting queues.
In conclusion, the developed approach turned out to be quite promising and encouraging.It is an exciting approach to use the concept of System Dynamics to model the decision-making of virtual humans that live their lives inside a microworld.Hopefully, the documented case study will lead to many different laboratory experiments to evaluate the dynamic decision-making of individuals that act-and interact-inside complex environments.The developed approach is suitable to abstract from homeostatic processes to self-regulating mechanisms of virtual humans.The approach covers causalities in decision-making, even leading to feedback loops within the decision architecture.Understanding the mechanisms behind human decisions-determined by micro-motives that influence action selection-is the key to understand how and when macroscopic phenomena will arise.According to [83], the highest leverage point to intervene in a system are the mindsets of the individuals, or, in other words, their decision architectures that lead to actions.From this point of view, it might be envisaged to develop policies to change the feedback logic based on the mindsets and, thus, to conduct research on such effects by conducting experiments-but that is another matter.
This article was inspired by various sources such as dynamic decision-making literature, literature about artificial intelligence with a special focus on the action selection problem, as well as literature from the areas of System Dynamics and agent-based simulations.In the future, it could be useful to remove cross-community barriers and to create a set of shareable resources.The question about how people make decisions in complex environments and how the environment responds is quite interdisciplinary.Ultimately, however, modeling the decision-making of virtual humans remains probably more an art than a science.
Limitations in the decision-architecture are at the same time aspects to be considered in future implementation attempts.On the one hand, it is planned to extend the sensors the virtual humans use to gather information from the environment in such a way that they will first explore the partially unknown environment and, after that, to assemble knowledge about discovered places in an agent's knowledge base.On the other hand, more attention will be paid to group interactions.It is planned to incorporate relationships between agents to be able to enhance the realism of individual behavior.All in all, the approach seems to have high potential-and from the point of research that has been done, more research in this direction is considered to be worthwhile.

Figure 1 .
Figure 1.Mapping the real world to a microworld enables to investigate dynamic decision-making on a risk-free basis [5].

Figure 2 .
Figure 2. Feedback loop in agent-based modeling (ABM).Microscopic motives of agents generate macroscopic patterns within the environment.These macroscopic patterns then feedback on the micro-motives.

Figure 3 .
Figure 3.The action selection mechanism is responsible for choosing an appropriate action at each moment in time.Internal stimuli are triggered by internal states in form of agent-intrinsic memory reservoirs, and the external stimuli result from the external senor information the agents receive from the simulated environment.The executed action feedbacks on both, on the internal states and on the environment in which the agent is living.

Figure 4 .
Figure 4.Each trajectory in the figure shows the preference for one risky prospect that can be chosen by the decision-maker.The shape of the graphs elucidates the random walk generation as part of the theory.(Source: [65])

Figure 5 .
Figure 5.Each graph in the figure shows the evolution of one motivation or a set of motivations over time.Hysteresis lets the activation levels increase over time and internal classifiers reset them after external classifiers led to an action execution.(Source: [32])

Figure 6 .
Figure 6.Schmidt defines a set of different states that affect decision-making.From the perspective of the System Dynamics methodology, the different internal states can be interpreted as reservoirs that are filled up and emptied over time.The mentioning of different causal interdependences among the different internal states can be interpreted as there may exists flows between different reservoirs.

Figure 7 .
Figure 7.The picture shows the physiology module of the performance moderator function (PMF)serv decision architecture.The authors describe the aim of the reservoirs to provoke alarms when a certain threshold is exceeded (e.g., hunger or fatigue).(Source: [69,71])

Figure 8 .
Figure 8.A stock and flow model operates in each instance of the agent population.

Figure 9 .
Figure 9.A stock and flow model aiming to generically describe self-regulation in view of the motivations eating, drinking and vising the toilet.

Figure 11 .
Figure 11.The chart illustrates the action selection mechanism in action.The six different motivations vary over time-as a result of firstly the stock and flow dynamics that change the internal states (internal stimuli) and secondly the multipliers processing the information from the environment (external stimuli).

Figure 12 .
Figure12.The working environment of Anylogic.On the left side, the projects view is folded out.The graphical representation of the simulation environment with all the different simulation elements can be found in the middle.On the right side, the properties view shows the several details for the selected element.

4. 1 .
The Architecture of the Simulation The simulation is organized within two major classes: 1.The visitor agent class contains of the action selection mechanism, which generates the internal stimuli.This class comprises (a) the System Dynamics building blocks; (b) the variables, parameters and condition values; (c) the functions to organize the action selection mechanism; (d) the statechart elements defining locomotion states, and a few more.2. The main class contains the major iteration loop that governs the decision-making, the functions generating the external stimuli multiplier values and several functions to export the simulation results, plus all elements of (a) the simulation environment; (b) the build-in visualization; (c) the pedestrian library building blocks; (d) the statistic objects and data analysis charts; (e) the data set collection and variables (f) the event handlers, (c) the 3D-objects in the *.x3d format as the result of a student project [81], and others.

Figure 13 .
Figure 13.An example of a scene from the video recordings.Collecting data like this is time-consuming, but it leads to accurate and valuable results.

Figure 14 .
Figure 14.The simulation view allows a 2D and a 3D visualization of the experimental simulation run.

Figure 15 .
Figure 15.Post-visualization of the simulation data with the software SumoViz (left) and PedViz (right) based on the Unity game engine.

Figure 16 .
Figure 16.The pedestrian view provides various sets of information about the virtual humans on an individual and on an aggregate level.

Figure 17 .
Figure 17.The management view provides several more practice-related sets of information.

Figure 18 .
Figure 18.Empiric data of the waiting queue length of the 2014 event.

Figure 19 .
Figure19.The queue length from the simulation.As the simulation is not a deterministic simulation, the pattern changes slightly from simulation run to simulation run.