Measuring Dynamics in Evacuation Behaviour with Deep Learning

Bounded rationality is one crucial component in human behaviours. It plays a key role in the typical collective behaviour of evacuation, in which heterogeneous information can lead to deviations from optimal choices. In this study, we propose a framework of deep learning to extract a key dynamical parameter that drives crowd evacuation behaviour in a cellular automaton (CA) model. On simulation data sets of a replica dynamic CA model, trained deep convolution neural networks (CNNs) can accurately predict dynamics from multiple frames of images. The dynamical parameter could be regarded as a factor describing the optimality of path-choosing decisions in evacuation behaviour. In addition, it should be noted that the performance of this method is robust to incomplete images, in which the information loss caused by cutting images does not hinder the feasibility of the method. Moreover, this framework provides us with a platform to quantitatively measure the optimal strategy in evacuation, and this approach can be extended to other well-designed crowd behaviour experiments.


Introduction
As one of collective behaviours under extreme conditions, the crowd congestion in case of emergencies is routinely related to disasters, such as clogging stampede [1][2][3]. It becomes significant to investigate collective patterns and individual behaviours in such cases. Furthermore, relevant researches could also help bridge the gap between individual decisions and collective behaviours under extreme conditions. To achieve this, many researchers have been investigating collective behaviours in simulations and experiments for decades [1][2][3][4][5]. In the evacuation scenario, irrational behaviours are inevitable in decisionmaking [6][7][8], in which diverse behaviour patterns emerge [9]. In principle, "rationality" could describe an optimal strategy that will bring a maximum payoff at both individual and whole levels in game theory [10,11], which can be quantified by, e.g., minimizing escape time in evacuation behaviour [8,12]. Henceforth in the paper, we try to measure the optimality of exit decisions related to escape time in evacuation.
Without sufficient information from environments or enough capacities in a close space, optimality of path-choosing decisions made by each individual will also depend on others' decisions, which can lead to the deviation from optimum strategies or generally introduce heterogeneous decision-makers [13][14][15][16][17][18]. Here, the deviation from optima is highly related to strategies of processing information [11,[19][20][21], which is regarded as one possible origin of the bounded rationality [10]. In our study, we specify one of crucial factors as processing heterogeneous information [12,[22][23][24][25]. It includes environment-related attributes and dynamics of surrounding pedestrians [6,26,27]. To combine them into a concrete case, we established a simulation model to describe a typical evacuation behaviour in a close space [12]. Although many works were applying macroscopic or microscopic models to study the evacuation behaviours [2,9,[28][29][30], to evaluate individual strategies, micro-models represented by Social force model, CA model, and magnetic field force model [31][32][33] could give a more accurate description of individual behaviours [34][35][36][37][38]. Thus, to measure the deviation from optimum decisions, we further build a deep learning framework based on the CA evacuation simulation, in which the optimality of exit decisions is quantified as a dynamical parameter in our CA simulations [12], and more details are shown in Section 2.1.
For the past few years, the development of sensor technology and the improvement of microchip computing power have been yielding unusually brilliant results in diverse fields. It makes things feasible that is collecting abundant data and using state-of-the-art machine learning methods to process them in evacuation behaviours [25,[39][40][41][42][43]. Deep learning (DL), a branch of artificial intelligence (AI), efficiently integrates statistical and inference algorithms and thus offers opportunities to uncover hidden structures of evolution in complex data and to describe it with finite dynamical parameters. Therefore, a combination of DL algorithms and spatio-temporal models for evacuation based on bounded rationality is a promising option. The existing researches mainly focus on applications of DL in designing evacuation strategies based on data [44][45][46]. The other potential application is to train deep neural networks(DNNs) on simulated data sets and transfer them into real data sets to evaluate realistic situations or recognize the hidden signals, which has been verified in both physics and epidemiology [47][48][49]. Based on the methodology, as Figure 1 shows, we first introduce DL into the evacuation model to measure the optimality of decisions in such extreme scenes.  Figure 1. Flowchart of learning dynamical parameters in evacuation models and estimate their values in other cases. The left panel shows evacuation maps simulated from a CA model and they are inputs of a neural network. The middle panel represents a generic neural network model which can be specified as a deep CNN model in our case. The rightest panel contains two parts, in which the upper one is testing maps which is from simulations but can be conveniently extended to real observed images, and the bottom icons indicate the key dynamical information the well-trained neural network can predict. Here it is a rational factor but can be generalized to more dynamical parameters.
In this paper, we use replicator dynamics to simulate the evacuation, which combines the bounded rational behaviour and rational decision-making [11,12,50,51]. Adopting the simulation model proposed in Ref. [12], we deploy a DNN model to extract dynamical parameters that determine individual behaviour in a CA model describing evacuation behaviours. To train deep convolution neural networks (CNNs), we prepare data sets with various dynamical factors from multi-frame images generated by CA models, which specifically means training the deep CNN on images cut from the whole evolution process. In addition, this framework has been evaluated on four different CNN models and has been further examined on the data set consisting of incomplete images cut from original images.

Cellular Automaton Modeling Evacuation with Bounded Rationality
A cellular automaton model was proposed for simulating the pedestrian flow with bounded rationality in a two-dimensional system [12]. The underlying structure is a L × L cell grid, where L is system size. The state of a cell can be empty or occupied by one pedestrian exactly or wall. The Moore neighbor is adopted in CA models, and pedestrians update their positions by transition matrices P(i, t), where P m,n (i, t) means the possibility that pedestrian i moves from t time at position (x(i, t), y(i, t)) to next time-step position. Neighbors' directions are labeled by (m, n), where m, n = 1, 2, 3 represents the row and column index of 9 directions. Thus, (1, 1) means the direction of upper left, (1, 2) is the upper, (1, 3) is the upper right and (2, 2) represent the center and the others are defined in a similar fashion. Each cell could be either empty or occupied by a wall or a pedestrian. Pedestrians at each time step can choose to move into a new location or stop. Once we choose one location of the exit, the cellular automata updated synchronously can simulate the escape process [32,52].
The model escape rules gives as follows: Set the position of exit (x, y) and generate N(t) population distribution at a L × L lattice. At the t = 0 time, disaster turns out and individuals begin to move; at the t time step, the individual i move to next position as matrices P(i, t) at t + 1 time step. Update all individuals synchronously, and the conflict will be handled by compared transition possibilities; Handle Conflicts. The conflicts occur when two or more persons want to move into the same position, and what we do to handle the conflicts is to compare their transition possibilities P m,n (i, t) which reflects their willingness to move. For example, the individual j and k both want to move into position (x, y), and the corresponding possibility for j is P m,n (j, t) and k is P m ,n (k, t). If P m,n (j, t) > P m ,n (k, t), then the individual j move successfully and k stayed where it was, and vice versa. For equal cases, one is randomly selected. It can be easily extended to the situation of many people. For individuals whose destination is an exit at the next time step, they escape successfully and are removed from the space to reduce the population as N(t + 1) = N(t) − 1. If N(t) = 0, all individuals exited and stop evolution; Else, update transition matrix according to the above strategies.
The extreme situation of escaping from disasters constrains people's behaviour, in which only intuition or social habits remains, no long term trade-off. The replicator dynamics modeling [50,51] links different behaviours, whether practical or spiritual, during a escaping process. It reforms the transition possibility P(i, t) as, where R(i, t), B(i, t) means weights from rational and bounded rational part respectively. They differ for different individual i at different time step t, which means these two matrices will be updated with evolution. The definition of components in matrix R m, which means if one position (m, n) around the individual i at t time is empty, O m,n (i, t) = 1, whereas the value is . And the E m,n (i, t) = α only holds when the exit direction is indicated by (m, n). For each individual, there is a relative location of the exit, that location will be assigned into one of 9 directions mentioned before depending on which one has the smallest azimuth between the direction and the exit. The other directions take the value and the is a minimum value that the calculation accuracy can reach. The parameter α represents attraction of exit to persons who want to escape, or the importance of information of exit position in the model. As Ref. [12] shown, the increasing of parameter α will induce decreasing of escape time and eventually saturates at individual and system levels, which indicates that α is a potential indicator of measuring the optimality in evacuation behaviour. Thus, we named α as the rational parameter in such a CA model. To measure the optimality of path-choosing decisions in crowd behaviour is to extract the corresponding rational parameter α in our case. The definition of bounded rational part B m,n relies on dynamic information from the others which leads to deviations from optima. The transport theory inspires us that escape dynamics needs more information on persons' position and velocity distribution, the basic variables in transport theory. Considering the full information cannot easily be observed by individuals, the mean-field approximation (MFA) can provide a global perception for the people on move, which shows B m,n (i, t) = 1 as rational choices, B m,n (i, t) = n m,n (i, t) as influencing by the crowds. The rational indicates transition possibilities only decided by R(i, t), that contains neighbours' states and the direction of exit, or other objective environments. The crowd defines n m, The definition shows the proportion of individuals in (m, n) orientation as a mean-field approximation, and people will be attracted to the direction with more density. We use it to mimic the "crowd" behaviour for individuals, which also means people can potentially get more population density information. The crowd effect induced by population affects human behaviour indirectly since people can gather and process information from the environment [22,25]. In this work, the distribution is discrete and the individual can process them as background, that's what the above definition means. People's perception of the distribution is reduced to the average value in a certain direction, a mean background field, as what statistical physics did in a many-body system.

Data-Set Generation and Network Capacity
The data sets that we prepared for training the neural networks are from the CA model included a total of 50,000 images. Out of the 50,000, there are 5 different initial populations ρ 0 ranging from 0.1 to 0.5, each with 10,000 images generated. Out of the 10,000 images with each initial population, there are 100 different values of rational parameters α ∈ (0, 5) and 100 frames of evolution in Time-step T ∈ [1, 100] for each parameter. Each image represents one snapshot of the evacuation process in a square form with a side length of 24, so each image we generated has 576 pixels. Each pixel of an image is either 0 or 1, where 0 represents empty space and 1 represents an individual present at that spot.
The main architecture of CNNs we used in this study is shown in Figure 2. Images generated from a CA model are fed to the input layer, the Conv2D layer is following after one input layer, and the MaxPooling layer is used to coarsening features extracted from CNN. The second Conv2D layer could be expanded to more CNNs whose performance is demonstrated in Section 3. The fully connected layers before the output layer are applied to process signals from preceding CNNs. The Droupout module and L 2 regularization are deployed to alleviate the possible over-fitting. To prepare inputs for the above CNN model, we select 10,000 as a standard batch size of samples, in which 2000 samples are from 5 initial population panels and mixed in one training data-set. Out of 2000, we label all 100 rational parameters α to each frame and prepared 20 groups from different frame selections. It means we prepare different numbers of consecutive frames as training data sets, which helps us to evaluate the performance of CNNs to extract the dynamical information from the collective behaviour. Starting with frame No. 36 as Figure 3 shown, we cut the following one frame as the first channel of 2000 samples, and then cut different numbers of frames (ranging from 1 to 32) after the first frame to form diverse channels of image inputs.

Validating CNN Models
To find a relative optimal CNN model to learn rational parameters from training data sets, we first examined the performance of different Convolution operations in our CNN models. In the examination, we set eight consecutive frames as eight channels per sample and tested different CNN models containing 1, 2, 3 and 4 convolution layers. The performances are demonstrated in Figure 4, in which the training and validation losses (mean square error, MSE) are decreasing with training. In Figure 4, the simple CNN model behaves distinct over-fitting after the first five epochs, which is understandable that the relative concise model tends to over-fit on a large data-set. Although the models with three and four convolution layers have small training losses as the other models show, their validation losses are highly unstable. It could be interpreted as the lack of training data causes under-fitting. The CNN model with two convolution layers is comparatively superior, for its stable performance both on training and validation data sets. Thus, in the following contents, we choose the 2-layer CNN model visualized in Figure 2 for further investigations.

Extracting the Dynamical Parameter via Deep Learning
In Figure 5, we demonstrate the testing performance of the CNN model on different numbers of consecutive frames. The MSE and R 2 = 1 − SS res /SS tot are chosen to evaluate the results learnt from different consecutive evolutions, where SS res is the sum of squares of residuals between predictions and ground truths and SS tot is the total sum of squares in testing data-set which is proportional to the variance of the data. By increasing the number of the frame from 1 to 32, the prediction of rational parameter α p tends to reach the ground truth α. As a relatively ideal result, eight consecutive frames achieve a testing loss of 0.062 and R-squared value 0.9771. It should be mentioned that while increasing the frame number does increase overall accuracy by a marginal degree, the amount of data (here is time, in real-life applications) required to analyze in these models grows disproportionately against model accuracies. Selecting frame numbers as low as possible is more realistic for generalizing our framework to assist real-life applications to react more quickly. To analyze sensitivity of the CNN model to extract rational parameter α under diverse population densities, we tested the CNN model on five initial population ρ 0 data-sets. To achieve the purpose, we prepare 2000 images from each ρ 0 using the same method as previously introduced, but here we feed images from each ρ 0 value into the model separately rather than mixed together. Five well-trained models are tested and shown in Table 1, in which results reveal that for all ρ 0 values we examined, the R-squared of evaluations were all above 0.98, while a 2000-image mixed model gives 0.95. Now, we are concerned about a more realistic scenario, in which evacuation information is partially missing. In reality, the observations of collective behaviour are routinely noisy and(or) incomplete. How to process and understand the hidden behaviour patterns is one of the urgent topics in, e.g., stewardship [53] and social networks [54]. The CNN is trained on parts of the images we prepared before. This means that the side length and position of the prepared images are set to be different instead of the number of the frame. In concise, we select a square area with a given side length off from the 24 by 24 images we generated and determine the position of the image part by defining the coordinates of the upper left corner on the original image. Using the same 10,000 images and 2-layer CNN, the input images are set with side lengths from 8 to 24 (Images of side lengths less than 8 contain too little information to train a 2-layer CNN model). In Figure 6, results show that the longer the side length, the more accurate the prediction is, which is consistent with the information completeness. With regard to the position of cut images, the top left corner and right centre (where the exit is) are tested. The inspection using different side lengths as a sample shows the obvious advantage of providing information at the outlet. In addition, when monitoring the exit, a 12 by 12 image section can achieve an accurate prediction of the entire situation with MSE = 0.094 and R 2 = 0.982, which is close to the performance of training on complete images.

Robustness Examinations
In addition to the dynamical factor α, other factors can also affect the evacuations, such as the initial population density and number of exits. In Figure 7, we validate our approach on these two cases. In Figure 7a, we use the same deep CNN model to learn the initial population densities from a series of intermediate processes which are (24,24,8) images. With the same size of data set, the testing performance is MSE = 6.73 × 10 −4 and R 2 = 0.906. It achieves an acceptable performance, but not good as the prediction task to α. It is understandable that the only dynamical parameter of simulations is α which is more important than the initial condition for the intermediate processes. Concerning the double-exit case, we set two symmetric exits on both sides of the location in the single-exit case, and they have the same widths as the single one. In addition, under the same CNN model and size of data set, we get the testing performance as MSE = 0.053 and R 2 = 0.973. It is comparable to the single-exit case. The predicted α and ground truth are plotted in Figure 7b, they are consistent with each other.

Measuring Deviations from the Optimal Decision
With a well-trained CNN model, we can predict the rational parameter in such evacuation behaviour α that reflects the importance of the exit information to individuals. The crowd rule was introduced to characterize a bounded rational behaviour [12], in which the deviation from optima is measured in our framework. With the same processing as Section 2 to prepare data-sets, we generated 10,000 images under the crowd rule. In a transfer learning manner, the well-trained CNN model learnt on a data-set with optimal strategy is transferred to predict the rational parameters on the data-set with the crowd rule. As Table 2 shows, predictions of rational parameters on different initial population densities reveal a distinct deviation which is δα, in which the base-line α = 2.475. The effect of the crowd rule on the rational behaviour is to reduce the influence of the exit information or equivalently is to strengthen the importance of the population density in evacuation. With population increasing, the deviation from optima δα changes from negative to positive, that is from overestimating rational parameter under small population to underestimating it under large population. In other words, the bounded rationality induced by the crowd rule in evacuation behaviour is quantitatively characterized as the deviation δα.

Conclusions
In this study, based on a CA model which generates spatio-temporal maps describing the evacuation process, we propose a deep learning framework to extract the optimality and its deviation induced by heterogeneous information. The latter is introduced in a replicator dynamics describing the bounded decision-making. The well-trained deep CNN accurately predicts dynamical factors from multi-frame images generated by the CA model.
In addition, it should be noted that the performance of this machine is robust to incomplete images corresponding to global information loss.
This framework provides us with a platform in which the optimality of decision is measured as a dynamical parameter in evacuation simulations, and the latter can be simulated by replicator dynamics. It should be noted that deep CNN is just one of the machine learning approaches that can learn the dynamical factor from replicator dynamics. Besides, the Bayesian method can also achieve our goal [55,56]. Although the CNNs can capture the spatial correlations more naturally in image-type data, it still deserves to compare the performances of different methods in the future. Furthermore, the scheme could also be generalized to other well-designed experiments. It has potentials to be used in recognizing potential collective patterns and avoid trampling if we trained on observed image data-sets from experiments or the real world. On the other hand, combining online games with the deep learning framework, it can help us to measure the optimality of individual or group decision in more human behaviours [57][58][59][60]. Because the evacuation simulation provides us with a platform in which the human instinct dominates behaviours under extreme scenes [23,61,62]. It brings opportunities to effectively investigate human behaviours without complex social relations, which will help us to understand the diverse and fascinating collective behaviours that occur in both virtual and real space (social network, financial network and social norms, these virtual social connections naturally incubate the collective behaviour; as for the real space, collective modes are common in urban dynamics, traffic flow, and pedestrian dynamics [63,64]). An online game simulating multi-players in evacuation has been developed and the measurement results will be released in our future works. In summary, this study provides an insight into measuring human decisions with deep learning approaches in collective behaviours.