A Mechanistic Data-Driven Approach to Synthesize Human Mobility Considering the Spatial, Temporal, and Social Dimensions Together

: Modelling human mobility is crucial in several areas, from urban planning to epidemic modelling, trafﬁc forecasting, and what-if analysis. Existing generative models focus mainly on reproducing the spatial and temporal dimensions of human mobility, while the social aspect, though it inﬂuences human movements signiﬁcantly, is often neglected. Those models that capture some social perspectives of human mobility utilize trivial and unrealistic spatial and temporal mechanisms. In this paper, we propose the Spatial, Temporal and Social Exploration and Preferential Return model (STS-EPR), which embeds mechanisms to capture the spatial, temporal, and social aspects together. We compare the trajectories produced by STS-EPR with respect to real-world trajectories and synthetic trajectories generated by two state-of-the-art generative models on a set of standard mobility measures. Our experiments conducted on an open dataset show that STS-EPR, overall, outperforms existing spatial-temporal or social models demonstrating the importance of modelling adequately the sociality to capture precisely all the other dimensions of human mobility. We further investigate the impact of the tile shape of the spatial tessellation on the performance of our model. STS-EPR, which is open-source and tested on open data, represents a step towards the design of a mechanistic data-driven model that captures all the aspects of human mobility comprehensively.

Mobility data resulting from the rise of ubiquitous computing (e.g., mobile phones, the Internet of Things, social media platforms) provides a precise way to sense human movements and face these societal challenges. Unfortunately, access to individual mobility data is restricted because they contain sensitive information about the individuals whose movements are described, and due to the EU General Data Protection Regulation (GDPR). Even when personal identifiers are removed to anonymize the dataset, there is no guarantee about the protection of the geo-privacy of individuals because they can be re-identified with a small amount of information [16][17][18][19][20][21].
A solution to deal with geo-privacy issues consists of design generative models of individual mobility, i.e., algorithms able to generate a collection of synthetic trajectories

Related Work
Among the many mechanistic generative models proposed for human mobility, the Exploration and Preferential return (EPR) model [36] has turned into a modelling platform given its robustness and modularity, allowing researchers to test their hypotheses by easily replacing or adding specific mechanisms to it. Specifically, EPR relies on two complementary mobility mechanisms: exploration and preferential return. During the exploration mechanism, an agent chooses a new location never visited before, based on a random walk process with truncated power-law jump size distribution. In the preferential return mechanism, an agent returns to a previously visited location based on the number of visits to that specific location, it reproduces the propensity of humans to return to locations they visited before. An agent in the model selects to explore a new location with probability P exp , and with complementary probability P ret = 1 − P exp , the agent returns to a previously visited location.
Several studies consequently widened the EPR model by adding increasingly complex mechanisms to reproduce statistical laws more realistically. In the d-EPR model [37], an agent visits a new location depending on both its distance from the current position and collective relevance. In the recency-EPR model [38], the preferential return phase includes information about the recency of location visits. In the memory-EPR model [39], during the exploration mechanism, the agent selects a location with probability proportional to the number of times it visited that location during the previous M days. EPR and its extension, focus only on the spatial aspect of human mobility, neglecting to reproduce realistic temporal patterns. For example, the displacements of individuals are not uniformly distributed during the day but follow the circadian rhythm, a property that is not captured by EPR-like models. Two refined models, namely TimeGeo [40] and DITRAS [31], overcome this problem by including a more refined temporal mechanism.
TimeGeo [40] is a mechanistic modelling framework to produce individual mobility trajectories with realistic spatiotemporal properties. TimeGeo models the temporal dimension through a time-inhomogeneous Markov chain that captures the circadian propensity to travel and the likelihood of arranging short and consecutive activities [40]. It integrates the temporal mechanism with a rank-based version of the EPR model (r-EPR), which assigns a rank to each unvisited location during the selection of a new location to visit, depending on its distance from the trip origin [40].
DITRAS (DIary-based TRAjectory Simulator) [31] generates the synthetic trajectories exploiting two probabilistic models: a diary generator and a trajectory generator. The diary generator consists of a Markov model trained on mobility trajectory data of real individuals, able to capture the probability of individuals to follow or break their routine at specific times of the day [31]. The trajectory generator is an algorithm that, given a weighted spatial tessellation, translates the abstract locations in physical locations using the d-EPR model [34].
Notwithstanding the definite correlation between human mobility and sociality, one of the few mechanistic models that attempts to replicate the socio-mobility patterns is GeoSim [33]. GeoSim takes into account both the mobility and the social dimension, although incorporating a trivial temporal mechanism. GeoSim proposes two mechanisms beyond the explore and preferential return ones: individual preference and social influence. The agent has to decide if its next displacement will be influenced or not by its social contact, respectively, with probability α and 1 − α.

Position of Our Work
An overview of the literature cannot avoid noticing the lack of mechanistic generative models able to reproduce realistically the spatial, temporal, and social dimensions at the same time. On the one hand, GeoSim can capture meaningful patterns representing the link between mobility and sociality, but cannot reproduce realistic spatiotemporal patterns. On the other hand, TimeGeo and DITRAS reproduce spatial and temporal patterns well but ignore the social dimension. In this paper, we propose the STS-EPR model in which we combine the mechanisms of existing mechanistic models attempting to reproduce the three dimensions of human mobility.

Trajectory
The trajectory of an individual is a sequence of records that allows for reconstructing their movements during the period of observation [41,42]. Formally, a spatiotemporal trajectory is defined as a sequence T = (r 1 , t 1 ), . . . , (r n , t n ) where t i is a timestamp such that ∀i ∈ [1, n), t i < t i+1 , and r i = (x i , y i ) is a pair of coordinates on a bi-dimensional space in a given reference system (CRS), e.g., latitude and longitude. A pair (r n , t n ) denotes a visit at location r n at timestamp t n .

Spatial Tessellation
For modelling purposes, the geographic space is discretized through a weighted spatial tessellation. The tiling of the geographic space aims at creating the covering of the area of interest using regular tiles, such as squared or hexagonal tiles, or irregular tiles that may define the shape of buildings, census cells, or administrative units.
Formally, given an area A, a set of geographical polygons called tessellation, G, is defined with the following properties: (1) G contains a finite number of polygons, l i called tiles, G = {l i : i = 1, . . . n}; (2) the locations are non-overlapping, l i ∩ l j = ∅, ∀i = j; and (3) the union of all locations completely covers A, n i=1 l i = A. In a weighted tessellation L, at each tile is associated its relevance, namely the popularity of a location among real individuals [31]. The overall number of visits to a location is usually used as an estimation of its relevance. L = (r 1 , w 1 ), . . . , (r n , w n ) , where w j represents the relevance of the tile j and r j is the representative point of the tile j.

Mobility Diary
A Mobility Diary (MD) is an abstract trajectory that describes the locations (in terms of placeholders called abstract locations) and the timestamp at which the user visits that specific abstract location [31]. Generally, in generative models for human mobility, the abstract locations are mapped into physical ones through diverse spatial mechanisms (e.g., EPR [36], or d-EPR [37]).
A mobility diary is generated by a Mobility Diary Generator (MDG) and is defined as follows: where ab j denotes an abstract location and t j the timestamp at which the individual visits ab j . The abstract location ab 0 refers to the home location of an individual.

Visitation Pattern
The visitation pattern of an individual a is represented as a vector lv a of |L| elements, called location vector, where |L| is the total number of locations in L. The j-th element of the location vector, lv a [j], contains the number of times a visited the location r j . The visitation frequency of a to a location r i is: .

Contact Graph
An individual's network of contacts G may influence their movements. We define G = (V, E) as a graph in which V indicates the set of individuals and E the social ties between individuals.

Problem Definition
A generative mobility model M is any algorithm able to generate a set of n synthetic trajectories T M = {T a 1 , . . . , T a n }, which describe the movements, during a certain period of time, of n independent agents a 1 , . . . , a n on a spatial tessellation L [23]. The realism of M is evaluated with respect to:
The patterns refer to the distributions of mobility measures that quantify aspects related to the spatial, temporal, or social aspects of an individual's mobility (e.g., radius of gyration, mobility entropy, mobility similarity). A realistic T M is expected to reproduce as many mobility patterns as possible.

2.
A set X = {T u 1 , . . . , T u m } of real mobility trajectories corresponding to m real individuals u 1 . . . u m that move on the same region as the one on which synthetic trajectories are generated. X is used to compute the set K of patterns, which are compared with the patterns computed on T M .

3.
A function D that computes the dissimilarity between two distributions. Specifically, for each measure in f ∈ K, D(P ( f ,T M ) ||P ( f ,X ) ) indicates the dissimilarity between P ( f ,T M ) , the distribution of the measures computed on the synthetic trajectories in T M , and P ( f ,X ) , the distribution of the measures computed on the real trajectories in X . The lower D(P ( f ,T M ) ||P ( f ,X ) ), the more realistic model M is with respect to f and X .

The STS-EPR Model
The Spatial, Temporal and Social Exploration and Preferential Return model (STS-EPR) extends the Exploration and Preferential Return model (EPR) [34] by considering the social dimension together with the spatial and temporal ones. It also includes a temporal mechanism that reproduces realistically the distribution of the number of movements during the day.
STS-EPR takes as input a spatial tessellation L, a mobility diary generator MDG, the time interval of the simulation [t start , t end ], and an undirected graph G describing the social relationships between the N agents. The model outputs N synthetic trajectories describing the displacements of N agents on L during the period [t start , t end ].
STS-EPR consists of four phases: initialization, action selection, location selection, and action-correction (see Figure 1). After the initialization phase, the agents perform the action selection, the location selection, and eventually the action-correction phases until a stopping criterion is satisfied (e.g., the number of hours to simulate is reached).  Figure 1. A schematic description of STS-EPR. When an agent a moves according to the entry in its mobility diary MD a , if the abstract location is ab 0 the individual returns to its starting location, otherwise it decides whether to explore a new location or return to a previously visited one. At that point, a determines whether or not its social contacts affect its choice for the location to visit next. If the selected action cannot be performed, it is corrected with an executable one (dashed arrows indicate action corrections).

Initialization
The edge's weights in G are initialized to zero and updated during the simulation. The weight of an edge indicates the mobility similarity of the linked agents, i.e., the cosine similarity of their location vectors. The model assigns to each agent a Mobility Diary produced by a Mobility Diary Generator (MDG); in STS-EPR, the MDG considered is a Markov Model that captures the individuals' probability to follow or break their routine at specific times of the day, exploiting the conditional probability of real trajectory data [31]. The mobility diary MD of an agent a is defined as: MD a = (ab 0 , t 1 ), (ab 1 , t 2 ), . . . (ab j , t j+1 ), (ab 0 , t j+2 ), (ab 1 , t j+3 ) . . . ) where ab is an abstract location, ab 0 denotes a's starting location, and t i is a timestamp. Two distinct consecutive abstract locations must be mapped to two distinct physical locations. After assigning the mobility diaries, the model assigns each agent to a starting location. The probability p(r i ) for an agent of being assigned to a starting location r i ∈ L is ∝ w i , where w i is the location's relevance. Each agent will move according to the entries in its mobility diary at the time specified. If the current abstract location is ab 0 , the agent returns to its starting location; otherwise, ab i is converted into a physical location through the next steps.

Action Selection
When moving, an agent can decide whether to explore a new location or return to a previously visited one by selecting one of two competing mechanisms: exploration and preferential return. Exploration models the decreasing tendency to explore new locations over time [34]. Preferential return reproduces individuals' significant propensity to return to locations they explored before [31,34,37]. An agent explores a new location with probability P exp = ρS −γ , or returns to a previously visited one with a complementary probability P ret = 1 − ρS −γ , where S is the agent's number of unique visited locations and ρ = 0.6, γ = 0.21 are constants (for these two parameters, we use the values estimated in the literature on mobile phone records [34]). The parameters ρ and γ influence the user's tendency to explore a new location versus returning to a previously visited location [34]. When the agent returns, it selects a location with a probability proportional to its visitation frequency. At that point, independently of the spatial mechanism selected, the agent determines whether or not the choice of the next location to visit is affected by the other agents, selecting between the individual and the social influence mechanisms. With a probability P soc = α, the agent's social contacts will influence its movement [33]. With a complementary probability of 1 − α, the agent's choice is not influenced by the other agents. The social factor α is equal to 0.2 as in the GeoSim model [33]. Indeed, Toole et al. [33] find that an exponential distribution with a mean value of 0.2 produces a close fit to the distribution of mobility similarity observed in the population. Moreover, this value is consistent with the results of Cho et al. [32], who find that 10-30% of trips are motivated by social reasons. Table 1 summarizes the default value and the role of each parameter.

Location Selection
At this point, the agent a decides which location is the destination of its displacement. The sets of locations a can visit or return to are exp a = {i | lv a [i] = 0} and ret a = {i | lv a [i] > 0 ∧ i / ∈ {s a , c a }}, respectively, where s a and c a denote the indices of the starting and current location of agent a. The set of the location visited, without the constraints of the current and starting location, is vis a = {i | lv a [i] > 0}. During the location selection step, a can choose among the following actions: • Individual Exploration (IE): a chooses a new location to explore from exp a . Individuals are more likely to move at small rather than long distances but also take into account the location's collective relevance [31]. We use the gravity law to couple distance and relevance [37]. If a is currently at location r j , it selects an unvisited , where d ij is the geographic distance between locations r i and r j with relevances w i , w j . • Social Exploration (SE): a selects an agent c among its social contacts in the social graph G, i.e., c ∈ {v ∈ V|(a, v) ∈ E}. The probability p(c) for c to be selected is proportional to the mobility-similarity between them: p(c) ∝ mob sim (a, c). After the contact c is chosen, the candidate location to explore is an unvisited location for a that was visited by c, i.e., the location is selected from set A = exp a ∩ vis c . The probability p(r i ) for a location r i , with i ∈ A, to be selected is proportional to the visitation pattern of c, namely p(r i ) ∝ f c (r i ). • Individual Return (IR): a chooses the return location from the set ret a with a probability proportional to its visitation pattern. The probability for a location r i with i ∈ ret a to be chosen is: Social Return (SR): c is selected as in SE, and the location a returns to is picked from the set A = ret a ∩ vis c . The probability p(r i ) for a location r i to be selected is proportional to the visitation pattern of the agent c, namely p(r i ) ∝ f c (r i ).

Action Correction
The set of possible locations an agent can reach is limited. For example, it may happen that the agent visited all locations at least once and there are no new locations to explore. To comply with these kinds of constraints, we introduce an action correction phase, executed if the location selection phase does not allow movements in any location.

•
No location in social choices: If an agent a decides to move with the influence of a social contact c, but ret a ∩ vis c = ∅ or exp a ∩ vis c = ∅ (no locations visited by both c and a or no locations visited by c and unvisited by a), we execute an individual action preserving a's choice to explore or return. • No new location to explore: When an agent a decides to explore but it visited all the locations at least once (exp a = ∅), we force the agent to make an IR action. • No return location: If an agent a, currently at location r i , decides to perform an IR, and r i is the only location visited so far (besides the starting location), it cannot return to any location (ret a = ∅). We force a to make an IE.

Experiments
In this section, we present the experiments to evaluate the performance of STS-EPR. We simulate the mobility of individuals in eight cities around the globe using STS-EPR and two state-of-the-art models: DITRAS [31] and GeoSim [33]. We evaluate the realism of synthetic trajectories generated by the mentioned models in terms of their statistical similarity with real ones extracted from Foursquare checkins [43] (Figure 2). Furthermore, we also conducted studies to examine the effect of different tile shapes on the trajectories' generation.

Trajectories Dataset
We use a public dataset D FS , collected by Yang et al. [43], which includes a set of global-scale checkins gathered from the social network platform Foursquare over 22 months (from April 2012 to January 2014). We use a lookup dataset D loc to associate the location's identifier with the corresponding geographic coordinates. Table 2 shows some examples of records in the two datasets.
A checkin describes a user's real-time position with its social contacts. A user's timeordered sequence of checkins can be used to reconstruct their movement considering each checkin as a point in their trajectory. Note that the reconstructed trajectory represents a portion of the user's mobility, and it is biased towards the most captivating places worth sharing on social media (e.g., points of interest).
The authors of the dataset collected the Foursquare checkins from Twitter by searching the Foursquare hashtag [43]. The dataset is associated with a snapshot of the social network obtained from Twitter, antecedent at the collection period.
From the D FS dataset, we extracted, for each city, a validation and a calibration dataset. The validation dataset contains the checkins of a set of users connected through the social graph for three months, from the 10 April 2012 to the 10 July 2012 ( Figure 3); it represents a benchmark of genuine trajectories to be used during the validation phase. The calibration dataset contains for the same period the checkins of a set of users not included in the validation dataset. The calibration dataset will be used to calibrate the model, i.e., to compute the location relevance and fit the Mobility Diary Generator. We report the characteristics of these datasets in Table 3.

Social Graph
Each of the validation datasets is associated with a social graph that describes the social relationships (i.e., mutual follow on Twitter) between those users that made at least two checkins between the 10 April 2012 and the 10 July 2012. Table 3 reports the characteristics of the contact graphs.

Measures
We quantify the models' realism (Section 3.6) with respect to several mobility measures using the six datasets of Section 5.1 and the Kullback-Leibler divergence (KL), defined as: where p is the ground truth distribution and q is a synthetic distribution. We consider the following spatial, temporal, and social mobility measures, which capture well-known statistical patterns of individual human mobility [22,23]: • Jump Length ∆r, the distance between two consecutive locations visited by an individual [28][29][30]. Formally, ∆r = d(r i , r i+1 ) is the geographical distance between two points r i and r i+1 in a trajectory. A truncated power-law well approximates the empirical distribution P(∆r) within a population of individuals, with the value of the exponent slightly varying based on the type of data and the spatial scale [28,29]. • Radius of Gyration r g , the typical distance travelled by an individual u during the period of observation [29,35]. The r g of individual u defined as r g (u) = 1 n u ∑ n u i=1 d(r i , r cm ) 2 , where n u is the number of points in T u , r i ∈ T u and r cm = 1 n u ∑ n u i=1 r i is the position vector of the centre of mass of the set of points in T u . A truncated power-law well approximates the distribution of r g [29,30]. • Visits per Location V l , the relevance of a location described as its attractiveness at a collective level, indicating the popularity of locations according to how people visit them on the geographic space [31,45]. • Location Frequency f (r i ), the probability of an individual to visit a location r i [29], identifying the importance of a location to an individual's mobility: the most visited location (likely home or work) has rank 1, the second most visited location (e.g., school or local shop) has rank 2, and so on. The probability of finding an individual at a location of rank L is well approximated by P(L) ∼ 1/L [29,30]. • Waiting Time ∆t: the elapsed time between two consecutive visits of an individual u: Empirically the distribution of waiting times is well approximated by a truncated power-law [34]. • Uncorrelated Entropy S unc : the predictability of the movements of an individual u [36], defined as S unc (u) = − ∑ n u i=1 p u (i) log 2 p u (i), where n u is the number of distinct locations visited by u and p u (i) is the probability that u visits location i [36,46]. • Activity per Hour t(h): the number of movements made by the individuals at every hour of the day [31,40]. The movements of individuals are not distributed uniformly during the hours of the day but follow a circadian rhythm: people tend to be stationary during the night hours while prefer moving at specific times of the day, for example, to reach the workplace or return home. • Mobility Similarity mob sim : the cosine-similarity of two individuals' location vectors [32,33,43]. We define the mobility similarity mob sim between two individuals u i , u j as the cosinesimilarity of their location vectors lv i , lv j .
Several studies demonstrate the correlation between human mobility and sociality [32,33,43,47,48]: the movements of friends are more similar than those of strangers, mainly because we are more likely to visit a location if a social contact explored that location before.

Experimental Settings
We synthesize the trajectories of individuals moving for three months in each city using STS-EPR, GeoSim, and DITRAS.

Results
For each combination of city and model, we run a trajectory generation for five times and take, for each measure, the average and standard deviation of the resulting KL divergence. Table 4 shows the results for all cities, obtained using a squared tessellation of 300 m. In Sections 6.1-6.3, we present the results obtained using the squared tessellation with tile size of 300 m. We discuss the role of the type and size of tiles in Section 6.4. While the distributions and tables for Kuala Lumpur, Sao Paulo, Jakarta, Bangkok, and Istanbul, together with the distributions and table for the hexagonal tessellation, are reported in the Supplementary Material.

Spatial
Our results highlight the importance of coupling distance and relevance in the location selection phase. The models that use the gravity law during the Individual Exploration, i.e., STS-EPR and DITRAS, capture the distribution of the jump length realistically, obtaining similar KL scores for all the cities (best score obtained with STS-EPR in Istanbul with a KL = 0.011, Table 4). In contrast, since GeoSim does not include any mechanism to consider the geographical distance or the relevance, it fails to reproduce the power-law behaviour of the ∆r distribution.
We obtain similar results for the radius of gyration: GeoSim achieves the worst performance for all the cities examined (KL ∈ [4.604, 5.361]) and fails to capture the shape of the real distribution. The trajectories generated by STS-EPR and DITRAS capture correctly the power-law behaviour of the radius of gyration; they achieve similar scores, nevertheless STS-EPR (KL ∈ [0.032, 0.239]) outscores DITRAS (KL ∈ [0.084, 0.815]) in six cities out of eight.
GeoSim is the best model in terms of KL regarding the location frequency measure, although it underestimates the real distribution (Figures 4d and 5d). STS-EPR captures this measure accurately, achieving a KL score that is on average 90.80% better than the one obtained by DITRAS (Table 4). DITRAS, when applied in some cities, cannot generate trajectories whose users have visited a sufficient number of locations to compute the distribution.
STS-EPR generates trajectories that preserve the distribution of location relevance, too: it is the best model for all the cities, with a KL score that is on average 99.64% and 87.59% better than that of GeoSim and DITRAS, respectively, (Figure 6d); STS-EPR outscores DITRAS even if both the models use the concept of relevance in their spatial mechanisms.
Finally, none of the models approximate the distribution of the entropy measure; however, STS-EPR results as the best model for this measure in six cities (Figures 4h and 5h).  . Average improvement in terms of KL divergence (percentage) achieved by using STS-EPR with respect to GeoSim (cyan bars) and DITRAS (orange bars). Note that 100% is an upper bound for the improvement, while there is no lower bound.

Temporal
The use of the mobility diary generator is essential to generate realistic temporal patterns. GeoSim fails to capture the distribution of the waiting times (KL close to 1 in all the cities, Table 4) because it extracts the waiting times from a pre-defined statistical distribution that does not consider the characteristic waiting time associated with a particular group. The two models that use the Mobility Diary Generator (MDG), STS-EPR, and DITRAS, can better capture this distribution (Figures 4f and 5f). In particular, STS-EPR achieves an average KL improvement of 76.50% and −18.87% with respect to GeoSim and DITRAS, respectively, (Figure 6e).
As for the imitation of people's circadian rhythm, GeoSim produces an unrealistic flat distribution in which the probability of moving is distributed uniformly across the day (Figures 4f and 5f). The two models that use the MDG, namely STS-EPR and DITRAS, achieve similar KL scores and capture accurately the shape of the distribution of the activity per hour, as well as the peaks of activity during the day (Figures 4g and 5g).

Social
Our proposal, STS-EPR, is the model that best captures the distribution of mob sim : it achieves KL ∈ [0.0099, 0.0828] for all the cities but New York City, in which the KL score associated is 0.1874 (Table 4). Since DITRAS does not employ any social mechanism during the generation of the trajectories, as expected, it cannot capture accurately the shape of the distribution (KL ∈ [0.783, 1.814]). GeoSim fails in reproducing values close to 1, achieving KL scores in the range [0.098, 0.555] resulting in the second-best model after STS-EPR. STS-EPR achieves by far the best scores, with a striking KL = 0.0099 for Bangkok (Table 4).
Overall, using STS-EPR guarantees an average KL improvement, computed for the eight cities, of 84.84% and 96.34% (Figure 6h) concerning GeoSim and DITRAS, respectively.

Impact of Spatial Tessellation
We repeat our experiments for two tessellations that differ in tile shape and surface: 1.
Squared tessellation with tiles of size 300 m and area 0.09 km 2 ; 2.
Hexagonal tessellation with tiles of H3 resolution 9 and area 0.10 km 2 ; The size of tiles in the hexagonal tessellation has an area close to the squared tessellation. We compute the squared tessellations using the scikit-mobility [44] library and the hexagonal tessellations using Uber H3 geospatial indexing (https://eng.uber.com/h3, accessed on 10 September 2021). Table 5 reports the number of squared and hexagonal tiles that cover each of the eight cities. The choice of the spatial tessellation impacts the results. For the spatial measure jump length, in which distances are fundamental, instantiating STS-EPR with the squared tessellations produces an average KL decrease of 0.02 to the hexagonal one (Figures 7a). For the spatial measure radius of gyration in Bangkok and Kuala Lumpur, the hexagonal tessellation produces a lower KL score (Figure 7b). In general, STS-EPR instantiated on a squared tessellation produces trajectories whose radius of gyration is more similar to the real distribution lowering on average the KL score of 0.012. For other non-distancebased spatial measures, such as location frequency, a hexagonal tessellation produces the best results (Figure 7c). Regarding the number of visits per location (Figure 7d), they are reproduced the best when STS-EPR is instantiated with a squared tessellation. The temporal measures are not affected by the choice of the spatial tessellation, since they do not depend on the space (Figure 7e,f). Similarly, the predictability of the trajectories (entropy) is not affected by the tessellation choice. In contrast, the mobility similarity is reproduced the best with the use of a hexagonal tessellation (Figure 7h), with which STS-EPR produces trajectories with an average KL score decrease of 0.007.

Discussion
The inclusion of a mechanism that couples spatial distance and location relevance allows significantly improving the realism to spatial measures such as jump length, the radius of gyration, and location relevance. Indeed GeoSim, which does not use the gravity law, cannot capture at all the shape of these distributions.
Although DITRAS and STS-EPR both use the EPR and gravity law as spatial mechanisms, STS-EPR significantly captures better the number of visits per location and the location frequency. This result occurs due to the inclusion in STS-EPR of the social mecha-nism, which allows an agent to visit locations far from its current position if a social contact visited such a location. Concerning the number of visits per location, the social mechanism rewards relevant locations. A relevant location for an individual may become relevant also for its social contacts when they choose to perform a social action.
The use of the mobility diary generator is crucial to capture the temporal patterns of human mobility. While the usage of the waiting time distribution in GeoSim cannot model the temporal characteristic of the population, using the diary generator allows reproducing both the waiting time and the propensity to move at specific times during the day.
STS-EPR does not depend on the specific characteristics of the geographic area since it produces good quality trajectories in all the examined cities, i.e., it is geographically transferable.
Finally, the combination of realistic social and temporal mechanisms allows STS-EPR to reproduce the mobility similarity realistically. Although GeoSim embeds a social mechanism, its results are comparable to those of DITRAS, which does not embed any social mechanism. This result highlights the importance of sociality: though often neglected in generative mobility models, it is essential to model properly individual human mobility and capture mobility patterns accurately. Indeed, the inclusion of the social mechanism allows capturing the spatial aspects better, as showed by the results achieved by the models for the visits per location measure.

Example of Execution
The code of STS-EPR is included in the Scikit-mobility (https://github.com/scikitmobility, accessed on 10 September 2021) Python library. Listing 1 shows how to generate a set of synthetic mobility trajectories using STS-EPR.
In lines 1, 2, and 3 we perform the basic imports to guarantee the correct execution of STS-EPR. In lines 5 and 6 we specify the time interval of the simulation. In lines 12, 15, and 18, we load the weighted spatial tessellation, the social graph, and the mobility diary generator, respectively. Finally lines 21 and 24 instantiate STS-EPR and generate the synthetic trajectories.
The full example of the instantiation and generation of synthetic mobility trajectories using the STS-EPR model and the input files used are available at https://jovian.ai/giulianocornacchia/example-sts-epr (accessed on 10 September 2021).

Conclusions
STS-EPR is a mechanistic data-driven, generative mobility model that embeds the spatial, temporal, and social dimensions together. Our results show that, overall, the modelling of the three dimensions together brings several advantages, making STS-EPR better than existing models that lack either the social, the spatial, or the temporal mechanism.
STS-EPR is particularly suitable in the field of computational epidemiology, in which sociality and mobility are the key factors in the spreading process of a disease. Simulating epidemics may help policymakers make crucial decisions about non-pharmaceutical interventions (e.g., imposing mobility reduction or social distancing).
STS-EPR also has some weaknesses. First, the mechanisms embedded in the model can capture a limited set of mobility measures, and the realism in the distribution of some measures must be improved. Future works should consider adding features in STS-EPR to capture out of the routine trips and the environmental spatial constraints imposed by buildings and road infrastructures. Another opportunity for future improvements consists of embedding deep learning techniques (e.g., Generative Adversarial Networks and Variational Autoencoders) to model aspects of mobility that are not captured by the current mechanisms, to improve the realism of the model.
In the meantime, our model is a step towards the design of a mechanistic data-driven model that can capture all the aspects of human mobility comprehensively.