Modeling Spatio-Temporal Evolution of Urban Crowd Flows

Metropolitan cities are facing many socio-economic problems (e.g., frequent traffic congestion, unexpected emergency events, and even human-made disasters) related to urban crowd flows, which can be described in terms of the gathering process of a flock of moving objects (e.g., vehicles, pedestrians) towards specific destinations during a given time period via different travel routes. Understanding the spatio-temporal characteristics of urban crowd flows is therefore of critical importance to traffic management and public safety, yet it is very challenging as it is affected by many complex factors, including spatial dependencies, temporal dependencies, and environmental conditions. In this research, we propose a novel matrix-computation-based method for modeling the morphological evolutionary patterns of urban crowd flows. The proposed methodology consists of four connected steps: (1) defining urban crowd levels, (2) deriving urban crowd regions, (3) quantifying their morphological changes, and (4) delineating the morphological evolution patterns. The proposed methodology integrates urban crowd visualization, identification, and correlation into a unified and efficient analytical framework. We validated the proposed methodology under both synthetic and real-world data scenarios using taxi mobility data in Wuhan, China as an example. Results confirm that the proposed methodology can enable city planners, municipal managers, and other stakeholders to identify and understand the gathering process of urban crowd flows in an informative and intuitive manner. Limitations and further directions with regard to data representativeness, data sparseness, pattern sensitivity, and spatial constraint are also discussed.


Introduction
As an increasing proportion of the world's population are migrating to urbanized areas, many metropolitan cities are facing many serious socio-economic problems, such as frequent traffic congestion, unexpected emergency events, and tragic human-made disasters, to list a few [1]. Many of these problems are caused by huge urban crowd flows, specifically referring to the gathering process of a flock of moving objects (e.g., vehicles, pedestrians) towards specific destinations during a given time period via different travel routes [2]. As a large-scale gathering of urban crowds involves potential threats to public safety [3], it is crucial to inform city planners, municipal managers, and other stakeholders of the risk at an early stage. Understanding the gathering process of urban crowd flows can help mitigate the risk in case the situation evolves towards a dangerous incident.
At present, with the assistance of the growing number of Global Positioning System (GPS) trackers installed in vehicles and the widespread penetration of mobile devices (e.g., smart-phones, tablets) equipped with positioning modules, we are able to capture digital traces from individual citizens in space and time directly and easily [4]. The use of GPS trackers and mobile positioning devices as sensor probes substantially overcomes the main drawbacks of traditional monitoring systems (e.g., fixed sensors, video cameras), namely, limited coverage of the geographical space and high costs of installation and maintenance [5]. It therefore enables us to observe, quantify, analyze, and predict the level of crowdedness of residents in nearby urban areas by measuring dynamic population density at arbitrary locations and identifying densely populated routes in the road network [6]. As a result, the spatio-temporal characteristics of urban crowd flows (e.g., average speed significantly lower than normal speed and space occupancy significantly higher than normal situation) have been deeply explored [7]. Taking vehicular movements as an example, a branch of transportation studies has highlighted the formation process of road traffic congestion in urban areas as well as its social, economic, and environmental impacts on urban life [8][9][10]. Moreover, the inherent daily rhythms of urban mobility dynamics largely lead urban crowd flows to be nonrecurrent in the short-term, recurrent in the long-term, and correlated in geographical space [11]. These facets have already served as fundamentals for the modeling and prediction of urban crowd flows in many practical applications.
However, the existing studies typically assume or neglect morphological correlations of crowdedness, leaving the spatio-temporal evolution patterns of urban crowd flows largely untouched [12]. Indeed, there is an urgent need for identifying, analyzing, and modeling the morphological evolutionary patterns of urban crowd flows. This will provide insights into citywide population concentration (e.g., road traffic congestion), on what factors are correlated in urban crowdedness, and how crowdedness propagates from one place (e.g., road, block) to another. Facilitated by this information, we will be able to build various applications including road planning, traffic prediction, and congestion management, just to name a few. To fill the gap of current studies on urban crowd flow analysis, we propose a novel method to model the morphological evolutionary patterns of urban crowd flows and validate it under both synthetic and real-world data scenarios.
The remainder of this article is organized as follows. In Section 2, we review and summarize existing research works on the analysis of urban crowd flows in terms of visualization, identification, prediction, and correlation. In Section 3, we elaborate on the methods for delineating morphological changes of urban crowd flows. In Section 4, we validate the proposed methodology under both simulation and real-world scenarios. In Section 5, we highlight our primary contributions, summarize the research findings, and discuss potential limitations.

Related Work
Analyzing the spatio-temporal distribution of urban crowd flows is a long-standing research focus. In metropolitan cities, crowd flows are influenced by the complex land uses and frequent mass gatherings, so it is more likely to form a crowded hotspot in a limited range of space and time [13]. Several studies have investigated this phenomenon by counting instant population via camera videos [14], telecommunications [15], social media footprints [16], and other ubiquitous sensing techniques. At the citywide scale, existing studies on urban crowd flow analysis can be generally categorized into four major strands concentrating on visualization, identification, prediction, and correlation.
Visualization techniques (e.g., isosurface and kernel density map) qualitatively reveal the macro-patterns of urban crowd flows as well as the micro-patterns of an individual trajectory to inform stakeholders about where and when crowded areas are formed, developed, and moved on one or several frequent routes [17,18]. Yet, it is nontrivial to quantitatively perceive the level and state of crowdedness directly from point and flow density-based visualizations [19]. Quantitatively, crowd density is the most important metric to evaluate the criticality of crowd situations by locally counting the population per unit area [20]. Local areas where urban inhabitants are likely to congregate over the predefined density threshold can thus be detected for careful monitoring during an event to secure crowd safety [21,22]. Supplemented by mobility flow mapping [23], the visualization paradigm informatively and intuitively depicts urban dynamics such as where people are converging within the city over the course of a day and how people occupy and travel through certain urban spaces as a response to special events [24]. Nonetheless, the local crowd density alone is insufficient for a comprehensive assessment of the criticality of a crowd situation. Many other factors related to urban crowds are adopted for a properly situational understanding, including the local speed variance, the local environment dependence, and the movement intentions [25]. Considering that individuals typically perform with high mobility in a sparse region but, in contrast, move slowly with densely neighboring crowds, the crowdedness of a spot has also been taken as a non-density-based measurement in terms of the instant, maximum, and minimum moving speeds [26]. This intertwined relationship between the moving speed of individuals and the crowd density [27] eventually led to the combination of both density and speed for better identification of urban crowd flows [28]. In particular, computer scientists have developed many efficient tools for querying densely populated regions in spatio-temporal databases [29][30][31].
Beyond visualization and identification, many efforts have also been devoted to the accurate prediction of crowd density at a citywide level for the early preparation of emergent crowd situations in the real world [32]. The basic rationale is that late arrivals in urban crowds are predictable based on the historical observations of inhabitants arriving early to attend gatherings [33]. With longitudinal observations, the crowd population distribution can be predicted based on the diurnal dynamic changes as well as the sources and sinks of the observed population movements [34]. Recently, deep-learning frameworks (e.g., convolutional neural networks, long short-term memory) have provided novel and promising tools for coupling periodicity, trends, residuals, and spatial locality into the prediction of urban crowd dynamics [35]. Yet, to make previous implicit methods interpretable, the mechanisms beneath the spatio-temporal formation and propagation processes of urban crowd flows are of vital importance. Fortunately, urban crowd flows manifest significant spatial and temporal correlations [36], as they usually have daily and weekly periodic patterns as well as instantaneous responses due to environmental and social conditions [37]. For instance, adjacent crowded spots have strong interactions with each other, and a crowded spot remains crowded in consecutive time periods [38,39]. A great deal of research has formed a macroscopic description of urban crowd flows and their propagation in time and space based on crowd simulation [40] and traffic flow theories [41]. Particularly, the emerging multiple sources of data enable urban crowd correlation to be captured, mined, and analyzed in very fine spatial and temporal granularities (e.g., road segments, street blocks) [42,43]. However, the existing literature on urban crowd propagation in large-scale networks mainly focuses on graphical representations of crowdedness without a metric or a dynamic model [44,45].
To support spatio-temporal modeling, there exists a rich body of research works on the evolution of spatio-temporal phenomena in the domain of Temporal GIS [46,47]. Many popular models have been developed based on raster-oriented, event-oriented, and spatio-temporal object-based perspectives [48,49]. Most of these models describe natural phenomena (e.g., wildfire, rainfall) without considering the human activities involved (e.g., traffic congestion, urban crowds) [50]. For natural phenomena, thematic characteristics are represented as attributes of spatial objects and further utilized to associate objects for tracking their spatio-temporal changes [51]. Under this scenario, an attribute denotes a single object (i.e., one-to-one mapping). Yet, to the best of our knowledge, the situation is substantially different for urban crowd modeling as there are fundamental differences between natural phenomena and human activity. For human activities (e.g., urban crowd flow), an object is a spatially cohesive region with similar attribute and a specific attribute might contain multiple objects (i.e., one-to-many mapping) [52,53]. Considering that target objects (i.e., crowd regions) are not readily traceable by their attributes between consecutive time frames, additional research efforts are therefore required to intuitively represent the dynamics of urban crowds' emerging and spreading in order to enable the real-time control of critical gathering regimes in urban environments. It is also noteworthy that existing spatio-temporal data models seldom quantify the spatio-temporal changes from the nested perspective (i.e., to support the monitoring of multiple levels of crowdedness). In summary, to fill the aforementioned research gaps, all the factors related to urban crowd visualization, identification, and correlation need to be taken into consideration integrally for the analysis of urban crowd flows.

Methodology
In this research, we propose a matrix-computation-based methodology for understanding the spatio-temporal evolution patterns of urban crowd flows. A brief overview of the proposed analytical framework is illustrated in Figure 1. Details for each processing step are elaborated in terms of matrix algebra as follows. Note that the proposed framework is raster-based (i.e., in the form of a matrix), which can be directly implemented by matrix computation manipulations. Due to this characteristic, it is independent of any complex spatial relation operation and spatio-temporal database required by conventional spatio-temporal modeling in the domain of temporal GIS [54]. Therefore, we argue that the proposed model and framework have remarkable generalization ability.

Urban Lattice
Given the input data (e.g., human mobility observations and the spatial coverage of the study area), we adopt regular spatial partitioning to monitor the spatio-temporal distributions of urban crowd flows in terms of the temporal series of matrices.

Spatial Partition
To quantify crowdedness, a city is divided into regular spatio-temporal grids characterized as a set of matrices I as: where the matrix I t of all grids at a certain moment t is defined as: where I t x,y is the value of the crowd level at the moment t within a given spatial cell (x, y) with the centroid at (x, y) and the size (i.e., both width and height) of δ. Note that the centroid (x, y) of a grid cell is used interchangeably with its corresponding spatial cell (x, y) in the following equations for the sake of brevity.

Crowd Level
The record of an individual p at time t is denoted as a tuple p, x t , y t , v t . Between time t and t + 1 (i.e., [t, t + 1)), the trajectory of individual p contains a set of consecutive records with length L ordered by time where L ≥ 2 and t 1 < t 2 < · · · < t L . For a given cell (x, y) of element I t x,y at time t, we can obtain its (1) speed (i.e., the average moving speed within a given spatial cell and a given time frame), (2) volume (i.e., the number of moving objects entering into, exiting from, traversing through, or remaining in a given spatial cell at a given time frame), (3) flux (i.e., the total number of moving objects that have appeared in a given spatial cell at a given time frame), and (4) crowd rate (i.e., the ratio of detention of all the moving objects out of the flux in a given spatial cell at a given time frame) as stay t x,y = |{P t : where the operator | | counts the cardinality of the input set (e.g., the number of unique individuals that have ever appeared in the cell during the given time period).
• Crowd rate We treat cells with low speed and high crowd rate as target areas, where the former indicates congested areas and the latter indicates the areas that have been visited by a large volume of population, of which the majority stays in the area for a long time. Based on speed, volume, flux, and crowd rate thresholds, the crowd level of cells are categorized into N (e.g., N = 3 for the sake of illustration and case study hereafter) distinct crowd states as Slowed flow: I t x,y = 1 for v t x,y ≤ and s t x,y < λ; • Crowded flow: I t x,y = 2 for v t x,y ≤ and s t x,y ≥ λ; and only those cells with a volume of flux f t x,y > κ will be further analyzed in order to avoid potential data sparseness that may arise in the empirical analysis.

Urban Crowd Hotspot
Hereafter, we extract individual crowd regions by the two-directional associations between neighboring (or nearby) urban crowds in space and time. We formulate the process to identify individual crowd regions based on the connectivity of crowd cells as follows.

Connectivity
For each cell of element I x,y in matrix I, its Moore neighborhood (containing two vertical, two horizontal, and four diagonal neighbors) is given by where x, y are the centroid coordinates and δ is the size of the cell, as previously mentioned. It is noteworthy that a certain cell in N 8 x,y will be missing if I x,y lies on the border of the matrix. Based on the Moore neighborhood, two cells I t x,y and I t x ,y are defined as directly reachable crowds if they are topological neighbors and their values of crowd level satisfy the predefined criterion of similarity Recall that the crowd level of cells can be categorized into N arbitrary states, and µ is therefore a parameter that is dependent on the number of categories N (e.g., N = 3 for our research context; see Section 3.1.2).

Connected Component
Two cells I t x,y and I t x ,y are further defined as reachable crowds if there exists a path I t x 1 ,y 1 , · · · , I t x n ,y n with I t x 1 ,y 1 = I t x,y and I t x n ,y n = I t x ,y , where each cell I t x i+1 ,y i+1 is directly reachable from cell I t x i ,y i according to Equations (12) and (13). Note that the so-called "directly reachable" defined here is fundamentally different from the density reachability defined in DBSCAN [55]. It denotes the topological relationship between cells, and is therefore symmetric. For each element I x,y , the largest connected subset in I that is reachable to this element is called a connected component of I. If an urban crowd distribution I contains only one single connected component, it is called a connected set C and its associated cells are denoted as C. Note that, by definition, a connected component is a collection of cells that are reachable to each other, whereas a connected set is a matrix (of size m × n) recording the crowd value for all the cells. In the connected set, the value for the cells of the connected component is non-zero, and vice-versa. In a common scenario, the urban crowd distribution I will consist of K (≥2) distinct connected sets C = {C i : i = 1, 2, · · · , K} as where the operator • computes the Hadamard (entrywise) product of a set of input matrices [56]. We therefore distinguish these connected sets as distinctively individual crowd regions for morphological evolution analysis.

Crowd Region
For a given connected set of crowds C t at time t, it is defined as a crowd region R t in matrix terms as follows: By definition, each crowd region is a set of connected cells within which the speed is low and the crowd rate is high. Therefore, we can obtain a set of crowd regions for the entire region as

Spatio-Temporal Evolution
Given a target crowd region, we compare its morphology at the current time frame and the morphologies of its associated crowd regions at the next time frame to gain insights into its spatio-temporal evolution patterns. To achieve this goal, we developed an approach to identify the spatial coverage of a crowd region at the next time snapshot based on its spatial coverage at the current time snapshot. The union of the spatial coverages between two consecutive snapshots is named as the mask region, which enables us to track the morphology of each crowd region over time.

Mask Region
For a given crowd region R t , its mask (i.e., the spatial coverage of the region) at the current time is defined as where µ is the minimum value of target crowd levels. Then, the set of masks for all crowd regions R t , R t+1 at time t, t + 1 is defined as where M t 0 = M t+1 0 ≡ 0 m×n is forced to be an empty mask. For each individual mask M t i in the mask set M t , its corresponding connected masks at the next time frame t + 1 is defined as In the same way, for each mask in M t+1 its corresponding connected masks at the previous time frame t can be derived as and the relations hold for all conditions as Furthermore, we define the associated masks at the current time t for the mask M t i of a given crowd region R t i as Note that the mask regions M t i, * and M t+1 i, * enable us to identify the association of congested regions in the same timestamp as well as between two consecutive timestamps. With the association, the morphological changes of the urban crowds are therefore described.

Crowd Morphology
The morphological change of a crowd region R t i is determined by comparing the characteristics of its corresponding masks M t i , M t i, * , and M t+1 i, * in terms of cardinality, area, and centroid [57]. With regard to the mask sets, the cardinality |M t i, * | denotes the number of non-empty elements (i.e., M t * ,j = {0 m×n }) in M t i, * , and the cardinality |M t+1 i, * | denotes the number of non-empty elements (i.e., M t+1 Considering that crowd region is represented by matrix, we adopt the raw moment to quantify its morphological attributes [58]. Mathematically, the area of a crowd region is given by the 0th-order raw moment of a mask matrix as and its centroid is given by the 1st-order raw moments of the mask matrix as Based on these principles, we obtain the area of each crowd region defined by M t and M t+1 at time t and t + 1 as area M t i and area M t+1 j as well as their centroids at ( ). Thereafter, we build a decision tree implemented by Algorithm 1 and categorize the morphological changes of urban crowds into 11 distinct categories as {"Newly Occurring", "Disappearing", "Splitting and Merging", "Splitting", "Merging", "Stable", "Stable and Moving", "Shrinking", "Shrinking and Moving", "Growing", and "Growing and Moving"}. Descriptive characteristics for each category are listed in Table 1. The rationale behind the assignment is to compare the centroid and the area of associated crowd regions between two consecutive time slots. Note that under certain scenarios a crowd region may involve splitting, merging, growing, shrinking, and moving at the same time, and we define this as "Splitting and Merging" in that there usually exist multiple partitions of the region into sub-regions to quantify their growing and shrinking patterns. In other words, it is impossible to refine those patterns into distinct evolution (sub-)categories. By doing so, each crowd region will be assigned into one out of the 11 predefined types of morphological changes during the given time period.

Algorithm 1: Morphological analysis.
i, * then return M t i is "Shrinking" else return M t i is "Shrinking and Moving" else

is "Growing and Moving"
A B C D E E' F (as condition labels)

Nested Crowd Evolution
Recall that a crowd region can contain sub-regions with different levels of crowdedness (refer to Section 3.1.2). We implement nested crowd evolution analysis at each level of crowd regions to gain a comprehensive description of the evolution patterns of urban crowd flows. The procedure is illustrated in details by Algorithm 2. Simply put, the crowd regions with low levels of crowdedness will be analyzed first, and then their core sub-regions (if exist) with higher level of crowdedness will be further analyzed. Since the spatial coverage of slowed flows is larger than that of crowded flows, the resultant morphological evolutionary patterns are described hierarchically in order to inform stakeholders in order to narrow down the criticality of crowd regions from low to high priority.

Synthetic Data
To show our proposed method's capacity to discover urban crowds, we first produced a temporal series of synthetic population distributions as illustrated in Figure 2. Note that synthetic data were generated based on specific conditions to assure that the 11 predefined types of morphological changes existed at the same time. In addition, to simplify the simulation, we only considered two crowd levels in the synthetic data (i.e., free flow (in green) and crowded flow (in red)). In this sense, we decoupled and analyzed the morphological evolutionary patterns from a non-nested standpoint in this simulation scenario. Besides, the analysis of synthetic data can serve as an example of aforementioned definitions and computation process in Section 3. To be concise, we illustrate the synthetic urban crowd distributions at time t and t + 1. Under closer scrutiny, there are 10 separate crowd regions in the first time frame and 11 crowd regions in the next. Following the proposed methodology in Section 3.3, we obtain the associated mask regions for each crowd region. For instance, the mask regions associated with crowd region M t 2 at time t are {M t 1 , M t 2 } and itself M t 2 . The mask regions associated with this crowd region at time t + 1 change to be M t+1 1 and M t+1 2 . With these mask regions, we can assign each crowd region a morphological evolutionary pattern between the two given time frames according to the predefined decision tree.

Pattern Assignments
As mentioned above, there are 10 crowd regions at time t and we have labelled them with different colors in Figure 3. After deriving the associated mask regions for each crowd region, we show their corresponding paths along the decision tree. For instance, for the cell marked as M t 0 , it is not crowded at the current time, but merges to be crowded at the next time frame. It is thus assigned to be "Newly Occurring". For the crowd region marked as M t 1 , it connects to the crowd region M t 2 at the next time frame to constitute a new crowd region. It is therefore assigned into "Merging". The pattern assignments for other crowd regions can be easily derived following this diagram and we leave it to readers for the sake of simplicity.

Case Study Area
Wuhan, the capital of Hubei province and the most populous city in Central China, lies in the eastern Jianghan Plain on the middle reaches of the Yangtze River at the intersection of the Yangtze and Han rivers. The city is a major transportation hub, with dozens of railways, roads, and expressways passing through the city and connecting to other major cities. Due to its key role in the national transportation network, Wuhan is well known as "China's Thoroughfare" by domestic sources, and sometimes referred to as "the Chicago of China" by foreign sources. It holds sub-provincial status, arises out of the conglomeration of three cities (i.e., Wuchang, Hankou, and Hanyang), and serves as the political, economic, financial, cultural, educational, and transportation center of central China.
The city of Wuhan was selected because this quick-emerging Asian megalopolis has been through very rapid growth stemming from a special economic initiative, with a population of approximately 10 million inhabitants in an area of 8594 square km, out of which approximately 600 square km are urban areas. Note that the study area for this part of the research is the major urban area of Wuhan, which is surrounded by the 3-Ring expressway and covers the majority of the population of the entire city. With regard to human mobility of this area, we collected the digital traces from 12,000 taxicabs in the city from 1 to 31 May 2014 for empirical analysis of the morphological evolutionary patterns of urban crowd flows. In detail, each GPS record contains the spatial location (in longitude and latitude), timestamp, operation status (as vacant or occupied), driving direction, and moving speed of a given taxicab. All those taxicabs worked continuously (with pick-ups and drop-offs) in the case study period (i.e., 31 days) and capture the daily traffic dynamics over the city's road network well. Note that although taxi data is only a small portion of road traffic, it has been widely applied to understand the overall patterns of transportation networks, in that unbiased traffic data are not readily available. Besides, although it is constrained by the road network, taxi mobility can be efficiently assessed via grid partitioning in the form of a data matrix [35], providing a good tool for evaluating our proposed matrix-computation-based framework.
Following the proposed methodology, the case study area was partitioned into 500 × 500 m grids (i.e., δ = 500 m) as illustrated in Figure 4. Then, the average speed, traffic volume (e.g., in-flow, out-flow, passing flow, staying flow, and flux) were calculated for each grid in every 3 minutes (i.e., ∆t = 3). To differentiate the crowd levels into free flow (in green), slowed flow (in yellow), and crowded flow (in red), we adopted a speed threshold = 20 km/h, a crowd rate threshold λ = 50%, and a volume threshold of κ = 10 taxicabs. Note that we conducted several trials with different parameters to derive crowded regions, and their overall spatial patterns were similar. There are clear temporal patterns as the crowd levels change over time. Here we selected five typical locations (i.e., Wuchang Rail Station, Optic Valley, Jianghan Road, Jiedaokou, and Qingnian Road) with distinct urban functionalities as examples to show the temporal patterns of urban crowdedness. In general, all the locations concentrated much more road traffic at daytime than nighttime. Meanwhile, the peak time period of the severity of crowdedness depended on the local urban environment. Wuchang Rail Station is a major railway station on the railway lines connecting Beijing-Guangzhou, Wuhan-Jiujiang and Hankou-Danjiangkou. It is the largest transportation hub in Wuhan, with daily traffic of more than 80,000 passengers and 20,000 packages. As a result, it is the most congested area among the selected locations and is even crowded with many taxicabs in the middle of the night to deliver railway passengers to destinations within the city. On the contrary, Optic Valley, as a rising central business district (CBD) with frequent road construction, is severely congested with road traffic between 09:00 and 23:00 h (i.e., almost the working hours) but becomes a completely empty at nighttime, when the traffic vanishes. Jianghan Road and Jiedaokou, as major commercial and business districts, have a great deal of road traffic during the commuting hours. As a major road located close to Hankou Rail Station, Qingnian Road is crowded during the commuting periods, and, similar to Wuchang Rail Station, gathers many taxicabs at nighttime. These empirical observations validate our proposed methodology for defining urban crowd levels.

Citywide Crowd Hotspots
At three-minute intervals, we identified the crowd regions (i.e., connected crowd cells) within the case study area. If a cell (or a set of cells) was crowded in many time slots (e.g., several hours of a day), we defined it as a crowd hotspot. At the citywide scale, we therefore calculated the ratio of crowdedness over time (i.e., the number of time slots when a cell was crowded divided by the total number of time slots) for each cell and the results are demonstrated in Figure 5. At the low level (i.e., µ ≥ 1), urban crowd flows were concentrated within several sub-regions, including the neighborhoods of Hankou Rail Station, Hankou CBD, Wuchang Old City, Wuchang Rail Station, Optic Valley, Hanyang CBD, and Wangjiawan Cross. At the high level (i.e., µ ≥ 2), urban crowd flows congregated heavily along several major roads, including Jiefang Avenue, Hanyang Avenue, Longgang Avenue, Wuluo Road, and Luoyu Road, as well as at the transportation hubs, including Hankou Rail Station and Wuchang Rail Station. Compared with the road traffic map published by the Wuhan Land Resource and Planning Bureau (http://www.whtpi.com/News/11.html), the spatial distribution of the crowd hotspots we found were highly overlapped with the traffic hotspots based on the loop detector that counts both private and public vehicles on the road network. These are thus confirmed to be critical spots where urban traffic should be carefully monitored and controlled within the case study city. In the temporal dimension, citywide crowd regions were quite stable in their morphological shapes. As shown in Figure 6, among the 11 types of morphological changes, more than half of the spatial coverages of both the slowed flows and the crowded flows were found to be "Stable". We believe that there are two possible reasons for the stable crowd flows. First, the time interval for tracking urban flows was 3 min. Since urban crowd flows usually exist for longer than 3 min, they would be detected as stable in consecutive intervals. Second, in practice, crowded regions are usually caused by bottlenecks of the road network as well as the spatial distribution of the urban population. These regions are usually concentrated by very stable crowded flows over time. Additionally, there is also a relatively large proportion (e.g., approximately 10%) of crowd regions that grows and shrinks as a response to the daily rhythm of urban dynamics. In particular, many severely crowded regions (i.e., congestion cores in road traffic) appear and disappear at the bottlenecks of the underlying road network. This is consistent with our understanding of road traffic, in that congestion usually forms and spreads in commuting hours. From a nested morphological perspective, we further found that severely crowded regions (i.e., µ ≥ 2) existed only in a limited number (about 25%) of urban crowd flows (i.e., µ ≥ 1). Those severely crowded regions were stable in both space and time. This provides direct evidence that urban crowd flows are significantly spatially and temporally correlated.
Based on the spatial distribution and the statistical characteristics of the derived urban crowd flows from taxi mobility data, we argue that citywide crowd hotspots are concentrated at a few critical locations and are recurrent in both spatial and temporal dimensions. These empirical observations validate our proposed methodology for deriving urban crowd regions and quantifying their morphological changing patterns.

Morphological Evolutionary Patterns
From the perspective of morphological evolution, we further investigated how individual crowd regions changed their shapes over time. For readability, we adopted a network visualization to reveal the morphological evolutionary patterns of urban crowd flows. As illustrated in Figure 7, the primary daily transmission patterns for individual slowed crowd regions were "Stable", "Growing and Moving", "Shrinking and Moving", "New Occurring", and "Disappearing". This is remarkably consistent with the statistics for the citywide crowd hotspots as mentioned above. Under closer scrutiny, we found that the morphological evolutionary patterns of individual slowed crowd regions were different in the middle of the night (i.e., 00:00 to 03:00 h), morning rush hours (i.e., 06:00 to 09:00 h), afternoon rush hours (i.e., 12:00 to 15:00 h), and evening rush hours (i.e., 18:00 to 21:00 h). In particular, during the morning rush hours, the dominant morphological evolutionary patterns of slowed crowd regions was "Growing and Moving", which is a consequence of the sudden increase of road traffic between home places and working offices. Similarly, we observed the morphological evolutionary patterns for the congested crowd regions. As the statistics over the citywide hotspots indicate, the severely crowded regions were relatively limited in number and merged and disappeared over time. This pattern held for all congested crowd regions. In addition, the morphological evolutionary pattern of the congested crowd regions seemed to be more stable than those of the slowed crowd regions, regardless of whether it was in the middle of the night or during morning rush hours, afternoon rush hours, or evening rush hours. These empirical observations validate our proposed methodology for delineating the morphological evolutionary patterns of urban crowd flows. Figure 7. Temporal transitions between the morphological evolutionary patterns for the slowed flows (top) and the crowded flows (bottom) at distinct time scales-that is, middle of the night, morning rush hours, afternoon rush hours, and evening rush hours. Note that the node size denotes the relative frequency of each pattern, the link width denotes the transmission probability between two nodes, and the link color denotes the originating node.
To show the relation between the morphological evolutionary pattern and the local urban environment, we further zoom into five typical locations associated with distinct socio-economic functions, including Wuchang Rail Station, Optic Valley, Jianghan Road, Jiedaokou, and Qingnian Road (see Figure 8). Recall that Wuchang Rail Station was crowded with taxicabs in the nighttime to transfer railway passengers, and the morphology of its belonged crowd region was quite "Stable" in that taxicabs usually move slowly in a queue following the transportation policy. During commuting hours, the morphology of the crowd region grew and shrank following the pulse of daily urban dynamics. The patterns in Optic Valley were distinctively different. As one of the most congested spots due to ongoing road construction, this crowd region was "Stable" in the daytime due to the heavy traffic flow passing through this area for business and commercial activities. The crowds around Jianghan Road were highly dynamic during rush hours, in particular immediately after working hours (i.e., 19:00 to 22:00 h). It is interesting that as a pedestrian street the crowd region seldom gathered too many taxicabs to form a stable congestion core, probably due to specific transportation policy in this area. The crowd region around Jiedaokou was generally "Stable" during morning, afternoon, and evening rush hours. At noon time, the congested crowd core merged and disappeared along with the increasing road traffic passing through this area. The morphological evolutionary patterns of the crowd region at Qingnian Road showed characteristics of a commercial area like Jiedaokou and a transportation hub like Wuchang Rail Station. Its congestion core appeared and remained "Stable" during morning, afternoon, and evening rush hours. Considering that Hankou Rail Station is located inside this region, there was a "Stable" crowd region as well as a "Stable" congestion core during the night time. These empirical observations imply that the daily rhythm of urban life is a major impact factor that determines the overall morphological evolutionary patterns of urban crowd flows, as expected.

Discussion and Conclusions
Understanding the spatio-temporal characteristics of urban crowd flows is of great importance to traffic management and public safety, and is very challenging because it is affected by many complex factors, including spatial dependencies, temporal dependencies, and external conditions. Our proposed method for modeling urban crowds' morphological evolutionary patterns was validated for its ability to define urban crowd levels, derive urban crowd regions, quantify their morphological changes, and delineate the morphological evolutionary patterns with both synthetic and real-word data scenarios. In particular, we note several merits of our proposed framework in its generalization ability. As it is raster-oriented, it requires no support from spatial relation operation or spatio-temporal databases. The morphological changes can be easily traced by matrix algebra and can be clearly visualized in space and time. This will enable us to identify and understand the gathering process of urban crowd flows in an informative and intuitive manner. Moreover, the proposed framework will provide an important input for spatio-temporal phenomena modeling that heavily relies on raster-based models (e.g., the popular cellular automaton analysis). With regard to applications, the proposed framework enabled us to detect hotspots towards which potential traffic of moving objects are gathering. Supplemented with urban road networks, traffic congestion becomes traceable for transport network optimization by considering when and where the congested areas form and disappear. Beyond road traffic, the proposed framework is applicable to diverse human mobility activities (e.g., the crowdedness of pedestrians or animals), and thus could provide valuable insights into commercial facility allocation, the risk of trampling events, and many other urban and non-urban problems. Based on the proposed methodology, abnormal evolution patterns as well as emergent crowd incidents with different severity will also become detectable.
Although promising, we have also noticed several further directions and limitations in our research work. On the one hand, urban crowd flows are affected by temporal dependencies and external conditions, which results in significant short-term variations. Specificaaly, the crowd rate is parameter-dependent and might vary in different urban environments. In this sense, the sensitivity analysis of the morphological evolutionary patterns of urban crowd flows under different environmental conditions should be tested. Our proposed methodology can be easily applied to other cities, and the parameters can be determined according to the research context (e.g., spatio-temporal characteristics of the urban transportation network). From a methodological perspective, the cell size of the grid should be determined by ad hoc applications. The cell size should be chosen based on the spatial scale of the crowd phenomenon of interest and, probably, in a heuristic manner. A large cell size usually results in morphological characteristics spanning large spatial areas, whereas a small cell size enables us to zoom into crowded regions concentrated at specific spots. Urban crowd flows are also long-term recurrent in terms of morphological changes, which might be modeled and predicted by a deep-learning framework in the future. Besides, we have to admit that urban crowd flows are mainly distributed over urban road networks. For more accurate monitoring of urban crowd flows in finer spatial granularity, the urban mobility data should be map-matched onto the underlying road network under common circumstances. Consequently, crowd density should be adjusted with an appropriate mechanism by adopting an adaptive crowd rate based on the road density within cells. Meanwhile, the evolution patterns of urban crowd flows on road networks should be delineated and modeled in the constrained geographical space, which will be a nontrivial task to accomplish. Another strategy could be to adopt small grids for densely populated areas and large grids for sparsely populated areas. We believe that the quad-tree might be a good strategy for adaptive spatial partitioning and accelerating the computation process. We look forward to generalizing our calculation process for the quad-tree structure in future work.
Finally, data representativeness (e.g., sampling bias, data sparseness, and positioning accuracy) is also a critical factor for reliable urban crowd flow analysis [59]. Considering that taxi data is a biased sample of the real population distribution, the derived congestion regions might deviate from the real urban crowd flow patterns. For instance, the crowdedness associated with taxicabs significantly underestimated the real mobility of urban inhabitants at Jianghan Road in our case study. With regard to the robustness of the proposed indices, if the input data are continuously recorded, there is no uncertainty in the calculation. However, in practice, mobility data are collected in discrete timestamps. We usually interpolate those discrete points to obtain a continuous trajectory or set the time unit to be greater than the sampling interval. However, the uncertainty cannot be eliminated. Data fusion on different types of flows (e.g., mobile phone data, social media data, transportation data) might be an efficient solution in our future works.