Understanding Spatiotemporal Patterns of Human Convergence and Divergence Using Mobile Phone Location Data

: Investigating human mobility patterns can help researchers and agencies understand the driving forces of human movement, with potential beneﬁts for urban planning and trafﬁc management. Recent advances in location-aware technologies have provided many new data sources (e.g., mobile phone and social media data) for studying human space-time behavioral regularity. Although existing studies have utilized these new datasets to characterize human mobility patterns from various aspects, such as predicting human mobility and monitoring urban dynamics, few studies have focused on human convergence and divergence patterns within a city. This study aims to explore human spatial convergence and divergence and their evolutions over time using large-scale mobile phone location data. Using a dataset from Shenzhen, China, we developed a method to identify spatiotemporal patterns of human convergence and divergence. Eight distinct patterns were extracted, and the spatial distributions of these patterns are discussed in the context of urban functional regions. Thus, this study investigates urban human convergence and divergence patterns and their relationships with the urban functional environment, which is helpful for urban policy development, urban planning and trafﬁc management.


Introduction
Cities comprise flows of information, goods and people.Among these urban flows, human movements are critical components that drive the pulses of cities. Examining people flows and their spatiotemporal dynamics has always been an important task for a wide range of disciplines, e.g., GIScience, transportation, epidemiology, etc. Traditionally, our ability to capture timely and spatially-detailed human mobility data has been constrained by available resources and data collection techniques [1][2][3].However, recent advances in location-aware technologies have produced new data sources, e.g., mobile phones, smart cards and social media that detail the movements of people in their daily lives.Consequently, studies have addressed various research challenges related to urban vitality [3][4][5], mobility prediction [6,7] and transportation modelling [8,9].These studies have enhanced our understanding of human mobility patterns in urban contexts.In this study, we attempt to improve the research in this field and focus on analyzing spatiotemporal patterns of human convergence and divergence in cities.
Convergence to a location suggests that the number of people flowing to a location is larger than the number of outgoing people.Conversely, divergence from a location suggests that the number of people leaving the location is larger than the number of incoming people.An understanding of how people flows converge and diverge in space and time in cities, as well as their relationships with urban land use can provide insight regarding urban dynamics and potentially benefit urban planning and public transportation management in cities.Therefore, the main research questions of this study are as follows: 1.
What spatiotemporal patterns of human convergence and divergence exist in the daily urban context? 2.
What types of urban land use are generally associated with these patterns?
To address these two questions, this study uses a large-scale mobile phone dataset collected in Shenzhen, China, on a weekday to investigate spatiotemporal patterns of human convergence and divergence.Unlike call detail records (CDRs) that only capture individual footprints during actual communication [10,11], the mobile phone dataset used in this study tracks individuals regularly over time (approximately once every hour) at the cell phone tower level, which enables us to investigate human convergence and divergence patterns with relatively fine and regular spatiotemporal resolution.These identified patterns reflect the essential characteristics of human travel patterns at different locations within the city and have implications for transportation planning, emergency response and epidemic control.

Literature Review
The development of information and communication technologies has profound implications for human sociology and physical mobility and makes it possible to collect large sets of georeferenced data from location-based devices, such as mobile phones, which creates new opportunities for understanding human mobility patterns and their relationship with urban functional environments [12][13][14].
Human mobility is closely related to urban transport and planning and is an important research topic in urban studies.For example, an individual's home and workplace can be identified from mobile phone data, and origin-destination flow matrices can be constructed to investigate commuting patterns [15][16][17].Real-time traffic speeds and travel times can be measured using a cellular phone-based system [18].In addition, real-time urban dynamics can be captured using mobile phone data to monitor human spatiotemporal distributions and provide insight into the real-time intensity of human activities in different urban areas [4,[19][20][21].Human mobility hotspots and dense areas can be detected by analyzing the trajectories and densities of cell phone users in urban environments [22][23][24][25].
Guo et al. [26] extracted pick-up and drop-off details from taxi trajectory data and proposed a hierarchical clustering method to map human flows with similar origins and destinations.Human mobility source-sink areas can also be identified based on temporal variations in pick-up and drop-off locations [27].Mobility networks can also be created from human movements, reflecting the spatial interactions of different urban areas and communities, or areas with close connections can be detected and used to evaluate and optimize urban planning [28,29].
There is a strong relationship between human mobility and the functional environment [27,30].The spatial distribution of different urban functional regions (e.g., residential, industrial or commercial) determines human activity locations, such as living, working, shopping and leisure.The spatial separation of these functional regions and the demands of human activities lead to human flows in urban space.Functional differences associated with different types of land use appear as different human mobility patterns.Thus, land use information can be used to estimate travel demands in different urban areas (i.e., a land use-transport interaction model) [31].The temporal population variation reflects the underlying function of the location.Thus, some studies have built temporal feature vectors for human activities at the grid cell level using human sensing data and machine learning methods to classify those vectors and infer urban land use information [32][33][34].The classification accuracy decreases as the heterogeneity of land use increases, but additional information (e.g., spatial interaction patterns and points of interest) can be incorporated to identify different functional regions and improve the accuracy [35,36].
These studies demonstrate the powerful potential of emerging big data in research regarding human mobility patterns and the relationships between human mobility patterns and the urban functional environment.This study adds to this knowledge base by investigating the spatiotemporal patterns of human convergence and divergence in a city environment.

Study Area and Dataset
The study area for this research is Shenzhen, which is located in southern China.Shenzhen has experienced rapid development associated with reform policies over the past 30 years, and the area has attracted a large number of immigrant workers seeking job opportunities.The total area of Shenzhen is approximately 1996 square kilometers, and the population is more than 15 million, reflecting the highest population density among Chinese cities [37].
The mobile phone location dataset used in this study was collected by a mobile phone company that includes approximately 60% of the entire mobile phone market in Shenzhen.It covers 16 million mobile phone users over a single workday and records the cell phone tower locations each cell phone connects to approximatively every hour.Thus, each cell phone has 24 records each day containing a user ID, recording time and longitude and latitude of the cell phone tower.The user ID was encrypted for privacy protection before the dataset was released for research purposes.Table 1 shows an example of an individual user's mobile phone records for a day.In total, 5940 cell phone towers (CPTs) with unique Tower ID numbers were extracted from the dataset.Figure 1 shows the spatial kernel density of the cell phone towers.
The spatial separation of these functional regions and the demands of human activities lead to human flows in urban space.Functional differences associated with different types of land use appear as different human mobility patterns.Thus, land use information can be used to estimate travel demands in different urban areas (i.e., a land use-transport interaction model) [31].The temporal population variation reflects the underlying function of the location.Thus, some studies have built temporal feature vectors for human activities at the grid cell level using human sensing data and machine learning methods to classify those vectors and infer urban land use information [32][33][34].The classification accuracy decreases as the heterogeneity of land use increases, but additional information (e.g., spatial interaction patterns and points of interest) can be incorporated to identify different functional regions and improve the accuracy [35,36].
These studies demonstrate the powerful potential of emerging big data in research regarding human mobility patterns and the relationships between human mobility patterns and the urban functional environment.This study adds to this knowledge base by investigating the spatiotemporal patterns of human convergence and divergence in a city environment.

Study Area and Dataset
The study area for this research is Shenzhen, which is located in southern China.Shenzhen has experienced rapid development associated with reform policies over the past 30 years, and the area has attracted a large number of immigrant workers seeking job opportunities.The total area of Shenzhen is approximately 1996 square kilometers, and the population is more than 15 million, reflecting the highest population density among Chinese cities [37].
The mobile phone location dataset used in this study was collected by a mobile phone company that includes approximately 60% of the entire mobile phone market in Shenzhen.It covers 16 million mobile phone users over a single workday and records the cell phone tower locations each cell phone connects to approximatively every hour.Thus, each cell phone has 24 records each day containing a user ID, recording time and longitude and latitude of the cell phone tower.The user ID was encrypted for privacy protection before the dataset was released for research purposes.Table 1 shows an example of an individual user's mobile phone records for a day.In total, 5940 cell phone towers (CPTs) with unique Tower ID numbers were extracted from the dataset.Figure 1 shows the spatial kernel density of the cell phone towers.The other dataset used in this study comprised urban functional region data, which was generated from the comprehensive plan of Shenzhen city (2010-2020) [38].This dataset includes ten functional region types: administrative (government agencies), commercial, industrial, residential, education, transport, tourism (scenic places and parks), sports, water and other (including agricultural, shrubs, bare land, etc.). Figure 2 shows the spatial distribution of urban functional regions.The other dataset used in this study comprised urban functional region data, which was generated from the comprehensive plan of Shenzhen city (2010-2020) [38].This dataset includes ten functional region types: administrative (government agencies), commercial, industrial, residential, education, transport, tourism (scenic places and parks), sports, water and other (including agricultural, shrubs, bare land, etc.). Figure 2 shows the spatial distribution of urban functional regions.

Methodology
The method used to identify the spatiotemporal patterns of human convergence and divergence included three main steps.First, we extracted the net flow from human space-time trajectories in each time slot to indicate human convergence and divergence.Then, we classified the netflow into ten classes according to quantile rules and categorized each grid cell to represent the human convergence and divergence intensity.Finally, a time series matrix was constructed based on the netflow classes, and the grid cells were grouped into clusters according to their temporal patterns.

Extracting Indicators of Human Convergence and Divergence
Using a concept of time geography [39], we constructed the space-time trajectory of each cell phone by connecting location records in chronological order.As shown in Figure 3, the cell phone trajectory can be represented as follows: ( , , , ), , ( , , , ), , ( , , , )

y t Id p x y t Id p x y t Id
Tr (1) where xi, yi and Idi represent the longitude, latitude and TowerID of record point pi, respectively, and ti represents the time when the point update occurred.For adjacent space-time points with different record locations, we can extract a movement from cell phone tower Idi to Idi+1 over time period ti to ti+1.

Methodology
The method used to identify the spatiotemporal patterns of human convergence and divergence included three main steps.First, we extracted the net flow from human space-time trajectories in each time slot to indicate human convergence and divergence.Then, we classified the netflow into ten classes according to quantile rules and categorized each grid cell to represent the human convergence and divergence intensity.Finally, a time series matrix was constructed based on the netflow classes, and the grid cells were grouped into clusters according to their temporal patterns.

Extracting Indicators of Human Convergence and Divergence
Using a concept of time geography [39], we constructed the space-time trajectory of each cell phone by connecting location records in chronological order.As shown in Figure 3, the cell phone trajectory can be represented as follows:  Table 1 shows that the time window of the location records was updated approximately every hour, e.g., the first point was recorded between 00:00 and 01:00 and the second between 01:00 and 02:00.A movement can be extracted between 00:00 and 02:00, and the time window from 00:00-02:00 is considered time slot T1.Thus, we can extract one movement for every two adjacent hours, and the day can be divided into 23 time slots, with Tj denoting the time window (j − 1):00-(j + 1):00.
One issue is that there may be signal switches between CPTs, which may be incorrectly interpreted as movements, particularly in areas with high tower densities [40,41].We adopted Thiessen polygons to represent the service area of a cell phone tower in the early stage of this study.We found that some cell phone towers are located very close to each other.Overall, 396 cell towers are very close to nearby towers, and the distance between towers can be less than 10 m.For example, two cell towers may be located in the same high-rise building.These close cell phone towers can cause frequent signal jumps between the towers.We chose to use regular grid cells to aggregate very close cell phone towers, thereby reducing the influence of signal switches.We divided the city using different grid sizes from 100 m × 100 m-2 km × 2 km with an increment of 100 m and found that the 500 m × 500 m grid cells of cell phone towers accounted for 90.2% of the major human activity areas, which was much larger than the percentage in grid cells less than 500 m × 500 m.In addition, we found that the movements within grid cells increase linearly, and movements between grid cells decrease linearly with grid size.The 500 m × 500 m grid cells ignored approximately 16% of movements.Although 600 m × 600 m grid cells cover 98% of major human activity areas, they ignored approximately 20% of movements.Therefore, we chose grid cells of 500 m × 500 m as the analysis unit.The resolution provided a relatively fine scale for studying human mobility.Grid cells not containing a CPT were excluded because human movements could not be calculated between grid cells without cell phone towers.In total, 2801 grid cells were used as basic analysis units, and each was tagged with a unique Grid ID.
We filtered movements between CPTs to generate movements between grid cells by ignoring movements for which the origin and destination CPTs were in the same grid cell.Thus, we extracted a grid cell-based flow matrix (p, q, fpq, Tj), where p and q are the origin and destination Grid IDs, respectively, fpq represents the number of people moving from p to q, and Tj represents the time slot.For each grid cell p, the inflow and outflow during a time slot are computed as follows. , Additionally, the netflow of the grid cell is computed as follows.Table 1 shows that the time window of the location records was updated approximately every hour, e.g., the first point was recorded between 00:00 and 01:00 and the second between 01:00 and 02:00.A movement can be extracted between 00:00 and 02:00, and the time window from 00:00-02:00 is considered time slot T 1 .Thus, we can extract one movement for every two adjacent hours, and the day can be divided into 23 time slots, with T j denoting the time window (j − 1):00-(j + 1):00.
One issue is that there may be signal switches between CPTs, which may be incorrectly interpreted as movements, particularly in areas with high tower densities [40,41].We adopted Thiessen polygons to represent the service area of a cell phone tower in the early stage of this study.We found that some cell phone towers are located very close to each other.Overall, 396 cell towers are very close to nearby towers, and the distance between towers can be less than 10 m.For example, two cell towers may be located in the same high-rise building.These close cell phone towers can cause frequent signal jumps between the towers.We chose to use regular grid cells to aggregate very close cell phone towers, thereby reducing the influence of signal switches.We divided the city using different grid sizes from 100 m × 100 m-2 km × 2 km with an increment of 100 m and found that the 500 m × 500 m grid cells of cell phone towers accounted for 90.2% of the major human activity areas, which was much larger than the percentage in grid cells less than 500 m × 500 m.In addition, we found that the movements within grid cells increase linearly, and movements between grid cells decrease linearly with grid size.The 500 m × 500 m grid cells ignored approximately 16% of movements.Although 600 m × 600 m grid cells cover 98% of major human activity areas, they ignored approximately 20% of movements.Therefore, we chose grid cells of 500 m × 500 m as the analysis unit.The resolution provided a relatively fine scale for studying human mobility.Grid cells not containing a CPT were excluded because human movements could not be calculated between grid cells without cell phone towers.In total, 2801 grid cells were used as basic analysis units, and each was tagged with a unique Grid ID.
We filtered movements between CPTs to generate movements between grid cells by ignoring movements for which the origin and destination CPTs were in the same grid cell.Thus, we extracted a grid cell-based flow matrix (p, q, f pq , T j ), where p and q are the origin and destination Grid IDs, respectively, f pq represents the number of people moving from p to q, and T j represents the time slot.For each grid cell p, the inflow and outflow during a time slot are computed as follows.
Additionally, the netflow of the grid cell is computed as follows.
Netflow was used as an indicator of human convergence and divergence in a grid cell during time slot T j .Compared to the call activity of CDRs, which reflects activity intensity, netflow reflects the difference in inflow and outflow, which indicates the change in the number of people in a cell during a time slot [42].A positive netflow indicates that the number of people in the grid cell increased during the time slot, i.e., convergence, and a negative netflow indicates a decreasing number of people, i.e., divergence.

Classification of Human Convergence and Divergence Using Quantile Rules
This study examined human convergence and divergence, and their varying intensities over a day.We aggregated netflow values from all time slots and then grouped them into different classes, where n i, j represents the netflow of grid cell i during time slot T j .The netflow set N = {n i, j } of the whole study region included 2801 × 23 values, with the distribution shown in Figure 4a.Most netflow values (95.4%) were between −1000 and 1000, which indicates that few locations have extremely large netflows.Additionally, the city can be considered a relatively homogeneous system.
ISPRS Int.J. Geo-Inf.2016, 5, 177 6 of 18 Netflow was used as an indicator of human convergence and divergence in a grid cell during time slot Tj.Compared to the call activity of CDRs, which reflects activity intensity, netflow reflects the difference in inflow and outflow, which indicates the change in the number of people in a cell during a time slot [42].A positive netflow indicates that the number of people in the grid cell increased during the time slot, i.e., convergence, and a negative netflow indicates a decreasing number of people, i.e., divergence.

Classification of Human Convergence and Divergence Using Quantile Rules
This study examined human convergence and divergence, and their varying intensities over a day.We aggregated netflow values from all time slots and then grouped them into different classes, where ni, j represents the netflow of grid cell i during time slot Tj.The netflow set N = {ni, j} of the whole study region included 2801 × 23 values, with the distribution shown in Figure 4a.Most netflow values (95.4%) were between −1000 and 1000, which indicates that few locations have extremely large netflows.Additionally, the city can be considered a relatively homogeneous system.Netflow was then sorted in ascending order and grouped into ten classes by quantiles, producing the quantile vector Q = [q1, q2,…, q9], where q1, q2,…, q9 represent the netflow values of nine break points in quantiles 10%, 20%,…, and 90%, respectively (Figure 4b).In this paper, we generated the quantile vector of break points Q = [−317, −128, −53, −18, −1, 15, 51, 122, 314].We use Q to classify each ni, j of N into different groups and assign it a level label to represent the intensity of convergence or divergence as shown in Table 2.The greater the strength of convergence or dispersion is, the larger the absolute level value is assigned.In Classes 5 and 6, convergence and divergence are relatively small, and we consider both at the same level of 0. After classification, we generate the corresponding set L = {li, j}, which indicates the intensity of human mobility of grid cell i in time slot Tj.

Class Classification Level(l)
Status Class Classification Level(l) Status Netflow was then sorted in ascending order and grouped into ten classes by quantiles, producing the quantile vector Q = [q 1 , q 2 , . . ., q 9 ], where q 1 , q 2 , . . ., q 9 represent the netflow values of nine break points in quantiles 10%, 20%, . . ., and 90%, respectively (Figure 4b).In this paper, we generated the quantile vector of break points Q = [−317, −128, −53, −18, −1, 15, 51, 122, 314].We use Q to classify each n i, j of N into different groups and assign it a level label to represent the intensity of convergence or divergence as shown in Table 2.The greater the strength of convergence or dispersion is, the larger the absolute level value is assigned.In Classes 5 and 6, convergence and divergence are relatively small, and we consider both at the same level of 0. After classification, we generate the corresponding set L = {l i, j }, which indicates the intensity of human mobility of grid cell i in time slot T j .

Cluster Analysis of the Temporal Patterns of Human Convergence and Divergence
We transformed L into a time series matrix, V, to extract the spatiotemporal patterns of human convergence and divergence: where V i represents the i-th row of the matrix, which indicates the variation in grid cell i over the day.There are 2801 rows in the matrix.L j represents the j-th column of the matrix, which indicates the level in 2801 grid cells at time slot T j , so there are 23 time slots.Table 3 provides examples of the matrix.The temporal characteristics of V incorporate the human mobility spatiotemporal dynamics of different areas of the city.For example, residential and commercial regions or workplaces located downtown or on the outskirts of the city may have different temporal patterns.
In the cluster analysis, our main goal is to extract these grid cells with similar levels of variation in human mobility, so we focus on clustering the rows in the matrix.As shown in Equation ( 6), the similarity between any two rows is calculated based on the Euclidean distance.An X-means clustering algorithm was adopted to cluster the time series matrix according to temporal characteristics.This algorithm is an improved method based on k-means and can automatically determine the number of clusters using Bayesian information criteria to overcome the drawbacks of k-means in choosing the number of clusters.It also accelerates the computation by using a kd-tree method to address the massive number of records [43].Additionally, it is an unsupervised clustering method that is suitable for multidimensional variable datasets.The well-known data mining tool WEKA was employed to execute the X-means algorithm [44].Based on the algorithm, eight clusters were extracted from V using X-means clustering, and they were denoted as C1, C2, . . ., C8.A cluster analysis identified grid cells with similar human convergence and divergence variation patterns, and we discuss the characteristics of each cluster in Section 5.2.

Convergence and Divergence in each Time Slot
Figure 5 shows human convergence and divergence for selected time slots.Areas where people converged and diverged in different time slots are clearly distinguishable.Changes in human mobility intensity can also be observed.The level of most grid cells is close to zero at midnight (T 3 ), aside from a few areas in the urban centers.As dawn arrives, human mobility increases due to the morning commuting peak (T 8 ) and then declines as people start their work (T 10 ).The mobility intensity in some locations increases at noon (T 12 ) due to activities related to lunch, especially in the northern regions of the city.Then, it decreases again during the afternoon work hour (T 15 ) to a level below that of the morning work hour (T 10 ).The evening commute (T 18 ) displays an opposite trend as T 8 , with most grid cells exhibiting a high convergence during T 8 as people flow into locations that exhibit divergence at T 8 , and this state can last until the evening hour (T 21 ).These patterns represent a typical urban workday dynamic that is related to human activity patterns, and it demonstrates the potential of mobile phone data for studying human mobility.These data can be used to understand aggregate mobility patterns on more detailed spatial and temporal scales.below that of the morning work hour (T10).The evening commute (T18) displays an opposite trend as T8, with most grid cells exhibiting a high convergence during T8 as people flow into locations that exhibit divergence at T8, and this state can last until the evening hour (T21).These patterns represent a typical urban workday dynamic that is related to human activity patterns, and it demonstrates the potential of mobile phone data for studying human mobility.These data can be used to understand aggregate mobility patterns on more detailed spatial and temporal scales.

Temporal Patterns of Human Convergence and Divergence
Figure 6 illustrates the temporal patterns of the average values of each cluster.Distinct temporal characteristics can be observed between the clusters.
Grid cells in C1 illustrate the high intensity of human convergence during most time slots, while C8 cells display divergence during most of the day, except during the morning commute (T6-T8)

Temporal Patterns of Human Convergence and Divergence
Figure 6 illustrates the temporal patterns of the average values of each cluster.Distinct temporal characteristics can be observed between the clusters.
Grid cells in C1 illustrate the high intensity of human convergence during most time slots, while C8 cells display divergence during most of the day, except during the morning commute (T 6 -T 8 ) when the cells display high-intensity convergence.Grid cells in C2 show convergence from T 6 -T 18 followed by high-intensity divergence from T 19 until midnight (T 23 ).
C3 and C4 have similar mobility patterns, with divergence mainly occurring from T 6 -T 10 and convergence after T 17 .The major difference between these clusters is that the mobility intensity in C4 is significantly higher than that in C3.C3 also exhibits a clear convergence-divergence pattern from T 11 -T 14 .
Cluster C5 shows a distinct convergence pattern during the morning and evening commutes, which last approximately two time slots, and divergence in the remaining time slots of the day.
C7 shows an opposite human mobility pattern to that of C3, with convergence mainly occurring from T 7 -T 9 and divergence after T 17 .
Compared to other clusters, there is no apparent temporal pattern in the grid cells of C6, and the mobility intensity is generally low.
The spatial distributions and mobility intensities of these human convergence and divergence patterns are associated with the spatial distribution of different land use types (e.g., residential, industrial, commercial, etc.) and the socioeconomic features of the geographical contexts [4,45,46].ISPRS Int.J. Geo-Inf.2016, 5, 177 10 of 18 when the cells display high-intensity convergence.Grid cells in C2 show convergence from T6-T18 followed by high-intensity divergence from T19 until midnight (T23).C3 and C4 have similar mobility patterns, with divergence mainly occurring from T6-T10 and convergence after T17.The major difference between these clusters is that the mobility intensity in C4 is significantly higher than that in C3.C3 also exhibits a clear convergence-divergence pattern from T11-T14.
Cluster C5 shows a distinct convergence pattern during the morning and evening commutes, which last approximately two time slots, and divergence in the remaining time slots of the day.
C7 shows an opposite human mobility pattern to that of C3, with convergence mainly occurring from T7-T9 and divergence after T17.
Compared to other clusters, there is no apparent temporal pattern in the grid cells of C6, and the mobility intensity is generally low.
The spatial distributions and mobility intensities of these human convergence and divergence patterns are associated with the spatial distribution of different land use types (e.g., residential, industrial, commercial, etc.) and the socioeconomic features of the geographical contexts [4,45,46].

Spatial Distribution of Derived Clusters
We further analyzed the spatial distribution of the identified clusters by combining functional regions to gain better understanding of human convergence and divergence in the urban context.To simplify the maps, hollow cells were used to represent grid cells.In addition, we calculated the average percentages of different land uses in each cluster.We first calculated the proportion of each land use in each grid cell.Then, for grid cells belonging to a certain cluster, we calculated the average proportion of each land use.Table 3 lists the average percentages of the different land use types in each cluster.

Spatial Distribution of Derived Clusters
We further analyzed the spatial distribution of the identified clusters by combining functional regions to gain better understanding of human convergence and divergence in the urban context.To simplify the maps, hollow cells were used to represent grid cells.In addition, we calculated the average percentages of different land uses in each cluster.We first calculated the proportion of each land use in each grid cell.Then, for grid cells belonging to a certain cluster, we calculated the average proportion of each land use.Table 4 lists the average percentages of the different land use types in each cluster.Figure 7 shows the spatial distribution of C1 and C8.It is counterintuitive that some areas continue to converge (C1) or diverge (C8) during most time slots (Figure 6).Most grid cells in these clusters are along the main roads of Shenzhen, and the average percentage of transportation land use in each grid cell in the two clusters is 15.2% and 18.4%, which are higher than the values in other clusters (Table 4).C1 cells tend to be on the boundary between industrial and residential regions, with industrial and residential land use accounting for 31.3% and 30.3%, respectively, of all land use in the cells (Table 4).C8 cells are mainly distributed along roads in industrial and downtown regions, and industrial and residential land use accounts for 41.7% and 16.5%, respectively, of land use in the cells.Thus, a large number of people flow into these regions during the morning commute (T 7 and T 8 ).The regions include some important intra-urban traffic junctions, as well as several inter-urban transportation hubs connected to nearby cities, e.g., several high-speed intersections, two railway stations and Futian Port (which connects to Hong Kong).Therefore, it is likely that the human mobility patterns in C1 and C8 are related to urban transportation.A possible explanation for the continuous convergence and divergence is that our dataset does not include interactions with nearby cities and neglects outflow from the city and inflow from other cities through these grid cells; thus, there is continuous positive or negative netflow during the day.This indicates that these areas may be main hubs that are closely connected to regions outside the city.This observation provides a reference for urban planners to locate and optimize urban bus public transit, so that people can be easily transferred from these places.Therefore, it is likely that C1 and C8 are often located along main urban roads.
Figure 7 shows the spatial distribution of C1 and C8.It is counterintuitive that some areas continue to converge (C1) or diverge (C8) during most time slots (Figure 6).Most grid cells in these clusters are along the main roads of Shenzhen, and the average percentage of transportation land use in each grid cell in the two clusters is 15.2% and 18.4%, which are higher than the values in other clusters (Table 3).C1 cells tend to be on the boundary between industrial and residential regions, with industrial and residential land use accounting for 31.3% and 30.3%, respectively, of all land use in the cells (Table 3).C8 cells are mainly distributed along roads in industrial and downtown regions, and industrial and residential land use accounts for 41.7% and 16.5%, respectively, of land use in the cells.Thus, a large number of people flow into these regions during the morning commute (T7 and T8).The regions include some important intra-urban traffic junctions, as well as several inter-urban transportation hubs connected to nearby cities, e.g., several high-speed intersections, two railway stations and Futian Port (which connects to Hong Kong).Therefore, it is likely that the human mobility patterns in C1 and C8 are related to urban transportation.A possible explanation for the continuous convergence and divergence is that our dataset does not include interactions with nearby cities and neglects outflow from the city and inflow from other cities through these grid cells; thus, there is continuous positive or negative netflow during the day.This indicates that these areas may be main hubs that are closely connected to regions outside the city.This observation provides a reference for urban planners to locate and optimize urban bus public transit, so that people can be easily transferred from these places.Therefore, it is likely that C1 and C8 are often located along main urban roads.Figure 8 shows the spatial distributions of grid cells in clusters C2 and C5.C2 grid cells are located in main commercial and industrial regions in the city, i.e., concentrated job locations that attract many people during the morning commute.The average commercial land use in this cluster is 11.6%, which is the maximum among all clusters (Table 3).The commercial regions also include many shopping malls, restaurants, financial institutions and recreational venues (bars, karaoke, entertainment, etc.).Therefore, these locations also attract numerous people for shopping, meals, entertainment and other activities during the daytime, with high-intensity divergence after T19.Grid cells in C5 are mainly located near small business districts and workplaces inside residential regions, and the commercial, industrial and residential land uses are 3.4%, 31.1% and 40.1% in this cluster, respectively (Table 3).Land use in residential regions is mixed and includes shopping malls, restaurants and recreational venues.Therefore, human mobility in these locations does not exhibit a consistent pattern, and the human mobility intensity is low.For example, these locations attract people for work during morning times, while people living in residential regions diverge to workplaces simultaneously.Thus, convergence and divergence both occur during the morning commute time (T6-T9).The convergence and divergence pattern in C2 is likely to occur in main Figure 8 shows the spatial distributions of grid cells in clusters C2 and C5.C2 grid cells are located in main commercial and industrial regions in the city, i.e., concentrated job locations that attract many people during the morning commute.The average commercial land use in this cluster is 11.6%, which is the maximum among all clusters (Table 4).The commercial regions also include many shopping malls, restaurants, financial institutions and recreational venues (bars, karaoke, entertainment, etc.).Therefore, these locations also attract numerous people for shopping, meals, entertainment and other activities during the daytime, with high-intensity divergence after T 19 .Grid cells in C5 are mainly located near small business districts and workplaces inside residential regions, and the commercial, industrial and residential land uses are 3.4%, 31.1% and 40.1% in this cluster, respectively (Table 4).Land use in residential regions is mixed and includes shopping malls, restaurants and recreational venues.Therefore, human mobility in these locations does not exhibit a consistent pattern, and the human mobility intensity is low.For example, these locations attract people for work during morning times, while people living in residential regions diverge to workplaces simultaneously.Thus, convergence and divergence both occur during the morning commute time (T 6 -T 9 ).The convergence and divergence pattern in C2 is likely to occur in main urban commercial regions, whereas it tends to occur near business districts and workplaces within residential regions in C5.
ISPRS Int.J. Geo-Inf.2016, 5, 177 12 of 18 urban commercial regions, whereas it tends to occur near business districts and workplaces within residential regions in C5. Figure 9 shows the spatial distributions of clusters C3 and C4.Grid cells in both clusters are mainly located in urban residential regions.The cells in C3 are mainly located in the northern part of the city, while the cells in C4 are located in the southern part of the city.As shown in Table 3, residential land is dominant in C3 and C4, accounting for 50.4% and 67.6% of land use in the clusters, respectively.As discussed in Section 5.2, there are also some human mobility differences between the clusters.For example, divergence lasts longer in C4 than in C3 during the morning (Figure 6).The cluster differences may be caused by differences between economic development and human mobility space in the northern and southern parts of the region.The southern region is the core of the urban business district in Shenzhen, and the economy in the southern region is more developed than that of the northern region.The southern population density is also higher than that in the northern region.The more developed economy and high population density may be the underlying reasons for the cluster pattern differences.However, many immigrant workers live in the northern part of Shenzhen, and they tend to live near their workplaces to save commuting time [47].This short commute distance also makes it convenient for them to return home at noon for lunch or to take short breaks for activities, which may also contribute to the convergence-divergence pattern differences between T11 and T14 (Figure 6).Thus, the cells in C3 and C4 are likely located in urban residential regions, with C3 mainly located in the northern part of the city and C4 generally located in the southern part.4, residential land is dominant in C3 and C4, accounting for 50.4% and 67.6% of land use in the clusters, respectively.As discussed in Section 5.2, there are also some human mobility differences between the clusters.For example, divergence lasts longer in C4 than in C3 during the morning (Figure 6).The cluster differences may be caused by differences between economic development and human mobility space in the northern and southern parts of the region.The southern region is the core of the urban business district in Shenzhen, and the economy in the southern region is more developed than that of the northern region.The southern population density is also higher than that in the northern region.The more developed economy and high population density may be the underlying reasons for the cluster pattern differences.However, many immigrant workers live in the northern part of Shenzhen, and they tend to live near their workplaces to save commuting time [47].This short commute distance also makes it convenient for them to return home at noon for lunch or to take short breaks for activities, which may also contribute to the convergence-divergence pattern differences between T 11 and T 14 (Figure 6).Thus, the cells in C3 and C4 are likely located in urban residential regions, with C3 mainly located in the northern part of the city and C4 generally located in the southern part.3, residential land is dominant in C3 and C4, accounting for 50.4% and 67.6% of land use in the clusters, respectively.As discussed in Section 5.2, there are also some human mobility differences between the clusters.For example, divergence lasts longer in C4 than in C3 during the morning (Figure 6).The cluster differences may be caused by differences between economic development and human mobility space in the northern and southern parts of the region.The southern region is the core of the urban business district in Shenzhen, and the economy in the southern region is more developed than that of the northern region.The southern population density is also higher than that in the northern region.The more developed economy and high population density may be the underlying reasons for the cluster pattern differences.However, many immigrant workers live in the northern part of Shenzhen, and they tend to live near their workplaces to save commuting time [47].This short commute distance also makes it convenient for them to return home at noon for lunch or to take short breaks for activities, which may also contribute to the convergence-divergence pattern differences between T11 and T14 (Figure 6).Thus, the cells in C3 and C4 are likely located in urban residential regions, with C3 mainly located in the northern part of the city and C4 generally located in the southern part.4, the percentage of industrial land in this cluster is 58.4%, which is the dominant land use; thus, a large number of people converge in these areas to engage in work during the morning commute and then diverge from these areas to return home or travel to other locations when they finish their daily work.Thus, the human convergence and divergence pattern in C7 contrasts that in C3, although human mobility in both clusters show typical daily travel patterns related to work.Therefore, the human mobility pattern in C7 is likely associated with urban industrial regions.Figure 10 shows the spatial distribution of C7.The grid cells in this cluster are mainly scattered across urban industrial regions.As shown in Table 3, the percentage of industrial land in this cluster is 58.4%, which is the dominant land use; thus, a large number of people converge in these areas to engage in work during the morning commute and then diverge from these areas to return home or travel to other locations when they finish their daily work.Thus, the human convergence and divergence pattern in C7 contrasts that in C3, although human mobility in both clusters show typical daily travel patterns related to work.Therefore, the human mobility pattern in C7 is likely associated with urban industrial regions.Based on the spatial distribution, grid cells in C6 are not confined to a specific functional area, but scattered across different regions of Shenzhen (Figure 11), including urban administrative, education, sports and tourism regions.People have the freedom to choose the timing at which they arrive and leave these regions; thus, no consistent temporal patterns are formed in the regions.We can see that the difference between residential land (27.9%) and industrial land (28.8%) is small (Table 3).Many grid cells in this cluster are also located on the border of residential and industrial regions, so it is possible that a mixture of patterns occurs in these grid cells, e.g., during the morning commute, a grid cell containing industrial and residential land use would attract people to work, but people living in the grid cell may leave for work, resulting in an overall low netflow intensity.Some grid cells are also located in suburban areas with very low population densities, which may be another reason for the low intensity of human mobility.Based on the spatial distribution, grid cells in C6 are not confined to a specific functional area, but scattered across different regions of Shenzhen (Figure 11), including urban administrative, education, sports and tourism regions.People have the freedom to choose the timing at which they arrive and leave these regions; thus, no consistent temporal patterns are formed in the regions.We can see that the difference between residential land (27.9%) and industrial land (28.8%) is small (Table 4).Many grid cells in this cluster are also located on the border of residential and industrial regions, so it is possible that a mixture of patterns occurs in these grid cells, e.g., during the morning commute, a grid cell containing industrial and residential land use would attract people to work, but people living in the grid cell may leave for work, resulting in an overall low netflow intensity.Some grid cells are also located in suburban areas with very low population densities, which may be another reason for the low intensity of human mobility.Figure 10 shows the spatial distribution of C7.The grid cells in this cluster are mainly scattered across urban industrial regions.As shown in Table 3, the percentage of industrial land in this cluster is 58.4%, which is the dominant land use; thus, a large number of people converge in these areas to engage in work during the morning commute and then diverge from these areas to return home or travel to other locations when they finish their daily work.Thus, the human convergence and divergence pattern in C7 contrasts that in C3, although human mobility in both clusters show typical daily travel patterns related to work.Therefore, the human mobility pattern in C7 is likely associated with urban industrial regions.Based on the spatial distribution, grid cells in C6 are not confined to a specific functional area, but scattered across different regions of Shenzhen (Figure 11), including urban administrative, education, sports and tourism regions.People have the freedom to choose the timing at which they arrive and leave these regions; thus, no consistent temporal patterns are formed in the regions.We can see that the difference between residential land (27.9%) and industrial land (28.8%) is small (Table 3).Many grid cells in this cluster are also located on the border of residential and industrial regions, so it is possible that a mixture of patterns occurs in these grid cells, e.g., during the morning commute, a grid cell containing industrial and residential land use would attract people to work, but people living in the grid cell may leave for work, resulting in an overall low netflow intensity.Some grid cells are also located in suburban areas with very low population densities, which may be another reason for the low intensity of human mobility.The clusters identified in this study provide insight into the human dynamics at different locations in the city and potential land use characteristics associated with these different human mobility patterns.For example, C1 and C8 are likely located along main urban roads, whereas C2 tends to be located in urban commercial regions.In residential-dominant regions, a geographical difference in human mobility can be identified between the northern and the southern parts of Shenzhen.Although the study area and dataset are different, our findings are similar to those of a study that explored the interdependence between land use and traffic patterns using GPS-enabled taxi data in Shanghai [27].In addition, these human mobility patterns are closely related to socioeconomic development and human activity areas [47].These findings provide preliminary knowledge about human convergence and divergence patterns in urban areas based on different land use information.
This knowledge can help urban planners and policy makers to improve the efficiency of urban operations.Additionally, it can be used as input in Markov or training models to predict real-time urban traffic flows [31,48,49].For example, when a new residential area is planned, human mobility patterns can be predicted based on its economic characteristics, thereby providing initial knowledge regarding the temporal travel demands of local residents.In addition, the findings can be used as a reference to estimate human convergence and divergence patterns using urban land use data in other cities without human tracking data.Conversely, urban land use information can be inferred based on these human mobility patterns [32,33].In addition, based on the temporal convergence and divergence patterns of human mobility in different urban regions, managers can optimize urban public bicycle dock locations or real-time bicycle schedules in convergent and divergent areas to maintain a balance between supply and demand [50].Similarly, taxi companies can allocate taxis in locations with high human convergence and divergence activities at specific times of a day [51].Therefore, these findings can be used to improve urban public transport efficiency, which helps promote intelligent urban mobility [52,53].

Conclusions
The emergence of new location-aware data sources (e.g., mobile phone data) has provided opportunities and challenges associated with understanding human activities in the urban context (e.g., real-time monitoring of urban dynamics, human mobility patterns, etc.).This article explores the spatiotemporal patterns of human convergence and divergence using a big mobile phone location dataset from Shenzhen, China.From the location sequences of individual cell phone trajectories, we derived two measures (inflow and outflow) at the grid cell level (500 m × 500 m) to represent the numbers of incoming and outgoing trips at different locations in the city at different times of the day.Using the difference between inflow and outflow, we generated a time series for each grid cell, which reflects the direction and intensity of people flows and describes the temporal patterns of human convergence and divergence.Then, a clustering algorithm was employed to categorize distinct human convergence and divergence types within the city.We then investigated the spatial distributions of grid cells in different categories and examined how the identified patterns were associated with particular urban functional region types.This yielded additional insight into the relationships between people flows and the functional environment.
Eight distinct spatiotemporal clusters were identified, and the spatial distributions of these patterns were discussed based on the urban functional areas.Grid cells in clusters C1 and C8 were likely located along main urban roads in transportation-dominant regions (e.g., intra-and inter-urban traffic hubs); C2 and C5 were generally located in commercial-dominant urban regions; C3 and C4 were mainly located in residential-dominant regions; C7 was typically located in industrial-dominant regions; and C6 was scattered in different functional regions throughout the city.There was also a geographical (north-south) difference in human convergence and divergence in urban residential regions, and this difference mimicked the pattern of urban socioeconomic development.Distinct human convergent and divergent activities occurred at noon in northern residential and industrial regions, which may be due to low human mobility in those areas.These findings enhance our knowledge of human mobility in different urban functional regions and provide a reference for policy makers to improve policy effectiveness.
There are some limitations of this study.First, one main limitation of this work is the potential impact of MAUP (modifiable area unit problem).Signal switches are a source of inherent bias in mobile phone data, and they may affect studies of human mobility patterns.The sample interval of the mobile phone data used in this study is approximately one hour, so we cannot accurately identify signal switches between cell phone towers.Most current studies employed Voronoi tessellations to represent the service areas of cell phone towers.However, there are many extremely close cell phone towers (separated by less than 10 m) in the study area (e.g., there are several cell phone towers in one office building in the urban center), so Voronoi tessellation does not prevent signal switching between these close cell phone towers.This study adopted 500 m × 500 m grid cells to divide the city and aggregate close cell phone towers to reduce the influence of signal switches between these cell phone towers.However, it is difficult to address the problem completely because the exact service area of a cell phone tower is uncertain.In addition, we excluded grid cells that did not contain cell phone towers because it is not feasible to calculate human movements between grid cells without cell phone towers.This may exclude some human activity areas.Although these movements were ignored, the analysis results provide useful information for understanding aggregate human mobility patterns in an urban functional context.Future studies can further analyze spatial interpolation differences between Voronoi tessellations and grid cells.Another limitation is that the dataset only covers one workday; thus, we were unable to investigate differences in weekly and seasonal patterns of human mobility.This study proposes a method for extracting daily spatiotemporal patterns of human convergence and divergence.The proposed method can be employed to extract human mobility patterns from long-term data, which is helpful for comparing human mobility on different days.
In future research, we will employ the identified patterns to optimize urban transportation and planning.For example, the urban public transport system could be optimized (i.e., the locations of bus stops or timetables of bus lines) based on the identified human mobility patterns.We will also further examine the relationship between human flow matrices and land use to provide better understanding of spatial interactions among different land use types.We believe that these analyses will deepen our knowledge of human activities in the urban context and provide many benefits to the development of urban systems.

Figure 1 .
Figure 1.Spatial kernel density of the cell phone towers (CPTs).Figure 1. Spatial kernel density of the cell phone towers (CPTs).

Figure 1 .
Figure 1.Spatial kernel density of the cell phone towers (CPTs).Figure 1. Spatial kernel density of the cell phone towers (CPTs).

Figure 2 .
Figure 2. Spatial distribution of urban functional regions.

Figure 2 .
Figure 2. Spatial distribution of urban functional regions.

Figure 3 .
Figure 3. Space-time trajectory of an individual cell phone record.

Figure 3 .
Figure 3. Space-time trajectory of an individual cell phone record.

Figure 4 .
Figure 4. (a) Distribution of set N (bin width = 100); (b) sorting and break points of set N.

Figure 4 .
Figure 4. (a) Distribution of set N (bin width = 100); (b) sorting and break points of set N.

Figure 5 .
Figure 5. Human convergence and dispersion in selected time slots.(a) Spatial distribution of human convergence and divergence during time slot T3; (b) Spatial distribution of human convergence and divergence during time slot T8; (c) Spatial distribution of human convergence and divergence during time slot T10; (d) Spatial distribution of human convergence and divergence during time slot T12; (e) Spatial distribution of human convergence and divergence during time slot T15; (f) Spatial distribution of human convergence and divergence during time slot T18; (g) Spatial distribution of human convergence and divergence during time slot T21.

Figure 5 .
Figure 5. Human convergence and dispersion in selected time slots.(a) Spatial distribution of human convergence and divergence during time slot T ; (b) Spatial distribution of human convergence and divergence during time slot T 8 ; (c) Spatial distribution of human convergence and divergence during time slot T 10 ; (d) Spatial distribution of human convergence and divergence during time slot T 12 ; (e) Spatial distribution of human convergence and divergence during time slot T 15 ; (f) Spatial distribution of human convergence and divergence during time slot T 18 ; (g) Spatial distribution of human convergence and divergence during time slot T 21 .

Figure 6 .
Figure 6.Clustering patterns of human convergence and divergence.

Figure 6 .
Figure 6.Clustering patterns of human convergence and divergence.

Figure 9
Figure9shows the spatial distributions of clusters C3 and C4.Grid cells in both clusters are mainly located in urban residential regions.The cells in C3 are mainly located in the northern part of the city, while the cells in C4 are located in the southern part of the city.As shown in Table4, residential land is dominant in C3 and C4, accounting for 50.4% and 67.6% of land use in the clusters, respectively.As discussed in Section 5.2, there are also some human mobility differences between the clusters.For example, divergence lasts longer in C4 than in C3 during the morning (Figure6).The cluster differences may be caused by differences between economic development and human mobility space in the northern and southern parts of the region.The southern region is the core of the urban business district in Shenzhen, and the economy in the southern region is more developed than that of the northern region.The southern population density is also higher than that in the northern region.The more developed economy and high population density may be the underlying reasons for the cluster pattern differences.However, many immigrant workers live in the northern part of Shenzhen, and they tend to live near their workplaces to save commuting time[47].This short commute distance also makes it convenient for them to return home at noon for lunch or to take short breaks for activities, which may also contribute to the convergence-divergence pattern differences between T 11 and T 14 (Figure6).Thus, the cells in C3 and C4 are likely located in urban residential regions, with C3 mainly located in the northern part of the city and C4 generally located in the southern part.
ISPRS Int.J. Geo-Inf.2016, 5, 177 12 of 18 urban commercial regions, whereas it tends to occur near business districts and workplaces within residential regions in C5.

Figure 9
Figure9shows the spatial distributions of clusters C3 and C4.Grid cells in both clusters are mainly located in urban residential regions.The cells in C3 are mainly located in the northern part of the city, while the cells in C4 are located in the southern part of the city.As shown in Table3, residential land is dominant in C3 and C4, accounting for 50.4% and 67.6% of land use in the clusters, respectively.As discussed in Section 5.2, there are also some human mobility differences between the clusters.For example, divergence lasts longer in C4 than in C3 during the morning (Figure6).The cluster differences may be caused by differences between economic development and human mobility space in the northern and southern parts of the region.The southern region is the core of the urban business district in Shenzhen, and the economy in the southern region is more developed than that of the northern region.The southern population density is also higher than that in the northern region.The more developed economy and high population density may be the underlying reasons for the cluster pattern differences.However, many immigrant workers live in the northern part of Shenzhen, and they tend to live near their workplaces to save commuting time[47].This short commute distance also makes it convenient for them to return home at noon for lunch or to take short breaks for activities, which may also contribute to the convergence-divergence pattern differences between T11 and T14 (Figure6).Thus, the cells in C3 and C4 are likely located in urban residential regions, with C3 mainly located in the northern part of the city and C4 generally located in the southern part.

Figure 10
Figure10shows the spatial distribution of C7.The grid cells in this cluster are mainly scattered across urban industrial regions.As shown in Table4, the percentage of industrial land in this cluster is 58.4%, which is the dominant land use; thus, a large number of people converge in these areas to engage in work during the morning commute and then diverge from these areas to return home or travel to other locations when they finish their daily work.Thus, the human convergence and divergence pattern in C7 contrasts that in C3, although human mobility in both clusters show typical daily travel patterns related to work.Therefore, the human mobility pattern in C7 is likely associated with urban industrial regions.

Table 1 .
Example of an individual's cell phone records during a day.
The sign *** ignores the minutes of a Longitude or a Latitude and the sign ****** ignores last six numbers of a User ID due to privacy protection.

Table 1 .
Example of an individual's cell phone records during a day.
(2)re x i , y i and Id i represent the longitude, latitude and TowerID of record point p i , respectively, and t i represents the time when the point update occurred.For adjacent space-time points with different record locations, we can extract a movement from cell phone tower Id i to Id i+1 over time period t i to t i+1 .[pi(x i , y i , t i , Id i ), p i+1 (x i+1 , y i+1 , t i+1 , Id i+1 )], Id i = Id i+1(2)

Table 3 .
Examples of the matrix.

Table 3 .
The distribution of land use in each cluster.Com, commercial land; Ind, industrial land; Res, residential land; Tra, transport land; Adm, administrative land; Edu, education land; Tou, tourism land; Spo, sport land; Wat, water land; Oth, other land (%).

Table 4 .
The distribution of land use in each cluster.Com, commercial land; Ind, industrial land; Res, residential land; Tra, transport land; Adm, administrative land; Edu, education land; Tou, tourism land; Spo, sport land; Wat, water land; Oth, other land (%).