A Visual Analytics Approach for Extracting Spatio-Temporal Urban Mobility Information from Mobile Network Traffic

In this paper we present a visual analytics approach for deriving spatio-temporal patterns of collective human mobility from a vast mobile network traffic data set. More than 88 million movements between pairs of radio cells—so-called handovers—served as a proxy for more than two months of mobility within four urban test areas in Northern Italy. In contrast to previous work, our approach relies entirely on visualization and mapping techniques, implemented in several software applications. We purposefully avoid statistical or probabilistic modeling and, nonetheless, reveal characteristic and exceptional mobility patterns. The results show, for example, surprising similarities and symmetries amongst the total mobility and people flows between the test areas. Moreover, the exceptional patterns detected can be associated to real-world events such as soccer matches. We conclude that the visual analytics approach presented can shed new light on large-scale collective urban mobility behavior and thus helps to better understand the “pulse” of dynamic urban systems.


Introduction
People play a central role in urban systems, and their behavior naturally depends on a city's structure [1,2].Furthermore, the organization of cities, i.e., their functional configuration, as well as their inherent dynamics comprise a certain universal similarity across different nations and times-cities are thus, to some degree, scaled versions of each other [2,3].As general human mobility patterns improve the basic understanding of urban dynamics (cf.[4]), the detailed knowledge of typical spatio-temporal mobility patterns and "anomalies" can provide additional insights into the functional configuration of urban environments and, therefore, significantly enhance the spatial and temporal awareness of decision makers for a variety of purposes, for instance event management (public concerts, soccer matches, etc.).
This research aims to better understand the typical spatio-temporal patterns of collective human mobility at the operational scale of a city and its close periphery.Furthermore, we follow [5,6] and compare these patterns among the four largest urban environments in the Friuli Venetia Giulia region (Northern Italy), which are Trieste, Udine, Pordenone, and Gorizia.In this way we are able to reveal similarities and differences in the cities' functional configuration in terms of mobility.In contrast to previous work, we exclusively rely on visual components in our analysis approach and purposefully avoid any mathematical, statistical, or probabilistic modeling.The analysis is based on user-generated mobile network traffic data, which are a very broad sample across society and supposed to be representative for the collective human behavior.
Within this context, our research question is: can characteristic and exceptional spatio-temporal patterns of the collective human mobility be derived from large volumes of user-generated mobile network traffic using visual analytics tools?The following sub questions arise: can the characteristic patterns be used to describe the functional configuration of a city in terms of mobility?Can the exceptional patterns be associated to real-world events?

Related Work
Today, thanks to the digital traces that people leave behind, voluntarily or not, when interacting with digital systems such as communication networks or social media platforms, the research on human behavior patterns has reached a new dimension: The plethora of "social-sensor" data [4] from such ubiquitous systems reflects the collective and individual human behavior in remarkable spatial and temporal detail [7,8].The user-generated traffic in mobile communication networks, which is the probably broadest and typically least biased [5] sample of a day-to-day used digital system, is a frequently used data set for the analysis of human mobility.The spatial resolution of such data can vary significantly (~250 m up to 5,000 m, cf.[9][10][11]) and is determined by the density of the network's antennas and the network traffic intensity per antenna [10].The data's spatial resolution and accuracy is thus typically higher in densely populated urban environments as compared to rural areas [12].Fairly detailed human mobility patterns can be reconstructed based on such communication data sets nonetheless (cf.[13]).
A variety of human mobility studies focus on individual urban environments.Such studies show, for instance, that human interaction significantly correlate with administrative areas [14], that individual daily mobility patterns correlate with the distance to attractive geographic areas [15], or that the trajectories of individuals in combination with other geographic preferences such as land use can be used to characterize human activity and consequently better understand peoples travel demands [1].Furthermore, in [16], the recurring patterns of peoples' mobility show that the regularity of mass movements correlates with social and environmental indicators.With respect to public events, the analysis of collective human movement [17] shows that the people living close to the event are more attracted than those living further away.To explore human movement dynamics in metropolitan areas, the tool developed by Martino et al. [7] allows for the investigation of both individual and aggregated cell phone traces at different spatial as well as temporal scales.On a very detailed individual level Yuan et al. [11] show that individual mobile phone usage data correlates significantly with the user's travel behavior in terms of scale, shape, and randomness of their traces.
Several studies exist that compare the human mobility between urban environments.For instance, statistically significant differences in human mobility patterns regarding the average distance traveled, the individual's area of influence, and geographic sparseness of the social network can be observed in countries with a developing or an advanced economic level [5].Major cities such as Los Angeles and New York City (both USA), which are certainly on an advanced economic level, also show different mobility patterns [6] in terms of daily and maximal travel distance.
However, studies that investigate the characteristic spatio-temporal pattern of the collective human mobility from a more dynamic perspective and offer a comparison between different urban environments are very rare.One very recent example, based on a 9-day time-series of individual mobile phone data, shows similarities and dissimilarities of the "dynamic mobility patterns" at the radio cell level within the same city [18].Although this study shows an interesting approach to measuring the similarity of the cells' average weekday/weekend temporal signatures, it does not allow for a comparison of the mobility patterns between different urban environments.Furthermore, many investigations of human mobility, including the studies mentioned above, make use of various visualization techniques in order to effectively communicate the studies' outcomes, however, neglect the intrinsic analytical power of such techniques.That is, visual analytics tools are rarely used to explore data and, thereby, reveal meaningful results.
Visual analytics, as defined by [19], aims to lay the conceptual and methodological foundation for the development of supportive visualization tools in a mainly explorative context.This is particularly true for vast data sets such as user-generated mobile network traffic.Whereas cartography primarily focuses on spatial data properties, typical visual analytics tools consider several more dimensions, above all time [20].Besides, visual analytics is an interactive and iterative process by definition [21].The visual analysis of spatio-temporal data can follow two complementary logics [22], namely, focusing on a certain location or region over time, or focusing on a specific time or time interval and investigating space.The main benefit of explorative visual analytics is the possibility of viewing the data from multiple perspectives and different scales simultaneously, whilst never losing the overview.Visual analytics facilitates highly interactive data exploration, data analysis and presentation; hence it is used for inductive rather than deductive reasoning.
In summary, analyses and comparisons of human mobility patterns in and among urban environments have rather disregarded the patterns' intrinsic spatio-temporal characteristics in the past.Also, due to the predominance of mathematical, statistical, or probabilistic modeling, the use of visual analytics to efficiently extract meaningful results has not been sufficiently addressed yet.

Study Areas
The study areas were the four largest urban environments within the Friuli Venetia Giulia region in Northern Italy: Gorizia (~35,000 inhabitants), Pordenone (~50,000), Trieste (~205,000) and Udine (~100,000).Each city is the capital of one of the four provinces of the same name, thus the study areas have similar administrative functions.The official 2001 administrative data sets were used to, firstly, distinguish urban from non-urban areas, and, secondly, to take into account the spatial distribution of the population density (Figure 1).In order to consider the close periphery as well, we calculated 1km buffers for each of the officially declared urban areas.For simplicity and visibility purposes these buffers were not clipped at national borders (in the case of Gorizia) or at coasts (Trieste as a major seaport of the region combines seaside and landside mobility).

Mobile Network Traffic Data Used
Addressing the representativeness of the data used for urban mobility analysis, the fully anonymized and aggregated user-generated mobile network traffic was provided by an Italian Telecom Operator with a market share of 34.2%.In the Friuli Venetia Giulia region, the mobile penetration rate, i.e., the ratio in percent between the number of mobile phones and the number of people in percent, is about 155% (both proportions are from the year 2010 [23]).Therefore, more than one third of the regional population's mobile communication activity was supposed to be reflected in the user-generated mobile network traffic used herein.Turned off mobile phones were not captured, but this effect was assumed to be randomly distributed spatially, as well as temporally, thus it did not bias the spatio-temporal analysis of the collective mobility patterns (however, it would bias the analysis of individual mobility).The first and the second step were designed to ensure valid and consistent mobile network traffic data.The raw telecom data files were processed in a Java application, developed to transfer the data into a spatially enabled database.As a result of those first two steps, more than 45 million data entries with more than 88 million handovers remained for the exploratory visual data analysis.This database was then integrated into the visual analytics software tool's inherent and proprietary data structure to ensure best possible analysis performance.In the exploratory visual data analysis (Figure 3 step 3 and step 4) we iteratively applied the two steps to maximize the extraction of mobility information in an interactive manner.In order to account for regular as well as for irregular mobility behavior, we put emphasis on two types of analysis: The analysis of characteristic patterns, which is described in detail in section four, was dedicated to uncover the typical spatio-temporal patters of the collective human mobility per day of the week.These patterns were then compared among the four urban test areas in order to address similarities and differences in their functional configuration with respect to mobility.In the detection of exceptional events, which is described in detail in section five, we distinguished visually noticeable "outliers" from the "background pattern" in a highly intuitive way.The outliers' spatio-temporal properties were then associated to real-world exceptional events, e.g., concerts.

Software Tools Used
The software tools used included: Tableau Software 7 for visual data analysis, visualizations, and, in combination with SPSS Statistics 17, for data integrity and validity checks; ESRI ArcGIS 10 Desktop SP4 for spatial data analyses and visualizations; Java for raw data processing; PostgreSQL 9.1 and PostGIS 1.5 for (spatial) data management.

Characteristic Temporal Mobility Patterns
The analysis of temporal mobility patterns herein comprises the following common characteristics: the subject of analysis is the handovers' cell-link, i.e., the direct link between the origin cell and the destination cell (refer to the blue arrow in Figure 2)-not the cells' location; to reduce the influence of outliers on the average mobility per cell-link we calculated the median of the corresponding handovers; to preserve the original spatio-temporal characteristics inherent in the handover data, the data have not been normalized by, e.g., population or overall network traffic-the data set thus remains unbiased; for the purpose of comparing the absolute mobility patterns, the scaling of the axis in the respective diagrams is equal for the four urban environments.The patterns shown in Figures 4 and 5 are complementary and provide two different views of the characteristic temporal mobility for each of the four urban study areas per day of the week.
Figure 4 shows two patterns per study area: the upper patterns show the median sum of all handovers per hour and cell-link, i.e., the cell-link's hourly total mobility, and the lower patterns show the absolute median difference of origin-to-destination and destination-to-origin handovers per   To allow for a better comparison of both similarity and symmetry the total mobility and the absolute net migration flow have been normalized between 0 (=minimum) and 1 (=maximum) for each area separately.All patterns show that the maximal total mobility is reached on Tuesday, closely followed by Wednesday; the minimal total mobility is clearly on Sunday.However, the absolute net migration flows (Figure 7, bottom) start to diverge on Wednesday and converge again on Sunday (Figure 7, bottom left).In fact, the smallest city-Gorizia-shows the comparably highest mobility activity on Friday and Saturday, which is confirmed as an asymmetric mobility behavior in Figure 7 (bottom right, indicated with two arrows).

Discussion of Characteristic Mobility Patterns
When comparing the characteristic mobility patterns among the four study areas some functional configurations of both the urban environment and the mobile network can be derived.For instance, according to Figures 4 and 5, high mobility in Gorizia, Trieste, and Udine is concentrated at up to three cell-links.In Pordenone however, the three mobility gateways identified in Figure 5 (P1, P2, and P3) are hardly identifiable within the temporal signatures shown in Figure 4. Pordenone thus shows, in contrast to the other three urban environments, a temporal mobility pattern with a relatively consistent amplitude among all cell-links.The specific geographic location of these mobility hotspots and mobility gateways (Figure 6) correlates with locations of high population density (Figure 1).From a mobile network point of view, this potentially indicates a better load-balancing of antennas in the area of Pordenone as compared to Udine, Trieste, or especially Gorizia (refer to Figure 4: G1,G2).Furthermore, some other explanatory factors can be identified that possibly explain those aforementioned locations.For instance, the most likely reason for the location of the distinct mobility gateway U1 (Figure 6) is a major street that crosses the city's ring road in the North of Udine.
The spatial patterns of collective human mobility (Figure 6) reveal that the density of mobility and the population density (Figure 1) show similar spatial distribution patterns-which is in agreement with [24].Furthermore, different structures arise when comparing the overall shape and orientation of the individual mobility density among the other cities.For instance, Pordenone and Udine seem to follow a centralized, rather than a decentralized city model (Figure 6).In contrast to that, Gorizia shows two "mobility centers", one in the North and one in the South, but both are relatively far away from the two mobility gateways G1 and G2.Gorizia can thus be seen as a rather "decentralized" city in terms of mobility.

Detection of Exceptional Mobility Patterns
In order to visually detect significant variations within the cell-links' absolute net migration flow, which are potentially caused by some real-world events, we show a minimum of two reference days for comparison purposes.For the intuitive recognition of the overall magnitude of the absolute net migration flow we used stacked bar charts instead of the previously used line charts and visualized the flow per cell-link in 15 min time intervals per day (each color in the time series plot refers to a cell-link).

Concert
Figure 8 shows the absolute net migration flow from the 22 to the 24 July 2009.Per day, over 30,000 successful handovers are visualized.In contrast to the relatively regular rag-rug like pattern on the 22 and 24 July, one cell-link (marked with an orange arrow) notably stands out on the 23 July.This particular cell-link was located right at the main entrance of Udine's Soccer Stadium, the Stadio Friuli (Figure 8 right, orange point in the aerial photo; the smaller detailed image shows that the cell-link represented as a point is in fact a line).Whereas only 1,635 successful handovers were recorded for the 22 July, this number rose up to 14,238 on the following day, thus indicating a big event in the stadium.Consulting additional information resources revealed that a Bruce Springsteen concert was the reason for the specific pattern on the 23 July in Figure 8.
The temporal pattern reflects a typical mobility behavior pattern of the mass: fans started to arrive at the stadium in the early afternoon; during the concert the number of handovers decreased before the traffic increased rapidly around midnight, when the concert ended.

Soccer Match
Figure 9 shows an outstanding cell-link in the evening hours of the 23 August-it is the same cell-link as highlighted in Figure 8.In this case the temporal pattern of that cell-link is visually somewhat different from the previous example as there are three distinguishable peaks and two dips.The detailed view of these evening hours (Figure 9 right side) reveals some specific characteristics: the absolute net migration flow at this cell-link rises before it decreases for three time slices, increases again for one 15-min step, decreases again for three time intervals before the activity of that cell-link "disappears" in the background pattern.
This pattern perfectly correlates with the characteristics of a soccer match (in this case it was a soccer match of Udinese versus AC Milan in Italy's first soccer league).This specific pattern is easily distinguishable from the rest and most likely unique for soccer matches.Furthermore, the absolute net migration flow can potentially serve as proxy for the number of people who were present at this specific location.According to the match report, slightly more than 15,000 people officially attended the match.The absolute net migration flow for this cell-link and this day is 3,926.The ratio between 3,926 and 15,000 is 0.262.This ratio is also comparable with other soccer matches: On 23 August 2009 Udinese played against AC Milan with 21,000 people in the stadium and an absolute net migration flow of 5,758, i.e., a ratio of 0.274).On 27 September 2009 Udinese played against FC Genoa with slightly over 15,000 people in the stadium and an absolute net migration flow of 3,458, i.e., a ratio of 0.231).

Weekend
The following example demonstrates that visually detectable outliers were not necessarily connected to an official or public event.Instead, the outliers can signpost increased overall mobility in certain areas at a given time interval and thus potentially serve as indicators for, e.g., attractiveness of shopping centers or pubs, difficulties in traffic flow, etc.
Figure 11 shows a pattern for a weekend indicating that areas with an average activity during the week exhibited increased mobile network traffic on Saturday night.As the map in the figure shows, the four most frequented cell-links are geographically very close to each other-this may indicate some extraordinary nightlife activity.

Discussion of Exceptional Events
The example of the Bruce Springsteen concert (Figure 8) gives a good impression of how big real-world event are traceable in the temporal and spatial dimension.A simple time histogram perfectly reveals the underlying process and allows for insights as to when and where masses move.
The mobility patterns during a soccer match (Figure 9) are potentially important for further applications, above all real-time monitoring of events: on the one hand, exceptional events can be classified in known (e.g., soccer matches) and unknown patterns (indicating, e.g., spontaneous gatherings); on the other hand the absolute net migration flow indicates the potential for a rough estimation of the number of people on the move.
The example of the parade (Figure 10) shows how collective mass movements can be detected in fully anonymized and highly aggregated mobile network data without any tracking function.The magnitude, the location and the temporal development of such an event can be deduced from the combination of a staked bar chart and the corresponding geographic map.
The spatial and temporal coincidence of outliers in the absolute net migration flow, as shown in Figure 11, allows for reasoning about the functional configuration of the affected area (e.g., a popular bar for socializing, some difficulties at a specific road intersection, etc.).

Conclusions
In this paper we visually examined the spatio-temporal patterns of collective human mobility derived from user-generated mobile network traffic data.In order to answer the research question we developed a four-step analysis approach, from raw data processing to urban mobility information visualization, and searched for characteristic and exceptional mobility behavior.Based on the example of Gorizia, Pordenone, Trieste, and Udine we extracted characteristic mobility patterns and compared them among the four test areas.We showed the characteristic spatio-temporal configuration of the mobility gateways, and mobility hot-spots in terms of total mobility and absolute net migration flows.As a matter of fact, we revealed surprising similarities and symmetries of the mobility among the urban test areas.This seems to be a logical conclusion from ([3], p. 7205), namely "[…] that cities are self-similar organizations, indicating a universality of human social dynamics […]".Furthermore, we detected several exceptional mobility patterns that could be associated to specific real-word events.The spatio-temporal mobility pattern of this concert is similar to the pattern of the Madonna concert [7], but they can hardly be directly compared on a purely visual basis.
We conclude that the analysis approach presented, which solely relies on the intrinsic power of different visualization techniques, is an effective and intuitive way to derive both characteristic and exceptional mobility patterns from vast mobile network data.The results enable a better understanding of both the large-scale collective urban mobility behavior, as well as of the "pulse" of dynamic urban systems.However, the findings are purely descriptive and based on "only" one data set, i.e., no information about the underlying causalities can be disclosed without additional information.
Potential application areas of the visual analytics approach presented include, for instance, large-scale near real-time monitoring of urban mobility, development of more effective mobility control mechanisms with respect to traffic flow management, or the intelligent "live" detection of exceptional events with respect to emergency management.Referring to the catastrophe at the "love parade" 2010 in Duisburg, Germany, such information could-if at hand in real-time-be of crucial importance to ensure peoples' safety and security at big events, and thus save lives.

Figure 1 .
Figure 1.Case study areas: four urban environments in the Friuli Venetia Giulia region including their close periphery.
Figure path (g cell, re Figure four u mobili

Figure 8 .
Figure 8. Bruce Springsteen Concert in Stadio Friuli, Udine, on Thursday, 23 July 2009; the outstanding orange pattern refers to the Stadium's location as shown in the aerial photo to the right (the white lines indicate other cell-links).

Figure 10
Figure 10 shows how a "moving event", in this case a parade, is reflected in the absolute net migration flow.The pattern on the 13 September 2009 shows five cell-links with comparably high flows at different time intervals and different locations, thus indicating a collective movement.This spatio-temporal pattern correlates with the program of the parade [25].

Figure 10 .
Figure 10.Celebration of the 60th anniversary of mountain infantry in Udine on Sunday, 13 September 2009 (the color of the arrows correspond with the color in the map).

Figure 11 .
Figure 11.Unknown event in the Udine city center on Saturday night, 19 September 2009 (the color of the arrows correspond with the color in the map).