Urban mobility is one of the challenges that cities have to face. Characterization of urban mobility is mandatory for structuring more effective urban policies, such as transport network. Conventional methods, such as census or surveys by questionnaires or interviews, are unable to describe the complexity of mobility of the population, since they capture an isolated picture of the phenomenon. Consequently, these methods are unable to detect the spatial–temporal dynamics of mobility occurring in large cities.
The large penetration of mobile phones, and the ability to locate users by analyzing telecommunication traffic, opens up the possibility of using this technology to study the population mobility at high spatial and temporal resolution. The potential of population data derived from mobile phone traffic, with respect to conventional data, opened up new implications for a better understanding of urban usages, in time and space [1
Some studies [2
] first introduced the use of mobile phone data for urban population analysis. A few investigations [4
] presented comprehensive reviews on the different spatial studies using mobile phone data, presenting the progress that has been achieved and the potential of mobile phone data. Steenbruggen et al. [4
] presented a comprehensive review and a typology of spatial studies on mobile phone data. Wang et al. [5
] reviewed the existing travel behavior studies that have used mobile phone data, presenting the progress achieved and their potential in this field of research. Calabrese et al. [6
] reviewed the use of mobile phone data for urban sensing, outlining the data that can be collected from telecommunication networks as well as their strengths and weaknesses.
Several authors [7
] carried out studies on the mobility of urban populations by means of mobile phone data using different methods, which vary with data types and research purposes.
Many studies refer to the identification of either travel, activity or mobility patterns in different countries. Ahas et al. [7
] developed a methodology for measuring time use patterns of urban life (or social time) finding differences in social time patterns across the cities of Harbin, Paris, and Tallinn. Sevtsuk and Ratti [8
], by applying FFT and a multilevel regression model of Erlang data, were able to distinguish the hourly, daily, and weekly activity distribution patterns in the city of Rome. Secchi et al. [14
] identified subregions of the metropolitan area of Milan sharing a similar pattern along time by means of a nonparametric method. Widhalm et al. [13
] modeled the dependencies between activity type, trip scheduling, and land use types in the cities of Vienna and Boston, via a Relational Markov Network. The Far East areas were often studied for mobility patterns. Fang et al. [17
] and Yang et al. [18
] assessed the stability of human convergence and divergence patters in the city of Shenzhen, China, using a spatiotemporal model. Yuan et al. [20
] found the Dynamic Time Warping algorithm effective in exploring the similarity/dissimilarity of urban mobility patterns in a masked city of China. Human mobility patterns in the city of Shenzhen, China were also studied by Sun et al. [22
] and Xu et al. [23
] by applying Principal Component Analysis on mobile phone data and a hierarchical clustering algorithm on aggregated human mobility patterns respectively. Lee et al. [24
] analyzed and compared urban activity and mobility patterns from mobile phone records across 10 cities in Korea, presenting the internal and external mobility of phone users and calculating urban attractiveness. Jiang et al. [25
] quantified spatial distributions of travel patterns by residents in different parts of the city of Singapore. Fan et al. [26
] proposed an approach of human trajectory gridding reconstruction based on mobile phone location data to estimate the urban crowd flux in the city of Beijing, China.
Trips and ranges of travel of people were also studied by several authors [10
]. Trasarti et al. [28
] used mobile phone data to extract interconnections between different areas of the city of Paris from highly correlated temporal variations of local population densities. Understanding the city dynamics through GIS visualization or real-time urban monitoring system, both based on mobile phone data, was approached by Demissie et al. [9
] and Calabrese et al. [27
]. At a larger spatial scale, Deville et al. [16
] demonstrated for Portugal and France, how spatial–temporal explicit estimations of population densities can be produced at national scales. Other studies [31
] faced the problem of classifying or identifying land use or important living places by using mobile phone data, while Gabrielli et al. [33
] classified mobile phone users into behavioral categories by means of their call habits. Mobile phone data were also used to track populations during big social and entertainment events. In particular, Traag et al. [34
] and Furletti et al. [29
] used such data to detect social events and to develop a framework for deciding who attended an event. Pucci et al. [1
] analyzed mobile phone data to study important business and sport events occurring in the city of Milan, Italy.
Mobility of population affects different fields such as environment, economy and health. The latter is particular relevant in case of traveling because of road accidents. According to the Italian Statistic Institute, approximately 175,000 accidents occurred in Italy in 2017 causing 3378 victims and injuring approximately 246,000 persons. The National Institute for Insurance against Accidents at Work (INAIL) registered in the same year ~91,000 (14% of total accidents) work related accidents involving transportations, ~71,000 (11%) of which occurred during the commuting between home and work places. Most of these accidents occurred in metropolitan areas. Traveling time, home–workplace distance, type of transportation, and job type are examples of key parameters to assess the risk of these accidents. When vehicles are used for working, like parcels delivery jobs, the risk of accident increases with the exposure time. To mitigate the social and health impacts, it is important to get information on the driving mechanisms of these phenomena that depend on the urban structure, population mobility demand, location of job places with respect to residential areas, and availability of the transport system. The economic structure, with its availability of jobs and employees, plays a fundamental role in driving population mobility, which might depend on the type of activity and its location with respect to residential areas.
Based on the above motivations, it is important to carry out a study, which, starting from an urban mobility analysis, could link it with information about the type of population involved, location of workplaces, and types of economic activity. The metropolitan area of Rome, Italy, was chosen as study area as it is strongly affected by mobility of population driven by work related reasons. The city lacks of a detailed assessment of the mobility phenomena and its characteristics, which could be used to plan interventions aimed to reduce work related mobility of population and consequently the impact of road accidents. In particular, there are open questions about the size of this phenomenon; its spatial, temporal, and demographic characteristics; and its driving mechanisms with respect to the city’s economic structure.
This paper aims to provide answers about some of the above issues, through a mobility study, based on mobile phone traffic data. Results about the classification of Rome’s urban area in terms of mobility time-patterns, modifications of the background population demography induced by urban mobility, and connection with the local economic structure with its request of job positions, will be presented. With respect to the above literature, this paper can be positioned among a number of studies [1
] which analyzed aggregated mobile phone data to get insights into the spatiotemporal distribution of urban population. Whenever possible in the paper, a comparison with results obtained in those studies is presented.
2.1. Description of Studied Area
The study is focused on the metropolitan area of Rome, Italy. Rome is the largest Italian city, with a population of ~2.5 million inhabitants in a 1290 km2 area as of the 2011 Italian Census, with the majority of the population living within the large urban area, but also including suburban communities. Due to working reasons, Rome attracts a large number of people living in its outskirts or in other provinces. The main direction of this commuting phenomenon is toward the city’s center, where business and services activities are located. Rome also attracts many tourists, which also contribute to the baseline mobility. In addition, the economic structure of Rome is rather complex, as it includes production units also in other semicentral areas, which also produce intra-urban mobility.
2.2. Population Data Derived by Mobile Phone Traffic
Population data were provided in the frame of the TIM BIGDATA Challenge 2015. Data are related to the TIM Italian mobile phone operator subscribers. Its market penetration is 32% at national level. As a consequence, it captures only a sample of the actual population. Two types of data were used in this study: the presence population and the demographic datasets. They were provided by the mobile company and used by the authors as it is. Consequently the latter did not participate to the processing of raw mobile traffic data to produce both the presence and demographic datasets.
The presence population dataset provides the amount of persons located in the studied area at aggregated level. It is based on full mobile phone type of communications (e.g., calls, TXT, and Internet). Privacy is guaranteed by design and tracking of individual was not possible with this dataset. The methodology used to derive presence of population data from mobile phone traffic is described elsewhere [35
] and only summarized herein. Basically, for each serving Base Transceiver Station (BTS) belonging to the network, a coverage area is defined by taking into account its technical characteristics, orography, and characteristics of the host building. If a user performs a telecommunication action (e.g., phone call, text message, or internet connection) while connected to a BTS, his/her presence is distributed over the coverage area associated to such a BTS. The coverage area was gridded to obtain maps related to the spatial distribution of the population by summing the number of users estimated to be in each cell of the grid at a certain time. Users are associated with cells, depending on where they perform their last telecommunication action. Therefore, if the presence of a user is recorded in cell A at time t = t0
, performs an action in cell B at t = t1
, and finally, at time t = t2
performs an action in cell C, his/her presence will be recorded in cells A, B, and C sequentially. The above procedure maintains the position of an individual up to the next phone action, preventing the loss of individuals due to low mobile frequency use.
The demographic datasets provides the number of TIM users selected for age ranges, at aggregated level. The data were based on Call Detail Records (CDRs) generated by outgoing calls. This dataset provided the amount of persons belonging to a certain age range (<18; 18–30; 31–40; 41–50; 51–60; >60; unknown).
The two datasets have substantial differences in their ability to track population density. Conversely to presence dataset, the amount of detected persons in the demographic dataset is missing or very low during late night-time due to the lack of sufficient CDRs events. Furthermore, they cannot be considered either connected or related and have to be analyzed separately. To guarantee a sufficient amount of data, the analysis of Demographics dataset was restricted to daytime (8–21 local time). At the same time, the class of unknown age was not considered, as it could not be linked—to census data.
The presence and demographic datasets provided gridded aggregated data at high time and space resolution for the studied area. Figure 1
(top and bottom) show the grids where population data were provided for the whole area and for the main metropolitan area of Rome, respectively. The grid covers an area of 114 × 114 km2
with 927 grid cells of different size. A higher spatial resolution is used (from 0.26 × 0.34 up to 1.0 × 1.3 km2
) within the urban area, while a coarser one is applied outside (from 4 × 5 up to 16 × 20 km2
). The data at each cell are provided by the mobile phone company at a time resolution of 15 min during the period 1 March to 30 April, 2015. Data were first averaged on hourly bases overall the period, and then selected for weekdays for further analysis.
2.3. Census Statistical Data
Data from the last Italian census survey, carried out by the Italian National Statistical Institute in 2011, were also used in this study for validation of results and their connection with the local economic structure. It includes information about demography, number of residential and production, commercial and services buildings, and mobility of population. This latter is described by parameters such as exit time, travel distance, and duration [36
]. In addition, data about the amount of population moving daily in and out from their residence are provided at census section level. A general census of industry and services was also carried out [37
], which provided, at census section level, the number of production units and employees by economic activity class, according to the ATECO classification. This kind of census survey involves a sample of 260,000 big and small/medium enterprises, 470,000 nonprofit organizations and 13,000 public institutions. According to census data, the city of Rome largely contributes to the studied region (367,906 (90%) and 1,500,000 (66%) for production units and employees, respectively). Data were grouped in 19 economic macrocategories. Table S1 of the Supplementary Material
summarizes the amount of units and employees by macrocategories for the Lazio region. Wholesale and retail commerce is the most frequent production unit (23%), followed by professional, scientific, and technical services (16%). However, in terms of number of employees, the former contributes for 13%, while the latter for ~7%. In general, service activities are largely the most frequent economical business in the studied area. In addition, the amount of buildings used for production, commercial and services were provided at census section level. The census data about the economic structure were used to relate urban mobility with jobs demand.
As for demography, data were provided at census section level in 15 age classes [38
]. To match these data with the mobile phone derived one, they were reclassified at the closer mobile phone user’s age classes. In detail, the following census age classes were used, 15–19, 20–29, 30–39, 40–49, 50–59, and >60. The lowest age class was limited to 15, as subjects with lower ages were considered unlikely to own a mobile phone.
To support the mobility study, complementary data of socioeconomic position of resident population were used. The socioeconomic position index was developed by Cesaroni et al. [39
] using census information that represented various dimensions of deprivation: education, occupation, housing tenure, family composition, and immigration. By using a factor analysis technique, they obtained a 5-level composite index (high, medium high, medium low, medium, low) which summarizes the socioeconomic status of population living in Rome.
For further analysis, the census information has been mapped onto the TIM grid using GIS techniques. Census numerical data have been first joined to section geometry obtaining a spatial layer having census data as attributes. Sections were first spatially intersected with the TIM grid, obtaining a new layer where sections are split along the TIM cell borders. When appropriate, the density per Km2 of each census parameter was used as the invariant value to be assigned, during processing, to all subsections belonging to the same section. Densities were transformed back using the area of the subsection, to obtain absolute values of each parameter that are proportional to the original one and weighted on the subsection size. Last, for each census parameter at each TIM cell, the sum of values at all subsections falling in a TIM cell was assigned to the TIM cell.
The above results point out the significance of population mobility in large metropolitan areas and its connection with the workforce.
Mobile phone traffic, used as a proxy of population presence, is able to track the location of people with high spatial–temporal detail, providing variations of population that are largely due to mobility. Time-pattern of population were found to be consistent with other studies carried out in Rome [3
], Milan [30
], Korean cities [24
], and Paris [28
] exhibiting common time features related with daily activity rhythms.
The mobility factor (NPM) introduced in this work, allowed the mobility phenomena to be identified and assessed. Hourly time profiles of the mobility factor were calculated and classified in seven clusters. Results were then validated with population, mobility, and type of building census data. A former study [3
] applied a cluster analysis on six time intervals of Erlang data (a measure of network bandwidth usage) collected in the city of Rome, identifying eight clusters. Although they revealed an overall structure of the city linked with the type of human activities, they could not connect it with cell signatures. Our work improved those results by obtaining a mobility map of the Rome metropolitan area, assigning mean clustered mobility time-patterns to the cells and linking them to the specific ongoing economic activities and workforce. Being based on aggregated cell level population data, this study could not deal with complex analysis and methods such as identification of travel, activity or mobility patterns as well as with identifying important living places, social events or with classifying mobile phone users into behavioral categories, as those analysis are based on processing of individual mobile phone cards or GPS data not available in this dataset. Consequently, the goal of this paper is not to present a new analysis/technology or to carry out a mobility pattern analysis. Rather, the study takes advantage of open mobile phone data to carry out a conventional mobility study using time variations of population density as a proxy of mobility, classifying the study domain by daily mobility patterns and linking it with other unconventional data like dynamic demography and workforce/economic sectors census data.
Dynamic demographic data by age classes, allowed to get information on the type of population involved in this mobility and to get insight in the related composition of population. Although the mean age class population distributions obtained from mobile phone are consistent with those of census data, a large spread (±15%) of variation are detected for each age class. Consequently, the mobility of urban population does affect its composition. The comparison at cell level of these mobile derived age population distributions with census data, permitted to assess the amount of variations of age population distributions due to mobility. In a range between 0 and 1, the maximum difference between the mobile based and the census CDFs was 0.15 on average with peaks up to 0.68. The areas with the highest values of NPM mobility factor were found to have the greatest variations on the age population distributions, mainly produced by the working age classes. Therefore possible intervention measures should be applied in these areas. These demographic results are new and contribute to the analysis of Rome’s urban mobility.
This study found that the number of employees per 100 inhabitants has a log linear relationship with the total early morning mobility and a strong spatial association with the time-patterns of the NPM mobility factor derived by cluster analysis. Although some general results are expected, the analysis carried out allowed the quantification of the phenomenon of mobility in this large metropolis and its characterization. A former study [8
] analyzed Erlang data collected in the city of Rome during seven weeks. It found that different hours of the day differently affect the mobile activity levels of diverse areas of the city and that differences in activity patterns among areas can partially be attributed to differences in their demographics, establishments, and built environment. This study improves these former results, by providing demographic information and by quantifying the contribution of employees’ per economic activity to mobility. Spatial consistencies with Census data about the population moving daily within or out from the municipality of residence and about both socioeconomic index and buildings used for production, commercial and services, were also found. The combined information on mobility-employees-economic sector could be used to assess the effect on mobility due to relocation of offices owning to economic macrocategories found to be of high contribution on mobility such as those of cluster 4 and 6 (high density business & services areas and touristic & commercial areas). Mobile phone data were already used to relate mobility with socioeconomic positions. Blumenstock [40
] reviewed how data can be useful to measure wealth and poverty in developing countries. Pappalardo et al. [41
] used mobility measures and social measures extracted from mobile phone data to estimate indicators for socioeconomic development and well-being in France. Marchetti et al. [42
] showed how big data has the potential to mirror aspects of well-being and other socioeconomic phenomena. They suggested three ways to use big data together with small area estimation techniques: to create local indicators and compare them to those obtained with small area estimation methods; use big data sources to generate new covariates; use survey data to check and remove the self-selection bias of the values of the indicators obtained using Big Data. The present study partially relates with the first and the third suggested ways. It creates a local mobility indicator (NPM cell based) and used survey data (population moving daily, socioeconomic position index, and amount of employees per economic sector) to check spatial consistency and validates results.
A few limitations of this study have been identified. Firstly, the short period of analysis (two months) does not permit conclusive evaluations about the variability of population mobility by seasons. Secondly, for individuals having low rate phone activity, the tracking of their positions could lack of accuracy. Demographic data might underrepresent some age classes, particularly those who are not frequent callers, such as elderly people. In addition, because population data were not separated as individuals, we could not detect variations of population determined by exchanges of the same amount of people moving to and from connected grid cells (intercell mobility); these are supposed to produce null effects on the related population amount. The lack of individual data do not allow to carry out a deeper analysis on travel, activity or mobility patterns as those conducted in many mobility studies [17
]. Being based on offline analysis of presence data derived from mobile phone communications, this study does not provide tools for real-time applications such as those able to monitor population and its composition, support emergency analysis involving population, or deliver information services to public transportation systems. Nevertheless, the study represents the first insight on actual time dependent distribution of people in the complex reality of the city of Rome.
Exploiting aggregated data available as open-data for the first time in Italy, the study provides for the city of Rome and its Province an accurate and deep understanding of the population mobility in the metropolitan area, providing its timing, spatial distribution, demographic effects and driving mechanisms in terms of workforce contributions from different macroeconomic categories. These results could support the identification and effectiveness of mobility measures aimed to reduce the work related commuting between home and work places and consequently the risk of road accidents with related injuries. These latter occur mainly due to poor use of public transport for individual commuting, hence mobility driven measures of urban planning can positively reflect on reduction of social/health costs caused by home–work commuting and work-related road accidents with related injuries. The timing of mobility at cell level could support, as an example, the tuning of public transportation plans to provide services where they are needed. The workforce contributions from specific economic sectors to the classified mobility patterns could provide data to evaluate the effectiveness of relocation of selected economic activities to reduce work related mobility. Finally, the demographic results provide information about the people involved in the mobility phenomena allowing to correctly design and target specific communication campaigns or marketing measures.