Modeling Population Spatial-Temporal Distribution Using Taxis Origin and Destination Data

During dangerous circumstances, knowledge about population distribution is essential for urban infrastructure architecture, policy-making, and urban planning with the best Spatial-temporal resolution. The spatial-temporal modeling of the population distribution of the case study was investigated in the present study. In this regard, the number of generated trips and absorbed trips using the taxis pick-up and drop-off location data was calculated ﬁrst, and the census population was then allocated to each neighborhood. Finally, the Spatial-temporal distribution of the population was calculated using the developed model. In order to evaluate the model, a regression analysis between the census population and the predicted population for the time period between 21:00 to 23:00 was used. Based on the calculation of the number of generated and the absorbed trips, it showed a different spatial distribution for different hours in one day. The spatial pattern of the population distribution during the day was different from the population distribution during the night. The coefﬁcient of determination of the regression analysis for the model (R 2 ) was 0.9998, and the mean squared error was 10.78. The regression analysis showed that the model works well for the nighttime population at the neighborhood level, so the proposed model will be suitable for the day time population.


Introduction
Cities and settlements are built nowadays in places that are subject to all types of natural and man-made disasters due to natural hazards. Cities are one of the most complex man-made structures that face today's unprecedented population growth, and their expansion has been on hazardous locations [1]. With the expansion of population and urbanization, the vulnerability to hazards, life threatening events, and financial risks have increased. Today, risk management has replaced post-crisis management in order to manage the crises that are caused by all types of potential hazards. According to the 2005 Hyogo framework, the purpose of risk management is to identify the risk before it occurs, to implement the coping strategies, and to consider the possible plans to reduce the effects in the event of an accident. Any energy factor that has the potential to harm a person can be considered a hazard factor. Therefore, humans are the most important element involved during a hazard. A hazard is the possibility of injury due to the occurrence of danger. Understanding the hazards, accepting the hazards, and consequently taking preventative measures in order to reduce casualties requires the awareness of population distribution [2,3]. The knowledge about population distribution is essential for urban infrastructure architecture, policy-making, and urban planning with the best spatial and time resolution for risky circumstances. The spatial-temporal population information can be used in traffic management, environmental protection, migration, risk assessment, crisis management, and urban sustainability [4][5][6][7][8]. Key human system variables, such as demographics and population estimation play an important role in urban sustainability. Population change affects the environment and consequently urban sustainability. Modifying the supply and demand pattern between the transportation and land use system and the population is possible by using the population change pattern in order to achieve a sustainable urban development. By identifying the densely populated areas during the day, we can develop the public transport network in the area and develop land uses in sparsely populated areas. Thus far, many studies have been conducted about population distribution modeling. Ara [9] examined the effect of population distribution on earthquake damage evaluations in Bangladesh. The author used statistical data, a field survey, and a questionnaire. Zijun and Lili [10] addressed the spatial-temporal modeling disaster-prone populations based on the land-use. For this purpose, statistical data, the Monte Carlo method, and the Modify Areal Weighting (MAW) method were used. They modeled static and active populations in the morning and at night. The results of the mentioned study showed that there is a difference in the population distribution between the day and the night. Renner et al. [11] addressed spatial-temporal population modeling in order to improve the risk assessment in Bolzano. They used statistical data and a field survey for the population modeling. They performed population modeling at night and during day time intervals for the resident population. They illustrated the dynamic population distribution modeling in Bolzano, and they concluded that this information is very useful for risk assessment in tourist areas.
These studies modeled the spatial-temporal distribution of the population using statistical methods. Even though the spatial distribution and the temporal distribution of populations can be obtained using statistical methods, these methods are not always sufficient to measure the microscopic travel behavior, so the data needed for this type of modeling is sometimes not available.
Understanding the spatial-temporal structure of modern cities has been made possible with the discovery of the population movement patterns [12]. The traveling actions of individuals can be analyzed using new sources of evidence. The vast volumes of data obtained from smartphones, metros, busses, and social media have helped to explain the patterns of human travel, urban dynamics, and spatial experiences [13][14][15][16]. Global Positioning System (GPS) data is able to identify people's movement behavior due to its unique features. Therefore, data from GPS systems provide a better understanding of the population movement behavior over time and space [17]. Yuan and Le Noc [18] examined urban movement using GPS data. The distance and the direction of travel are determined by the pick-ups and the drop-offs. The number of pick-up locations and the average trip extent reflects the size and the scale of the mobility of different urban regions. The entropy index was used to determine the direction of travel, and the hourly mobility pattern similarity was calculated using the Dynamic Time Warping (DTW) algorithm. Gong et al. [19] used taxis data to discover travel patterns. Shen et al. [20] examined the pattern of the population movement, and they studied the number of generated and absorbed trips using taxi data. In the studies mentioned above, the pick-up and drop-off taxis data is widely used in order to identify the human movement patterns as opposed to population estimation and modeling. Therefore, the big data of taxi origins and destinations can be used in many spatial analyzes.
Urban planning and management require accurate spatial information at consecutive city times. The planners and the decision-makers receive the information they need about the current state of development and changes by monitoring these changes. A Geospatial Information System (GIS) is an essential tool to scientifically analyze these changes, which provides the required management protocol. Therefore, a GIS has been considered by researchers in recent years and due to the spatial nature of population distribution, it is widely used in related matters of Spatial-temporal population modeling. The main purpose of this study is to develop a model to access hourly population variations, which are based on the pick-up and drop-off taxis data. Taxi origin and destination data has been selected for population modeling due to low costs, high accuracy, time consuming data, and real-time data. This model calculates the hourly population of Bojnourd city, Iran. In this paper, the number of generated trips and absorbed trips are first calculated in the regions using the pick-up and the drop-off taxis data. Next, the census population is then assigned to the neighborhoods, and the Spatial-temporal modeling of the population is conducted using the obtained relationships.
In this study, a brief introduction about the necessity of the research is first given, and its implementations and a literature review are then presented. Next, the study area is introduced. After that, the research method, the results, the discussion, and the conclusions are presented.

Study Area and Data
Bojnourd is situated in Northeastern Iran, which is the capital of Northern Khorasan province. Bojnourd city, which is a 36 square kilometers area, is one of the major cities in the country. 27.5 percent of the area of Bojnourd city is used for residential use, 0.9% of the total area of the city is used for commercial use, and 1.8% of the city is used for urban facilities and equipment. Nearly 24 percent of the city has other lands, and another 13.3 percent has non-urban land uses. According to the latest census of population and housing data for 2016, the city had a population of 228,931, which included 67,335 households, and it comprised of 114,441 men and 114,490 women. Bojnourd city is the center of all types of activities with various administrative and industrial services. The form and the system of networking of Bojnourd city streets are in the form of a semi-regular grid network, which has made inter-city communication easier. The city is bound on the north and the southwest by faults, and it is bound on the north and the east by rivers. Bojnourd has been prone to natural disaster danger for many years due to its geographical location, and it is one of the most sensitive areas in the country. The historical events of earthquakes in the study area show that the city of Bojnourd was the epicenter of earthquakes until 1922, and according to historical evidence, earthquakes caused the destruction and the devastation of many inhabitants. This clearly explains the need for spatial-temporal modeling of the population.
The data used in this study includes population data, spatial data, and locations data of the taxis drop-off and pick-up locations. The population data and the spatial data were obtained from the Iranian Statistical Center. Data about the pick-up and the drop-off of the taxis were collected from 137 taxis between 1 November 2018 to 26 December 2018. The accuracy of this data includes from the time level every 20 s, and at the spatial level involves every 5 m. The data was downloaded from the Hamsi application, which is the online taxi request software, that belongs to the Sako Company in Bojnourd city. The taxis data contains 20,008,969 records, and it is formatted to the raw points for the passenger's pick-up and drop-off points. Each data record includes information about the taxi id, speed, position, time, direction, and whether or not the taxi is carrying a passenger, which is the status. Table 1 shows a recorded example of the taxi data. Figure 1 shows the location of Bojnourd city in Iran, which includes the population's spatial distribution and the spatial distribution of the pick-up and drop-off points.

Methodology
The purpose of the present study is to model the spatial-temporal distribution of the population using Bojnourd taxis pick-up and drop-off location data. In this research, the number of generated and absorbed trips in each neighborhood is estimated using the GPS data from taxis for the spatial-temporal modeling of population. Next, the day time population is calculated using Equation (1). In this study, the number of incoming and outgoing people to the study area is calculated using Equations (2) and (3) [21].
where D P denotes the population of the area during the day, C P is the statistical population of the region, and P d denotes the population entering the area. It is calculated from the number of drop-off locations, which is also called the absorbed trips (NDO). P p is the population of the area leaving it, and it can be calculated by the number of pick-up locations in the region, which is also called the generated trips (NPU). α is the percentage of the trips made by the citizens of Bojnourd city using taxis. The α value in this study is 18.08%. The hourly population of the region can also be obtained using the Equation (4) [21].
where P Li is the population for neighborhood i during period L. P Oi is a census population that is obtained from the Statistics Center of Iran according to the latest population and housing census (2016). P is the sum of population of the regions. Figure 2 illustrates the steps used in the research.

Results and Discussion
In this research, Bojnourd's GPS taxis data was used for the spatial-temporal modeling of the urban population. After processing the data, the number of generated trips, which included the number of pick-up locations, and the absorbed trips, which included the number of drop-off locations, per hour was obtained. Figure 3 depicts the number of pick-up and drop-off locations per hour. Both the generated trips and the absorbed trips have a high frequency from 4:00 to 18:00. There is also a reduction period from 18:00 to 4:00. There is a period of increase with the number of generated trips and absorbed trips between 8:00 and 9:00 due to the commencement of the working hours, the departure of the passengers to the workplace, and between 14:00 and 15:00 when daily work ends and the travelers are returning to their homes. The output of this step is called the travel demand estimation model, the Origin-Destination (OD) matrix, or the trip distribution matrix. The rows of this matrix represent the origins, and the columns represent the travel destinations. The values of this matrix show the number of trips made from each origin to each destination. Using the OD matrices, it is possible to predict the urban movement pattern and the population of the region. The spatial-temporal distribution of the population of Bojnourd can be calculated based on Equations (1)-(4). Figure 4 shows the hourly alterations of the populations of the ten central neighborhoods of the study area, which were randomly selected. The entire population of the city declined rapidly from 7:00, which continued to have a downward trajectory until 13:00. The population increased rapidly between 13:00 to 15:00, which resembled the night population, and it also decreased between 15:00 to 19:00.  Table 2 shows the total night population, the maximum day population, and the day to night population ratio for the study area. The maximum day to night population ratio is 1.048. Out of 75 neighborhoods, 68 neighborhoods had a day to night population ratio that is larger than 1. Dabaghkhane and the Bagh Haj Rahman neighborhoods, which have educational land uses, residential land uses, green spaces, and sport land-uses, have large day to night population ratios of 1.022 and 1.048. These ratios are close to one for Hosseini Masoum, Ferdowsi, Koi Behdari, Shahrak Shahed, Madrese Ferdowsi, Mosala, Jafarabad, Nirogah, Koi Police, and 17 Shahrivar shoaled neighborhoods due to a balance with the day population and the night population. The Nader, Bargh, and Ahmadabad neighborhoods have low day to night population ratios of 0.993 and 0.994. The city of Bojnourd can be divided into four categories, which include high during the day and night (high-high), high during the day and low during the night (high-low), low during the day and divided high during the night (low-high), and low during the day and night (low-low). Figure 5 shows the spatial distribution of the day to night population ratio of Bojnourd. Figure 5 shows that neighborhoods with high population ratios are more concentrated in the neighborhoods from Shahrak Alghadir, Dabaghkhane, Bagh Aziz, Bagh Motahari, and Taher Gholam, which include with 1.039, 1.021, 1.015, 1.012, and 1.011 ratios, respectively. The higher day to night population ratio in a given area can indicate a greater proportion of commercial land-use than the proportion of residential land use.  Figure 5. Spatial distribution of the day to night population ratio.
The spatial-temporal map of the study population for the five-time intervals from 03:00 to 05:00, 07:00 to 09:00, 11:00 to 13:00, 17:00 to 19:00, and 21:00 to 23: 00, which are based on the hourly changes in the number of generated and absorbed trips, are shown in Figure 6. The population of the day trips especially between 7:00 and 9:00 was significantly expanded from the center to the outer outskirts. The city's population was scattered throughout the evening due to the citizens returning to residential regions. In the downtown areas, there are no obvious differences between the distributions of the day population and the night population, and the ratio of the day population to the night population distribution for these areas varies in the range from 1.007 to 0.93. Figure 5 shows that there are several important dense population centers for the day, which are mostly concentrated in the commercial, tourism, and industrial areas within the downtown, east, west, and south areas of the city. Similarly, there are dense day centers in the suburbs. Figure 5 shows that the daytime population distribution extends from the center to the suburbs, which is largely due to the concentration of commercial centers, farmland, administrative centers, and offices. There is also some day to night population densities in the central region, which are mostly due to service centers and residential areas.
An analysis of the hourly population changes for the residential, commercial, educational, medical, and tourism neighborhoods are shown in Figure 7. The hourly changes in population indicate the population in two-hour intervals for this land uses in Bojnourd. The population movement is achieved during the day to night land uses for the mentioned land uses. The population of all the neighborhoods shows a decline over time. In residential neighborhoods, there is a decrease in population during the early hours of the day due to people leaving home to go to work and for education purposes. The population of residential neighborhoods during the evening shows an upward trend that is inverse to the daytime.  Business neighborhoods have high population fluctuations due to the movement of people during the day. The population in business neighborhoods drop off between 1:00 and 9:00. The population growth is observed from 9:00 to 13:00 in business neighborhoods due to the business hours. Population growth is observed in educational facilities during the day and from 5:00 to 7:00 due to the arriving students and staff. After that, there is a drop in a population at night. Moreover, due to the closure of schools, there is a decrease in population from 1:00 onwards. The population of tourist sites throughout the day also has a lot of up and down, and there is an increasing peak during the night due to a large number of people going to these places to rest during the night. Medical facilities populations are also increasing in the morning due to the arrival of staff and patients, and smooth population changes are observed during the rest of the hours.
In order to verify the validity and the appropriateness of our model, a regression analysis was performed between the measured values, which included the census population residing at night and the predicted value, which included the calculated population at 21:00 to 23:00 using Equation (4). Figure 8 presents the census population, which was obtained from the Statistics Center of Iran according to the latest population and housing census (2016), versus the evaluated overnight population from 21:00 to 23:00. The regression analysis showed that the model performs well for the neighborhood day population. The coefficient of determination for the model (R 2 ) is 0.9998, and the mean square error is 10.78. The evaluated total night population is only 0.4% lower than the census population.

Conclusions and Recommendations
With the rapid growth of the urban population, urban sustainability is threatened, and the occurrence of any problem causes the non-realization of urban sustainability. Today, sustainability has become the first priority in the laws that govern urban planning. Therefore, studying the pattern of urban population distribution among planners, experts and officials can realize the idea of sustainable urban development to some extent. This study examined the Spatial-temporal distribution of the population of Bojnourd city, which was based on taxis pick-up and drop-off location data. The results of this research can be widely used in studies that are related to Spatial-temporal analysis, such as urban transport infrastructure design, urban policies and planning, and risk assessment. In this study, we attempted to model the population distribution using location-based data. The results of calculating the amount of generated and absorbed trips indicated an abundance of daytime taxi trips, which illustrated that people take taxis more during the day than at night. The results of modeling the Spatial-temporal distribution of the population in the research of Ma et al. [21] and Renner et al. [11] indicated that the day and night populations are not the same. Moreover, the population increasing during the day from the central areas to the suburbs. In the present study, the population of the central areas declined sharply after 7:00. However, the population of the central areas increased between 21:00 to 23:00. The spatial pattern of the population distribution during the day was different from the population distribution during the night. The day time population distribution expanded from the central part of the city to the suburbs due to the urban structure and the expansion of commercial centers, offices, the agricultural land, and the industrial land. Bojnourd's population during the day and the night moved from low-high to high-low areas, which indicated travel between residential and work areas. The population moves from the suburbs to the central areas during the night due to the existence of residential areas in these areas. The suburbs had a high daily and night time population density due to the existence of parks, stadiums, business centers, gardens, and residential areas. There was industrial congestion in the suburbs. This type of industrial congestion is beneficial to the growth of the regional economy as well as to the scatter of the urban population in the suburbs. The analysis of the hourly population variation for residential, commercial, educational, medical, and tourism neighborhoods was highly heterogeneous and fluctuating, which could be due to the scattered tourists, commercial neighborhoods, educational neighborhoods, and medical neighborhoods. Moreover, the regression analysis between the census population and the predicted population for the night time from 21:00 to 23:00 was used to evaluate the model. The coefficient of determination for the model (R 2 ) was 0.9998, and the mean square error was 10.78. The total during the night population estimated from 21:00 to 23:00 was only 0.01% lower than the census population, which could be due to an error in positioning. The regression analysis showed that the model could work well for the population at the neighborhood level, so the model that was used is suitable for the day time population. A population distribution with a precise time scale can be a valuable source of information, which is equally important for a disaster risk assessment. Moreover, having population distribution information with the highest quality in space and time can reduce vulnerabilities. Moreover, according to a study that was conducted and the results obtained, the relevant officials should take the necessary measures and controls as soon as possible. For this purpose, solutions, such as increasing public transportation in places with high populations, increasing the width of streets in places with high populations for rescuing, the control of land uses, public awareness, and control methods are proposed.
GIS plays an important role with the Spatial-temporal modeling of populations by performing a spatial analysis. The data from taxis have great potential because of the enormous volume of data that is used to calculate the number of generated and absorbed trips in an area and to perform a spatial analysis, such as population Spatial-temporal modeling. This method is used for large amounts of data, such as taxi data, and it will not work with small amounts of data. It will also be suitable for metropolitan areas because of the large volume of taxi trips. In future research, population modeling at the block spatial unit level for disaster preparedness and management can be effective.