Intraday Variation Mapping of Population Age Structure via Urban-Functional-Region-Based Scaling

: The spatial distribution of the population is uneven for various reasons, such as urban-rural differences and geographical conditions differences. As the basic element of the natural structure of the population, the age structure composition of populations also varies considerably across the world. Obtaining accurate and spatiotemporal population age structure maps is crucial for calculating population size at risk, analyzing populations mobility patterns, or calculating health and development indicators. During the past decades, many population maps in the form of administrative units and grids have been produced. However, these population maps are limited by the lack of information on the change of population distribution within a day and the age structure of the population. Urban functional regions (UFRs) are closely related to population mobility patterns, which can provide information about population variation intraday. Focusing on the area within the Beijing Fifth Ring Road, the political and economic center of Beijing, we showed how to use the temporal scaling factors obtained by analyzing the population survey sampling data and population dasymetric maps in different categories of UFRs to realize the intraday variation mapping of elderly individuals and children. The population dasymetric maps were generated on the basis of covariates related to population. In this article, 50 covariates were calculated from remote sensing data and geospatial data. However, not all covariates are associate with population distribution. In order to improve the accuracy of dasymetric maps and reduce the cost of mapping, it is necessary to select the optimal subset for the dasymetric model of elderly and children. The random forest recursive feature elimination (RF-RFE) algorithm was introduced to obtain the optimal subset of different age groups of people and generate the population dasymetric model in this article, as well as to screen out the optimal subset with 38 covariates and 26 covariates for the dasymetric models of the elderly and children, respectively. An accurate UFR identiﬁcation method combining point of interest (POI) data and OpenStreetMap (OSM) road network data is also introduced in this article. The overall accuracy of the identiﬁcation results of UFRs was 70.97%, which is quite accurate. The intraday variation maps of population age structure on weekdays and weekends were made within the Beijing Fifth Ring Road. Accuracy evaluation based on sampling data found that the overall accuracy was relatively high— R 2 for each time period was higher than 0.5 and root mean square error (RMSE) was less than 0.05. On weekdays in particular, R 2 for each time period was higher than 0.61 and RMSE was less than 0.02.


Introduction
Accurate population maps represent the spatial patterns of population distribution and can be used in regional planning and development [1][2][3][4], ecological environment and network companies have begun to generate user profiles that contain behavioral information, attribute information, and activity tracks of each user. Real-time or near-realtime population distribution maps and attribute information have been produced on the basis of these data [43]. However, these data are characterized by low accessibility and high production costs, which increases the restrictions on their use. Therefore, the ultimate objective of this article was to investigate the feasibility of using publicly available datasets to map intraday variation of population age structure in a high-accuracy and low-calculation cost way, as well as to analyze the activity patterns of people of different ages on the basis of these maps.
The remainder of this paper is organized as follows. The study area and datasets used in this article are introduced in Section 2. The methods used in intraday variation mapping of population structures are described in Section 3. The results of each step are shown in Section 4. Section 5 discusses the advantage and the limitation of this article. Finally, Section 6 concludes the findings and explains how the work could be extended in the future.

Study Area
Beijing, the capital of China, has a large population and frequent population movement. Commuting trips have characteristics of strong temporal and spatial regularity and rigidity [44], which is the most important form of population activity in urban areas. According to the data provided by the Beijing Transport Institute, there are 23 million daily commuting trips on average, the average commuting distance is 12.4 km, and the average commuting time is 56 minutes. With the rapid progress of urbanization, Beijing has gradually formed an approximate radial and concentric spatial structure defined by ring roads [45]. There are 6 existing ring roads in Beijing, of which the Fifth Ring Road is a circular highway with a total length of 98.58 km. The area inside the Fifth Ring Road ( Figure 1) is 667 square kilometers and includes Xicheng District, Dongcheng District, Chaoyang District, Haidian District, and Fengtai District, as well as most of Shijingshan District and the northern part of Daxing District. As Beijing's core area, the area within the Fifth Ring Road has the highest population density and traffic flow [46]. According to the distribution of the permanent population in Beijing, jointly released in 2015 by the Beijing Municipal Bureau of Statistics and the Beijing Survey Team of the National Bureau of Statistics, the area within the Fifth Ring Road only accounts for 4.07% of the area of Beijing, while accounting for 49% of the permanent population in Beijing. Considering that the area encompassed by the Fifth Ring Road has a large population and regular population mobility, it is very suitable for the intraday variation mapping of population age structure. Therefore, we choose this area as the study area.

Remotely Sensed Data
Four kinds of remotely sensed data were used in this article, which were used to generate covariates in the population dasymetric model of elderly individuals and children.
The Visible Infrared Imaging Radiometer Suite (VIIRS) (https://ncc.nesdis.noaa.gov/ VIIRS/ accessed on 18 February 2021) onboard the Suomi National Polar-orbiting Partnership [47] spacecraft can produce a suite of average radiance composite images using nighttime light data from the VIIRS Day/Night Band (DNB) [48]. These data can identify weak light sources, which can be used to study the atmosphere, surface processes, and human activities. The VIIRS products have a spatial resolution of 740 m and are produced monthly and annually.  The land use and land cover data were obtained from the MODIS Land Cover Type Product (MCD12Q1) (https://lpdaac.usgs.gov/ accessed on 18 February 2021), which supplies global land cover maps at annual time steps and 500 m spatial resolution from 2001 to present. On the basis of the International Geosphere-Biosphere Program (IGBP) classification scheme, we divided the MCD12Q1 products into 17 land cover types. We selected categories related to population distribution and generated covariates used in the dasymetric model.
We Sentinel-2 is a wide-swath, multispectral, and high-resolution Earth observation mission from the European Space Agency's Copernicus Program. The mission supports a broad range of services and applications, such as agricultural monitoring, emergency management, land cover classification, and water quality [49]. The spatial resolution of each band of Sentinel-2 is shown in Table 1. Sentinel-2 was used to calculate some spectral indexes in this article. Two types of geospatial data were used in this article, namely, OpenStreetMap (OSM) data and point of interest (POI) data.
OSM (https://www.openstreetmap.org/ accessed on 18 February 2021) is a collaborative project to create a free editable map of the world. As the most popular volunteer geographic information (VGI) project, OSM allows users to extract and upload data from handheld GPS devices, aerial photographs, other free content, or even local knowledge alone. This makes the OSM data crowdsourced and characterized by fast updates. OSM data also have the characteristics of high accuracy, and many studies have proven that the level of detail and coverage of their data is even better than some proprietary maps of certain countries or regions [50,51]. At the same time, since OSM has been proven to have a high positional accuracy in urban areas [52], it has been confirmed that it can be use in urban functional region (UFR) identification [53] and population mapping [27]. The OSM dataset was used in this article to identify the UFRs and to generate covariates for population dasymetric mapping.
The POI data were retrieved from NavInfo, the leading locational big data provider in China. POI data are a type of geospatial big data containing information such as names, coordinates, and categories. In recent years, many studies have demonstrated the usefulness of POI data in identifying UFRs [54][55][56][57][58][59] and population mapping [17,27,60]. In this article, POI data were also used in both UFR identification and population dasymetric mapping. A total of 230,329 records of POIs were obtained in the study area.

Demographic Data
The 2018 demographic data for Beijing were obtained from the China Statistical Yearbook published by the National Bureau of Statistics of China, which is the latest population data released thus far. The demographic data from 16 districts and counties of Beijing include the number of people in 3 age groups: 65 years old or older, 14 years Remote Sens. 2021, 13, 805 6 of 25 old or younger, and 15 to 64 years old, which indicate the population at the end of the year. On the basis of these data, we calculated the proportion of elderly individuals and children and used this as the dependent variable in the permanent resident population dasymetric model.

Population Survey Sampling Data
Population survey were organized on August 13 to 29, 2020. A total of 22 sampling points within the Beijing Fifth Ring Road ( Figure 2) were deliberately selected evenly on the basis of the distribution of UFRs to ensure that each category of UFR could be sampled. The identification of UFRs is explained in detail below in Section 3.2.1. We conducted 6 population surveys in the morning, afternoon, and evening on weekdays and weekends at each sampling point. Two 5-minute demographic sampling videos were taken for each survey, with a 5-minute interval between the 2 videos. The video was shot with a GoPro Hero 7 Black action camera, which has HyperSmooth and Superview functions to shoot a smoother video with a wider field of view. We used these 2 functions to shoot videos with a resolution of 2.7 k, ensuring that we obtained population information as accurately as possible. From these videos, we were able to obtain data on the proportion of the population of different age groups at these sampling points. These data can represent the most accurate proportion of the population of each location in different time periods.

Methods
This section introduces the methods ( Figure 3) used in this article. The purpose of this process was to generate intraday variation population maps of different age groups. According to common international practice, the population is usually divided into 3 groups according to age, namely, 0-14 years old for children, 15-64 years old for youth and adults, and 65 years and older for elderly individuals [61][62][63]. This article aimed to construct proportion maps and determine the mobility patterns of 2 age groups (children and elderly individuals). The overall process can be divided into 2 parts, namely, dasymetric mapping of the population age structure and intraday variation mapping of the population age structure, in order to realize the transformation of population data from demographic data to spatial data and from spatial data to spatiotemporal data. In the first part, the random forest-recursive feature elimination (RF-RFE) algorithm was used to select the optimal subset of covariates and perform dasymetric population mapping. In the second part, UFRs were identified using point of interest (POI) records and OpenStreetMap (OSM) data, and the temporal scaling factors were calculated by combining population sampling data. Temporal scaling factors were used to generate intraday variation maps of the elderly individuals and children.

Dasymetric Mapping of Population Age Structure
The accuracy of the dasymetric model depends on the covariates it used. Many factors, such as socioeconomic, physical (topographic, climatic, and environmental), and political factors [64,65], can affect (directly or indirectly) the distribution of the population [39]. Therefore, we calculated the covariates related to these factors. However, these factors need to be filtered in order to simplify the model and eliminate interference factors. Therefore, the dasymetric mapping of the population age structure consists of 2 parts: optimal subset selection of covariates and dasymetric modeling.

Covariates Calculation
During the past few decades, the rapid development of remotely sensed satellites and the popularization of smartphones has transformed the way we obtain ground information. The factors related to the population distribution introduced above can be obtained cheaply and effectively through remotely sensed data and geospatial big data. Remotely sensed data could be used to describe factors that include nighttime light data [66]; land use and land cover data [64]; topographic and landform data [67]; and remote sensing spectral indexes that evaluate vegetation, urbanization, and water bodies [33]. Geospatial data can reflect the topological relationship of spatial data. On the basis of these data, we calculated the following covariates related to population distribution.
Nighttime light data were found have strong population distribution associations in many studies [68,69]. Nighttime light data have been widely used in mapping the urbanization process [70,71] and mapping population age structure [39]. Thus, nighttime light data can be used as a basic covariate for estimating the distribution of the population age structure in urban areas. In this study, the VIIRS nighttime light composites for 2018 were obtained and preprocessed to filter out lights from fires, boats, the aurora, and other temporal lights.
Land cover information is usually selected to redistribute the aggregated census to improve the accuracy of gridded population data [64]. Land use and land cover data have a higher spatial resolution than census data, which can effectively improve the accuracy of population spatial modeling [72]. In particular, the built-up class has the closest relationship with the population distribution [27]. The urban and built-up land cover classes were extracted to generate the covariate of distance to built-up lands. We used the "Euclidean Distance" toll in ArcMap to generate raster data of the distance from the build-up area.
Topography is closely related to population distribution and even population change [73]. To analyze the relationship between topographical factors and the population's age structure, we used STRM to generate two covariates, elevation and slope, in order to reflect the study area's topographical factors. The "Slope" toll in ArcMap was used to generate raster data of the slope using STRM.
Remote sensing spectral indexes can effectively measure and monitor the ground features and have been proven to be related to population distribution [38,39]. Three indexes, the enhanced vegetation index (EVI), normalized difference built-up index (NDBI), and normalized water index (NDWI), were selected in this article. EVI, NDBI, and NDWI are 3 commonly used remote sensing indexes used to highlight vegetation, built-up areas, and water bodies. All 3 indexes were calculated from the Sentinel-2 MSI: Multispectral Instrument, Level-2A. We used the Google Earth Engine platform to calculate the average of these 3 indexes for the whole of 2018. The calculation formula for the three indexes are as follows: The road network, river network, and water body data obtained from OSM were used in this article to identify UFRs and generate covariates in the population dasymetric model. Referring to the level information of road network data, we calculated 2 covariates, namely, road density and distance to road for each road category. At the same time, river network density and distance to the nearest water body were also used to analyze the impact of water bodies on population age structure distribution.
Dasymetric modeling by introducing POI data has been confirmed, reflecting the population distribution effectively [17,27,60]. In order to make full use of the relationship between POI records and population distribution, we also considered POI density and distance to POI for each category of POI in this study.
All the covariates introduced above were resampled to generate a 100 m resolution raster for dasymetric mapping, and these covariates are shown in Table 2. The dasymetric model has proven to be an effective method to generate gridded population maps. In this article, a new dasymetric model was established to generate finegrained population maps of different age groups using the random forest (RF) model [74]. On the basis of the data presented above, we produced 50 covariates in this study. However, not all covariates can be used in dasymetric mapping. Therefore, to simplify the dasymetric model, remove the interference variables, and improve the model's accuracy, we used the recursive feature elimination (RFE) [75] algorithm to select the optimal covariates subset used in the dasymetric model.
The RF model is a nonparametric model that has been widely used in classification or regression problems, and many scholars have used it for dasymetric mapping to reflect the natural distribution of human populations [25,76,77]. RF models grow a" forest" with numerous decision trees, and each tree is constructed using a random subset of the independent covariates and a random sample of the training dataset. The results were determined by all the trees in the "forest". The process of randomly selecting data is called the bootstrap sampling technique. Data not selected in the bootstrap process are called out-of-bag (OOB) data, the OOB error estimation is an error estimation method that can replace that using the test set. The RF model has the advantage of having fewer adjustment parameters and can be applied to large datasets with high efficiency, which makes RF an ideal model for population dasymetric mapping.
RFE is a feature selection method that fits a model and removes the weakest features until the specified number of features is reached. It optimizes the model through multiple iterations, continuously removing features and rebuilding it on the remaining features. The importance of features is measured, and the less relevant features are removed at each iteration [78]. The RFE algorithm provides good performance with moderate computational efforts in finding the subset of features with the minimum possible generalization error [75,78].
In this study, the RF model used the population proportion of elderly individuals and children as the response variable and the mean value of each covariate selected by the RFE algorithm as the independent variables. The log-transformed process was used for each variable to generate a more regular and evenly distributed map [25]. Subset selection, model estimation, and prediction were performed using the free software environment R 3.6.1, which is used for statistical computing and graphics [79]. Two packages, randomForest [80] and caret [81], were used to generate the population dasymetric map.
The random forest algorithm has two parameters that need to be adjusted. After many experimental training repetitions, we finally decided that a 500-tree forest with 4 covariates for each tree could obtain a stable, minimized OOB error of prediction.
The RFE algorithm in the caret package was used in selecting the optimal subset of covariates. In this article, we produced a total of 50 covariates; hence, we screened out the best subsets containing 1-50 covariates and selected the optimal subsets from these subsets. Since this article used the random forest model for dasymetric modeling, we chose the rfFuncs for the rfeControl parameter in RFE algorithm.
The root mean square error (RMSE) was used to select the optimal subset, and the formula is as follows: where f i represents the estimated value of county i; r i represents the true value obtained from demographic data; and N represents the number of counties here, which was 16.
The log-transformed variables were used in the population dasymetric model. To obtain the real population age structure maps, we back-transformed the result generated by the population dasymetric model to predict the population proportion for each pixel.

Urban Functional Region Identification
The identification of UFRs consists of 2 steps, identify land-use parcels and determine the UFR category of each land-use parcel. Studies have shown that the segmentation of cities through multilevel road networks can obtain satisfactory results of regional boundaries [55,59,82,83]. POI data have also been proven to play an important role in land-use classification on the basis of human activities [54,56]. This paper proposes a UFRs identification method based on fine-scale parcel boundary data generated from OSM road network data and POI data.
A method proposed by Liu and Long [82] was adopted to identify land-use parcels. The OSM roads were first extended by 100 m to address the disconnection of the road network. The dangling roads and independent sections of the road network were then removed because they cannot be connected to nearby roads. Finally, the OSM roads were divided into 3 types according to the principal tags, and buffers of different distances were set for different types of roads. On the basis of the investigation of road width in the study area and the national standard in the Code for Design of Urban Road Traffic Facility issued by the Ministry of Housing and Urban-Rural Development of the People's Republic of China, we generated buffers of 40, 20, and 10 m for Level 1 roads (highways and major roads), Level 2 roads (secondary roads), and Level 3 roads (third roads and fourth roads), respectively, in order to build an independent road space. The land-use parcels were built by removing the road space from the study area.
After the land-use parcels were identified, the POI data were used to determine each land-use parcel's UFR category. There are 17 categories of POI data used in this article: carting, residential community, wholesale and retail, automobile sales and service, financial services, educational services, health and social security, sports and leisure, communal facilities, commercial facilities and services, resident services, corporation enterprises, transportation and storage, scientific research and technical services, agriculture, forestry, animal husbandry, and fisheries. It was necessary to reclassify the POI records since the POI data in this article will identify UFRs and these UFRs will serve population age structure mapping. All the POI records were merged into 4 categories: open space, industry and commerce, public service, and residential. These 4 reclassified categories can clearly reflect the population mobility pattern and can serve for spatiotemporal population age structure mapping [42]. The kernel density of these 4 categories of POIs was calculated to determine the land-use parcels' UFR category. The kernel density is estimated on the basis of the first law of geography, that is, the closer the location is to the core element, the greater the density expansion value it obtains, which reflects the characteristics of spatial heterogeneity and the attenuation of center strength with distance [84]. This makes full use of the original data information, and the result is less affected by subjective factors and has the advantages of gradual change and revealing detailed characteristics, which makes it possible to reveal the details of things and realize the spread of radiation effects on neighboring locations.
Therefore, each land-use parcel's category was judged by the kernel density of POIs. This process of determining the category of land-use parcels consists of 2 steps. First, calculating the kernel density of the 4 POI categories and all POIs in each land-use parcel, if the density of a certain POI category exceeded 50% of the density of all POIs, we regarded the category of this POI as the category of this parcel. Second, for mixed parcels that did not have an absolute advantage POI category, further operations were needed to determine the category of these parcels. Near-convex-hull analysis (NCHA) was used to reclassify POI in land-use parcels [46]. For example, industrial and commercial POI records such as stores will be located in parks, which interferes with the identification of open space UFR identification. In order to eliminate similar interference, these industrial and commercial POIs should be reclassified as open space POI through NCHA. The specific operation is as follows. The convex hull (the minimum polygon boundary) of the most numerous POI category in each parcel was calculated. Then, the other POIs in this convex hull were reclassified into this category, and the first step was repeated to identify the category of these parcels. Finally, null values were assigned to parcels that were still unidentifiable.

Temporal Scaling Factor Calculation
The population age structure map can be generated on the basis of the urban functional map, the population dasymetric maps of different age groups, and the population survey sampling data. This process consists of two steps. The first step is to calculate the temporal scaling factors (TSFs) of different age groups in different UFRs at different time periods. The second step is to calculate the intraday variation population age structure map on the basis of the temporal scaling factors.
It has been proven that human mobility pattern has strong spatiotemporal regularity, and these regularities can be used to reveal the relationship between different UFRs of the city [85,86]. Therefore, the population mobility patterns are consistent for the same category of UFRs, and the TSFs of the same UFR should also be the same. To calculate these TSFs, we divided the 22 sampling points introduced above into training data (17) and validation data (5). We calculated the TSFs for different people in different time periods in each category of UFR according to the population dasymetric map and the training data. The calculation formula for the TSFs of the elderly and children in different time periods and different categories of UFRs is as follows: Among them, TSF represents a certain type of people in a certain category of UFR at a certain time period, p j represents the measured proportion of the j-th sampling point in this kind of UFR, and P j represents the value of the same point in maps generated by dasymetric mapping. The variable num j represents the statistical population value of the j-th sampling point in this kind of UFR, and num represents the total number of populations in the same kind of UFR in the same time period.

Dasymetric Maps of Population Age Structure
Optimal subset covariate selection was driven by the RFE algorithm using countylevel demographic data and the average value of 50 covariates in each county. We screened out the optimal subset containing 1 to 50 covariates for dasymetric models for the elderly individuals and child populations proportionally.
The indicator of each optimal subset is shown in Figure 4. RMSE was used as the indicator to select the optimal subset of the population dasymetric model. The subset with the smallest value is the optimal subset. Finally, the optimal subset containing 38 covariates and 26 covariates was selected for population dasymetric model of elderly individuals and children, respectively. An RF-based dasymetric model was constructed with the above covariates described in Section 3.2.2, which was used to downscale the proportion of different age groups of people in each county. The fine-gridded population dasymetric map at a 100 m spatial scale in the experiment showed visually satisfactory results ( Figure 5). Figure 5a shows the dasymetric map of the elderly individuals' proportion, while Figure 5b shows the dasymetric map of the children proportion. The accuracy of the RF-based dasymetric model was evaluated using R 2 . The results show that the R 2 values of the population maps of elderly individuals and children were 98.14% and 91.54%, respectively. This proves that the dasymetric model using the optimal subset covariates selected by RFE can well represent the population distribution.

Urban Functional Region
The UFRs within the Fifth Ring Road identified as described in Section 3.2.1 are shown in Figure 6. The actual boundary and category of UFRs was manually interpreted within a selected validation area of the study area (Haidian District within the Fifth Ring Road), with the assistance of a field survey, high-resolution remote sensing images, digital maps, and POI data. The actual UFRs were compared with the classification result for accuracy evaluation.
The confusion matrix for the UFR identification result is shown in Table 3. The identification result achieved an overall accuracy of 70.97%. Of all the four UFR categories, open space and public categories had the highest identification accuracy, with both the user's and producer's accuracies above 72%. However, the identification accuracies of industry and commerce facilities, and residential categories were relatively low, especially for industry and commerce facilities, where the user's and producer's accuracies were only over 50%. This was mainly due to the relatively wide distribution of POIs in the industry and commerce facilities category, which is easy to mix with other POI categories and leads to confusion. For example, in open space UFRs where there are tourist attractions, there will also be restaurants, hotels, and other commercial facilities nearby. This has a great impact on its identification accuracy. Figure 7a displays the correctly identified result of UFRs, while Figure 7b,c displays the UFRs identification result in this article and the visual interpretation result of UFRs, respectively. As shown in Figure 7, the distribution of open space category distributions, residential category, and public facilities category are relatively concentrated. They are distributed in the northwest, south, and east of the validation area. However, the industry and commercial facilities categories are relatively dispersed and distributed in the whole region, interweaving with the other three UFR categories, which further confirms the above analysis of the reasons for the poor accuracy of the identification results of industry and commercial facilities. This indicates that it is difficult to use only POI records when recognizing the industry and commercial facilities UFRs. They are easier to mix with other UFR categories and interfere with the accuracy of the overall UFRs identification result. Therefore, it is necessary to introduce other information, such as spectral and texture features and landscape metrics, in order to improve these mixed regions' identification accuracy. The specific process will be discussed in detail in the conclusions and discussion.
There is an inherent limitation in land-use parcels generating using road network data. The land-use parcels' boundary derived from the OSM road data was different from that of manual interpretation, and many small parcels were not recognized. This is usually associated with the under-segmentation phenomenon. In reality, land-use parcels are not only segmented by road networks, but rather walls, fences, trees, or even nothing [59]. Segmentation using only road network data will merge some small parcels into larger parcels, forming mixed UFRs and even misidentified the category of UFRs.

Temporal Scaling Factors of Different UFRs
The temporal scaling factors of each category of UFR on weekdays and weekends was calculated. The result of these scaling factors is shown in Figure 8. In addition to being used in intraday various mapping of elderly individuals and children, these factors can also be used in analyze the mobility patterns and distribution preferences. On the basis of these temporal scaling factors, we can obtain the following three conclusions. First, elderly individuals and children had different activity preferences and distribution characteristics. The proportion of elderly individuals in residential UFRs was usually higher, while the proportion of children in open space UFRs was higher. This was mainly due to the attributes of the elderly and children. Elderly individuals usually need a small amount of activity; hence, they are usually distributed near their homes. In contrast, children usually need a larger amount of activity, and thus they often go to parks, playgrounds, and other open-space UFRs.
Second, the distribution of different age groups in different time periods of the day had certain patterns. Except for residential UFRs on weekdays and public facilities UFRs on weekends, the time period with the highest proportion of elderly individuals was between 8:00 and 12:00. However, for children, apart from residential UFRs on weekdays and public facilities and residential UFRs on weekends, the time period with the highest distribution ratio was usually between 12:00 to 16:00. This shows that elderly individuals were more willing to perform activities in the morning, while children were more willing to perform activities in the afternoon.
Third, the distribution of different groups of people on weekdays and weekends had a certain pattern. Compared with weekdays, the proportion of elderly individuals in residential UFRs on weekends was significantly reduced, and the temporal scaling factors dropped by 0. 23 On the other hand, the temporal scaling factor of the public facilities UFRs increased by 0.69, 0.35, and 0.35 in the morning, afternoon, and evening, respectively.

Intraday Variation Maps of Population Age Structure
According to the population proportion maps of elderly individuals and children calculated in Section 4.1 and the temporal scaling factors calculated in Section 4.3, we were able to obtain the following fine-resolution intraday variation population maps of elderly individuals and children of weekdays ( Figure 9a) and weekends ( Figure 9b). As shown in these maps, the proportion of the elderly population was higher than that of children in all time periods in residential and public facilities UFRs. In comparison, the proportion of children was higher than that of elderly individuals in almost all open space UFRs at all times. In industry and commerce facilities UFRs, except for weekday and weekend mornings, the proportion of children was higher than that of the elderly. This also confirms, to a certain extent, the regularity of activities of the elderly and children. Children prefer outdoor sports and entertainment, and they will gather more in open space and industry and commerce facilities UFRs. On the other hand, the elderly individuals are more willing to participate in public activities, and they will gather more in residential and public facilities UFRs.
The accuracy of these intraday variation maps of population age structure was evaluated on the basis of the five sampling points for validation introduced in Section 2.2.4. We used the R 2 and RMSE indicators to evaluate the accuracy of the result ( Figure 10). The result showed that the accuracy of the intraday variation maps on weekdays was higher, with the R 2 of elderly individuals and children in different time periods all above 0.6 and the value of the RMSE less than 0.02. However, the accuracy of the intraday variation maps on weekends was relatively low, with R 2 only higher than 0.5 and RMSE less than 0.05. The reason for this phenomenon may be that middle-aged people are more active on weekends, and they account for a relatively high proportion of the population, which interferes with the accuracy of intraday variation mapping of the elderly and children. At the same time, after comparing the prediction accuracy of different periods of the day, we found that the prediction accuracy in the evening was the highest. This may have been because people's mobility in the evening will be lower than that during the day [27,87], and low population mobility makes the population distribution relatively fixed, which is conducive to improve the accuracy of population distribution estimation. variation maps on weekends was relatively low, with only higher than 0.5 and RMSE less than 0.05. The reason for this phenomenon may be that middle-aged people are more active on weekends, and they account for a relatively high proportion of the population, which interferes with the accuracy of intraday variation mapping of the elderly and children. At the same time, after comparing the prediction accuracy of different periods of the day, we found that the prediction accuracy in the evening was the highest. This may have been because peopleʹs mobility in the evening will be lower than that during the day [27,87], and low population mobility makes the population distribution relatively fixed, which is conducive to improve the accuracy of population distribution estimation.

Morning
Afternoon Evening Elderly individuals

Discussion
The first advantage of this article is that we innovatively used UFR data and introduced the temporal scaling factorʹs concept to produce fine resolution intraday variation maps of population age structure to overcome MAUP in population maps. We first achieved high accuracy UFR identification using road network data and POI data. Then, we innovatively proposed a dasymetric model based on the RF-RFE algorithm for the elderly and children. Finally, we calculated the temporal scaling factor using the population survey data and generated the intraday variation maps of population age structure. The second advantage of this article is our ability to analyze the activity patterns of population using these maps.
However, our study still had limitations and uncertainties, such as the intraday variation population mapsʹ low accuracy for weekend mornings and afternoons. To overcome these limitations and uncertainties, future research can consider the following aspects in order to improve population proportion mapsʹ mapping accuracy.
The first aspect is to further improve the identification accuracy of UFRs. Since the

Discussion
The first advantage of this article is that we innovatively used UFR data and introduced the temporal scaling factor's concept to produce fine resolution intraday variation maps of population age structure to overcome MAUP in population maps. We first achieved high accuracy UFR identification using road network data and POI data. Then, we innovatively proposed a dasymetric model based on the RF-RFE algorithm for the elderly and children. Finally, we calculated the temporal scaling factor using the population survey data and generated the intraday variation maps of population age structure. The second advantage of this article is our ability to analyze the activity patterns of population using these maps.
However, our study still had limitations and uncertainties, such as the intraday variation population maps' low accuracy for weekend mornings and afternoons. To overcome these limitations and uncertainties, future research can consider the following aspects in order to improve population proportion maps' mapping accuracy.
The first aspect is to further improve the identification accuracy of UFRs. Since the temporal scaling factor was calculated on the basis of the UFRs' category, a higher UFRs' identification accuracy may improve spatiotemporal population maps' final accuracy. This article only used OSM road network and POI data to identify UFRs. Although the UFR identification results were reasonably good, with the overall accuracy reaching 70.97% (Table 3), there were still some shortcomings. For example, the identification accuracy of industry and commercial facility UFRs was low and needs to be further improved. The most direct way to improve the identification accuracy of UFRs is to introduce other data related to the division of UFRs, such as high-resolution remote sensing images and landscape metrics. High-resolution remote sensing images can provide many spectral and texture attributes that have already been used in UFR identification [88,89]. Landscape metrics have also been found to be a very important indicator in differentiating urban land uses [59,90,91]. Therefore, in the next step, we further introduce these data to improve UFR identification accuracy.
The second aspect is the city's complexity; this article's temporal scaling factor did not fully reflect the real situation. We divided the records from sampling points for validation into four categories according to the UFR categories and found that the accuracy of the records was less accurate in the industrial and commercial facility regions ( Figure 11). This may have been due to the complexity of the industry and commercial facility regions themselves. It is necessary to further divide the industrial and commercial regions, such as the separation of industrial regions and commerce regions, in order to obtain more UFR categories so that this article's research results are more realistic. In the future, more categories of UFRs will be identified to improve the accuracy of intraday variation population age structure mapping. The third aspect is sampling error. Since the population is flowing and uneven, there may be abnormal values in the process of population sampling, such as the elderly tour group when sampling in tourist attractions, which will interfere with the population sampling results. In order to solve this problem, the number of sampling points and the sampling times of each sampling point can be increased to eliminate the sampling error as much as possible.
The last aspect is the inherent drawbacks with parcels generated using the road network data. In this article, we assumed that all the parcels were separated by roads. In reality, parcels in real life can be divided by walls, vegetations, river, or even nothing. Therefore, in future research, it is necessary to comprehensively consider these elements and establish a multi-element land-use parcels identification method to improve the identification accuracy of land-use parcels.

Conclusions
The objectives introduced in the introduction were achieved in this study. We realized the intraday variation mapping of population age structure, and also proved that the mobility of the urban population has certain patterns. People of different age groups will prefer to gather in a different category of UFRs at different periods.
This article made the first attempt to apply UFR data to analyze the spatial and temporal distribution of elderly individuals and children. The results of this article can accurately display information on the distribution and activity patterns of population of different age groups, which can be directly applied to assess the aging of the population in different regions and urban management. The research method in this article also provides ideas for calculating the proportion of other populations with different attributes (e.g., gender or income levels) in the future. The fine resolution intraday variation maps of population age structures in this article can be further used in many other studies. By combining the results of this article and population maps such as Worldpop and LandScan, we were able to obtain the distribution data of elderly individuals and children in different time periods. These data will play a vital role in the risk assessment of disasters, public health management, and many other aspects. Data Availability Statement: Public available datasets were analyzed in this study. These data can be found here: https://ncc.nesdis.noaa.gov/VIIRS/, https://lpdaac.usgs.gov/, https://www2 .jpl.nasa.gov/srtm/, https://scihub.copernicus.eu, https://www.openstreetmap.org/, http://nj.tjj. beijing.gov.cn (accessed on 18 February 2021).