Associations of Spatial Aggregation between Neighborhood Facilities and the Population of Age Groups Based on Points-of-Interest Data

: By actively adapting urban planning to identiﬁed social needs, residential areas tend to be more people-oriented, fairer, resource-saving, and sustainable. The emergence of big data has provided new opportunities for the planning of residential urban areas. Since the quantity and age-appropriateness of neighborhood facilities are important criteria when developing the ideal neighborhood, this study investigated the associations of the number of neighborhood facilities and the age groups within those neighborhoods by using the Wuhan metropolitan area in China as a case study and by applying a Geodetector and regression analysis to points-of-interest data. In terms of age groups, the neighborhood facilities of kindergartens, pharmacies, and bus stations were found to be highly associated with population size, regardless of the age di ﬀ erence. It was also found that convenience stores were closely related to the adult population, and that convenience stores, community hospitals or clinics, and vegetable markets or fresh supermarkets were associated with the elderly population. Facilities without signiﬁcant correlations were equally important, but it was found that there was no statistical correlation between the number of facilities and the distribution of the population. The weak association of key educational resources and medical resources with the population indicates a concentrated distribution of educational resources and medical resources, and the latent insu ﬃ ciency of schools, community hospitals, or clinics at some neighborhoods. It concludes that planning of neighborhood facilities for residential areas in Wuhan requires optimization in terms of matching the provision of facilities with population size and social structure. Furthermore, more e ﬀ orts should be put into supplementing important facilities and building di ﬀ erentiated residential area programs based on age structure.


Introduction
The theory and practice of urban planning in Europe and North America have been developing for more than one hundred years [1], throughout which time there has been a recognition of residential areas as the basic unit of planning and that the daily behavior of urban residents, the use of labor markets, and urban facilities are the reasons that make a city what it is [2,3]. However, starting in the 1950s, some Asian countries gradually introduced the concept of the pedestrian-scale neighborhood [4]. Under the challenges of urban development, China has been rethinking its urbanization [5], and residential area planning has been an important part of urban planning in the new era.
Neighborhood facilities within walking distance would largely meet the basic needs of education, health care, and food, while other social activities that could be easily reached by effective public transport would achieve an orderly utilization of facilities and improve the livability of the residential area [6][7][8][9][10][11][12][13][14][15]. Therefore, accessibility, spatial distribution, and quantity of facilities affect the residential attraction of the neighborhood and population mobility. For instance, children's educational environment has become a new driving force to shape the urban form, and the driving force of children's access to quality schools is even stronger than that of employment demand [16]. In the past two decades, people have been exploring the spatial equity of public facilities allocation, focusing on solving the mismatch between population needs and accessible facilities [17]. An imbalance of accessibility is a major defect of urban public services [18,19]. In order to improve the accessibility of neighborhood facilities, the timeframe for accessibility of facilities is proposed according to car ownership, land use policies, and people's acceptance of walking [20]. The Chinese government advocates 15-min life circle neighborhoods to provide citizens with basic public services by walking and ultimately to realize a comprehensive social organization combining public transport [6,15]. Residential area planning, beginning with accurate population predictions, helps neighborhoods allocate resources and services, and aids the development of neighborhood plans, planning policies, and priorities [21,22]. The location of small-size neighborhood facilities is easily adjustable, but there should be an adequate number of facilities within 15-min walking distance to match the population [23]. Research into the associations focusing on the number of facilities and the population in residential areas is concerned with whether there is a quantitative basis and how to stimulate the population indices (Per thousand persons or Population size) of facilities in residential area planning.
Furthermore, residential area planning based on the population structure would help to improve the efficiency of facilities, save social resources, and embody social care. The human-scale urban spatial form needs to be directly connected to those who participate in and use them [24]. According to the theory of "Homo Urbanicus", there will be different groups pursuing different opportunities for spatial contacts under a certain population size [25]. The planners should identify what kind of human settlements attract what kind of "Homo Urbanicus" and operate these variables through planning measures [25]. Fainstein believed that the goal of urban planning is to consciously create a just city [26]. Henri Lefebvre defined space as a social construction, advocated that all groups should have a "right to the city", and in the 1990s proposed three main approaches to urban justice: communicative rationality, recognition of diversity, and spatial justice [27]. The connotation of the just city is extremely rich, and it is difficult to anticipate in the early stages of planning. But the allocation of public service facilities should be directed towards social justice and spatial optimization [28]. Though the social attributes of the residents include family income, education, gender, age and marital status [29][30][31], different demands for public facilities among various age groups have become a hot topic of research into livable cities. More attention has been paid to the influence of the neighborhood environment on the livability of urban elderly, and there are many planners who advocate for the creation of communities that care for the elderly [32,33]. Since shopping and services, transportation and pedestrian facilities, neighborhood attractions, and public transportation affect elderly people's activities, it is important to design safe and accessible neighborhoods to help the elderly develop friendly neighborhoods [34,35]. The development of public space, neighborhoods, and cities conducive to the residence of children is also an important embodiment of social equity and social care [36]. We should integrate children's needs and sense of belonging and enable them to actively participate in all stages from comprehensive research through to design-planning and implementation [37]. Meanwhile, since the aging of China is accelerating [38,39], it is especially necessary for Chinese cities to discuss the needs of age groups and the fairness of facilities distribution [40].
The scale and perspective of neighborhood research determine the reference of the results in residential area planning [41]. The common feature of urban facilities is the aggregation in the central urban area [42,43], and some scientific problems are worthy of considering when applying the concept of urban development into planning practice. Though compactness and aggregation of urban centers are considered as the optimal mode of disordered expansion of urban development, it needs to be determined whether there are insufficient facilities in the central urban area or whether the scale effect of a small number of high-efficiency facilities would meet the needs of the city. These are important for relevant government departments to regulate the internal resources of a city and arrange the layout of urban facilities according to population and urban functions on a fine-scale [44]. Previously, people used questionnaires to help to select the social environmental factors that affect activity patterns [45]. The relatively small sample sizes had limited the generalization of the findings to a larger picture. These days, however, human-oriented data resources can be generated from big data, such as points-of-interest (POI) data on an online map, cellphone big data, global positioning system data of taxis, and social media data. These have provided new research paradigms for urban residential area research and planning due to high updating frequency and large sample sizes [46,47]. Up to date, the application of big data methods in investigating the associations of spatial aggregation between neighborhood facilities and the population of age groups are still deficient.
Thus, to fill this gap, this research took the Wuhan metropolitan area of central China as a case study to: (1) examine the quantitative associations of neighborhood facilities and population; (2) estimate whether there is an age difference in the associations; and (3) offer references on the layout and number of facilities according to the population size for residential area planning. In this paper, twelve neighborhood facilities were selected based on corresponding planning standards and recognized by refereeing POI in Wuhan on Amap. Geodetector [48] and regression models were employed to investigate the associations between population and neighborhood facility provisions and the differences in the needs among various age groups. The remainder of the paper is organized as follows: section two describes the methodology, including the data collection and analysis methods; the results of associations according to Geodetector and regression analysis are presented in the third section; the findings from the viewpoint of the spatial character of neighborhood facilities and urban planning practice are discussed in the fourth section; and the last section provides the study's conclusions.

Case Selection: Wuhan Metropolitan Area
Wuhan is the capital city of Hubei province in central China [49]. Since the urban built-up areas of Wuhan are concentrated in Wuhan's metropolitan area, the scope of the research is Wuhan metropolitan area as shown in Figure 1. In 2017 the population of the Wuhan metropolitan area was approximately 9.087 million, including 0.954 million children, 7.187 million adults, and 0.946 million elderly. The dependency ratio of the population reached 20.91%, the employment ratio reached 79.09%, and the aging coefficient reached 10.71%. When the dependency ratio is less than 50%, it is termed as a demographic dividend; and an aging coefficient of more than 7% means that the country or region has an aging society. Therefore, the Wuhan metropolitan area was considered to be a demographic dividend, but with a relatively high degree of aging.
In this study, the population was divided into three age groups according to common international criteria: children population from 0 to 14 years old; adult population from 15 to 64 years old; and elderly population over 65 years old. The population density of the age groups is shown in Figure 1. The population distribution in the Wuhan metropolitan area is uneven and concentrated in the central urban area. In particular, the old town of Hankou has concentrated neighborhoods with a population density of over 100,000 people/km 2 . For age groups, the number of the elderly population is close to the children population, but the high-density neighborhoods of children are scattered. High-density neighborhoods of the elderly population are concentrated in the old town of Hankou, and the density of the elderly population in some neighborhoods is as high as 20,000 people/km 2 . The overall spatial pattern of the adult population is relatively consistent with the distribution of the total population, and the high-density neighborhoods over 100,000 people/km 2 are concentrated in the old town of Hankou. There are age differences in population concentrations, so it is necessary to explore whether there are age differences in the associations between population and the number of facilities. The logic framework of this study is shown in Figure 2. Based on the theories of urban geography, urban planning, statistics, it is feasible to explore the associations between the population of age groups and neighborhood facilities by using Geodetector and regression analysis. And the findings would help to urban residential area planning in terms of age difference and distribution in a spatial aggregation of facilities quantitatively.

The Selection of Neighborhood Facilities
In China's "Standard for urban residential area planning and design", 5-min, 10-min, and 15-min pedestrian-scale neighborhoods contain neighborhood facilities for public management and public services, commercial services, municipal public services, transportation stations, and block services. The standard highlights the service radius and walking time of residents, and is a new attempt compared with the previous planning standard of demographic indicators. Most of the municipal public facilities have specialized planning, and some facilities are generally insufficient or lacking, such as community canteens and senior citizen care centers. Therefore, 12 facilities closely related to the daily life of residents, including education, medical treatment, health, and transportation were selected as shown in Table 1. Social media data is a direct response to human activity and the spatial form of a city and is widely used when evaluating the urban scale and spatial structure [50,51]. Within social media data, POI is widely used in the study of aggregation character and accessibility of facilities, such as schools or hospitals [52][53][54]. POI is therefore suitable to reflect the spatial match between the number of facilities and population size within walking distances of neighborhoods. Facilities were extracted from POI on

Association Analysis of Neighborhood Facilities and Population
To study the associations between population and the number of facilities on a macro scale and apply to further residential area planning, the Geodetetor based on geography and regression analysis were selected.

Geodetector
In the Geodetector, factor detection uses the difference between the sum of intra-layer variances of the interpreted variable Y under factor X stratification and the total variance of the whole region to explain the spatial differentiation of attribute Y, which is measured by the q value [48]. The range of q is 0 to 1, with a larger q value indicating the stronger explanatory power of factor X on attribute Y. Interaction detection assesses the explanatory power of the two factors X1 and X2 to the explained variable Y increased or decreased when they act together by comparing the q values of the two factors X1 and X2 to Y (q (X1) and q (X2)), respectively, and the q value of their interaction (q (X1∩X2)) [48]. Lan et al. used the Geodetector to study the response of housing prices to education, medical care, culture, business, and leisure facilities and found that the needs of medical care and education caused the difference in housing prices, and that the increase of the type of facilities would enhance the effects [55]. Therefore, the Geodetector was suitable for the study of the associations between the population and the number of neighborhood facilities.

Regression Analysis
Before the regression analysis, correlation analysis was conducted among the total population, the age group population and the number of neighborhood facilities. A correlation coefficient above 0.5 was a strong correlation, between 0.3 and 0.5 was a moderate correlation, between 0.1 and 0.3 was a weak correlation, and less than 0.1 was irrelevant. For facilities that were strongly relevant to the population, a linear regression model based on the ordinary least squares (OLS) method was established. After the mean-based prediction of linear regression, quantile regression was used to supplement OLS for the sharp peaks (outliers) and thick tails of scattered populations and facilities. Quantile regression describes the whole character of the conditional distribution of explained variables, and the estimation is more robust for outliers. Meanwhile, functional relationships at a high level will help determine whether there is a scale effect of the shortage of facilities. Quantile regression uses the minimum absolute deviation to estimate the relationship of the conditional quantile of variables.
The key to the solution of the minimum absolute deviation lies in the design of the loss function; for the calculation, please refer to the literature [56].

Associations Identified by Geodetector
The total population of all ages, the children, the adult, and the elderly population was respectively taken as the variables to be explained, and the number of neighborhood facilities as the explanatory variables in the Geodetector. The results of factor detection and interaction detection are shown in Table 2, Table A1, Table A2, and Table A3 weak when used to try and explain the population, the combination of CS and CH/CL helps to explain the population distribution. It also means that the above two interactive variables have the possibility of coexistence and are important facilities that cannot be replaced in residential areas. The interaction has a more significant impact on the distribution of the elderly population. On the one hand, the elderly population has a larger demand for PHA, CH/CL, and VM/FRS, while on the other hand, it may be that the concentrated distribution of the elderly and the neighborhoods where they live are habitable.

Results of Regression Analysis
According to the correlation coefficients shown in Table 3, the facilities related to the total population with the correlation coefficient higher than 0. 5 Table 4, Table A4, Table A5, and Table A6. For mean regression, a large number of facilities have a high intercept, and a small number of facilities have a low intercept. Examples of this include CS with an intercept of 12.101 and KIN with an intercept of 1.241 (Table 4). According to the statistics of the facilities in different age groups, the intercepts are close, but the coefficient B is sensitively varied.  Table 4, 25% of the neighborhoods with a high population have a large number of CS and VM/FRS, while 10% of the neighborhoods with a high population have a large number of BS. In Table A4, 10% of the neighborhoods with a large children population have large numbers of BS. In Table A5, 25% of the neighborhoods with a large adult population have a large number of CS, and 10% of the neighborhoods have a large number of BS. In Table A6, 25% of the neighborhoods with a large elderly population have a large number of CH/CL, CS, and VM/FRS; 10% of the neighborhoods have a large number of BS. It is worth noting that since the percentage of neighborhoods with children or elderly are ranked by the numerical size of the population, they may not be in the same group of neighborhoods.

Associations of Facilities and Age Groups and Policy Implications
In this study, we used a Geodetector and regression analysis to find associations between population and number of facilities based on POI on a large scale. The types and priorities of neighborhood facilities associated with age groups are different between the Geodetector and that of regression analysis, but it is clear that KIN, PHA, and BS are facilities that are closely related to the three groups of age population and should therefore be the most basic neighborhood facilities. The adult population's association with CS and the elderly population's associations with CS, CH/CL, and VM/FRS indicate that there are age differences in demand and that mature neighborhoods need to be more inclusive. This implies that the demographic characteristics, particularly the age structure, of a neighborhood should also be emphasized during the planning process beyond following the general planning standards. According to the regression analysis, the statistics change sensitively in the relationship between the children population and facilities, and the elderly population and facilities, which indicates that the two age groups have specific public service needs and their demand may be weakened only the total population is considered in residential planning. This reflected the significance of the current trendy research in the domain of elderly and children friendly cities and neighborhoods. As vulnerable and sensitive age groups, the elderly and children should be given more focus when local public services are improved in the governmental agenda.
Given people's diverse and qualitative demands, the 12 types of facilities chosen for this study represent the basic needs of the public in China, which may be different in other countries or studies. For example, a study of Australian retirement villages found that the largest number of facilities were neighborhood centers, libraries, barbecue facilities, hairdressers/salons, and snooker/billiard tables [57], which is very different to the needs of the elderly in China. The number of facilities of JHS, PRS, SAN, GYM, BO, and RTRS is weakly associated with the population, but they are also the necessary infrastructure to ensure the quality of the neighborhood. The reason for the weak associations between them and the population is that these facilities are centrally clustered, and the amount of low values prevents regularity in the whole area. In particular, it should be noted that the layout of rail transit stations adopts the layout based on the road network with equal distance, which has certain peculiarities and the number of stations cannot keep the positive associations with the population distribution. Next, from the quantile regression, only a small number of neighborhoods enjoy a large number of related facilities. This means that aggregation of population and related facilities are synchronous, and there is no lack of facilities or scale effects in the central urban area. Meanwhile, neighborhoods on the low quantile are more vulnerable to inadequate facilities. In contrast, some facilities do not have significant concentrations at each quantile level. This indicates that densely populated neighborhoods are vulnerable to shortages. Thus, the more populated the neighborhood is, the higher priority filling the shortage of facilities there should be given in the policy making. Besides, among education, medical treatment, health, and transportation, the weak correlation and statistical changes between educational facilities and population should be given attention. Given the constant growth of population in urban Wuhan, this evidence implies that the educational facilities are the most likely to be in short supply as the ongoing rapid urbanization. Therefore, government should not ignore the supplements of education resource in securing the adequate school quotas. Lastly, based on the interaction detection, it is certain that the more types of facilities are available, the more attractive the area will be for the population. This again proves that the diversity of provided facilities plays a key role in improving the attractiveness of a district. In regards to this, relevant guideline and standards should be sustained or even enhanced to improve a city's livability.

Limitations and Future Research
Although the research has adopted diverse methods and data sources, there are several limitations as follows, which to some extent confines the application and generalization of the findings. Firstly, the magnitude of the 12 facilities is quite different, but this does not affect the research results. The results reveal a need to make up for the deficiency in the number of facilities by building differentiated neighborhoods according to the age groups. Although there are an increasing number of elderly people in China and an inadequate distribution of appropriate urban neighborhood services [58], it is not enough to consider age differences alone. Secondly, the relevant results did not provide evidence supporting the hypothesis that the greater the number of facilities available, the better the livability of the city is. Instead, this depends on the features of services and the ability to achieve effective social resource allocation. Thirdly, to create a pedestrian-scale neighborhood for all residents, a comprehensive questionnaire survey identifying more facilities reflective of the population's social nature should be conducted with the aim of linking current neighborhood features to the rules and standards of neighborhood design. Fourthly, in terms of data resource, since the POI used in this research are static, the dynamic social activities and use of facilities could be improved by combining more social media data to respond to the urban vitality and planning policies. In addition, the number of facilities in neighborhood buffer zones was included in this study, though whether they were accessible or not was not taken into account. This needs to be considered together with road network accessibility in future related research. Lastly, the differences in the spatial distribution of associations of neighborhood facilities and population would be reflected by geographically weighted regression, which can be conducted in future study.

Conclusions
For sustainable urban planning of residential areas and for improving overall livability, it is advocated that facilities for residential daily life should be allocated within an acceptable walking distance. In this sense, residential area planning requires a sufficient quantity of neighborhood facilities and services to suit the diverse demands of the local population structure and size. To reveal the quantitative associations of neighborhood facilities within walking distance and the population of age groups and whether there are age differences, this study adopted a Geodetector and regression analysis methods to investigate the spatial features and the associations of the facilities and population in the Wuhan metropolitan area of China based on POI data. It was found that kindergartens, pharmacies, and bus stations have a significant correlation with the population regardless of the age difference and should be a neighborhood's basic supporting facilities. Since the types of facilities increased with the increase of age groups, a differentiated design should be considered according to the age structure of the residents in the process of neighborhood planning. It was also found that although population aggregation is more likely to drive facility aggregation, facilities without significant correlations turned out to be statistically inconsistent with the distribution of the population. In particular, the weak association of educational resources and medical resources with the population means that there is a centralized distribution of educational resources and medical resources, while some neighborhoods may have insufficient schools, community hospitals or clinics. In general, the planning of neighborhood facilities for residential areas in Wuhan tends to be optimized in terms of matching the provision of facilities with the population size and social structure. However, the current residential planning system is lagging behind the demand for facilities. This study advocates keeping the allocation of neighborhood facilities flexible for addressing the uncertainty of urban development and population demand. The study contributes to the body of knowledge related to population-facilities distribution in China and provides a reference for conducting similar studies in other comparable conurbations.   Appendix B Table A4. Mean regression and quantile regression parameters of the children population and facilities.  Note: *** indicates a significant level p < 0.001; **indicates p < 0.01; * indicates p < 0.05.