Next Article in Journal
Geospatial Knowledge-Base Question Answering Using Multi-Agent Systems
Previous Article in Journal
Ultra-Wideband System for Museum Visitors Tracking: Towards the Integration of the Positioning System with the Vision Sensors
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

High-Spatiotemporal-Resolution Population Distribution Estimation Based on the Strong and Weak Perception of Population Activity Patterns

State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, 129 Luoyu Road, Wuhan 430079, China
*
Author to whom correspondence should be addressed.
ISPRS Int. J. Geo-Inf. 2026, 15(1), 34; https://doi.org/10.3390/ijgi15010034
Submission received: 9 October 2025 / Revised: 23 December 2025 / Accepted: 4 January 2026 / Published: 8 January 2026

Abstract

Population activity drives urban development, and high-spatiotemporal-resolution population distribution provides critical insights for refined urban management and social services. However, mixed population activity patterns and spatial heterogeneity make simultaneous high-temporal- and -spatial-resolution estimation difficult. Therefore, we propose the High-Spatiotemporal-Resolution Population Distribution Estimation Based on the Strong and Weak Perception of Population Activity Patterns (SWPP-HSTPE) method to estimate hourly population distribution at the building scale. During the weak-perception period, we construct a Modified Dual-Environment Feature Fusion model using building features within small-scale grids to estimate stable nighttime populations. During the strong-perception period, we incorporate activity characteristics of weakly perceived activity populations (minors and older people). Then, the Self-Organizing Map algorithm and spatial environment function purity are used to decompose mixed patterns of strongly perceived activity populations (young and middle-aged) and to extract fundamental patterns, combined with building types, for population calculation. Results demonstrated that the SWPP-HSTPE method achieved high-spatiotemporal-resolution population distribution estimation. During the weak-perception period, the estimated population correlated strongly with actual household counts (r = 0.72) and outperformed WorldPop and GHS-POP by 0.157 and 0.133, respectively. During the strong-perception period, the SWPP-HSTPE model achieves a correlation with hourly population estimates that is approximately 4% higher than that of the baseline model, while reducing estimation errors by nearly 2%. By jointly accounting for temporal dynamics and population activity patterns, this study provides valuable data support and methodological insights for fine-grained urban management.

1. Introduction

As a fundamental component of urban environments, the population is the primary source of urban vitality. The spatiotemporal distribution and dynamics of populations significantly influence various aspects of urban life, including the economy and the environment [1]. The rapid acceleration of urbanization has brought about significant challenges, including environmental pollution and ecological degradation [2,3], issues related to heat environments [4], and challenges in urban planning and land management [5]. This has been accompanied by an explosive increase in urban populations and rapid shifts in population distribution, leading to a range of urban problems, including traffic congestion, severe aging, and increased risks of infectious diseases [6,7]. These challenges underscore the urgent need for refined urban management and monitoring. Acquiring, modeling, and analyzing population distribution information across different spatiotemporal scales is crucial for cities’ healthy and sustainable development. At the spatial level, large-scale population data, such as that at the administrative or street scale, is valuable for macro-level urban planning and long-term development. Conversely, small-scale data, such as building-level population information, is essential for detailed urban management tasks. At the temporal level, dynamic population data reflect continuous population changes across different spatial scales. When the temporal interval is long, as in annual population distribution data, it can support studies of population migration patterns, the evolution of urban forms, and functional changes. On the other hand, short-term data, such as hourly population distribution, is critical for more precise urban management applications, including traffic anomaly detection, infectious disease prevention, and disaster emergency response. Traditional urban population distribution data are primarily obtained through survey-based statistical methods. For instance, governments across countries typically rely on national censuses to collect population distribution data within administrative boundaries. However, this approach has several drawbacks, including high labor and material costs, long census cycles, large spatial granularity, and reliance on administrative units for data presentation [8]. While existing datasets created by institutions or researchers [9,10,11] have been widely used, they still face the challenge of achieving high temporal and spatial resolution simultaneously. Therefore, fine-grained, high-resolution spatiotemporal population distribution estimation remains a pressing issue in urban geography and is of significant importance for the healthy development and fine-grained management of cities. In recent years, research on estimating high-spatiotemporal-resolution population distributions has increasingly converged around three principal directions. The first involves integrating multi-source datasets—such as building footprints, land-use information, nighttime lights, and mobile-sensing data—to simultaneously enhance both spatial and temporal resolution [12,13,14,15]. The second focuses on identifying intra-day human–land activity patterns from aggregated observations and leveraging these patterns for dynamic population allocation [14,16]. The third emphasizes achieving methodological interpretability, accounting for scale effects, and ensuring reproducibility while supporting finer spatial units, such as at the building level [17,18,19,20,21,22].
Numerous scholars have conducted studies on high-spatiotemporal-resolution population distribution estimation. This research categorizes these studies based on the types of data used and the modeling methods employed. There are three categories of data types: population distribution estimation based on land-use data, on new sensor data, and on multi-source data fusion. Due to its relatively low spatiotemporal resolution, land-use data is better suited for estimating population distribution at broader time scales, such as annually or diurnally [23]. New sensor data, such as transit card records and mobile phone locations, are commonly used to estimate population distribution at finer granularity. For example, Ma et al. [24] used subway smart card data to estimate hourly population distribution at the community scale in six central areas of Beijing. Feng et al. [25] proposed a bivariate model integrating mobile phone data to estimate multi-scale dynamic population distribution from traffic analysis zones to county-level administrative districts. However, due to inconsistencies in data collection frequency, processing methods, spatial resolution, and biases in data sampling, some studies [26] have introduced multi-source data fusion modeling approaches to mitigate the impact of sensor data instability on estimation results. In terms of modeling methods, three categories can be identified: weighted interpolation [27], statistical regression [17], and artificial intelligence (AI) modeling. Bergroth et al. [28] applied weighted interpolation to estimate hourly population distribution within 250 m grids across the Helsinki metropolitan area, combining mobile data with auxiliary data such as land cover, building information, and time-use surveys. Deville et al. [29] employed a log-linear model to establish a functional relationship between nighttime mobile phone user density and census population density at the street level. They used it to estimate daytime and nighttime population distributions at the commune scale. AI modeling, leveraging machine learning or deep learning algorithms, is increasingly popular due to its ability to capture non-linear relationships. For instance, Chen et al. [18,19,20,21] employed artificial neural networks to analyze the population’s temporal dependencies and spatial correlations based on mobile phone inflow and outflow data, achieving real-time population distribution modeling at grid scales ranging from 500 m to 2000 m in urban areas. Current research on dynamic population distribution modeling primarily uses statistical properties of spatiotemporal data or employs AI algorithms to autonomously learn hidden spatiotemporal patterns in population distribution data, thereby estimating overall population dynamics. However, these approaches often overlook the spatiotemporal differences in population attributes across various activities in real life, thereby improving only the temporal or spatial granularity of population distribution estimation. This limitation has garnered increasing attention within the academic community. For instance, recent studies have begun to address this issue by explicitly modeling the heterogeneity of geolocation data through decoupling "behavior" from "population", thereby correcting biases in population distribution estimation [30]. Suppose we can identify the fundamental patterns of population activity and their associated spatiotemporal characteristics, and integrate these with multi-source data to estimate populations with different activity attributes. In that case, we can reduce the biases introduced by big data. This approach would allow us to improve the accuracy of dynamic population distribution estimates while balancing both temporal and spatial resolution.
The varying attributes of population activities introduce complexity into activity patterns, characterized by distinct temporal and spatial behaviors. Research on population activity patterns primarily relies on individual-level tracking data [31,32,33,34,35,36] and aggregated group data. For large-scale individual mobile phone trajectory data [37,38,39,40,41,42], methods such as hidden Markov models, probabilistic models, rule-based algorithms, and machine learning approaches are commonly employed to infer population activity patterns [43]. However, these methods often require additional auxiliary data to provide prior knowledge related to activities. To reduce dependence on such auxiliary data, Liu et al. [44] proposed a semi-supervised learning method that constructs user activity chains by semantically annotating user activity purposes. This method successfully inferred seven urban population activity patterns: home, work, shopping and dining, leisure and entertainment, healthcare, and exercise. Although individual tracking data can uncover more detailed population activity patterns, access to this type of data is often restricted due to privacy concerns, and the processing methods and models involved are typically complex, limiting their applicability and timeliness. In contrast, sequence snapshot data, which aggregates and anonymizes individual location information, only records the number of active users within a given spatial and temporal range. This approach has higher data availability and broader application scenarios. For instance, specific mobile applications generate location request data that reflect actual spatiotemporal population distribution, such as Tencent’s real-time user density data and Baidu heat maps. Methods for analyzing such data include sampling techniques [12], clustering methods [45], and spectral mixture analysis [16]. For example, Zhao et al. [13] used urban functional zones and the Baidu heat map data to obtain time-series data population counts for each zone from 7:00 to 24:00 and to create population activity distribution curves. Shi et al. [14], using Tencent’s real-time user density data, applied spectral mixture analysis to identify different types of population activity distribution patterns. Chen et al. [15] integrated Tencent’s real-time user density data with landscape features to determine the spatiotemporal distribution of population activities in urban parks. However, a key issue with this data type is the scale effect introduced during aggregation, which can result in mixed patterns within the observed population activity patterns. Therefore, the ability to effectively extract fundamental temporal characteristics of population activities from sequence snapshot data containing mixed patterns and to recognize spatiotemporal distribution differences in population activity patterns across spatial environments is crucial for unlocking the potential of these data to perceive population activity patterns.
Despite notable progress in data sources and methodological approaches, existing studies still share three fundamental limitations. First, products derived from remote sensing or land-use data typically offer high spatial resolution but struggle to capture hour-level dynamics; in contrast, mobile-phone or internet-based sensing provides superior temporal resolution but often suffers from sampling bias, coarse spatial support, and stringent constraints related to privacy and data accessibility [12,13,14,15,18,19,20,21,24,25,26,28,29]. Second, when sequential snapshot data are aggregated into grid units, “mixed patterns” can easily emerge, leading to clustering results with highly similar temporal signatures and ambiguous semantic distinctions, thereby hindering subsequent decomposition and allocation processes [14,16]. Third, Dasymetric mapping and related allocation methods, when not coupled with physical constraints such as building type and opening status, are prone to mismatches between activity patterns and actual places. Moreover, the downscaling process from communities to grids and further to individual buildings entails scale effects and risks of ecological fallacy [17,27,46].
To address these limitations, this study proposed a method called High-Spatiotemporal-Resolution Population Distribution Estimation Based on the Strong and Weak Perception of Population Activity Patterns (SWPP-HSTPE). The key terms used in this study are defined and explained as follows:
  • Sequence Snapshot Data: Data that reflects the state of a system at specific time points, arranged in a sequence by time. Each data snapshot is a static observation, such as the population distribution at a given moment [47].
  • Strong/Weak-Perception Period: The temporal perception of population activity is the basis and prerequisite for understanding population activity patterns. This study accounts for bias in the sequence snapshot data commonly used for population distribution estimation [44], considering the characteristics of reduced perception intensity during the early morning hours, and the behavior of populations that rest at night and become active during the day. The daily period is divided into a weak-perception period (00:00–08:00) and a strong-perception period (08:00–24:00).
  • Strongly/Weakly Perceived Activity Populations: Due to sampling biases in spatiotemporal data sources across users, this study classifies the population into strongly perceived activity populations (ages 18–59) and weakly perceived activity populations, including minors (ages 0–17) and older people (ages 60+).
  • Dual Environment of Buildings: This refers to the dual environment of buildings proposed in previous research [22], encompassing both the building’s internal and external environments. The internal environmental boundary is defined by the building’s physical outline and serves as the fundamental independent unit for accommodating the population. The building’s outline can more accurately reflect the impact of internal features on population distribution. The external environmental boundary is defined by the boundary of the surrounding Traffic Analysis Zone (TAZ), aligning more closely with the proper urban construction form.
  • Spatial Environment Functional Purity (SEFP): This refers to the proportion of usable area occupied by a specific type of building within a geographic analysis unit. For example, if educational office buildings occupy the largest share of usable area within a 200 m grid, the SEFP of that grid for educational office function will be highest. This study uses SEFP to extract basic population activity patterns.
Initially, we integrated relatively static data, including building characteristics, POI kernel density features, nighttime light data within small-scale grids, and Baidu Heatmap data corresponding to the period most correlated with the resident population. This integration calibrated the Population Estimation using the Dual-Environment Feature Fusion (DEFF) model, yielding the Modified Dual-Environment Feature Fusion (MDEFF) model. Based on the time perception of population activity, using the MDEFF model to estimate nighttime population distribution at the building level during the weak-perception period (00:00–08:00). For the strong-perception period (08:00–24:00), population distribution at the building scale was estimated based on population attributes and activity patterns. Firstly, we estimated the distribution for the weakly perceived activity populations (minors and older people) based on their activity time, range, and frequency. Then, we extracted fundamental activity patterns for the strongly perceived activity populations (young and middle-aged people) using the Self-Organizing Map (SOM) algorithm and spatial environment functional purity to decompose the mixed temporal activity patterns. The Dasymetric mapping process was constrained by the proportions of these fundamental activity patterns and the mapping relationship between building types and activity patterns. Thereby, it estimates the building-level population distribution during the strong-perception period. Finally, we obtained hourly population estimates at the building scale by combining the population distributions from different perception periods and activity populations. By organically integrating nighttime “stable-scale” population estimation, daytime “pattern learning,” and Dasymetric mapping constrained by building types and openness conditions, this study established an interpretable and reproducible workflow that linked population activity patterns to dynamic population estimates at the building scale. Wuhan’s Jianghan and Wuchang districts provide an appropriate testbed for this study because they represent rapidly urbanizing core areas with high population intensity and distinct functional structures, making them well-suited for evaluating high-resolution population distribution methods.

2. Methods

This study proposed the SWPP-HSTPE method, designed to estimate population distribution at the building scale, broken down by hour and differentiated by various perception periods and activity populations. The SWPP-HSTPE method consisted of four main steps: (1) Population distribution estimation during the weak-perception period based on the MDEFF model. (2) Population distribution estimation for weakly perceived activity populations during the strong-perception period. (3) Population distribution estimation for strongly perceived activity populations during the strong-perception period. (4) Aggregation of population estimates across various perception periods and activity populations to generate an hourly population distribution at the building scale. The overall workflow is illustrated in Figure 1.

2.1. Population Distribution Estimation During the Weak-Perception Period Based on the MDEFF Model

At the building scale, which serves as the spatial fine-grain unit for population distribution studies, heterogeneity, discontinuity, and irregularity can lead to scale-related issues when integrating multi-source geographic data. Our previous study proposed a building-scale population estimation method based on Dual-Environment Feature Fusion (DEFF) [22] to address these challenges. The process begins by constructing the internal and external physical environment boundaries of buildings based on the road-based urban structure (as shown in Figure 2), providing an appropriate spatial scale for the fusion of multi-source geographic data. Next, considering the impact of a building’s inherent characteristics on population accommodation, the method identifies building functions based on the dual environments and public perception, and associates building floor and area characteristics to calculate the internal environment’s population capacity. Simultaneously, acknowledging that the external environment of a building reflects, to some extent, the population’s willingness to reside, the method proposes a Data Quality-based Technique for Order of Preference by Similarity to Ideal Solution (DQW-TOPSIS) to calculate population corrections for external environments. Four features—nighttime light intensity, residential point-of-interest (POI) kernel density, weekday mobile location data density, and weekend mobile location data density—are selected to describe the building’s external environment. A comprehensive ranking of the external environment is performed by assessing the proximity of the evaluated environmental feature values to the ideal values, with weights assigned to different attribute features based on their data quality. The detailed principles and processes of this method are described in the previous study [22]; due to space limitations, they are not elaborated here. Finally, a linear regression model is constructed by combining both internal and external environmental feature information, generating static population distribution results at the building scale. Although this approach achieved high spatial resolution in population distribution estimation, it did not account for the dynamic nature of population activity patterns, thereby limiting its temporal resolution. Additionally, due to data limitations, the dual-environment features were aggregated at the community unit level for model fitting, which somewhat reduced the regression model’s accuracy. Accordingly, this study adopts fine-scale grids (e.g., 200 m) as the geographic units for computing dual environmental features of buildings. Within each community, a “pseudo-ground-truth” grid layer is generated using population weights corresponding to the census-correlated time period, and this layer is then back-projected to the building scale. Based on this, the original DEFF model is refined to develop the MDEFF model. This modification reduces scale-aggregation errors and enables dual-environment features to operate effectively within the local neighborhood of each building, thereby improving the accuracy and stability of steady-state population estimation during the weak-perception period.
Firstly, hourly grid-scale population sampling data (e.g., Baidu heatmap data) was aggregated to the community scale. We then selected the period t r with the highest Pearson correlation coefficient with the resident population at the community level. We downscaled the community population to estimate the simulated actual population for the weak-perception period s _ p o p n at the g r i d n using the grid weights from this period:
s _ p o p n = ω n i = 1 m ω n × p o p c
In Equation (1), p o p c represents the resident population at the community scale, w n   denotes the weight of each grid within the community during the selected period t r , and i m w n   represents the sum of weights for all m grids within the community during t r . Building-level dual-environment features within the g r i d were then used to fit a regression model for population estimation (Equation (2)), which was subsequently employed to calculate the population during the weak-perception period   d _ p o p n , k :
s _ p o p n = a × i = 0 n E _ i n k + b × i = 0 n E _ o u t k
In the equation, E _ i n k represents the internal population capacity of the k th building within the g r i d n , E _ o u t k indicates the external environment’s population adjustment factor for the k th building within the g r i d n , and the coefficients a and b are the regression parameters. Once a and   b were determined, the population during the weak-perception period at the building scale d _ p o p n , k could be calculated based on the internal capacity and external adjustment values of each building (Equation (3)).
d _ p o p n , k = a × E _ i n k + b × E _ o u t k
The weak-perception period population at the building scale d _ p o p n , k represents the estimated nighttime stable population distribution. This study used it as the initial population distribution state p o p a , n , k for estimating the dynamic daytime population distribution. Daytime population activities developed based on this initial state. To validate the population distribution during the weak-perception period at the building scale, we would calculate the correlation between d _ p o p n , k   and the number of residential units in the neighborhood. Then, this was compared with the population distribution correlations derived from the WorldPop and GHS-POP datasets.

2.2. Weakly Perceived Activity Populations Distribution Estimation During the Strong-Perception Period Based on Population Activity Attributes

To estimate the distribution of weakly perceived activity populations, this study accounts for the fact that populations across age groups typically exhibit distinct spatial distributions. Based on the characteristics of population activity patterns, the population was classified into seven age groups (C0–C6). It calculated the proportion of each age group p o p P C using census data (Table 1). Based on the semantic recognition of the internal environment from the DEFF model [48], we classified buildings into three categories: residential buildings ( B u i r ), educational office buildings ( B u i e ), and commercial service buildings ( B u i c ). The daytime activities of age groups C0-C4 (minors) are relatively simple, typically involving home and school activities, characterized by low frequency and short distances. Since the “nearby enrollment” policy governs school admission during China’s compulsory education phase, distance is the primary factor to consider [49]. Therefore, groups C0-C4 were assigned to the nearest school relative to their residence during school hours. During non-school hours, they were allocated to B u i r to represent their presence at home. For group C6 (older people), since their daily travel is predominantly non-commuting, characterized by high frequency, short duration, and short distance [50], they were primarily assigned to B u i r , reflecting their home-based activities.
Based on actual population activity patterns, the population distribution during the strong-perception period should evolve from that during the weak-perception period. Therefore, the estimated nighttime steady-state population distribution at the building scale, obtained from the MDEFF model, was used as the initial population distribution state p o p a , n , k . Using the population proportions p o p P C shown in Table 1, the weakly perceived activity populations within each building p o p C , n , k were then calculated as per (Equation (4)):
p o p C , n , k = p o p a , n , k × p o p P C
In this equation, p o p a , n , k represents the estimated population distribution during the weak-perception period for the k th building within g r i d n , p o p P C denotes the population proportion for each group, while p o p C , n , k indicates the number of each group in the k th building within g r i d n .
Finally, by considering the activity attributes of the weakly perceived activity populations outlined in Table 2, the populations present during the activity time t C were assigned to the corresponding school based on the distribution rules mentioned earlier. At other times, the populations were allocated to the nearest residential building B u i r , resulting in the hourly distribution of the weakly perceived activity populations during the strong-perception period at the building scale p o p w , n , k t .
Population activities in the strong-perception period build upon the stable nighttime distribution. By accounting for the activity times and actual patterns of each group of weakly perceived activity populations, we calculated the weakly perceived activity populations during the strong-perception period. This calculation was part of the hourly building-scale population distribution estimation for daytime activities.

2.3. Strongly Perceived Activity Populations Distribution Estimation During the Strong-Perception Period by Integrating Population Activity Patterns and Spatial Environment Characteristics

Given the complexity of population activity patterns, using flow data to detect them is essential for obtaining accurate, high-spatiotemporal-resolution population distribution information. Sequence snapshot data aggregates and anonymizes individual location information, recording only the number of active users within specific spatial and temporal ranges. This data type is highly accessible and applicable in various scenarios. However, a key challenge with such data is the scale effect introduced during aggregation, which results in mixed activity patterns. To address this issue, our study proposed a method that leveraged the SOM algorithm and the functional purity of the spatial environment—effectively separating fundamental temporal features of population activities from mixed patterns in sequence snapshot data. This method enabled the perception of population activity patterns at the building level. Furthermore, by introducing constraints that relate population activity patterns to building types, we refined the high-spatiotemporal-resolution population distribution estimates for the strongly perceived activity populations during the strong-perception period.

2.3.1. Population Activity Patterns Extraction Based on the SOM Algorithm and Spatial Environment Functional Purity

The Self-Organizing Map (SOM) is an unsupervised learning algorithm [51] widely applied in clustering analysis due to its simplicity and effectiveness. The Self-Organizing Map (SOM) algorithm reduces the dimensionality of high-dimensional time series data and facilitates its visualization. SOM maps high-dimensional data into a lower-dimensional space while preserving the topological relationships among data points by organizing the input data on a two-dimensional grid. This allows similar samples within the dataset to be placed adjacent to one another on the SOM, enabling practical cluster analysis. This study used SOM to cluster temporal feature vectors of population activity patterns, and then identified the initial population activity patterns using long-term population heatmap data at the grid-scale. The overall workflow is illustrated in Figure 3. The theoretical foundation and detailed procedures of the SOM-based clustering approach have been thoroughly described in our previous work [22] and are not repeated here due to space constraints. However, unsupervised clustering methods like SOMs consider only overall changes in population activity patterns within geographic units, leading to mixed clustering results that hinder the extraction of fundamental population activity patterns and their temporal characteristics. To address this issue, our study proposed a novel approach that integrated the spatial environment of small-scale grids to filter the clustering results. This method aimed to reduce the degree of pattern mixing in population activity sequences. First, the usable areas of various building units were statistically analyzed to examine the correlation between building types and population activity patterns. Next, the usable area of different building units (UA, unit: m 2 ) is computed to explore the relationship between building types and population activity patterns (Equation (5)).
U A = s × h
where the terms s and h represent, respectively, the effective floor area and the number of floors of the building.
To reduce disturbances caused by mixed patterns and to extract functionally distinct baseline activity curves, we define the Spatial Environment Functional Purity (SEFP) at the grid-scale, as expressed in Equation (6):
S E F P n , j = k U A n , k , j k , j U A n , k , j k f
where S E F P n , j denotes the spatial environment functional purity of population activity pattern j within g r i d n . The term U A n , k , j represents the usable area of the building k associated with activity pattern j in g r i d n . The set f denotes all buildings within g r i d n associated with activity pattern j .
After obtaining the SOM clustering results, we select, across all clusters, the one with the highest SEFP (i.e., the highest proportion of a specific building type) as the baseline temporal signature for each population activity pattern. SEFP quantitatively represents the “functional control layer” through usable area proportions and can be regarded as an extension of the control layer in Dasymetric mapping to the task of selecting temporal endmembers. Choosing the cluster with the highest functional proportion as the baseline activity curve is effectively equivalent to extracting “endmembers” from mixed signals, thereby reducing collinearity and semantic ambiguity among temporal curves and providing a clear functional foundation for subsequent decomposition and allocation. Given that these fundamental activity patterns are expected to differ significantly at the building scale, we further analyzed their spatiotemporal characteristics by incorporating empirical knowledge of typical population activity schedules [14,16]. Based on this analysis, the corresponding population activity patterns were inferred, thereby enabling the perception of population activity modes. Laying a foundation for subsequent high-spatiotemporal-resolution population distribution estimation.

2.3.2. Strongly Perceived Activity Populations Distribution Estimation During the Strong-Perception Period Based on Population Activity Pattern Constraints

The Dasymetric mapping method [46] is a widely used approach for generating surface-based population representations, facilitating the transformation of census population data from administrative units to finer scales, such as building and grid levels. This process combines area weighting with empirical extraction techniques, making it popular for estimating high-spatiotemporal-resolution population distributions due to its simplicity and high accuracy. However, applying this method directly to calculate population at the building scale from grid-scale population data would yield the same proportional weights for different population activity patterns across various building types, thereby losing the valuable detail of population activity pattern perception. To address this issue, we proposed constraints on the Dasymetric mapping process based on population activity patterns. Firstly, we used a mixed-pixel linear decomposition method to extract the weights w n j of various fundamental population activity patterns within each grid. This allowed the calculation of the population for each activity pattern p o p s , n , j t within the grid, which was the first constraint. The mapping process from grid-scale to building-scale population data involved transitioning from two-dimensional to three-dimensional space, necessitating consideration of both the area and height of the mapped units. Furthermore, building types within a grid influenced population activity patterns, with a one-to-one correspondence between building types and activity patterns (Table 3). This correspondence served as the second constraint. The “type-mapping + open/closed” constraints applied in the allocation stage operationalize two fundamental principles at the control level: activity–place matching and reachability–availability. Specifically, home activity is mapped to residential buildings, work activity to educational and office buildings, and social activity to commercial and service facilities; moreover, no allocation is performed during periods when a facility is closed. These constraints prevent the population associated with a given activity pattern from being assigned to incompatible or unavailable places at the building scale, thereby reducing spurious hotspots and cold spots. By incorporating these constraints, we could achieve high-spatiotemporal-resolution population distribution estimation based on the perceptions of population activity patterns.
(1)
Mixed Pixel Linear Decomposition Constraint
Initially, assuming that the population activity sampling values (e.g., Baidu heatmap values) were uniformly sampled across different grids, we could use Equation (7) to upsample the grid population from these population activity sampling values:
p o p s , n t = ω n t n N ω n t × p o p s
In Equation (7), t represents the strong-perception period (08:00–24:00), p o p s denotes the number of the strongly perceived activity populations (young and middle-aged people) in the study area, w n t is the sampling value weight of the n th grid g r i d n during t , and N is the total number of grids in the study area. After upsampling, we obtained the strongly perceived activity populations p o p s , n t for g r i d n during t . Additionally, we calculated the dynamic population proportion w n , j t by population activity patterns, as shown in Equation (8):
ω n , j t = ω n j × P A j t j = 1 J ω n j × P A j t
In the Equation, w n , j t represents the weight of the j th population activity pattern in g r i d n during t when distributing the total population, w n j denotes the weight of the j th population activity pattern in g r i d n , P A j t signifies the temporal feature value of the j th population activity pattern during t , J denotes the number of population activity patterns in g r i d n . t ∈ [8,23], representing the period from 8:00 a.m. to 12:00 a.m. at an hourly scale. After obtaining the dynamic population proportion, we further calculated the population count belonging to the j th population activity pattern in g r i d n during t , denoted as p o p s , n , j t , based on the raw total number of strongly perceived activity populations p o p s , n t in g r i d n during t . As shown in Equation (9), this count served as the first constraint for estimating the population distribution of strongly perceived activity populations during the strong-perception period.
p o p s , n , j t = ω n , j t × p o p s , n t
(2)
Constraint of Correspondence Between Building Types and Population Activity Patterns
Firstly, considering that buildings can be either open or closed, we represented the open/close status of the k th building in g r i d n as a time-ordered vector i s o p e n n , k with a length of 24. A value of 1 indicates that the building is open during that period, while 0 signifies that it is closed. For some buildings, open/close status information can be directly obtained from mapping services. For others, the open/close status was supplemented based on the land-use type of the building (Table 4). Subsequently, we used Equation (10) to calculate the spatial capacity of the k th building in g r i d n to accommodate the population during t , specifically related to the j th population activity pattern.
C n , k , j t = i s o p e n n , k [ t ] × s n , k , j × h n , k , j k f
In the Equation, i s o p e n n , k [ t ] denotes the open status of the k th building in g r i d n during t .
Using C n , k , j t , we further calculated the population count p o p s , n , k , j t for the j th population activity pattern in the k th building within g r i d n , as expressed in Equation (11).
p o p s , n , k , j t = C n , k , j t k = 1 f C n , k , j t × p o p s , n , j t
Then the number of strongly perceived activity populations p o p s , n , k t in the k th building in g r i d n during t is:
p o p s , n , k t = j = 1 J p o p s , n , k , j t
where J represents the number of population activity patterns in the k th building, and its value range is [0, 3].
During the strong-perception period, the population engages in daytime activities based on their stable nighttime distribution. Using the extracted basic population activity patterns and the two constraints proposed in this study, we calculated the strongly perceived activity populations p o p s , n , k t during this period. This was part of estimating the hourly building-scale population distribution during the day. By adding the weakly perceived activity populations p o p w , n , k t , we obtained the daytime hourly building-scale population distribution e _ p o p n , k t during the strong-perception period (Equation (13)). By further incorporating the population during the weak-perception period d _ p o p n , k , we achieved the final estimation of the hourly building-scale population distribution with high spatiotemporal resolution b h _ p o p n , k t (Equation (14)).
e _ p o p n , k t = p o p w , n , k t + p o p s , n , k t
b h _ p o p n , k t = e _ p o p n , k t + d _ p o p n , k
This study proposed the SWPP-HSTPE method to estimate hourly population distribution at the building scale. First of all, based on the perception of population activity time, we calculated the dual-environmental features of buildings within small grid units and constructed the MDEFF model to estimate the stable nighttime population distribution during the weak-perception period (00:00–08:00). For the strong-perception period (08:00–24:00), we estimated the building-scale population distribution by considering population attributes and activity patterns. Initially, we estimated the distribution for weakly perceived activity populations (minors and older people) based on activity time, range, and frequency. For strongly perceived activity populations (young and middle-aged people), we extracted the basic population activity patterns using the SOM algorithm and spatial environmental function purity to decompose the mixed temporal features of population activities. We then constrained the Dasymetric mapping process by the proportion of basic population activity patterns and the mapping relationship between building types and activity patterns, achieving the estimation of the population distribution for strongly perceived activity populations during the strong-perception period (08:00–24:00). Finally, by integrating the population distributions across different perception periods and activity populations, we obtained the hourly building-scale population distribution.

3. Study Area and Data Sources

3.1. Study Area

Wuhan, located in Hubei Province, is a naturally multi-centered city, with each district developing its unique characteristics and directions based on local conditions. Among them, Jianghan and Wuchang Districts are core areas located on either side of the Yangtze River’s central axis in Wuhan, forming the city’s central urban area (Figure 4). Jianghan District covers an area of 28.29 square kilometers and has a permanent population of 647,900 as of 2022. The district’s GDP reached 151.2 billion RMB in 2022, ranking first among Wuhan’s districts in economic density, population density, and output per unit area. Jianghan District is known for its thriving commercial, trade, and financial sectors, and it has also developed a cultural tourism industry based on its unique historical and cultural heritage. Wuchang District, the core area south of the Yangtze River in Wuhan, is renowned for its rich history, beautiful natural environment, abundant tourism resources, and strong educational and research capabilities. As of now, Wuchang District has achieved 100% urbanization, covering an area of 107.76 square kilometers, with a permanent population of 1.2705 million in 2022 and a relatively high population density. As typical representative areas of rapid urbanization in Wuhan, both Jianghan District and Wuchang District exhibit high population density and complex population dynamics. Additionally, due to their distinct functional roles and economic backgrounds, the population activity patterns in these two districts show significant differences. Therefore, focusing on Jianghan District and Wuchang District as the study areas is advantageous for conducting refined population distribution research. The findings will provide a theoretical foundation and practical guidance for achieving intelligent urban resource allocation and enhancing refined urban management, including emergency disaster response.

3.2. Data Sources

3.2.1. Data Introduction

The data used in this study includes various types, such as foundational geographic data, social perception data, and related validation data, as detailed in Table 5.
Regarding foundational geographic data, the Seventh National Population Census of Wuhan (“Seventh Census”) provides detailed records of the population size, density, and age distribution across districts, subdistricts, and communities in Wuhan as of the end of 2020. Typically, census data on the resident population represents the population during the nighttime rest period and is therefore used as the statistical ground truth for population distribution during the weak-perception period. Building data, which records the actual area and number of floors of each building, offers the smallest unit of analysis for population distribution. As a typical unit for urban function classification, land-use data provides rich environmental information for identifying building functions. Road network data, which contains crucial information such as road classifications, speed limits, and road names, provides the spatial structure necessary to construct the building environment.
Regarding social perception data. Baidu heatmap, a big data application with hundreds of millions of users, calculates population heat values for different times and regions based on users’ location information when they access Baidu products (such as Baidu Maps, Baidu Search, Baidu Music, Baidu Translate, etc.). That is, the frequency with which users trigger Baidu products, which is a commonly used sequence-snapshot data [52]. They are aggregated and anonymised on the platform using multi-source location request data, and the output is a spatiotemporal density index that does not contain individual trajectories or user identifiers. These data accurately reflect urban activity hotspots and densely populated areas, making them ideal for modeling population activity patterns and distribution during the strong-perception period in this study. POI provides potential population distribution information for static population estimation. Compared to POI data, area of interest (AOI) data offers more precise spatial and temporal information. Mobile phone location data can reflect high-spatiotemporal-resolution population distribution information, but its small data volume and lower timeliness limit its application. Therefore, in this study, mobile phone location data collected during the weak-perception period were used only as one of the environmental indicators in the population distribution estimation model. The Luojia-01 nighttime light data, with a spatial resolution of 130 m, is commonly used to reflect urban nighttime activities and development status; it was applied here to estimate population distribution during the weak-perception period. The choice of 2018 Luojia-01 nighttime light data, rather than data from other years or other sources—such as the National Polar-orbiting Partnership–Visible Infrared Imaging Radiometer Suite (NPP-VIIRS) or the Defense Meteorological Satellite Program–Operational Linescan System (DMSP-OLS)—was based on a comprehensive assessment of data strengths and limitations as well as interannual stability. The most prominent advantage of Luojia-01 lies in its high spatial resolution of 130 m, which is substantially finer than that of NPP-VIIRS and DMSP-OLS (approximately 500 m). However, due to constraints related to satellite lifespan and operational stability, only the 2018 Luojia-01 nighttime light data are currently available. For building-level population estimation, higher spatial resolution is critical, as it allows a more detailed representation of intra-urban variations in light intensity across different functional parcels, thereby effectively reducing signal distortion caused by spatial mixing. To verify the validity of using 2018 data to represent the spatial pattern in 2023, we applied the autoencoder-based cross-sensor calibration approach proposed by Chen et al. [53]. We conducted a pixel-level Pearson correlation analysis between the 2018 and 2023 NPP-VIIRS-like annual composite images for the study area. The results show a high correlation coefficient of 0.8925 between the two years, indicating that the relative spatial pattern of nighttime light intensity remained highly stable over this period. This strong spatial consistency provides robust empirical support for prioritizing the higher-resolution 2018 Luojia-1 nighttime light data, which better serve the objectives of this study, despite the temporal mismatch.
Regarding validation data. The WorldPop dataset has a spatial resolution of 100 m and is primarily generated from nighttime lights and land-use data, offering high spatial resolution and population-fitting accuracy [9]. The GHS-POP dataset has a spatial resolution of 250 m and is generated from high-resolution satellite imagery, census data, and other socioeconomic information. The LandScan dataset has a spatial resolution of 1000 m and is part of the global population project developed by the Oak Ridge National Laboratory in the United States [10]. It utilizes an innovative approach combining Geographic Information Systems (GIS) and remote sensing to develop population allocation algorithms that adapt to varying data conditions and regional characteristics. These three datasets are widely used in studies of urban population spatiotemporal distribution and provide support for evaluating the overall effectiveness of the experiments in this study. Anjuke’s community data primarily includes the number of households in each community and is used to validate the population distribution estimates during the weak-perception period. Street-level employed population data is used to validate the population distribution estimates during the strong-perception period. To provide more direct and reliable validation of the estimation results during the strong-perception period, this study also incorporates hourly population data at approximately 200 m grid resolution from Baidu Huiyan. Although this dataset originates from the same source as the Baidu Heatmap data used for model construction, it has undergone extensive official post-processing by Baidu: multi-source demographic statistics are used to calibrate and reallocate the raw activity signals, yielding outputs that can be regarded as a high-credibility proxy for ground truth. In this study, we use this dataset as an independent external reference to quantitatively evaluate and compare, at the grid-scale, the performance of the proposed main model (SWPP-HSTPE) with the baseline model, thereby enabling rigorous horizontal (cross-model) validation.

3.2.2. Temporal Coordination of Multi-Year Data

To avoid biases introduced by cross-year datasets and to ensure reproducibility, we adopt the following temporal harmonization strategy:
(1)
Census-based calibration.
The 2020 census population is used as the baseline ground truth for nighttime steady-state calibration during the weak-perception period. Building data and land-use data provide slowly varying physical constraints that define spatial carrying capacity. The 2023 Baidu heatmap is used solely to learn the temporal shapes of activity patterns rather than to fit the 2020 population magnitudes.
(2)
Stability assumptions.
Building stocks and dominant urban functions in the study areas (Jianghan and Wuchang) remained broadly stable from 2018 to 2023. Functional spaces reflected by POIs and AOIs changed only gradually at the interannual scale. Luojia-01 nighttime lights serve as a proxy for nighttime steady-state activity and development intensity.
(3)
Scale-free normalization.
Source datasets from different years are normalized at the grid level, and Baidu heatmap data are normalized across hourly sequences. This ensures that only relative shapes are learned, minimizing the influence of year-to-year differences and variations in overall population scale.
(4)
No leakage.
Nighttime baselines (2020) and daytime pattern learning (2023) are decoupled in their objectives: the former supports steady-state calibration during the weak-perception period, while the latter provides temporal weights during the strong-perception period. No numerical information from 2023 is used to calibrate or “backfill” the 2020 totals.
(5)
Robustness basis.
Existing studies have demonstrated that, within urban core areas, VIIRS and higher-resolution nighttime light products (including Luojia-01) exhibit strong consistency in spatial patterns and brightness distributions [54]. Building on this evidence, our study further mitigates cross-year discrepancies by applying feature standardization and learning only the relative temporal shapes of activity patterns.

4. Experimental Results

4.1. Results of Population Activity Patterns Perception

4.1.1. Results of Population Activity Patterns Extraction Based on the SOM Algorithm and Spatial Environment Functional Purity

We clustered the Baidu heatmap data using the SOM clustering algorithm for weekdays (Monday to Friday). The clustering performance was evaluated using the Mean Quantization Error (MQE). The MQE gradually decreased through multiple iterations, indicating that the SOM model increasingly improved in data mapping. As shown in Figure 5, the MQE decreases progressively as the number of neuron nodes within the SOM network plane increases. It becomes stable when X = Y = 5, yielding 25 distinct temporal patterns of population activity at the grid-scale on weekdays. Beyond this point, further increasing the grid size yields only marginal gains. When the number of neurons is smaller than 5 × 5 (e.g., 4 × 4), the MQE is substantially higher, indicating a clear risk of underfitting. In contrast, once the neuron number reaches or exceeds 5 × 5, the rate of MQE reduction markedly slows: the difference between 5 × 5 and 6 × 6 is far smaller than that between 4 × 4 and 5 × 5. This pattern indicates that model performance becomes relatively insensitive to the exact neuron count near a 5 × 5 configuration. Balancing estimation error against model complexity, we therefore selected a 5 × 5 grid as the optimal SOM structure. Under this configuration, the SOM was trained with the following hyperparameter settings: 1000 training iterations; linear initialization based on principal component analysis (PCA) to accelerate convergence; a linearly decaying learning rate, decreasing from an initial value of 0.5 to approximately 0.01 at the end of training; and a Gaussian neighborhood function, with the neighborhood radius linearly shrinking from covering most of the SOM grid to including only the winning neuron and its immediate neighbors. In the 5 × 5 SOM configuration, the model was independently trained with 10 random seeds. The resulting MQE shows a mean value of 0.1333 with a standard deviation of 0.0045, indicating that the clustering results are insensitive to random initialization and thus robust.
These 25 temporal patterns were then subjected to mean and normalization processing, yielding the 25 population-activity temporal feature curves shown in Figure 6, labelled k1–k25. Notably, these curves exhibit temporal similarities, suggesting that differences in population activity patterns over time are not pronounced. This indicates that some level of pattern mixing may still exist within the extracted 25 population activity patterns.
To further explore the spatial distribution characteristics of population activity patterns, we statistically analyzed the usable area of buildings. Since POI data lacks height information, we analyzed population activity pattern clusters based on proportions of different building types by usable area. As shown in Figure 7, the usable area proportions of B u i r and B u i c are relatively high, while B u i e having a lower usable area proportion. This pattern directly relates to the commercial and residential planning in Jianghan and Wuchang Districts. Combining this with the analysis in Figure 6, it becomes apparent that clusters with similar proportions of building functions exhibit more similar temporal feature curves for population activity patterns. This indicates that differences in the spatiotemporal distribution of population activity patterns are less pronounced when the proportions of building functions are similar. For instance, the clusters k15, k20, and k25 have the following proportions of usable area: 84:6:10, 88:5:7, and 82:9:9, respectively. These clusters show the highest proportion of residential land-use within buildings, with minimal commercial and educational functions. In contrast, clusters k16 and k21 have proportions of 17:8:75 and 14:5:81, respectively, indicating the highest commercial service function within buildings. Clusters k3 and k5 have proportions of 63:14:23 and 65:10:25, respectively, indicating a high proportion of residential land-use and considerable educational office functions. The normalized grid heat curves and mean curves for these three data sets, as shown in Figure 8, exhibit similar trends. This strongly correlates with urban spatial building functions and population activity patterns. The more similar the spatiotemporal distribution, the more similar the population activity patterns. Therefore, it can be reasonably inferred that as the spatial environment becomes increasingly homogeneous (i.e., as the Spatial Environment Functional Purity, SEFP, increases), the extracted population activity patterns in such environments more closely resemble basic functional curves.
Based on the analysis in Figure 7 of spatial environment functional purity (SEFP), it is evident that k21 exhibits the highest proportion of B u i c usable area, indicating that its spatial environment has the highest S E F P of P A c . Similarly, k20, with a B u i r usable area proportion as high as 95%, demonstrates the highest S E F P of P A r . For k4, the proportion of B u i e usable area is highest, suggesting that its spatial environment has the highest S E F P of P A e . Therefore, using S E F P as a basis, the temporal characteristics of the population activity patterns for k21, k20, and k4 can be identified as the fundamental patterns corresponding to urban commercial services, residential functions, and educational office functions, respectively.
Figure 9 illustrates the distinct temporal characteristics of the three fundamental population activity patterns. A detailed analysis of these patterns reveals the following trends: (1) 0:00–7:00: All three activity patterns show low activity intensity during the night, with k20 having the highest intensity, followed by k21 and then k4. (2) 7:00–17:00: The activity intensity of k20 remains at about 50% of its evening level, with a slight peak around noon. Meanwhile, the activity intensity of k4 and k21 increases at 7:00, with k21 rising slightly faster than k4. By 9:00, their intensities are nearly identical, but k4 gradually surpasses k21. The activity intensity of k21 stabilizes with peaks around 12:00 and 18:00, while k4 peaks at 10:00 and 16:00. Additionally, from 12:00 to 15:00, the activity intensities follow the k21 > k4 > k20 pattern. (3) After 17:00, both k4 and k21 show a declining trend in activity intensity, while k20 gradually increases, reaching its daily peak at 22:00 before declining. By 23:00, the intensity ranking becomes k20 > k21 > k4. Based on these observations, it is evident that k20 corresponds to the home population activity pattern ( P A r ), characterized by a “stable during the daytime, rising at nighttime” trend. k4 represents the work population activity pattern ( P A w ), with a noticeable double-peak at 10:00–11:00 and 16:00–17:00, where daytime activity intensity is generally higher than nighttime activity intensity. Finally, k21 represents the social population activity pattern ( P A s ), where daytime activity remains high, with peaks slightly later than P A w , and nighttime activity is higher than P A w but lower than P A r .

4.1.2. Validation of Population Activity Patterns Based on Mixed Decomposition Linear Model

This study applied a hybrid decomposition linear model to decompose the temporal characteristic curves of 25 distinct population activity patterns and the three fundamental population activity patterns. The effectiveness of the decomposition is evaluated using the distribution of overall R 2 and R M S E . Figure 10 illustrates the distribution of R 2 and R M S E   across 3725 grid cells for the linear decomposition results of different population activity patterns during weekdays. When conducting the hybrid linear decomposition based on all 25 population activity patterns, the median R 2 is 0.804, and the median R M S E is 0.130. The hybrid linear decomposition based on the three fundamental population activity patterns yields a median R 2 of 0.743 and a median R M S E of 0.149. The slight difference between the two sets of results indicates that the fit and predictive accuracy of the population activity models derived from the 25 and three fundamental patterns are similar and consistently high. This finding demonstrates the effectiveness of the three fundamental population activity patterns in representing the spatiotemporal characteristics of population activities.

4.2. Results of High-Spatiotemporal-Resolution Population Distribution Estimation

4.2.1. Results of Overall Population Distribution Estimation in the Study Area

The analysis of the population proportions associated with different activity patterns during various periods within the study area is presented in Table 6. The period t denotes the interval from t to t + 1 h. For example, period 8 refers to 8:00–9:00. The analysis reveals that from 8:00 onward, the proportion of the population engaged in work activities ( p o p w ) gradually increases from 8.65%, reaching dual peaks around 11:00–12:00 and 16:00–17:00, approaching 30%. After 18:00, the proportion begins to decline, aligning with the typical work and rest periods of the working population. The proportion of the population engaged in social activities ( p o p s ) stabilizes at around 40% beginning at 10:00. It only starts to decrease after 19:00. The higher proportion of the social population during the day compared to the work population is attributed to two main factors. On the one hand, the Jianghan and Wuchang districts host numerous renowned attractions and well-developed tourism infrastructure, leading many people to engage in tourism activities even on weekdays. On the other hand, the robust economies of these districts attract many people for shopping, entertainment, and other social activities. The population engaged in home activities ( p o p r ) proportion exceeds 90% during the weak-perception period but drops to about 35% during the strong-perception period. This inverse relationship with p o p w and p o p s aligns with typical human behavior, characterized by resting at night and being active during the day. The main reason the proportion of the home population during the daytime strong-perception period remains high is that this study considers only home behavior among older people. The aging population in Jianghan and Wuchang, where the population of older people approaches 20%, further supports this finding.
Due to the similarity in population proportions between adjacent periods, to more clearly illustrate the overall population distribution changes, Figure 11 displays the population distribution maps at the building scale for the weak-perception period (00:00–08:00) and four strong-perception periods with 4-h interval (10:00–11:00, 14:00–15:00, 18:00–19:00, 22:00–23:00). Figure 11a,e show similar population distributions, as both periods have the highest proportion of p o p r . Starting in the morning, P A s and P A w activities begin, with concentration hotspots emerging in some areas by 11:00. Notably, region A is centered around the Jianghan Road pedestrian street, which is a commercial area, and region B is centered around Hongshan Plaza, which is an office area, continue to show high population activity into the evening. Population activity patterns then shift to p o p r , with increased population in residential areas near Nanhu Street, such as region C, after 19:00, leading to a more stable overall population distribution by night. Figure 11 illustrates that the building-scale hourly population distribution estimation from this study effectively captures both the fine-scale dynamics of population and its spatial aggregation. Throughout the day, from 10:00, the P A s and P A w begin to concentrate, particularly in large commercial and office areas, continuing until around 19:00. Afterwards, the concentration dissipates, and the population distribution stabilizes.

4.2.2. Results of Population Distribution Estimation for Typical Regions

Further quantitative observations of population activity patterns were conducted in sample regions A, B, and C. Figure 12A1 depicts region A, centered around the Jianghan Road commercial district. This area, with its strong commercial and adjacent transportation functions due to the nearby Jianghan Road subway station, shows the highest proportion of p o p s . And peak hours at 9:00 and 19:00, consistent with its role as a transportation hub and commercial hotspot (Figure 12A2). Figure 12B1 illustrates a district primarily composed of government office buildings, such as the Hubei Provincial Railway Bureau. In this region, the proportion of p o p w is the highest, followed by p o p s , with a minimal proportion of p o p r , indicating a clear work-oriented pattern (Figure 12B2). Figure 12C1 shows the built environment near the Xinxinjiayuan community in Baishazhou. This residential area, with some life service facilities, exhibits a pattern where p o p r decreases during the strong-perception period. Conversely, during the weak-perception period, p o p r rises. The analysis of these three sample regions reaffirms the inherent correlation between a region’s diverse functions and its dynamic population structure features. Specifically, areas with commercial function show stronger social activity patterns, with bimodal variation. In buildings designated for educational or office use, the work activity pattern dominates, followed by social activity, with residential patterns least prevalent. Conversely, home activity patterns are predominant in residential areas, with a noticeable reduction in the daytime strong-perception period compared to the nighttime weak-perception period.

4.3. Validation of the Results of High-Spatiotemporal-Resolution Population Distribution Estimation

4.3.1. Validation of the Results of Population Distribution Estimation During the Weak-Perception Period

To ensure consistency in the validation scale, the building-level estimates for the weak-perception period were aggregated to the residential-compound level for comparison. Specifically, only residential-use buildings were included, and the spatial linkage between buildings and residential compounds was established using the “point-in-polygon” test between each building’s centroid and the compound AOI. Thus, the unit of analysis for validation is the residential compound rather than individual buildings, aligning the evaluation with the spatial unit of external household statistics and reducing biases introduced by mixed-use buildings.
For nighttime validation, the 2024 Anjuke household counts for residential compounds were used as an external reference. Household counts are highly correlated with nighttime resident population in urban residential areas and therefore serve as an appropriate proxy for assessing consistency. To avoid systematic biases arising from variations in average household size or occupancy rates, the household counts were not converted directly into population estimates. Instead, we employed Pearson’s correlation coefficient at the compound scale to evaluate rank consistency, and results were stratified by year of construction to control for differences in occupancy rates partially. Occupancy rates vary by year of construction, but this does not affect the evaluation based on correlation coefficients. We calculated Pearson correlation coefficients to assess the relationships between population distribution estimates derived from the MDEFF model during the weak-perception period, the WorldPop and GHS-POP datasets, and the actual number of households in each residential area. The results are shown in Figure 13.
The correlation coefficient between the population estimated by the MDEFF model and the actual number of households is 0.72. This value is 0.157 higher than that of the WorldPop dataset and 0.133 higher than that of the GHS-POP dataset. The WorldPop dataset exhibits significant underestimation in some buildings. This underestimation is likely due to its top-down approach, which allocates a portion of the population to non-residential buildings within built-up areas, thereby reducing the population assigned to actual residential buildings. In contrast, the MDEFF model accounts for both the building’s internal and its surrounding environments, leading to more accurate population distribution estimates.
Given that residential occupancy rates vary by year of construction, it is unrealistic to expect 100% occupancy, which could lead to discrepancies between the actual population and the number of households. To eliminate the influence of building age on the validation process, we selected 249 residential areas built before 2014 and 164 areas built before 2004. We then calculated the correlation between the number of households and the estimated population during the weak-perception period for each construction period. The results are presented in Table 7.
As shown in Table 7, the correlation between the number of households and the three datasets for residential areas built before 2024 is lower than for earlier regions. This suggests that the average occupancy rate in these more recently constructed areas may be lower than in previous years, leading to discrepancies between the number of households and the actual population, thereby affecting the accuracy of validation results. Specifically, the correlation between population estimates using the MDEFF model is 0.720, which is lower than the correlations for areas built before 2014 (0.759) and before 2004 (0.807). This indicates that older residential areas tend to have higher correlations. Similarly, the correlation for the WorldPop dataset is 0.563, lower than that for areas built before 2014 (0.613) and before 2004 (0.606). The GHS-POP dataset shows a correlation of 0.587, which is lower than that for areas built before 2014 (0.624) and before 2004 (0.600). Interestingly, the WorldPop and GHS-POP datasets exhibit higher correlations for areas built before 2014 than for those built before 2004, likely due to differences in the algorithms and empirical data used to create them. Across different construction years, the MDEFF model consistently outperforms the WorldPop and GHS-POP datasets, underscoring its advantage in estimating population distribution during the weak-perception period.

4.3.2. Validation of the Results of Population Distribution Estimation During the Strong-Perception Period

(1)
Comparative Validation Based on the Ablation Model
To quantitatively assess the incremental value of the proposed pattern–place matching constraint, we constructed an ablation model for comparison. This ablation model shares the same activity-pattern decomposition pipeline as the main model (SWPP-HSTPE): specifically, it uses SOM and SEFP to extract three fundamental activity patterns from population sampling data and, after linear decomposition, derives the numbers of p o p r , p o p w , and p o p s within each grid. The key difference lies in the final Dasymetric mapping stage. In the ablation model, the building-type–activity-pattern matching constraint is removed, and the decomposed activity-specific populations within each grid are allocated to all buildings in that grid solely in proportion to their usable area.
Street-level employment population data are used as an external reference to validate the activity-pattern-specific populations estimated by both the main and ablation models. We aggregate the building-scale results of each activity pattern from both models to the street level and compute Pearson correlation coefficients with observed street-level employment populations. The results are reported in Table 8.
All correlation coefficients presented in Table 8 have passed the significance test at the 0.05 confidence level. The correlations between the estimated p o p w , p o p s , and p o p r with the actual employed population are 0.862, 0.624, and 0.571, respectively. The highest correlation coefficient for p o p w with the actual employed population further validates the accuracy of this estimation. Notably, the correlation between the p o p w estimated by the main model and observed employment population (r = 0.862) is substantially higher than that of the ablation model (r = 0.743). This result further demonstrates that introducing the pattern–place matching constraint enables the main model to allocate p o p w more accurately to B u i e , thereby avoiding the misallocation to B u i r , or B u i c that occurs in the ablation model. Consequently, the resulting spatial patterns are more consistent with real-world conditions. These findings provide strong evidence for the reliability of population distribution estimates based on perceptions of population activity patterns.
(2)
Comparative Validation of Area-Weighted Direct Allocation Baseline Model
To provide a more robust horizontal validation of the results for the strong-perception period, and to quantitatively assess the incremental value of the key methodological components in our strong-perception population estimation—such as activity-pattern decomposition and building-type constraints—we introduce an independent external reference dataset, namely the Baidu Huiyan hourly population grid product at 180 m resolution. We further construct an Area-Weighted Direct Allocation Baseline Model for comparison. The design of this baseline follows a widely adopted paradigm in population spatialization research: when prior knowledge is unavailable, population is allocated in a top-down manner based on available dynamic proxy data (here, population heat values) and physical capacity (building usable area). By comparing its outcomes with those of the proposed SWPP-HSTPE framework, we can clearly evaluate the effectiveness of our approach in activity-pattern identification and the “pattern–place” constrained allocation mechanism.
The baseline model is implemented as follows.
For each hour t , the total population of the study area, p o p t , serves as the control total. Based on the proportion of the grid-level sampling data s a m p l e n , t (e.g., population heat intensity) to the total sampling value s a m p l e t , the total population is allocated to each grid cell, yielding p o p n , t , as expressed in Equation (15).
p o p n , t = s a m p l e n , t s a m p l e t × p o p t
The population assigned to each grid, p o p n , t , is further redistributed to individual buildings k within that grid. The allocation weights depend exclusively on building usable area: the population assigned to each building is proportional to the product of its usable area relative to the total building usable area within the grid, as shown in Equation (16).
p o p n , k , t = s n , k × h n , k k s n , k × h n , k × p o p n , t
In the Equation, p o p n , k , t denotes the estimated population of the k -th building within g r i d n . The terms s n , k   and h n , k represent the effective floor area and the number of stories of building, respectively; their product yields the total building usable area.
It is important to note that this baseline model does not decompose population activity patterns or incorporate constraints related to building type or operational status. It therefore represents a purely sampling- and capacity-based allocation logic driven solely by population proxy data and physical building usable area.
We aggregated the building-level estimates from both the proposed SWPP-HSTPE main model and the baseline model to the ~180 m grid resolution of Baidu Huiyan. We computed Pearson’s correlation coefficient (r), mean absolute error (MAE), and root mean square error (RMSE) against the “ground truth.” The comparative results are presented in Table 9 and Figure 14.
The findings show that during most strong-perception periods (10:00–23:00), the SWPP-HSTPE model outperforms the baseline model across all evaluation metrics. In terms of correlation, the main model achieves higher Pearson coefficients for most daytime hours, with a notable improvement during the afternoon activity peak at 16:00, where the correlation reaches 0.359 compared with 0.321 for the baseline. Regarding error metrics, the main model attains an average MAE of 395.2 and an average RMSE of 698.8, representing reductions of approximately 1.7% and 1.9%, respectively, relative to the baseline (average MAE: 402.1; average RMSE: 712.3). Although the baseline shows marginally higher correlation during a few morning hours (08:00–09:00), its error levels remain consistently larger than those of the main model.
Overall, the SWPP-HSTPE model demonstrates superior stability and accuracy throughout the day, particularly in the afternoon and evening hours when activity patterns become more complex. These quantitative results confirm that incorporating SEFP-based activity pattern extraction and building-type-constrained allocation substantially improves estimation accuracy, outperforming simple physical-capacity-based allocation approaches.
(3)
Validation Analysis of Hot and Cold Spots of Population Distribution at Peak Periods of Various Activity Patterns.
This study employs the Hot Spot Analysis (Getis–Ord Gi *) tool in ArcGIS 10.8 to identify hotspots and cold spots at the building level. Spatial relationships are conceptualized using a fixed distance band (FIXED_DISTANCE_BAND), with the distance threshold automatically determined by the tool based on dataset characteristics to ensure that each feature has at least one neighbor. During computation, the spatial weights matrix is row-standardized. To control for potential false positives arising from repeated significance testing across multiple hours (i.e., the multiple testing problem), we apply the Benjamini–Hochberg False Discovery Rate (FDR) correction to the p-values computed for each hour. The Getis–Ord Gi* statistics are calculated separately for the peak periods of the three population activity patterns. The hotspots and cold spots shown in Figure 15, Figure 16 and Figure 17 are statistically significant after FDR correction.
Figure 15 presents the spatial distribution of hot and cold spots at the building scale during P A w ’s peak. The population hotspots during this period are mainly concentrated in regions W1 to W6, which correspond to the Jingshan Light Machinery Industrial Park, Hankou Railway Station, Tongji Hospital affiliated with Huazhong University of Science and Technology, Wuhan Conservatory of Music, Chuhe Han Street, and the teaching area of Wuhan University. These areas play a significant role in the education, healthcare, and office services of Jianghan and Wuchang districts, making them key work population clusters. Notably, no significant cold spots are observed, indicating a relatively large and evenly distributed p o p w within the study area. This suggests that the spatial distribution of p o p w is fairly balanced across the regions.
Figure 16 illustrates the distributions of hot and cold spots in the P A s activity pattern during the peak period. Significant differences in the distributions of hot and cold spots are evident. Hotspots are predominantly located in areas S1-S9. These include regions around Hankou Railway Station, Wansongyuan Food Street, Heng Long Plaza, Jianghan Road Pedestrian Street, Shuian International Commercial Street, Qunxingcheng, and Yuejiazui Subway Station. These areas serve as primary spaces for social activities, including shopping, entertainment, leisure, and tourism. Area S9 extends from Chuhe Han Street, passing through Hongshan Square, Zhongnan Road, and Baotong Temple to the Street Corner. This corridor features bustling commercial facilities and numerous bus and subway stations, and acts as a crucial transportation hub for social activities in Jianghan and Wuchang districts. Conversely, the northern and southern parts of Wuchang District exhibit relatively low P A s , leading to coldspots in areas S10-S13. Area S10, undergoing redevelopment, has limited social activity and thus appears as a P A s cold spot. Area S11’s cold spot results from the demolition of old residential neighborhoods as part of urban renewal plans. Area S12, primarily Yangyuan Street, also faces issues with housing demolition and urban transformation, which hinder social activities. Area S13, in Nanhu Street, is predominantly residential and exhibits slightly lower P A s than other areas in Jianghan and Wuchang, forming a P A s cold spot. Overall, commercial and service facilities are evenly distributed across Jianghan and Wuchang districts. However, the southern part of Wuchang and the northern part of Jianghan exhibit lower levels of P A s . Future urban planning should focus on enhancing social infrastructure in these areas to better meet the social and shopping needs within the city’s fifteen-minute living circle.
Figure 17 illustrates the hot and cold spots of population distribution for the P A r during the peak period. The spatial distribution of hot and cold spots in this period shows notable variation. Hotspots are primarily located in regions H1–H7, including the densely populated Hanxing Street in Jianghan District, the migrant-populated Hu’anli urban village, the Electric Power Community, Nanyang Apartments, the Xudong Area, Zhongnan Road Street, and Nanhu Street. These areas predominantly feature residential buildings, resulting in higher G i * indices for P A r population activity patterns, thereby highlighting significant hotspot areas. Conversely, regions H8–H13 reveal concentrated cold spots, including the Jingshan Light Machinery Industrial Park, the ongoing redevelopment of Dadong Community, Huadi Street, Yangyuan Street, Hongshan Square, and the teaching area of Wuhan University. Except for the redevelopment areas H10–H12, these cold spots overlap with the population distribution hotspots of the P A w and P A s , which aligns with actual population activity trends.
Based on the population distribution estimation for the strong-perception period, the analysis using the G i * index accurately reflects the spatial distribution of hot and cold spots for each population activity pattern. This analysis is grounded in building-scale distribution estimates during the strong-perception period. In summary, the SWPP-HSTPE method can effectively estimate high-resolution spatiotemporal distributions for various perception periods and activity populations.

5. Discussion

This study separates nighttime steady-state population distribution from daytime dynamics. During the nighttime (weak-perception period), we use the 2020 census residential population as the baseline, summarize internal building capacity (floor area × number of floors) and surrounding environmental characteristics within a 200 m grid, and regress these features to the building scale. During the daytime (strong-perception period), the model learns only the relative temporal patterns from the 2023 population activity data, without introducing cross-year scale information. During the weak-perception period, compared with products that rely solely on nighttime lights or grid-level weighted allocation (e.g., WorldPop, GHS-POP), MDEFF redistributes under dual constraints: “internal capacity + external environment.” This design better explains the correlation between nighttime estimates and actual household counts (r = 0.72) and reduces errors caused by allocating population to non-residential buildings. During the strong-perception period, compared with daytime approaches that only perform mixture decomposition [14], our study introduces prior knowledge of spatial functional purity via SEFP, thereby reducing ambiguity in pattern extraction. Compared with end-to-end deep learning methods that require extensive labelled data [21], our framework is less dependent on annotation, offers stronger interpretability, and is more readily transferable across different urban contexts.
To systematically evaluate the proposed methodology, we further introduced independent Baidu Huiyan hourly population data at approximately 180 m resolution as an external reference. We constructed a general area-weighted direct allocation baseline model for comparison. The validation results (Section 4.3.2) clearly demonstrate that the SWPP-HSTPE main model outperforms the baseline across all evaluation metrics. The main model shows a higher correlation with the Baidu Huiyan “ground truth” while achieving lower MAE and RMSE. The fundamental reason for this improvement lies in the design logic: the baseline model relies solely on physical capacity and performs undifferentiated allocation, leading to substantial misallocation during working hours—especially the overassignment of population to large residential buildings—resulting in spatial patterns that deviate markedly from the actual urban functional structure. In contrast, the framework proposed in this study—SEFP-based fundamental activity curve filtering → building-type mapping → Dasymetric mapping—achieves semantic matching between “activity” and “place.” This ensures that the population engaged in working, social, or other activities is allocated to functionally appropriate buildings, making the estimated spatial distribution more consistent with real-world activity patterns.
Regarding temporal segmentation, the division of weak/strong-perception period in this study is modeled and validated using weekday data, without explicitly accounting for seasonal or weekend/holiday variations. Prior research has shown that activity schedules in winter versus summer, as well as weekend/holiday mobility patterns, exhibit systematic differences from weekdays. Recent research has made significant strides by constructing heterogeneous multi-scale Population Analysis Unit interaction networks. These frameworks explicitly model the evolutionary relationships of nodes across different time windows, offering a more refined approach for capturing the temporal dynamics and scale interactions of population activity patterns [55]. Within our framework, future work can—without altering the core steps—apply the same SOM + SEFP procedure separately to weekday, weekend/holiday, or seasonal datasets to independently extract their respective basic activity curves. During decomposition and allocation, share conservation (i.e., the sum of mode weights equals 1) can be enforced, and simple temporal smoothing can be applied to hourly sequences to obtain stable hour-by-hour estimates. With respect to interactions among activity patterns, the present study adopts a simplified linear-additive, mass-conserving assumption, without explicitly modeling linkage effects, such as increased social activity after school/work. Future extensions may introduce transition relationships among patterns to capture these interactions, while retaining the interpretability provided by building-type mapping and open/closed state constraints.
Regarding the modifiable areal unit problem (MAUP) and the treatment of empty grid cells, tiny grids increase the proportion of empty cells and intensify spatial fragmentation, thereby amplifying statistical noise; massive grids dilute micro-level heterogeneity and introduce over-smoothing. The empirically observed empty-cell rates are 0.3644 at 100 m, 0.2833 at 200 m, and 0.2613 at 300 m. Compared with 100 m, the 200 m grid reduces the empty-cell rate by 8.11 percentage points, whereas increasing the grid size to 300 m yields only marginal improvement (an additional reduction of 2.20 percentage points) while weakening the fine-scale representation of the built environment. We therefore adopt 200 m as a balanced grid size, which also matches the spatial support of the population activity data. Downscaling from communities to grids is implemented through proportional redistribution based on the relative intensity of intra-community population activity, rather than uniform allocation, thereby reducing the risk of ecological fallacy. Grids without building capacity are excluded from nighttime regression fitting and from the Getis–Ord Gi* statistic; during the strong-perception period, population is assigned only when the grid contains buildings of relevant types that are open during that hour. These design choices indicate that major hot–cold spot patterns are primarily determined by functional spatial structures and are largely insensitive to reasonable variations in grid resolution.
The building-level, hourly population distribution generated in this study directly supports a wide range of urban governance tasks. For public service provision and transportation management, differences between midday and evening peaks can be used to evaluate service pressure on medical stations, transit hubs, and commercial pedestrian zones, enabling time-of-day optimization of service schedules. For order maintenance and tourism management, population aggregation can be separated into weekdays, weekends, and holidays to support time- and space-specific crowd regulation. The framework also provides a unified data foundation for population geography, transportation engineering, urban planning, and public health, allowing quantification of the temporal coupling among residence, employment, and consumption, and the identification of high-risk crowding windows. More importantly, the high spatiotemporal-resolution population distribution data produced in this study serves as critical input for deepening our understanding of complex urban systems. For instance, such data can form the basis for constructing "Function-associated Population Mobility Networks" [56], which are used to dynamically quantify the service scopes of urban functions and their driving effects on population mobility. This effectively bridges the gap between static/quasi-dynamic population distribution estimation and the analysis of dynamic population mobility mechanisms. In addition, it is noteworthy that the population data by age cohort (e.g., the elderly population) generated within this research framework can serve as a critical foundation for analysis in numerous refined urban management scenarios. For instance, such data could support cutting-edge assessments like the spatial equilibrium model of elderly care facilities with high spatiotemporal sensitivity (SEM-HSTS), enabling precise characterization of the spatiotemporal heterogeneity in urban resource allocation [57]. Without requiring additional data, the method can be extended to short-term inference and scenario evaluation. On one hand, separate versions of basic activity curves can be established for weekdays, weekends/holidays, and seasonal periods to generate multi-scenario hourly estimates. On the other hand, under planning scenarios—such as the addition of commercial/office space, adjustments to operating hours, significant events, or construction-related closures—population redistribution can be simulated by modifying building-type mappings or opening-hour parameters. Furthermore, hourly population distributions can be integrated with indicators such as employment density, rent/land prices, public transit supply, industrial structure, and commercial vitality to identify functional mismatches and service gaps, thereby supporting site selection, off-peak scheduling, and pedestrian-friendly redesign. Building on this, if combined with advanced centrality quantification methods that account for both mobility features and network topology—such as the k-dis-weight-shell algorithm [58]—it becomes possible to more precisely assess the relative importance of different locations within the urban population mobility network. Consequently, this integration offers more dynamic and network-aware decision support for transportation planning, facility siting, and epidemic management. All of these applications rely on an interpretable pipeline—census-based steady-state calibration, activity-pattern extraction, and building-type/opening-state constraints—that ensures strong transferability and reusability.

6. Conclusions

This study proposes the High-Spatiotemporal-Resolution Population Distribution Estimation Based on the Strong and Weak Perception of Population Activity Patterns (SWPP-HSTPE) model. The model estimates hourly building-scale population distributions for different perception periods and activity patterns, using the Wuchang and Jianghan districts in Wuhan as case studies. The main conclusions are as follows:
1.
The correlation coefficient between the estimated and actual household populations for the MDEFF model is 0.72. This is 0.157 higher than that obtained from the WorldPop dataset and 0.133 higher than that from the GHS-POP dataset, demonstrating the superior performance of the MDEFF model in estimating population distribution during the weak-perception period.
2.
Three basic population activity patterns were identified using population activity data and employing the SOM algorithm and spatial environment functional purity. Linear decomposition of the temporal characteristics of these patterns yielded median R 2 and R M S E values of 0.743 and 0.149, respectively. This validates the effectiveness of population activity patterns in capturing spatiotemporal characteristics and establishes a basis for high-spatiotemporal-resolution population distribution estimation.
3.
During the strong-perception period, the SWPP-HSTPE main model outperforms the area-based baseline across all evaluation metrics—including correlation, MAE, and RMSE—demonstrating its superior ability to capture daytime dynamic population patterns. Analysis of the G i * index based on population distribution estimates during the strong-perception period accurately reflects the spatial aggregation of population hot and cold spots during the peak periods of various activity patterns. This confirms that the SWPP-HSTPE method can provide precise, high-spatiotemporal-resolution estimates of population distribution across different activity patterns.
Although the empirical results effectively demonstrate the accuracy and validity of the SWPP-HSTPE model, several limitations remain:
  • The current temporal segmentation remains relatively coarse. The weak-perception period has not yet been subdivided into finer activity patterns, nor does the present framework incorporate seasonal or weekday–weekend/holiday variations. As data quality improves, future work may further refine the characterization of nighttime activities.
  • The linear regression and SOM approaches employed in this study are still limited in capturing complex temporal dynamics. Subsequent research could explore more advanced and efficient models, while strengthening cross-city and cross-year generalization and improving uncertainty assessment.
  • This study has not yet applied the high-spatiotemporal-resolution population distribution dataset to real urban planning scenarios. Future efforts could establish deeper integration with public service provision, transport operations, and public health management to develop transferable and reusable application frameworks.
  • Due to data constraints, the method may be less applicable in rural or peripheral areas. Future research may incorporate rule-based reasoning and adaptive modeling to reduce dependence on dense data sources.
As technology and data quality advance, the application of high-spatiotemporal-resolution population distribution information will likely expand, offering valuable support and guidance for urban management and societal development.

Author Contributions

Conceptualization, Rui Li, Guangyu Liu and Hongyan Li; methodology, Rui Li, Guangyu Liu, Hongyan Li and Jin Xia; software, Guangyu Liu and Hongyan Li; validation, Rui Li, Guangyu Liu and Hongyan Li; formal analysis, Rui Li and Guangyu Liu; investigation, Guangyu Liu; resources, Rui Li and Guangyu Liu; data curation, Rui Li and Guangyu Liu; writing—original draft preparation, Rui Li, Guangyu Liu and Hongyan Li; writing—review & editing, Rui Li and Hongyan Li; visualization, Guangyu Liu and Hongyan Li; supervision, Rui Li and Jin Xia; project administration, Rui Li; funding acquisition, Rui Li. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the National Natural Science Foundation of China (Grant No. U20A2091, 41930107).

Data Availability Statement

The data presented in this study are available upon request from the corresponding author. The data are not publicly available due to restrictions imposed by privacy approval and informed consent agreements regarding study participants.

Acknowledgments

Thanks to Wuhan Geomatic Institute for its support for this work.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
SWPP-HSTPEHigh-Spatiotemporal-Resolution Population Distribution Estimation Based on the Strong and Weak Perception of Population Activity Patterns
SOMSelf-Organizing Map
AIartificial intelligence
SEFPSpatial Environment Functional Purity
DEFFDual-Environment Feature Fusion
MDEFFModified Dual-Environment Feature Fusion
DQW-TOPSISData Quality-based Technique for Order of Preference by Similarity to Ideal Solution
POIpoint of interest
AOIarea of interest
GISGeographic Information Systems
MQEMean Quantization Error

References

  1. Zhou, C.; Sun, J.; Su, F.; Yang, X.; Pei, T.; Ge, Y.; Yang, Y.; Zhang, A.; Liao, X.; Lu, F.; et al. Geographic Information Science Development and Technological Application. Acta Geogr. Sin. 2020, 75, 2593–2609. [Google Scholar] [CrossRef]
  2. Xiao, Z.; Shen, Z. The temporal and spatial evolution of population & industrial agglomeration and environmental pollution and the relevance analysis. J. Arid. Land Resour. Environ. 2019, 33, 1–8. [Google Scholar] [CrossRef]
  3. Zhang, M.; Tan, S.; Zhang, C.; Han, S.; Zou, S.; Chen, E. Assessing the Impact of Fractional Vegetation Cover on Urban Thermal Environment: A Case Study of Hangzhou, China. Sustain. Cities Soc. 2023, 96, 104663. [Google Scholar] [CrossRef]
  4. Yang, J.; Zhan, Y.; Xiao, X.; Xia, J.C.; Sun, W.; Li, X. Investigating the Diversity of Land Surface Temperature Characteristics in Different Scale Cities Based on Local Climate Zones. Urban Clim. 2020, 34, 100700. [Google Scholar] [CrossRef]
  5. Zhang, M.; Zhang, C.; Kafy, A.-A.; Tan, S. Simulating the Relationship between Land Use/Cover Change and Urban Thermal Environment Using Machine Learning Algorithms in Wuhan City, China. Land 2021, 11, 14. [Google Scholar] [CrossRef]
  6. Xia, J.; Zhou, Y.; Li, Z.; Li, F.; Yue, Y.; Cheng, T.; Li, Q. COVID-19 risk assessment driven by urban spatiotemporal big data: A case study of Guangdong-Hong Kong-Macao Greater Bay Area. Cehui Xuebao/Acta Geod. Et Cartogr. Sin. 2020, 49, 671–680. [Google Scholar]
  7. Li, R.; Wang, J.; Wang, S.; Wu, H. Prediction of Network Public Opinion Features in Urban Planning Based on Geographical Case-Based Reasoning. Int. J. Digit. Earth 2022, 15, 890–910. [Google Scholar] [CrossRef]
  8. Jiang, D.; Yang, X.; Wang, N.; Liu, H. Study on spatial distribution of population based on remote sensing and GIS. Adv. Earth Sci. 2002, 17, 734. [Google Scholar]
  9. Wu, S.; Qiu, X.; Wang, L. Population Estimation Methods in GIS and Remote Sensing: A Review. GIScience Remote Sens. 2005, 42, 80–96. [Google Scholar] [CrossRef]
  10. Dobson, J.E.; Bright, E.A.; Coleman, P.R.; Durfee, R.C.; Worley, B.A. LandScan: A global population database for estimating populations at risk. Photogramm. Eng. Remote Sens. 2000, 66, 849–857. [Google Scholar] [CrossRef]
  11. Martin, D.; Cockings, S.; Leung, S. Developing a Flexible Framework for Spatiotemporal Population Modeling. Ann. Assoc. Am. Geogr. 2015, 105, 754–772. [Google Scholar] [CrossRef]
  12. Yao, Y.; Liu, X.; Li, X.; Zhang, J.; Liang, Z.; Mai, K.; Zhang, Y. Mapping Fine-Scale Population Distributions at the Building Level by Integrating Multisource Geospatial Big Data. Int. J. Geogr. Inf. Sci. 2017, 31, 1220–1244. [Google Scholar] [CrossRef]
  13. Zhao, X.; Zhou, Y.; Chen, W.; Li, X.; Li, X.; Li, D. Mapping Hourly Population Dynamics Using Remotely Sensed and Geospatial Data: A Case Study in Beijing, China. GIScience Remote Sens. 2021, 58, 717–732. [Google Scholar] [CrossRef]
  14. Shi, Q.; Zhuo, L.; Tao, H.; Li, Q. Mining Hourly Population Dynamics by Activity Type Based on Decomposition of Sequential Snapshot Data. Int. J. Digit. Earth 2022, 15, 1395–1416. [Google Scholar] [CrossRef]
  15. Chen, Y.; Liu, X.; Gao, W.; Wang, R.Y.; Li, Y.; Tu, W. Emerging Social Media Data on Measuring Urban Park Use. Urban For. Urban Green. 2018, 31, 130–141. [Google Scholar] [CrossRef]
  16. Wu, L.; Cheng, X.; Kang, C.; Zhu, D.; Huang, Z.; Liu, Y. A Framework for Mixed-Use Decomposition Based on Temporal Activity Signatures Extracted from Big Geo-Data. Int. J. Digit. Earth 2020, 13, 708–726. [Google Scholar] [CrossRef]
  17. Liu, Z.; Ma, T.; Du, Y.; Pei, T.; Yi, J.; Peng, H. Mapping Hourly Dynamics of Urban Population Using Trajectories Reconstructed from Mobile Phone Records. Trans. GIS 2018, 22, 494–513. [Google Scholar] [CrossRef]
  18. Khodabandelou, G.; Gauthier, V.; Fiore, M.; El-Yacoubi, M.A. Estimation of Static and Dynamic Urban Populations with Mobile Network Metadata. IEEE Trans. Mob. Comput 2019, 18, 2034–2047. [Google Scholar] [CrossRef]
  19. Khodabandelou, G.; Gauthier, V.; El-Yacoubi, M.; Fiore, M. Population Estimation from Mobile Network Traffic Metadata. In Proceedings of the 2016 IEEE 17th International Symposium on a World of Wireless, Mobile and Multimedia Networks (WoWMoM), Coimbra, Portugal, 21–24 June 2016; IEEE: Coimbra, Portugal, 2016; pp. 1–9. [Google Scholar] [CrossRef]
  20. Zong, Z.; Feng, J.; Liu, K.; Shi, H.; Li, Y. DeepDPM: Dynamic Population Mapping via Deep Neural Network. AAAI 2019, 33, 1294–1301. [Google Scholar] [CrossRef]
  21. Chen, J.; Pei, T.; Shaw, S.-L.; Lu, F.; Li, M.; Cheng, S.; Liu, X.; Zhang, H. Fine-Grained Prediction of Urban Population Using Mobile Phone Location Data. Int. J. Geogr. Inf. Sci. 2018, 32, 1770–1786. [Google Scholar] [CrossRef]
  22. Liu, G.; Li, R.; Xia, J.; Liu, Z.; Cai, J.; Wu, H.; Peng, M. Dual-Environment Feature Fusion-Based Method for Estimating Building-Scale Population Distributions. Geo-Spat. Inf. Sci. 2024, 27, 1943–1958. [Google Scholar] [CrossRef]
  23. Qi, W.; Li, Y.; Liu, S.; Gao, X.; Zhao, M. Estimation of Urban Population at Daytime and Nighttime and Analyses of Their Spatial pattern: A Case Study of Haidian District, Beijing. Acta Geogr. Sin. 2013, 68, 1344–1356. [Google Scholar] [CrossRef]
  24. Ma, Y.; Xu, W.; Zhao, X.; Li, Y. Modeling the Hourly Distribution of Population at a High Spatiotemporal Resolution Using Subway Smart Card Data: A Case Study in the Central Area of Beijing. IJGI 2017, 6, 128. [Google Scholar] [CrossRef]
  25. Feng, J.; Li, Y.; Xu, F.; Jin, D. A Bimodal Model to Estimate Dynamic Metropolitan Population by Mobile Phone Data. Sensors 2018, 18, 3431. [Google Scholar] [CrossRef] [PubMed]
  26. Liu, S.; Long, Y.; Zhang, L.; Liu, H. Semantic Enhancement of Human Urban Activity Chain Construction Using Mobile Phone Signaling Data. IJGI 2021, 10, 545. [Google Scholar] [CrossRef]
  27. Fan, Y.-D.; Shi, P.-J.; Gu, Z.-H.; Li, X.-B. A Method of Data Gridding from Administration Cell to Gridding Cell. Geogr. Sci. 2004, 24, 105–108. [Google Scholar]
  28. Bergroth, C.; Järv, O.; Tenkanen, H.; Manninen, M.; Toivonen, T. A 24-Hour Population Distribution Dataset Based on Mobile Phone Data from Helsinki Metropolitan Area, Finland. Sci. Data 2022, 9, 39. [Google Scholar] [CrossRef]
  29. Deville, P.; Linard, C.; Martin, S.; Gilbert, M.; Stevens, F.R.; Gaughan, A.E.; Blondel, V.D.; Tatem, A.J. Dynamic Population Mapping Using Mobile Phone Data. Proc. Natl. Acad. Sci. USA 2014, 111, 15888–15893. [Google Scholar] [CrossRef]
  30. Xia, J.; Li, R.; Yang, X.; Wang, J.; Zou, N. High-Resolution Population Distribution Prediction Considering the Spatiotemporal Heterogeneity of Geolocated Behavior. Int. J. Geogr. Inf. Sci. 2025, 1–33. [Google Scholar] [CrossRef]
  31. Calabrese, F.; Diao, M.; Di Lorenzo, G.; Ferreira, J.; Ratti, C. Understanding Individual Mobility Patterns from Urban Sensing Data: A Mobile Phone Trace Example. Transp. Res. Part C Emerg. Technol. 2013, 26, 301–313. [Google Scholar] [CrossRef]
  32. Huang, A.; Levinson, D. Axis of Travel: Modeling Non-Work Destination Choice with GPS Data. Transp. Res. Part C Emerg. Technol. 2015, 58, 208–223. [Google Scholar] [CrossRef]
  33. Cao, G.; Wang, S.; Hwang, M.; Padmanabhan, A.; Zhang, Z.; Soltani, K. A Scalable Framework for Spatiotemporal Analysis of Location-Based Social Media Data. Comput. Environ. Urban Syst. 2015, 51, 70–82. [Google Scholar] [CrossRef]
  34. Patel, N.N.; Stevens, F.R.; Huang, Z.; Gaughan, A.E.; Elyazar, I.; Tatem, A.J. Improving Large Area Population Mapping Using Geotweet Densities. Trans. GIS 2017, 21, 317–331. [Google Scholar] [CrossRef] [PubMed]
  35. Tsou, M.-H.; Zhang, H.; Nara, A.; Han, S.Y. Estimating Hourly Population Distribution Change at High Spatiotemporal Resolution in Urban Areas Using Geo-Tagged Tweets, Land Use Data, and Dasymetric Maps. arXiv 2018, arXiv:1810.06554. [Google Scholar] [CrossRef]
  36. Sun, L.; Axhausen, K.W. Understanding Urban Mobility Patterns with a Probabilistic Tensor Factorization Framework. Transp. Res. Part B Methodol. 2016, 91, 511–524. [Google Scholar] [CrossRef]
  37. Nemeškal, J.; Ouředníček, M.; Pospíšilová, L. Temporality of Urban Space: Daily Rhythms of a Typical Week Day in the Prague Metropolitan Area. J. Maps 2020, 16, 30–39. [Google Scholar] [CrossRef]
  38. Nitsche, P.; Widhalm, P.; Breuss, S.; Brändle, N.; Maurer, P. Supporting Large-Scale Travel Surveys with Smartphones—A Practical Approach. Transp. Res. Part C Emerg. Technol. 2014, 43, 212–221. [Google Scholar] [CrossRef]
  39. Kontokosta, C.E.; Johnson, N. Urban Phenology: Toward a Real-Time Census of the City Using Wi-Fi Data. Comput. Environ. Urban Syst. 2017, 64, 144–153. [Google Scholar] [CrossRef]
  40. Kubíček, P.; Konečný, M.; Stachoň, Z.; Shen, J.; Herman, L.; Řezník, T.; Staněk, K.; Štampach, R.; Leitgeb, Š. Population Distribution Modelling at Fine Spatio-Temporal Scale Based on Mobile Phone Data. Int. J. Digit. Earth 2019, 12, 1319–1340. [Google Scholar] [CrossRef]
  41. Alexander, L.; Jiang, S.; Murga, M.; González, M.C. Origin–Destination Trips by Purpose and Time of Day Inferred from Mobile Phone Data. Transp. Res. Part C Emerg. Technol. 2015, 58, 240–250. [Google Scholar] [CrossRef]
  42. Liu, Z.; Li, R.; Cai, J.; Hu, Q.; Wu, H. Mobility Difference Index: A Quantitative Method for Detecting Human Mobility Difference. GIScience Remote Sens. 2024, 61, 2301274. [Google Scholar] [CrossRef]
  43. Wang, Z.; Yue, Y.; He, B.; Nie, K.; Tu, W.; Du, Q.; Li, Q. A Bayesian Spatio-Temporal Model to Analyzing the Stability of Patterns of Population Distribution in an Urban Space Using Mobile Phone Data. Int. J. Geogr. Inf. Sci. 2021, 35, 116–134. [Google Scholar] [CrossRef]
  44. Liu, S. Analysis and Research on the Spatio-Temporal Pattern of Urban Crowd Activities Based on Mobile Signaling Data. Ph.D. Thesis, Nanjing Normal University, Nanjing, China, 2022. [Google Scholar] [CrossRef]
  45. Chen, Y.; Liu, X.; Li, X.; Liu, X.; Yao, Y.; Hu, G.; Xu, X.; Pei, F. Delineating Urban Functional Areas with Building-Level Social Media Data: A Dynamic Time Warping (DTW) Distance Based k -Medoids Method. Landsc. Urban Plan. 2017, 160, 48–60. [Google Scholar] [CrossRef]
  46. Cici, B.; Gjoka, M.; Markopoulou, A.; Butts, C.T. On the Decomposition of Cell Phone Activity Patterns and Their Connection with Urban Ecology. In Proceedings of the 16th ACM International Symposium on Mobile Ad Hoc Networking and Computing, Hangzhou, China, 22–25 June 2015; ACM: Hangzhou, China, 2015; pp. 317–326. [Google Scholar] [CrossRef]
  47. Li, D.; Wang, S.; Li, D. Spatial Data Mining; Springer: Berlin/Heidelberg, Germany, 2015. [Google Scholar] [CrossRef]
  48. Lin, L.; Lin, G.; Yan, X.; Chen, A.; Yang, Z. Spatialization Models of Census Data: A Review. J. Subtrop. Resour. Environ. 2010, 5, 10–16. [Google Scholar] [CrossRef]
  49. Dai, T.; Zhao, Y.; Liao, C. Maximum Spatial Equality of Educational Opportunity Based on the Proximity-Random System: A Case Study of Xicheng District, Beijing. Econ. Geogr. 2019, 39, 7. [Google Scholar] [CrossRef]
  50. Huang, J.; Zhang, R.; Hu, G. A Research of the Elderly’s Daily Life Circle Based on Spatial-Temporal Behaviors—Analysis of Place Recognition and Spatial Features. Urban Plan. Forum 2019, 3, 87–95. [Google Scholar] [CrossRef]
  51. Vesanto, J.; Alhoniemi, E. Clustering of the Self-Organizing Map. IEEE Trans. Neural Netw. 2000, 11, 586–600. [Google Scholar] [CrossRef]
  52. Li, Z. Research on Demand of Emergency Supplies for Flood Under the Cooperation of Baidu Heat Map and Multi-Layer Perceptron Neural Network. Master’s Thesis, Wuhan University of Technology, Wuhan, China, 2019. [Google Scholar] [CrossRef]
  53. Chen, Z.; Yu, B.; Yang, C.; Zhou, Y.; Yao, S.; Qian, X.; Wang, C.; Wu, B.; Wu, J. An Extended Time Series (2000–2018) of Global NPP-VIIRS-like Nighttime Light Data from a Cross-Sensor Calibration. Earth Syst. Sci. Data 2021, 13, 889–906. [Google Scholar] [CrossRef]
  54. Zheng, Q.; Seto, K.C.; Zhou, Y.; You, S.; Weng, Q. Nighttime Light Remote Sensing for Urban Applications: Progress, Challenges, and Prospects. ISPRS J. Photogramm. Remote Sens. 2023, 202, 125–141. [Google Scholar] [CrossRef]
  55. Yang, X.; Li, R.; Xia, J.; Wang, J.; Li, H.; Zou, N. HMS-PAU-IN: A Heterogeneous Multi-Scale Spatiotemporal Interaction Network Model for Population Analysis Units. Int. J. Appl. Earth Obs. Geoinf. 2025, 140, 104565. [Google Scholar] [CrossRef]
  56. Liu, X.; Li, R.; Cai, J.; Li, B.; Li, Y. Quantifying Urban Function Accessibility and Its Effect on Population Mobility Based on Function-Associated Population Mobility Network. Int. J. Appl. Earth Obs. Geoinf. 2024, 135, 104273. [Google Scholar] [CrossRef]
  57. Li, H.; Li, R.; Cai, J.; Wang, S. The Spatial Equilibrium Model of Elderly Care Facilities with High Spatiotemporal Sensitivity and Its Economic Associations Study. ISPRS Int. J. Geo-Inf. 2024, 13, 268. [Google Scholar] [CrossRef]
  58. Cai, J.; Li, R.; Liu, Z.; Liu, X.; Wu, H. Quantifying Spatial Interaction Centrality in Urban Population Mobility: A Mobility Feature- and Network Topology-Based Locational Measure. Sustain. Cities Soc. 2024, 114, 105769. [Google Scholar] [CrossRef]
Figure 1. High-Spatiotemporal-Resolution Population Distribution Estimation Based on the Strong and Weak Perception of Population Activity Patterns (SWPP-HSTPE) method flow chart.
Figure 1. High-Spatiotemporal-Resolution Population Distribution Estimation Based on the Strong and Weak Perception of Population Activity Patterns (SWPP-HSTPE) method flow chart.
Ijgi 15 00034 g001
Figure 2. Schematic diagram of dual-environment construction of buildings [22].
Figure 2. Schematic diagram of dual-environment construction of buildings [22].
Ijgi 15 00034 g002
Figure 3. The Self-Organizing Map (SOM) algorithm flow chart.
Figure 3. The Self-Organizing Map (SOM) algorithm flow chart.
Ijgi 15 00034 g003
Figure 4. Overview of the study area: Jianghan and Wuchang Districts in Wuhan. (a) Location of Wuhan in Hubei Province. (b) Location of Jianghan and Wuchang Districts in Wuhan. (c) Building units represent the study area.
Figure 4. Overview of the study area: Jianghan and Wuchang Districts in Wuhan. (a) Location of Wuhan in Hubei Province. (b) Location of Jianghan and Wuchang Districts in Wuhan. (c) Building units represent the study area.
Ijgi 15 00034 g004
Figure 5. Mean Quantization Error (MQE) versus cluster number.
Figure 5. Mean Quantization Error (MQE) versus cluster number.
Ijgi 15 00034 g005
Figure 6. Normalized clustering results of the heat values’ time series characteristic curve.
Figure 6. Normalized clustering results of the heat values’ time series characteristic curve.
Ijgi 15 00034 g006
Figure 7. Stacked bar chart of usable area percentage of various types of buildings for different population activity patterns.
Figure 7. Stacked bar chart of usable area percentage of various types of buildings for different population activity patterns.
Ijgi 15 00034 g007
Figure 8. Clustering results of grid-scale heat values and their mean curves for buildings with similar spatial functional usable area ratios. (a) Population activity pattern curves with similar home spatial environment functions. (b) Population activity pattern curves with similar social spatial environment functions. (c) Population activity pattern curves with similar work spatial environment functions.
Figure 8. Clustering results of grid-scale heat values and their mean curves for buildings with similar spatial functional usable area ratios. (a) Population activity pattern curves with similar home spatial environment functions. (b) Population activity pattern curves with similar social spatial environment functions. (c) Population activity pattern curves with similar work spatial environment functions.
Ijgi 15 00034 g008
Figure 9. Time series characteristic curves of basic population activity patterns.
Figure 9. Time series characteristic curves of basic population activity patterns.
Ijgi 15 00034 g009
Figure 10. Error box plot of the linear decomposition results of different population activity patterns. (a) R 2 distribution of linear decomposition results of different population activity patterns. (b) R M S E distribution of linear decomposition results of different population activity patterns.
Figure 10. Error box plot of the linear decomposition results of different population activity patterns. (a) R 2 distribution of linear decomposition results of different population activity patterns. (b) R M S E distribution of linear decomposition results of different population activity patterns.
Ijgi 15 00034 g010
Figure 11. Hourly building-scale population distribution estimates for the study area.
Figure 11. Hourly building-scale population distribution estimates for the study area.
Ijgi 15 00034 g011
Figure 12. Street maps and population changes in different activity patterns of the sample areas. (A1) Street map of the Jianghan Road business district. (B1) Street map of Hongshan Square. (C1) Street map of Xinxinjiayuan, Baishazhou. (A2) Population changes in the Jianghan Road business district. (B2) Population changes in Hongshan Square. (C2) Population changes in Xinxinjiayuan, Baishazhou.
Figure 12. Street maps and population changes in different activity patterns of the sample areas. (A1) Street map of the Jianghan Road business district. (B1) Street map of Hongshan Square. (C1) Street map of Xinxinjiayuan, Baishazhou. (A2) Population changes in the Jianghan Road business district. (B2) Population changes in Hongshan Square. (C2) Population changes in Xinxinjiayuan, Baishazhou.
Ijgi 15 00034 g012
Figure 13. Comparison of population estimation accuracy during the weak-perception period across various data sets. (a) Accuracy of population estimation based on the MDEFF model. (b) Accuracy of population estimation based on the WorldPop dataset. (c) Accuracy of population estimation based on the GHS-POP dataset.
Figure 13. Comparison of population estimation accuracy during the weak-perception period across various data sets. (a) Accuracy of population estimation based on the MDEFF model. (b) Accuracy of population estimation based on the WorldPop dataset. (c) Accuracy of population estimation based on the GHS-POP dataset.
Ijgi 15 00034 g013
Figure 14. Results of different indices for the baseline model and High-Spatiotemporal-Resolution Population Distribution Estimation Based on the Strong and Weak Perception of Population Activity Patterns (SWPP-HSTPE).
Figure 14. Results of different indices for the baseline model and High-Spatiotemporal-Resolution Population Distribution Estimation Based on the Strong and Weak Perception of Population Activity Patterns (SWPP-HSTPE).
Ijgi 15 00034 g014
Figure 15. Population distribution heat map at the building scale during P A w ’s peak (16:00–17:00).
Figure 15. Population distribution heat map at the building scale during P A w ’s peak (16:00–17:00).
Ijgi 15 00034 g015
Figure 16. Population distribution heat map at the building scale during P A s ’s peak (13:00–14:00).
Figure 16. Population distribution heat map at the building scale during P A s ’s peak (13:00–14:00).
Ijgi 15 00034 g016
Figure 17. Population distribution heat map at the building scale during P A r ’s peak (0:00–8:00).
Figure 17. Population distribution heat map at the building scale during P A r ’s peak (0:00–8:00).
Ijgi 15 00034 g017
Table 1. Population proportion of each age group in Wuhan.
Table 1. Population proportion of each age group in Wuhan.
Age (Years)Type Alias C Population   Proportion   p o p P C (%)
0–2C01.85
3–5C12.77
6–11C25.04
12–14C31.84
15–17C41.54
18–59C569.72
Over 60C617.23
Table 2. All-day activity attributes of each group of weakly perceived activity populations.
Table 2. All-day activity attributes of each group of weakly perceived activity populations.
Type Alias C Activity   Time   t C All-Day Activity Attributes
C0NoneStay at home all day
C18:00–16:30 Go   to   kindergarten   during   t C , and stay at home the rest of the time
C28:00–18:00 Go   to   primary   school   during   t C , and stay at home the rest of the time
C38:00–18:00 Go   to   junior   high   school   during   t C , and stay at home the rest of the time
C47:30–21:00 Go   to   high   school   during   t C , and stay at home the rest of the time
C6NoneStay at home all day
Table 3. Correspondence between building types and population activity patterns.
Table 3. Correspondence between building types and population activity patterns.
Building TypePopulation Activity PatternPopulation
Residential building ( B u i r ) Home activity ( P A r ) Population engaged in home activities ( p o p r )
Educational office building ( B u i e ) Work activity ( P A w ) Population engaged in work activities ( p o p w )
Commercial service building ( B u i c ) Social activity ( P A s ) Population engaged in social activities ( p o p s )
Table 4. Open period mapping table of land-use type.
Table 4. Open period mapping table of land-use type.
Land-Use Type i s o p e n
Transportation service station land[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
Parks and green spaces[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
Square land[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
Public utility land[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
Logistics and warehousing land[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
Port terminal land[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
Rail transit land[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
Urban and rural road land[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
Urban residential land[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
Land for science, education, culture, and health[0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0]
Industrial land[0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0]
Land for press and publication[0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0]
Commercial service facilities land[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0]
Table 5. Data name and source.
Table 5. Data name and source.
Data TypeData NameData SourceYear
Foundational
geographic data
Population dataGovernment Statistics2020
Building dataGovernment Statistics2021
Land-use dataGovernment Statistics2020
Road networkGovernment Statistics2018
Social
perception data
Baidu heatmapHuiyan of Baidu Maps2023
Night light data of Luojia-01 Hubei Data and Application Center of High Resolution Earth Observation System2018
Point of interestAmap2022
Area of interestAmap2023
Validation dataWorldPophttps://hub.worldpop.org/2020
GHS-POPhttps://ghsl.jrc.ec.europa.eu/2015
LandScanhttps://landscan.ornl.gov/2022
Anjuke’s community datahttps://m.anjuke.com/bj/2024
Baidu’s population dataHuiyan of Baidu Maps2023
Employed population dataGovernment Statistics2020
Table 6. Changes in the proportion of the population with different activity patterns in the study area.
Table 6. Changes in the proportion of the population with different activity patterns in the study area.
Time (t) Proportion   of   p o p r Proportion   of   p o p w Proportion   of   p o p s
0–792.27%0.46%7.27%
873.95%8.65%17.40%
958.64%17.55%23.81%
1034.26%25.84%39.91%
1132.66%27.82%39.52%
1233.21%24.64%42.15%
1334.15%22.88%42.98%
1433.67%25.15%41.18%
1532.01%27.28%40.71%
1631.77%28.36%39.87%
1732.96%28.19%38.85%
1834.02%26.49%39.49%
1944.69%12.20%43.11%
2055.18%11.24%33.58%
2167.61%0.94%31.45%
2271.35%0.75%27.90%
2387.41%0.51%12.08%
Table 7. Correlation between the estimated results of various population distributions during the weak-perception period of communities built in different years and the number of households in the community.
Table 7. Correlation between the estimated results of various population distributions during the weak-perception period of communities built in different years and the number of households in the community.
Years the Community was Built (Years)MDEFFWorldPopGHS-POP
(2014, 2024]0.7200.5630.587
(2004, 2014]0.7590.6130.624
Before 20040.8070.6060.600
Table 8. Correlation between the main model and the ablation model estimation results and the employed population.
Table 8. Correlation between the main model and the ablation model estimation results and the employed population.
Population with Different Activity PatternsSWPP-HSTPEAblation Model
p o p w 0.8620.743
p o p s 0.6240.590
p o p r 0.5710.552
Table 9. Comparison of different metrics between the baseline model and High-Spatiotemporal-Resolution Population Distribution Estimation Based on the Strong and Weak Perception of Population Activity Patterns (SWPP-HSTPE).
Table 9. Comparison of different metrics between the baseline model and High-Spatiotemporal-Resolution Population Distribution Estimation Based on the Strong and Weak Perception of Population Activity Patterns (SWPP-HSTPE).
HourPearsonMean Absolute Error (MAE)Root Mean Square Error (RMSE)
BaselineSWPP-HSTPEBaselineSWPP-HSTPEBaselineSWPP-HSTPE
80.32740.2867382.7017379.0218671.2818670.0618
90.32990.2993373.5400370.9798661.8940661.3043
100.32390.3320366.5802359.5478686.4726676.9503
110.33530.3622366.9983358.8352675.4161660.6282
120.30860.3250388.3766381.4980720.7615710.2570
130.30890.3399398.5180389.7664715.0393701.1763
140.32240.3344398.1758391.0668717.3915706.8538
150.32490.3423397.3794389.2459716.3840704.3898
160.32150.3595399.3398391.1488707.6874692.0621
170.32820.3492377.9474370.3824669.8795658.6959
180.31760.3470385.3712377.0137701.6371687.4497
190.30820.3296402.1709395.9690698.2332688.5163
200.29220.3150410.6992402.4604704.6128694.4084
210.27350.2896439.9255431.7153749.9794740.5431
220.24190.2666481.3857471.6276822.2811810.9434
230.23160.2326541.5071533.5981933.8147926.0437
Bold values indicate the superior performance of the SWPP-HSTPE model.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, R.; Liu, G.; Li, H.; Xia, J. High-Spatiotemporal-Resolution Population Distribution Estimation Based on the Strong and Weak Perception of Population Activity Patterns. ISPRS Int. J. Geo-Inf. 2026, 15, 34. https://doi.org/10.3390/ijgi15010034

AMA Style

Li R, Liu G, Li H, Xia J. High-Spatiotemporal-Resolution Population Distribution Estimation Based on the Strong and Weak Perception of Population Activity Patterns. ISPRS International Journal of Geo-Information. 2026; 15(1):34. https://doi.org/10.3390/ijgi15010034

Chicago/Turabian Style

Li, Rui, Guangyu Liu, Hongyan Li, and Jing Xia. 2026. "High-Spatiotemporal-Resolution Population Distribution Estimation Based on the Strong and Weak Perception of Population Activity Patterns" ISPRS International Journal of Geo-Information 15, no. 1: 34. https://doi.org/10.3390/ijgi15010034

APA Style

Li, R., Liu, G., Li, H., & Xia, J. (2026). High-Spatiotemporal-Resolution Population Distribution Estimation Based on the Strong and Weak Perception of Population Activity Patterns. ISPRS International Journal of Geo-Information, 15(1), 34. https://doi.org/10.3390/ijgi15010034

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop