Population Mapping with Multisensor Remote Sensing Images and Point-Of-Interest Data

Fine-resolution population distribution mapping is necessary for many purposes, which cannot be met by aggregated census data due to privacy. Many approaches utilize ancillary data that are related to population density, such as nighttime light imagery and land use, to redistribute the population from census to finer-scale units. However, most of the ancillary data used in the previous studies of population modeling are environmental data, which can only provide a limited capacity to aid population redistribution. Social sensing data with geographic information, such as point-of-interest (POI), are emerging as a new type of ancillary data for urban studies. This study, as a nascent attempt, combined POI and multisensor remote sensing data into new ancillary data to aid population redistribution from census to grid cells at a resolution of 250 m in Zhejiang, China. The accuracy of the results was assessed by comparing them with WorldPop. Results showed that our approach redistributed the population with fewer errors than WorldPop, especially at the extremes of population density. The approach developed in this study—incorporating POI with multisensor remotely sensed data in redistributing the population onto finer-scale spatial units—possessed considerable potential in the era of big data, where a substantial volume of social sensing data is increasingly being collected and becoming available.


Introduction
High-resolution population distribution data are essential in addressing a wide range of critical issues, such as vulnerability assessment [1,2], urban planning [3,4], emergency management [5], and public health [6,7].In most countries worldwide, commonly available information on population number and composition through the Census Bureau is aggregated over administrative units, such as provinces, counties, townships, census tracts, and block groups.The usefulness of these census data is limited due to the spatial heterogeneity of population distribution within administrative units [8].Meanwhile, both the availability and quality of environmental data are increasing.Such an unmatched development of demographic and socioeconomic data and natural science data, especially at the fine levels of granularity, has hindered the advancement of decision making in many aspects, such as resource allocation [9] and disease prevention [10], and, more broadly, the integration of natural and social sciences [11].Therefore, the development of efficient methods for accurately modeling fine-scale population distribution is urgently needed.
A number of approaches have been developed to disaggregate census population data, the most reliable population data sources worldwide, onto fine-scale grids, with the value of each grid representing the population number within that grid.Examples include areal weighting interpolation [12], pycnophylactic interpolation [13], and dasymetric mapping [11,14] which outperforms previous approaches by utilizing high-quality ancillary data to redistribute the population over space [15].Remote sensing data, such as land cover data, have been widely used as ancillary information on where people may live in dasymetric mapping approaches for population redistribution [11,16].
Since the late 1990s, satellite-derived nighttime light (NTL) data have been proven to be a reliable proxy for population distribution [17,18].The NTL dataset from the US Air Force Defense Meteorological Satellite Program Operational Linescan System (DMSP/OLS) is a widely used product for dasymetric mapping [19][20][21][22].Despite its benefits, the DMSP/OLS dataset has several limitations, such as its single spectral band and coarse spatial resolution (2.7 km), saturation in urban centers, and blooming effect [23][24][25].For example, due to the blooming effect, the lit areas shown in the DMSP/OLS dataset are generally larger than actual urban areas [26,27].Several studies have been conducted to overcome limitations above, such as combining NTL and land cover data to improve the representation of population distribution [25,[28][29][30].Using such data fusion approaches, a pixel-based elevation-adjusted human settlement index (EAHSI) has been produced on the basis of NTL, enhanced vegetation index (EVI), and the digital elevation model (DEM).The EAHSI was used as ancillary data to generate a population density map with the spatial resolution of 250 m for Zhejiang province, China [31].However, some industrial areas with a high EAHSI value are actually less populated than expected, which has led to a considerable degree of population overestimation.
Social sensing data that are becoming extremely popular in the era of big data could be a potential solution to improve the accuracy of the population products generated purely on the basis of the environmental ancillary data.For example, the point-of-interest (POI) data are one of the most commonly used social sensing datasets in urban studies.Each POI with geographic coordinates generally represents a functionally built environmental feature.Certain POI types associated with more human activities may indicate better livability and higher population density than other types [32].Recently, POI data have been used as ancillary data to enhance population estimation over relatively small areas [3,4].However, these data have not been combined with other data sources in terms of high-resolution population modeling.
Here, we incorporated POIs with multisource remote sensing data to further improve the accuracy of the population modeling.The resulting population dataset was compared with a widely used global population product.This study has introduced the field of high-resolution population modeling by utilizing an innovative combination of remote sensing and social sensing data to refine population distribution.With both types of sensing data increasingly becoming available, the approach proposed in this study would lead to the development of better predictive tools for population estimation.

Study Area
The study area was Zhejiang, which is located in the southeastern coast of China (south of Shanghai), with a total land area of approximately 101,800 km 2 and a long (6484 km) coastline in the eastern part (Figure 1a).With approximately 54.4 million permanent residents at the end of 2010 (the latest census year in China), Zhejiang is the 10th most populated province and that with the 4th largest gross domestic product in China.Hills and mountains cover 70.4% of the land in Zhejiang, with only 23.2% of the land covered by plains and basins.The majority of the population in Zhejiang resides in the northern plains and eastern coastal areas (Figure 1b).The hierarchy of the administrative units in Zhejiang from coarse to fine includes 11 cities, 90 counties, and 1520 townships.

Data Sources and Preprocessing
The seven datasets used in this study were all produced in 2010 and obtained from different sources (Table 1).The township-level (equivalent to level 4 of the Global Administrative Unit Layer defined by the Food and Agriculture Organization) population data and administrative unit boundaries were obtained and combined as the original population data source.For accuracy assessment, the Zhejiang part of a global gridded population dataset, that is, the WorldPop with a spatial resolution of 100 m [33], was used.

Data Sources and Preprocessing
The seven datasets used in this study were all produced in 2010 and obtained from different sources (Table 1).The township-level (equivalent to level 4 of the Global Administrative Unit Layer defined by the Food and Agriculture Organization) population data and administrative unit boundaries were obtained and combined as the original population data source.For accuracy assessment, the Zhejiang part of a global gridded population dataset, that is, the WorldPop with a spatial resolution of 100 m [33], was used.The POI data including 386,178 POIs located in Zhejiang falling within 20 categories (Table 2) were obtained from Baidu Map Services (http://map.baidu.com),which is the most widely used and also largest web map service provider in China [3].
The Moderate Resolution Imaging Spectroradiometer EVI products (MOD13Q1) at a spatial resolution of 250 m, which was available every 16 days in 2010, were downloaded from the US Geological Survey.Compared with the normalized difference vegetation index, which is a well-known conventional vegetation index, EVI is responsive to canopy structural variations [34]; therefore, it is likely to avoid saturation in the southern and western areas of Zhejiang with extremely dense vegetation.To remove the cloud effects, the annual maximum EVI (EVI max ) was produced for each grid cell by implementing raster math calculations on 23 EVI images over a year [31].The NTL data in 2010 was obtained from a DMSP/OLS stable light image composite at a spatial resolution of 1 km, which is produced by the National Oceanic and Atmospheric Administration's National Centers for Environmental Information.The digital number (DN) values in an NTL image varying from 0 to 63 represent the average brightness of NTL in 2010, except for 63, which was assigned to saturated pixels.The Advanced Spaceborne Thermal Emission and Reflection Radiometer Global DEM version 2 at a spatial resolution of 30 m was downloaded from the US Land Processes Distributed Active Archive Center.Both NTL and DEM data were resampled to 250 m through bilinear interpolation to spatially match the EVI data.All three remote sensing images above were reprojected to the Albers Conical Equal Area Projection and then clipped by the Zhejiang boundary.

Methodology
The objective of our study was to spatially disaggregate census data by the township level into each pixel to produce a population distribution map with a fine spatial resolution (i.e., 250 m × 250 m).We adopted an improved linear regression-based method that combined the multisource remote sensing images and POIs.The major steps of this improved method are shown by the flowchart (Figure 2).

Generating an EAHSI Image
An EAHSI image covering Zhejiang for the year 2010, with a spatial resolution of 250m, was generated based on the EVImax value and resampled NTL and DEM layers, as follows: where e is approximately equal to 2.71828, and NTLnor is the normalized DN value of the NTL image, which was calculated as follows: where NTLmax and NTLmin are the maximum and minimum NTL values in the study area, respectively.Detailed information about EAHSI can be found in a previous study [31].

Generating an EAHSI Image
An EAHSI image covering Zhejiang for the year 2010, with a spatial resolution of 250m, was generated based on the EVI max value and resampled NTL and DEM layers, as follows: where e is approximately equal to 2.71828, and NTLnor is the normalized DN value of the NTL image, which was calculated as follows: where NTLmax and NTLmin are the maximum and minimum NTL values in the study area, respectively.Detailed information about EAHSI can be found in a previous study [31].

Generating a POI Density Layer
Spearman's correlation analysis was adopted to examine the relationship between the number of each POI category and the population at the township level (Table 2).Kernel density estimation (KDE) is a well-established method in analyzing the first-order properties of a point event distribution [35][36][37] and identifying hot spots [38][39][40].KDE was used to convert each category of discrete POIs into a smooth and continuous density surface.Previous studies showed that the statistical results are insignificantly affected by the choice of the kernel function; hence, bandwidth is the main parameter for KDE [35,36].The planar KDE with a quartic kernel function, which is one of the most commonly used functions [41], was implemented in this study.The township level boundary map was used to summarize each category of POI density to level 4 and train the linear correlation between the sum value of POI density and the population counts.We tested different bandwidths ranging from 500 m to 8000 m at an interval of 100 m.There was a slight fluctuation when the bandwidth was between 2000 m and 5000 m, and the correlation coefficients were relatively high.Finally, we determined a reasonable bandwidth for all categories of POI at a 3000 m bandwidth when the POI densities of most categories had the largest correlation coefficients with the population.Thus, we selected 3000 m as the bandwidth of the KDEs.
Principal component analysis, which is one of the most commonly used dimension-reducing techniques that can reduce a large number of correlated variables to a small number of uncorrelated ones [42], was adopted to combine multiple POI kernel density maps into one composite POI density layer.

Mapping Population
Given that both POI density and EAHSI values linearly correlate to the population count at the township level (graphical abstract), a multiple linear regression model was built, as follows: where POP represents the estimated population counts at township level, and the coefficients "a" and "b" are the average values of the 10 repeated trials of 10-fold cross-validation.A 10-fold cross-validation was adapted for randomly dividing township census data into 10 groups, and repeated 10 trials to determine steady coefficients "a" and "b".Specifically, census data of nine groups of the townships were used to train the model, and the census data in the remaining group were used to evaluate the multiple linear regression model.To show the advantage of fusing POIs for population estimation, we also conducted similar cross-validations to build a linear regression model by using only EAHSI.Table 3 shows a summary of 100 groups' repeated trials for EAHSI and POI-EAHSI.Finally, the gridded EAHSI and POI were used to disaggregate POP at the township level to predict pixel-level population counts.

Accuracy Assessment
To highlight the fact that the use of POI data can significantly increase the accuracy of population mapping, we compared our population map produced by POI and EAHSI (referred to as the POI-EAHSI population map) with the map produced only by EAHSI and the WorldPop gridded population products.For accuracy assessment, an average estimate of out-of-sample prediction was generalized for POI-EAHSI and EAHSI data sets and WorldPop population data was aggregated by townships and then compared with census data to evaluate the accuracy.Summary statistics, including the root mean square error (RMSE), the RMSE divided by the mean township population count (%RMSE), the mean absolute error, and mean relative error (MRE), were calculated for the three methods.

Population Density
Table 2 shows that all categories of POIs were positively related to population counts.We combined 20 kernel POI density maps to one composite POI density layer.Only the first principal component image (Figure 3) was used because its contributing rate of the cumulative sums of squares reached 88.07%.Table 3 shows the summary of 10 repeated trials of 10-fold cross-validation, which indicated that the multiple linear regression model on the basis of fused POIs and EAHSI was credible to estimate population distribution, with a mean value of the coefficient of determination (R 2 ) of 0.78, while the mean R 2 of using only EAHSI was 0.55.The %RMSE and MRE of incorporating POIs were also significantly smaller than using only EAHSI.
According to Equation ( 1) with the a and b values of 52.61 and 25.61, respectively, the gridded population map in Zhejiang for 2010 at a spatial resolution of 250 m was generated.Most of the population lived in the urban agglomerations around Hangzhou Bay, Wenzhou-Taizhou coastal region, and Jinhua-Quzhou basin.The Hangzhou and Ningbo regions in northern Zhejiang were the most heavily populated regions (Figure 4).The spatial distribution of predicted population for Zhejiang was generally consistent with the results of a previous study [31].However, population distribution in the present study revealed apparent spatial heterogeneity and rich information in urban centers due to the combination of POIs.The population density map can be widely used in numerous activities, such as demographic studies, decision making, spatial planning, and emergency response in Zhejiang.Table 3 shows the summary of 10 repeated trials of 10-fold cross-validation, which indicated that the multiple linear regression model on the basis of fused POIs and EAHSI was credible to estimate population distribution, with a mean value of the coefficient of determination (R 2 ) of 0.78, while the mean R 2 of using only EAHSI was 0.55.The %RMSE and MRE of incorporating POIs were also significantly smaller than using only EAHSI.
According to Equation ( 1) with the a and b values of 52.61 and 25.61, respectively, the gridded population map in Zhejiang for 2010 at a spatial resolution of 250 m was generated.Most of the population lived in the urban agglomerations around Hangzhou Bay, Wenzhou-Taizhou coastal region, and Jinhua-Quzhou basin.The Hangzhou and Ningbo regions in northern Zhejiang were the most heavily populated regions (Figure 4).The spatial distribution of predicted population for Zhejiang was generally consistent with the results of a previous study [31].However, population distribution in the present study revealed apparent spatial heterogeneity and rich information in urban centers due to the combination of POIs.The population density map can be widely used in numerous activities, such as demographic studies, decision making, spatial planning, and emergency response in Zhejiang.

Accuracy Assessment
Figure 5 shows the relationship between estimated and census population counts at the township level for Zhejiang.Each point represented an estimated and actual population count within a township unit.The relationship between the predicted gridded estimates and the census population totals was substantially more linear for the POI-EAHSI method than the WorldPop method.The POI-EAHSI method also showed the highest correlation between estimated and census values (R 2 = 0.88) compared with the WorldPop (R 2 = 0.79) dataset.A significant decrease in MRE (30.46%) and RMSE (1.78) were attained to confirm the improved performance of our proposed method.

Accuracy Assessment
Figure 5 shows the relationship between estimated and census population counts at the township level for Zhejiang.Each point represented an estimated and actual population count within a township unit.The relationship between the predicted gridded estimates and the census population totals was substantially more linear for the POI-EAHSI method than the WorldPop method.The POI-EAHSI method also showed the highest correlation between estimated and census values (R 2 = 0.88) compared with the WorldPop (R 2 = 0.79) dataset.A significant decrease in MRE (30.46%) and RMSE (1.78) were attained to confirm the improved performance of our proposed method.

Accuracy Assessment
Figure 5 shows the relationship between estimated and census population counts at the township level for Zhejiang.Each point represented an estimated and actual population count within a township unit.The relationship between the predicted gridded estimates and the census population totals was substantially more linear for the POI-EAHSI method than the WorldPop method.The POI-EAHSI method also showed the highest correlation between estimated and census values (R 2 = 0.88) compared with the WorldPop (R 2 = 0.79) dataset.A significant decrease in MRE (30.46%) and RMSE (1.78) were attained to confirm the improved performance of our proposed method.We compared the distribution of the residuals of population estimation by POI-EAHSI (Figure 6a) and EAHSI methods (Figure 6b).Population residual was calculated by subtracting the census data from the predicted population of out-of-samples.A negative residual implied that the predicted value was an underestimation, and a positive residual indicated an overestimation.A same-color bar was adopted to compare the distribution of errors easily.In general, EAHSI caused population overestimation in most regions of Zhejiang (Figure 6b).The POI-EAHSI method can significantly decrease the errors and improve the model precision over the entire province, especially in the northern part of Zhejiang (Figure 6a).In the southeast coastal regions of Zhejiang, significant population underestimation was observed (Figure 6a,b).Extensive land reclamation in the coastal areas of Taizhou and Wenzhou for real estate development caused a mismatch between satellite images and administrative boundaries, thereby influencing the results of zonal statistics.The long coastline and the numerous islands in the southeastern coastal regions resulted in the discarding of pixels in zonal statistics, thereby contributing to population underestimation.Compared with WorldPop data, the improvement of the POI-EAHSI method was also apparent across most township units in Zhejiang (Figure 6a,c).A previous study suggested that the WorldPop mainland China dataset has high accuracy [43].Therefore, we further compared the POI-EAHSI results with the WorldPop dataset.
northern part of Zhejiang (Figure 6a).In the southeast coastal regions of Zhejiang, significant population underestimation was observed (Figure 6a,b).Extensive land reclamation in the coastal areas of Taizhou and Wenzhou for real estate development caused a mismatch between satellite images and administrative boundaries, thereby influencing the results of zonal statistics.The long coastline and the numerous islands in the southeastern coastal regions resulted in the discarding of pixels in zonal statistics, thereby contributing to population underestimation.Compared with WorldPop data, the improvement of the POI-EAHSI method was also apparent across most township units in Zhejiang (Figure 6a,c).A previous study suggested that the WorldPop mainland China dataset has high accuracy [43].Therefore, we further compared the POI-EAHSI results with the WorldPop dataset.
Figure 7 shows the model fit between the predicted population density of each township unit compared with the original census population density at the same census unit level for 2010 for POI-EAHSI results and the WorldPop dataset.According to the census population density, all the township units in Zhejiang were classified into three groups, namely, top 20%, medium 60%, and low 20% (red, green, and blue dots, respectively, in Figure 7).There was a good fit at medium population densities for both POI-EAHSI and WorldPop with a similar variable explanation (R 2 = 0.72 vs. R 2 = 0.65).However, there were larger errors at extreme population densities (Figure 7).At high population density, an underestimation of the original census data was observed, whereas significant overestimation was observed at extremely low population density, especially for WorldPop.These types of errors were also observed in previous dasymetric modeling studies [44][45][46].However, POI-EAHSI showed significantly higher accuracy than WorldPop in both tails of population density, especially for the low tail (R 2 = 0.57 vs. R 2 = 0.15).Figure 7 shows the model fit between the predicted population density of each township unit compared with the original census population density at the same census unit level for 2010 for POI-EAHSI results and the WorldPop dataset.According to the census population density, all the township units in Zhejiang were classified into three groups, namely, top 20%, medium 60%, and low 20% (red, green, and blue dots, respectively, in Figure 7).There was a good fit at medium population densities for both POI-EAHSI and WorldPop with a similar variable explanation (R 2 = 0.72 vs. R 2 = 0.65).However, there were larger errors at extreme population densities (Figure 7).At high population density, an underestimation of the original census data was observed, whereas significant overestimation was observed at extremely low population density, especially for WorldPop.These types of errors were also observed in previous dasymetric modeling studies [44][45][46].However, POI-EAHSI showed significantly higher accuracy than WorldPop in both tails of population density, especially for the low tail (R 2 = 0.57 vs. R 2 = 0.15).Figure 7. Scatterplots between the predicted population density on a log10-log10 scale at the township unit and the original census population density at the same unit level.Red points are township units with a large population density for the top 20%, and blue points are units with low population density for low 20% tails.The comparison of the validation unit counts divided by unit area (population density) on an ln-ln scale with those estimated from maps produced using county census units.

Discussion
Spatially accurate data on human population distributions are vital for many applied and theoretical studies.Dasymetric mapping techniques using NTL data as the ancillary information have been increasingly used to disaggregate census population to a finer spatial level.However, the uncertainties in the relationships between NTL and human population distribution should be recognized.NTL emissions depend on affluence, culture, and economic structure [17,47].In many cities of developed countries, commercial advertising, sports facilities, and security lighting often represent additional sources of NTL emissions.Therefore, NTL brightness does not directly or consistently reflect population distribution.A number of experiments demonstrated that remote sensing data, such as land use and NTL data, cannot be used to conduct accurate population estimation at a fine scale, especially in a complex urban environment [4].The underestimation in high-population-density areas and the overestimation in low-population-density areas due to spatial nonstationarity is a frequently recurring problem in dasymetric mapping studies [44][45][46].The derivation of global parameters in this method imposes an averaging effect on the disaggregation of the population that masks the intrinsic heterogeneity in population distribution characteristics [46].
Recently, social sensing data proved to be useful in population estimation.Previous studies used mobile phone data [48], Twitter [49], or OpenStreetMap data [50] to improve population mapping.However, volunteered geographic information (VGI)-based data such as OpenStreetMap data in China is far from being complete [51].Mobile phone data is difficult to obtain for a large study area.Twitter cannot be used in China.Few studies started to use POIs to estimate population distributions at a fine spatial resolution on a small scale, such as a single city [4] or urban districts [3].This study built a population model to disaggregate census data and obtained a high-precision population map at a fine spatial resolution of 250 m by fusing multisource remote sensing data and POIs.A case study for Zhejiang, China has been conducted.KDE and principal component analysis were used to generate a POI density map, which highly relates to human daily life and population distribution in urbanized areas.The results showed that POIs can be considered as useful ancillary data for population estimation even at the regional scale.Compared with WorldPop global population datasets, the method in this study that fuses information from multisource remote sensing data and

Discussion
Spatially accurate data on human population distributions are vital for many applied and theoretical studies.Dasymetric mapping techniques using NTL data as the ancillary information have been increasingly used to disaggregate census population to a finer spatial level.However, the uncertainties in the relationships between NTL and human population distribution should be recognized.NTL emissions depend on affluence, culture, and economic structure [17,47].In many cities of developed countries, commercial advertising, sports facilities, and security lighting often represent additional sources of NTL emissions.Therefore, NTL brightness does not directly or consistently reflect population distribution.A number of experiments demonstrated that remote sensing data, such as land use and NTL data, cannot be used to conduct accurate population estimation at a fine scale, especially in a complex urban environment [4].The underestimation in high-population-density areas and the overestimation in low-population-density areas due to spatial nonstationarity is a frequently recurring problem in dasymetric mapping studies [44][45][46].The derivation of global parameters in this method imposes an averaging effect on the disaggregation of the population that masks the intrinsic heterogeneity in population distribution characteristics [46].
Recently, social sensing data proved to be useful in population estimation.Previous studies used mobile phone data [48], Twitter [49], or OpenStreetMap data [50] to improve population mapping.However, volunteered geographic information (VGI)-based data such as OpenStreetMap data in China is far from being complete [51].Mobile phone data is difficult to obtain for a large study area.Twitter cannot be used in China.Few studies started to use POIs to estimate population distributions at a fine spatial resolution on a small scale, such as a single city [4] or urban districts [3].This study built a population model to disaggregate census data and obtained a high-precision population map at a fine spatial resolution of 250 m by fusing multisource remote sensing data and POIs.A case study for Zhejiang, China has been conducted.KDE and principal component analysis were used to generate a POI density map, which highly relates to human daily life and population distribution in urbanized areas.The results showed that POIs can be considered as useful ancillary data for population estimation even at the regional scale.Compared with WorldPop global population datasets, the method in this study that fuses information from multisource remote sensing data and POI data can generate improved estimation performance in revealing the actual population distribution at a fine scale, especially in urbanized areas.
Human settlement index on the basis of NTL and vegetation index can effectively map human settlements [52] and impervious surface [53] while distinguishing commercial, residential, and industrial areas is difficult.POIs can supplement information to identify urban functional zones [54][55][56][57].Compared with EAHSI, POIs that are mainly located in urban areas and are highly related to human daily life can represent an area with high population density and exclude industrial regions [58].In addition, POI data possessed a simple data structure compared with other multidimensional data.Therefore, POIs can be easily used to refine population estimation, especially in urbanized areas.The incorporation of POIs facilitated the decrease in the weight in commercial and industrial areas, which certainly improved the population prediction.
The quality and the appropriateness of the ancillary data used influenced the accuracy of the population estimation.One of the uncertainties of our method is the quality of POI data because POI descriptions are generally provided by volunteers, and inaccurate descriptions are likely to occur.However, our POI data were obtained from a commercial navigation database and were collected by trained persons and subject to strict inspection.In addition, these POI data are used in the Baidu Map and navigation app.Therefore, the positional and thematic accuracy of Baidu POI data is reliable.Most POIs concentrate in urban areas, which most likely limits the improvement of population estimation of our method to urban regions.In rural areas and urban fringe areas, many POIs are unreported, and POI density is relatively low.Therefore, POI data may not be an effective measure of population density in non-urban areas.Moreover, the correlations between some categories of POI and population density may vary in different cities, since urban fabric patterns vary across regions.In this regard, reproductions of this study in other geographic areas/countries need to investigate spatial patterns of POIs that reflect population distribution.Finally, although POIs can identify the footprints of human activities, they cannot provide the extent of these activities.The lack of information on the volume of buildings may lead to population underestimation or overestimation [44].

Conclusions
Social sensing data, such as POI, directly reflect human activities and contain rich information on place semantics, and have significantly complemented traditional remote sensing data in the context of population estimation.Considering that social sensing and remote sensing data capture different aspects of human activities, integrating these two types of data is a promising research topic.Our approach took advantage of the information from POIs and multisource remote sensing data to obtain the detailed and accurate characteristics of the population distribution and subsequently improve population estimation.The POI-EAHSI model incorporating POI data overcame the systematic overestimation and underestimation issues in previous studies and produced the most accurate results, especially at the extremes of the population density.This paper provided a new approach for the rapid and accurate estimation of the human population at the regional scale.The integrated approach for population estimation has the potential to adopt more remote sensing data and new types of social sensing big data to estimate population in more flexible ways in the future, such as age-specific population estimation [59].The values of multisource social sensing data in population estimation will be explored in future studies to further improve the accuracy of population mapping.
Remote Sens. 2019, 10, x FOR PEER REVIEW 3 of 15 administrative units in Zhejiang from coarse to fine includes 11 cities, 90 counties, and 1520 townships.

Figure 2 .
Figure 2. Flowchart of disaggregating census population data into the 250 m grid cells.

Figure 2 .
Figure 2. Flowchart of disaggregating census population data into the 250 m grid cells.

Figure 3 .
Figure 3. Kernel density map of the first principal component image.

Figure 3 .
Figure 3. Kernel density map of the first principal component image.

Figure 5 .
Figure 5. Scatterplots between the census population counts and (a) POI-EAHSI population estimates, (b) EAHSI-derived population estimates, and (c) WorldPop population at the township level for Zhejiang in 2010.

Figure 5 .
Figure 5. Scatterplots between the census population counts and (a) POI-EAHSI population estimates, (b) EAHSI-derived population estimates, and (c) WorldPop population at the township level for Zhejiang in 2010.

Figure 5 .
Figure 5. Scatterplots between the census population counts and (a) POI-EAHSI population estimates, (b) EAHSI-derived population estimates, and (c) WorldPop population at the township level for Zhejiang in 2010.

Figure 7 .
Figure 7. Scatterplots between the predicted population density on a log 10 -log 10 scale at the township unit and the original census population density at the same unit level.Red points are township units with a large population density for the top 20%, and blue points are units with low population density for low 20% tails.The comparison of the validation unit counts divided by unit area (population density) on an ln-ln scale with those estimated from maps produced using county census units.

Table 1 .
List of datasets (all produced in 2010) used in this study.

Table 2 .
Spearman's correlation coefficients between each point-of-interest (POI) category and census population at the township level (ranked in descending order).
All correlation coefficients were significant at the 0.01 level.