New Perspectives for Mapping Global Population Distribution Using World Settlement Footprint Products

: In the production of gridded population maps, remotely sensed, human settlement datasets rank among the most important geographical factors to estimate population densities and distributions at regional and global scales. Within this context, the German Aerospace Centre (DLR) has developed a new suite of global layers, which accurately describe the built-up environment and its characteristics at high spatial resolution: (i) the World Settlement Footprint 2015 layer (WSF-2015), a binary settlement mask; and (ii) the experimental World Settlement Footprint Density 2015 layer (WSF-2015-Density), representing the percentage of impervious surface. This research systematically compares the e ﬀ ectiveness of both layers for producing population distribution maps through a dasymetric mapping approach in nine low-, middle-, and highly urbanised countries. Results indicate that the WSF-2015-Density layer can produce population distribution maps with higher qualitative and quantitative accuracies in comparison to the already established binary approach, especially in those countries where a good percentage of building structures have been identiﬁed within the rural areas. Moreover, our results suggest that population distribution accuracies could substantially improve through the dynamic preselection of the input layers and the correct parameterisation of the Settlement Size Complexity (SSC) index.


Introduction
According to the latest revision of the United Nations (UN), World Population Prospects, the world's population is projected to grow from 7.7 billion in 2019 to 10.9 billion in 2100 [1]. Considered part of the four global demographic "megatrends", population growth next to population ageing, migration and urbanization, is an important indicator for economic, social and environmental development [2]. For this reason, accurate knowledge of the size, location, and distribution of the human population is fundamental for successfully achieving a sustainable future. An effective monitoring of global population change, allows implementing efficient government policies to allocate financial resources, plan interventions and quantify populations at risk. 3 of 24 uses the Global Human Settlement Layer built-up grids (GHSL-BUILT) [37], while the HRSL uses a binary mask of areas identified as human-made buildings extracted from very high-resolution satellite imagery. While this modelling approach is less complex and allows global transferability [33], the population mapping accuracies of these products largely depend on the complete identification of building structures and are affected by omission and commission errors [13].
In this context, the German Aerospace Center (DLR) has developed dedicated global layers and related analysis tools that describe the built-up environment and its characteristics with high accuracy and high spatial resolution. The first includes the Global Urban Footprint (GUF) dataset, which was released in 2016 [38]. The GUF was produced based on an operational framework that automatically processed and analysed over 180,000 TerraSAR-X/TanDEM-X radar images collected during 2011-2013. It provides a global human settlement map at 12 m resolution [39], which up to now, has been employed by more than 500 institutions for a broad scope of applications [40], including studies focused on population disaggregation [12,41,42]. Currently, the DLR is working on a suite of follow-on products-the World Settlement Footprint (WSF)-with an extended semantic depth, based on the joint analysis of Landsat 8 and Sentinel 1 optical and radar imagery [43]. The first two releases of this new suite will include: (i) a binary settlement mask named WSF-2015 outlining settlements globally at 10-m resolution; and (ii) the experimental WSF-2015-Density layer. which estimates the percent of impervious surface for the pixels labelled as settlement in the WSF-2015 [43].
Impervious surfaces are primarily associated with streets, sidewalks and building structures. They can be defined as surfaces consisting of materials such as asphalt, concrete or stone that seal the soil surface, eliminating water infiltration [44]. Impervious surfaces extracted using different remote sensing methodologies have been examined in a small number of population distribution studies [45,46]. In these studies, the authors have demonstrated that impervious surfaces are highly correlated to population counts, making them good predictors of population distribution. Nevertheless, these studies have only focused on limited areas, thus leading to results and methodologies that are not globally transferable. In the same way, in producing population distribution maps based on settlement extent products, Reed et al. [34] showed that an initial version of the WSF-2015 layer was capable of producing population distribution maps with predictive accuracies higher than the GHSL layer and relatively close to the HRSL layer, employing different population distribution methods. However, while currently the HRSL layer is available only for a limited number of countries, the novel WSF-2015 and the experimental WSF-2015-Density layers have the potential to become the ideal covariates to support population disaggregation methods and to produce global population distribution datasets with improved accuracy and higher spatial resolution than those currently available.
Following this premise, the main goal of this research was to examine the suitability of the WSF-2015 and the-thus far experimental-WSF-2015-Density layers as input covariates for the development of a new global population distribution dataset. Population distribution maps were produced using a dasymetric mapping approach in combination with the finest population census data available at global scale at the time of writing. We specifically focused on the systematic cross-comparison between the performance of the binary and the impervious layer, to investigate if quality and accuracy improvements in population disaggregation can be achieved with the WSF-2015-Density layer, compared to the already established binary approach that has been employed by other population datasets and their baseline settlement layers.
Through a comprehensive quantitative assessment, we evaluated the mapping performance of each covariate layer, addressing the influence of: (i) the spatial resolution of the input census data; (ii) the quality of the input covariate layers; and (iii) the spatial distribution of the built-up environment, on the final results.
The corresponding analyses were conducted for nine representative countries of different size and different levels of urbanisation and population aggregation.

WSF-2015 Layer
The WSF-2015 is a novel layer outlining the extent of human settlements globally at 10-m resolution. The dataset has been derived by jointly exploiting multi-temporal Sentinel-1 Synthetic Aperture Radar and Landsat-8 optical satellite imagery collected during 2014-2015, of which~107,000 and~217,000 scenes have been processed, respectively.
The basic underlying hypothesis is that the dynamics of human settlements over time are sensibly different with respect to those of all other non-settlement classes. Accordingly, for all the scenes available for the given target region, key temporal statistics (e.g., temporal maximum, minimum, mean, variance, etc.) have been concurrently computed for: (i) the original Sentinel-1 backscattering; and (ii) different spectral indices extracted from the Landsat-8 data after removing clouds and cloud shadows. Next, candidate training samples for the settlement and non-settlement class have been extracted on the basis of specific thresholds determined-based on extensive empirical analysis-for some of the resulting temporal features. Classification is then performed separately for the opticaland radar-based features by means of Support Vector Machines (SVMs) and, finally, the two outputs are properly combined. The WSF-2015 exhibited high accuracy and reliability, outperforming all other existing similar global layers. Specifically, this has been quantitatively demonstrated through an extensive validation exercise performed in collaboration with Google where 900,000 reference samples have been labelled by crowd-sourcing photointerpretation for a collection of 50 globally distributed test sites of 1 × 1 lat/lon degree size. The layer is currently available for online browsing on the ESA Urban-TEP platform; furthermore, a comprehensive description of the classification system and the validation results is provided in [43].

WSF-2015-Density Layer
The WSF-2015-Density is one of the first experimental developments of the WSF product and service portfolio, aiming to enhance the semantic and thematic scope of the WSF-2015; in particular, the layer describes the percent impervious surface (PIS) within areas categorised as settlements in the WSF-2015. Effectively mapping the PIS is of high importance to assess-among others-the risk of urban floods, the urban heat island phenomenon as well as the reduction of ecological productivity. Furthermore, it is generally considered as an effective proxy for the housing density, thus making it particularly suitable for supporting spatial population distribution [45][46][47]. The current processing methodology follows the approach originally described by Marconcini et al. [48] and is based on the assumption that a strong inverse relation exists between vegetation and impervious surfaces (i.e., the higher the presence of vegetation is, the lower the corresponding imperviousness is). Accordingly, the core idea is to compute and analyse for each pixel the temporal maximum of the Normalised Difference Vegetation Index (NDVI), which depicts the status at the peak of the phenological cycle. To this purpose, the NDVI available from the TimeScan dataset [40,49] has been used, which has been derived globally from Landsat-8 scenes acquired during 2014-2015. Figure 1 shows a subset of the WSF-2015-Density layer for Toluca state in Mexico. Values range between 0 and 100, with red and green tones highlighting high and low PIS, respectively.

Input Census Data
For this research, population census data for nine low-, middle-and highly urbanised countries [50] located in four different macro-regions of the world were collected to analyse how the differences in the level of spatial granularity of the available administrative boundaries and the variability in the morphology of built-up landscapes influence the accuracy of each covariate layer. To achieve these objectives, countries were selected on basis of the availability of population census data at different spatial aggregation levels. In other words, countries were selected only if the census data allowed for the spatial aggregation of the administrative boundaries up to four administrative levels.
The Center of International Earth Science Information Network (CIESIN) provided geographic administrative boundaries and corresponding population counts for Cambodia, Côte d'Ivoire, England, France, Germany, Malawi, Mexico, and Vietnam. CIESIN population data were selected for this research, as it has been used in the production of other population dataset such as GPWv4, GHS-POP, WorldPop and the HRSL. CIESIN collected census data at the highest spatial detail available from the results of the 2010 round of Population and Housing Censuses, which occurred between 2005 and 2014. CIESIN data include two types of population estimates: census-based and UN-adjusted, both estimated for the years 2000, 2005, 2010, 2015 and 2020. Initial population estimates were derived for each administrative unit by means of an exponential model fitted on at least two census counts for each country [17]. However, to allow for global comparisons, CIESIN adjusted the census counts to the target year of 2010, which were then then interpolated and extrapolated to produce the UN-adjusted estimates with the objective to correct for over-or under estimations [6,7]. The 2015 UN-adjusted estimates were used in this research.
For Myanmar, population data were collected from the Ministry of Immigration and Population in reference to the Population and Housing Census of 2014 [51] and was joined with publicly available geographic administrative boundaries [52]. The population data were released on May 2015 and the original population counts were used in this research.
For each country, administrative boundaries and population counts were aggregated at four levels of spatial resolution using attribute information stored within the data. Table 1 shows the total population for 2015 for each country as well as the official administrative unit nomenclature at each

Input Census Data
For this research, population census data for nine low-, middle-and highly urbanised countries [50] located in four different macro-regions of the world were collected to analyse how the differences in the level of spatial granularity of the available administrative boundaries and the variability in the morphology of built-up landscapes influence the accuracy of each covariate layer. To achieve these objectives, countries were selected on basis of the availability of population census data at different spatial aggregation levels. In other words, countries were selected only if the census data allowed for the spatial aggregation of the administrative boundaries up to four administrative levels.
The Center of International Earth Science Information Network (CIESIN) provided geographic administrative boundaries and corresponding population counts for Cambodia, Côte d'Ivoire, England, France, Germany, Malawi, Mexico, and Vietnam. CIESIN population data were selected for this research, as it has been used in the production of other population dataset such as GPWv4, GHS-POP, WorldPop and the HRSL. CIESIN collected census data at the highest spatial detail available from the results of the 2010 round of Population and Housing Censuses, which occurred between 2005 and 2014. CIESIN data include two types of population estimates: census-based and UN-adjusted, both estimated for the years 2000, 2005, 2010, 2015 and 2020. Initial population estimates were derived for each administrative unit by means of an exponential model fitted on at least two census counts for each country [17]. However, to allow for global comparisons, CIESIN adjusted the census counts to the target year of 2010, which were then then interpolated and extrapolated to produce the UN-adjusted estimates with the objective to correct for over-or under estimations [6,7]. The 2015 UN-adjusted estimates were used in this research.
For Myanmar, population data were collected from the Ministry of Immigration and Population in reference to the Population and Housing Census of 2014 [51] and was joined with publicly available geographic administrative boundaries [52]. The population data were released on May 2015 and the original population counts were used in this research.
For each country, administrative boundaries and population counts were aggregated at four levels of spatial resolution using attribute information stored within the data. Table 1 shows the total population for 2015 for each country as well as the official administrative unit nomenclature at each spatial aggregation level, the number of administrative units, the average area and the average spatial resolution (ASR). The ASR is calculated as the square root of each country total area divided by the number of administrative units, representing the effective resolution units within each country [3].

Population Distribution: Dasymetric Mapping Approach
Population distribution maps for 2015 were generated for each country at each administrative unit level using a dasymetric mapping approach, where population census data from administrative boundaries (source zones) are disaggregated into smaller areal units of fixed spatial resolution (target zones). The size of the target zones is normally defined by the pixel resolution of the different ancillary datasets employed to restrict and refine the distribution of the population within each administrative unit [53]. The estimated population per grid cell is defined in Equation (1): where Pop t is the population of the target zone, Pop s is the population of the source zone, A t represents the area of the target zone and W p is the weight of a grid cell within the target zone. With this modelling approach, population counts are maintained (volume-preserving property) at each original input source zone. In this research, two types of dasymetric mapping techniques were used. The first method is the traditional binary approach, which relies on the WSF-2015 layer to assign a weighting factor of 1 to built-up pixels and a 0 for non-built-up pixels. The second method uses the WSF-2015-Density layer to assign a weighting factor that ranges from 0 to 100, estimating the PIS for the pixels classified as settlement in the WSF-2015.

Quantitative Accuracy Assessment
As stated by Bai et al. [54] "quantifying the accuracy of population distribution maps has been recognized as a critical and challenging task". Determining the spatial and quantitative uncertainties of population distribution products is fundamental yet very difficult due to the lack of independent and compatible reference data [10]. Nevertheless, through well-established accuracy methods, it is possible to assess the effectiveness of new models (disaggregation methods and/or covariate layers) and investigate if higher population distribution accuracies can be reached in comparison to previous approaches. For this research, the accuracy of the two covariate layers was assessed by computing the difference between the estimated population counts extracted from maps produced using coarser administrative units (input units) and the actual population counts of the finest administrative units (validation units). This accuracy method has been widely employed in previous research [13,17,34,42,55]; however, it still presents some limitations, as high-resolution boundaries and population data (e.g., enumeration area level) are not publicly available for all countries.
For this reason, to gain a more comprehensive and detailed understanding of the mapping capabilities of each covariate layer, the final population distribution maps were evaluated following a series of thorough quantitative analyses performed at the validation unit level and the input level of the administrative units. The analysis at the validation unit level was divided in two parts. In the first part, an overall accuracy assessment was carried out to examine the influence of the spatial resolution of the input census data on the results. Here, population distribution maps were produced using three spatial aggregation levels of the administrative boundaries as input units (Analyses I-III in Table 2). For each analysis, four main descriptive statistics were calculated to measure the overall accuracies of each layer. These metrics are briefly described in Table 3 and include: the MAE (Mean Absolute Error), the MAPE (Mean Absolute Percentage Error), the Root Mean Square Error (RMSE) and the coefficient of determination (R 2 ). Table 3. Descriptive statistics for overall accuracy assessment at the validation unit level for Analyses I-III.

Metric
Description MAE is the mean absolute error at each level of analysis (i), calculated as the average of the sum of the absolute differences between the estimated population (PE vu ) and the actual population (P VU ) at each validation unit.
MAPE is the mean absolute percentage error at each level of analysis (i), calculated as the MAE i divided by the average population of each country.
RMSE is the root mean square error at each level of analysis (i), calculated as the square root of the mean of the sum of squares of the differences between the estimated population at (PE vu ) and the actual population (P VU ) at each validation unit.

R 2
Defined as the coefficient of determination at each level of analysis, derived from classical linear least square modelling with constant intercept at 0. It is also defined as the square of the Pearson correlation coefficient, to measure the variation between the estimated population and the actual population of all validation units. Readers can refer to [56] for detailed calculations.
The second part of the analysis was carried out only for the population maps produced using the finest input units (Analysis I in Table 2). Here, similar to the methodology and classification presented by Bai et al. [54], the Relative Estimation Error (REE) metric was used to identify the amount and distribution of error produced by each covariate layer. The REE for each validation unit was calculated as: where PE VU is the estimated population of the validation unit and P VU is the actual population of the validation unit. Using the REE VU , validation units were grouped and classified into different REE ranges (Table 4). Table 4. REE classification [54].

REE Ranges Description
Accurately estimated (25%, 50%] Overestimated (50%, ≥100%] Greatly overestimated From this classification, two sub-analyses were conducted for each country. First, for a better understanding of the error distribution associated with each covariate layer, we calculated the percentage of each country's total population that fell within each error range. Second, for each country, we calculated the average actual population and average number of settlements pixels for the validation units that fell within each error range. This last analysis was done to identify if there is any Sustainability 2019, 11, 6056 9 of 24 relationship between the amount of population that needs to be distributed (P VU ) and the number of available settlement pixels, and if the ratio between these two parameters can explain the REE values reported in the validation units.
Finally, as the reported accuracy at the validation unit level is only a reflection of the capability of each input covariate layer to correctly allocate population counts at the input unit level, a series of analyses were carried out at the input unit level, focusing only on Analysis I (Table 2). First, we used the RMSE metric as a summary of the error within each original input unit, following the methodology presented by Mennis and Hultgren [57]. RMSE was calculated as the square root of the mean of the sum of squares of the difference between the actual population counts and estimated population counts of all validation units within an input unit: where P VU is the actual population at validation unit, PE VU is the estimated population at validation unit and n is the number of validation units within an input unit. To compare the effectiveness of the covariate layers, input units were grouped according to the layer that produced the lowest RMSE values and for each group the percentage of each country's total population was calculated. Second, on basis of these results, we undertook a series of analyses to identify and describe the regions where one layer outperformed the other. For this analysis, we derived the Settlement Size Coverage (SSC) index, which classifies each input unit according to (i) the number of small-, mediumand large-settlement objects that can be found within each unit; and (ii) the proportion of each input unit's total area that is covered by these settlements objects. To calculate the SSC index, settlement objects were created, where each object is composed of connected settlement pixels via at least one pixel edge or corner (8-neighbourhood), as described by Esch et al. [58]. The SSC index within each given input unit was derived as: SSC IU = #settlement pixels #settl. objects * Sum of the area settl. objects Total area of input unit * Area of largest settl. objects Mean area of settl. objects (7) where high SSC IU values indicate dense built-up environments and low SSC IU values indicate sparse built-up environments. To allow country cross-comparisons, we normalised the SSC index values from 0 to 10 and divided it into three classes, as shown in Table 5. Thresholds were visually derived and evaluated against all available countries. For each SSC class, we calculated the average RMSE produced by each layer.

SSC Index Class Description
Low (>0-1) Small size settlements and low coverage of the total area of the input units Mix of small and medium size settlements and medium coverage of the total area of the input units High [1. [8][9][10] Mix of medium and large size settlements with high coverage of the total area of the input units

Visual Assessment of the Population Distribution Maps
The WSF-2015-Density and the WSF-2015 layers were used to produce population distribution maps for each country at each spatial aggregation level of the administrative units, representing the estimated night-time population (population counted at place of domicile) as the number of people per grid cell for the year 2015. The final spatial resolution of the population distribution maps equals the spatial resolution of the input covariates (~10-m at the equator).
Because the volume of results (72 population distribution maps) is too large to present here in full, we focused on one representative country to visually inspect the thematic differences between the maps produced using the WSF-2015-Density and the WSF-2015 layers before turning to the quantitative analyses of all the maps. Figure 2 shows the final population distribution maps produced using the finest administrative units for Germany (enumeration areas), depicting the local metropolitan areas of Berlin and Munich. Note that, for the finest administrative units, these two areas have been modelled using a single census unit were local differences between the binary and the weighted disaggregation approaches are rather clear. As illustrated in Figure 2, population disaggregation based on the WSF-2015 layer produces homogeneous population counts within each administrative unit in comparison to the WSF-2015-Density layer, which offers more spatial heterogeneity. As a result of the proportional allocation produced by the binary layer, it is possible to observe abrupt changes from high to low population counts between neighbouring administrative units. The transitions are considerably smoother when using the WSF-2015-Density layer, due to the weight given by the percent of impervious surface, which rarely changes abruptly at the boundaries of the administrative units.

Analyses at the Validation Unit Level
A summary of the accuracy assessment results using the WSF-2015-Density and WSF-2015 layers is presented in Table 6. Results show that, for each layer and each country, the highest R 2 , the lowest MAE, the lowest MAPE and the lowest RMSE values are reached using the finest administrative input units (Analysis I, Table 2). Furthermore, from one level of spatial aggregation to the next, the values for the R 2 decrease, while the MAE, MAPE and RMSE values increase.
From the RMSE and MAE metrics, it can be seen that, for Analysis I, for most countries, errors remain below the size of the average population using any of the two covariate layers. While for all countries the MAE values remain below the average population size for Analyses I-III, RMSE exceeds this threshold in Analysis II in Germany and France and in Analysis III in Mexico and As illustrated in Figure 2, population disaggregation based on the WSF-2015 layer produces homogeneous population counts within each administrative unit in comparison to the WSF-2015-Density layer, which offers more spatial heterogeneity. As a result of the proportional allocation produced by the binary layer, it is possible to observe abrupt changes from high to low population counts between neighbouring administrative units. The transitions are considerably smoother when using the WSF-2015-Density layer, due to the weight given by the percent of impervious surface, which rarely changes abruptly at the boundaries of the administrative units.

Analyses at the Validation Unit Level
A summary of the accuracy assessment results using the WSF-2015-Density and WSF-2015 layers is presented in Table 6. Results show that, for each layer and each country, the highest R 2 , the lowest MAE, the lowest MAPE and the lowest RMSE values are reached using the finest administrative input units (Analysis I, Table 2). Furthermore, from one level of spatial aggregation to the next, the values for the R 2 decrease, while the MAE, MAPE and RMSE values increase.
From the RMSE and MAE metrics, it can be seen that, for Analysis I, for most countries, errors remain below the size of the average population using any of the two covariate layers. While for all countries the MAE values remain below the average population size for Analyses I-III, RMSE exceeds this threshold in Analysis II in Germany and France and in Analysis III in Mexico and Myanmar. Additionally, the difference between the RMSE and the MAE values tends to increase as the spatial detail of the input units decreases, with significant higher differences in countries such as Côte d'Ivoire, France, Myanmar and Vietnam. In case of the three latter, the large differences can be explained by the large variances between the errors of the validation units within each country.
Comparing the results between the WSF-2015-Density and the WSF-2015 layers, it can be seen that, for Cambodia and Malawi, the best overall accuracies are reported using the WSF-2015 layer at all levels of aggregation. For the rest of the countries, the WSF-2015-Density layer performs better at all levels of aggregation, except for Mexico and Myanmar where there is a transition between layers in Analysis III.
Focusing only on the population distribution maps produced using the finest input units (Analysis I, Table 2), further analyses were performed at the validation unit level. First, to understand the amount of error produced by each input covariate layer, we calculated the percentage of each country's total population that fell within each REE range. Classifying the REE values in different error ranges (Table 4), we calculated the percentage of each country's total population that fell within each REE range for each covariate layer, as shown in Figure 3 and Table 7.
The percentage bar charts in Figure 3 show that, for each country, both covariate layers distribute approximately the same amount of population with comparable accuracies. From here, it can be seen that, for all countries, the largest percentage of the population was "accurately estimated" with estimation errors ranging from −25% to 25% for both covariate layers. For Côte d'Ivoire, Germany, England and Myanmar, this represents more than 50% of the total population; for France, Cambodia and Vietnam, between 40% and 50% of the total population; and, for Malawi and Mexico, between 30% and 40% of the total population. Moreover, for the majority of the countries, the second largest percentage of the population was either "underestimated" or "overestimated" (from ±25 to ±50%). For all countries, less than 15% of the total population was underestimated, while, for most countries, except Germany and Myanmar, from 15% to 25% of the total population was overestimated. Finally, the smallest percentage of the population for all countries was "greatly underestimated" or "greatly overestimated" (≥50% or ≤−50%), with Malawi reporting an average of~30% of the total population within these ranges, followed by Mexico with~25%, and France and Vietnam with~17%. Focusing only on the population distribution maps produced using the finest input units (Analysis I, Table 2), further analyses were performed at the validation unit level. First, to understand the amount of error produced by each input covariate layer, we calculated the percentage of each country's total population that fell within each REE range. Classifying the REE values in different error ranges (Table 4), we calculated the percentage of each country's total population that fell within each REE range for each covariate layer, as shown in Figure 3 and Table 7.
The percentage bar charts in Figure 3 show that, for each country, both covariate layers distribute approximately the same amount of population with comparable accuracies. From here, it can be seen that, for all countries, the largest percentage of the population was "accurately estimated" with estimation errors ranging from −25% to 25% for both covariate layers. For Côte d'Ivoire, Germany, England and Myanmar, this represents more than 50% of the total population; for France, Cambodia and Vietnam, between 40% and 50% of the total population; and, for Malawi and Mexico, between 30% and 40% of the total population. Moreover, for the majority of the countries, the second largest percentage of the population was either "underestimated" or "overestimated" (from ±25 to ±50%). For all countries, less than 15% of the total population was underestimated, while, for most countries, except Germany and Myanmar, from 15% to 25% of the total population was overestimated. Finally, the smallest percentage of the population for all countries was "greatly underestimated" or "greatly overestimated" (≥50% or ≤−50%), with Malawi reporting an average of ~30% of the total population within these ranges, followed by Mexico with ~25%, and France and Vietnam with ~17%.   To identify if there is any significant relationship between the actual population to distribute in a particular validation unit and the number of available settlement pixels, we calculated the average actual population and the average number of settlement pixels for the validation units that fell within each REE range. Figure 4a shows the ratio between these two parameters for each REE range, where the general tendency indicates that, for most countries, errors of underestimation are mainly reported in validation units where a relatively low number of settlement pixels were identified in comparison to the average actual population reported for those validation units. In other words, errors of underestimation tend to increase as the ratio between the population and the number of settlement pixels increases. On the other hand, for most countries, errors of overestimation tend to increase as the ratio between the average actual population and the number of settlement pixels decreases, indicating that a large number of settlement pixels have been detected in relation to the average actual population reported for those validations units. To identify if there is any significant relationship between the actual population to distribute in a particular validation unit and the number of available settlement pixels, we calculated the average actual population and the average number of settlement pixels for the validation units that fell within each REE range. Figure 4a shows the ratio between these two parameters for each REE range, where the general tendency indicates that, for most countries, errors of underestimation are mainly reported in validation units where a relatively low number of settlement pixels were identified in comparison to the average actual population reported for those validation units. In other words, errors of underestimation tend to increase as the ratio between the population and the number of settlement pixels increases. On the other hand, for most countries, errors of overestimation tend to increase as the ratio between the average actual population and the number of settlement pixels decreases, indicating that a large number of settlement pixels have been detected in relation to the average actual population reported for those validations units.  For a better understanding of the error distribution, the percentage of validation units that reported similar ratios and fell within each REE was quantified for each country. From the percentage bar charts in Figure 4b, it is possible to observe that, for countries such as Cambodia, Mexico, Malawi and Vietnam, more than 30% of the validations units reported errors of underestimation (from −100% to −25%), with Mexico, Malawi and Vietnam reporting ~20% of the validation units "greatly underestimated" (from −100% to −50%). In the same way, France reported the largest percentage of the validation units (~41%) with errors of overestimation (from 25% to ≥100%), followed by Mexico (~30%), Malawi and Germany (~25%). Here, Mexico reported the largest percentage of validation units "greatly overestimated", with ~20% of the validation units with REE larger than 100%. For a better understanding of the error distribution, the percentage of validation units that reported similar ratios and fell within each REE was quantified for each country. From the percentage bar charts in Figure 4b, it is possible to observe that, for countries such as Cambodia, Mexico, Malawi and Vietnam, more than 30% of the validations units reported errors of underestimation (from −100% to −25%), with Mexico, Malawi and Vietnam reporting~20% of the validation units "greatly underestimated" (from −100% to −50%). In the same way, France reported the largest percentage of the validation units (~41%) with errors of overestimation (from 25% to ≥100%), followed by Mexico (~30%), Malawi and Germany (~25%). Here, Mexico reported the largest percentage of validation units "greatly overestimated", with 20% of the validation units with REE larger than 100%.

Analyses at the Input Unit Level
To evaluate the actual performance of each covariate layer, results at the validation unit level were used to calculate the RMSE IU metric of the original input census units used for population disaggregation according to Equation (6). Input units were grouped according to the input covariate layer that produced the lowest RMSE IU values and for each group the percentage of each country's total population was calculated. Figure 5 illustrates the percentage bar charts for each country. As one can notice, for Germany, France and Mexico, the predominance of the WSF-2015-Density is clear, distributing more than 75% of each country's total population with overall lower RMSE values in comparison to the WSF-2015 layer. On the other hand, for Cambodia and Malawi, the WSF-2015 layer performs better, distributing more than 75% of the population more accurately compared to WSF-2015-Density layer. In the rest of the countries (i.e., Côte d' Ivoire, England, Myanmar and Vietnam), both layers perform equally, with the WSF-2015-Density layer distributing a slightly larger amount of the population better than the WSF-2015 layer.

Analyses at the Input Unit Level
To evaluate the actual performance of each covariate layer, results at the validation unit level were used to calculate the RMSEIU metric of the original input census units used for population disaggregation according to Equation (6). Input units were grouped according to the input covariate layer that produced the lowest RMSEIU values and for each group the percentage of each country's total population was calculated. Figure 5 illustrates the percentage bar charts for each country. As one can notice, for Germany, France and Mexico, the predominance of the WSF-2015-Density is clear, distributing more than 75% of each country's total population with overall lower RMSE values in comparison to the WSF-2015 layer. On the other hand, for Cambodia and Malawi, the WSF-2015 layer performs better, distributing more than 75% of the population more accurately compared to WSF-2015-Density layer. In the rest of the countries (i.e., Côte d' Ivoire, England, Myanmar and Vietnam), both layers perform equally, with the WSF-2015-Density layer distributing a slightly larger amount of the population better than the WSF-2015 layer. To identify the regions where each covariate layer produced higher accuracies, the input units of each country were classified according to the SSC index (Equation (7), Table 5). The map in Figure  6 illustrates the results of this classification for Côte d' Ivoire. Here, most of the input units fell within the "low" SSC class, which is characterised by small size settlement objects that cover a low percentage of each input unit's total area. A few input units fell within the "medium" SSC class, characterised by a mix of medium and small size settlements objects, and only two input units fell within the "high" SSC class, characterised by large size settlement objects that cover a large extent of each input unit's total area. For Côte d' Ivoire, some of the most populated cities are located within the "high" and "medium" input units, such as Abidjan, Bouake, Korhogo and Divo. To identify the regions where each covariate layer produced higher accuracies, the input units of each country were classified according to the SSC index (Equation (7), Table 5). The map in Figure 6 illustrates the results of this classification for Côte d' Ivoire. Here, most of the input units fell within the "low" SSC class, which is characterised by small size settlement objects that cover a low percentage of each input unit's total area. A few input units fell within the "medium" SSC class, characterised by a mix of medium and small size settlements objects, and only two input units fell within the "high" SSC class, characterised by large size settlement objects that cover a large extent of each input unit's total area. For Côte d' Ivoire, some of the most populated cities are located within the "high" and "medium" input units, such as Abidjan, Bouake, Korhogo and Divo. Following this classification, the same analysis was carried out for each country. Figure 7 shows the percentage of each country's total area (pie charts) and corresponding population (boxes) derived from the input units according to the SSC index. For Côte d' Ivoire, Cambodia, Mexico, Myanmar and Malawi, the largest percentage of the total area fell within the "low" SSC class. For all these countries, more than 70% of the population is located within these areas, except for Mexico, where the majority of the population (54.79%) is located within areas belonging to the "high" SSC class. For Germany, England, France and Vietnam, Following this classification, the same analysis was carried out for each country. Figure 7 shows the percentage of each country's total area (pie charts) and corresponding population (boxes) derived from the input units according to the SSC index. Following this classification, the same analysis was carried out for each country. Figure 7 shows the percentage of each country's total area (pie charts) and corresponding population (boxes) derived from the input units according to the SSC index. For Côte d' Ivoire, Cambodia, Mexico, Myanmar and Malawi, the largest percentage of the total area fell within the "low" SSC class. For all these countries, more than 70% of the population is located within these areas, except for Mexico, where the majority of the population (54.79%) is located within areas belonging to the "high" SSC class. For Germany, England, France and Vietnam, For Côte d' Ivoire, Cambodia, Mexico, Myanmar and Malawi, the largest percentage of the total area fell within the "low" SSC class. For all these countries, more than 70% of the population is located within these areas, except for Mexico, where the majority of the population (54.79%) is located within areas belonging to the "high" SSC class. For Germany, England, France and Vietnam, the largest percentage of the total area fell within the "high" SSC class, where more than 80% of the population is located. For most countries, the second largest percentage of the area fell within the "medium" SSC class. In these areas, the second largest percentage of the population is located, however it does not exceed more than 17% of the total population.
For each SSC class, we computed the average RMSE error produced by each covariate layer and the percentage difference between the two layers ( Table 8). The results indicate that, for all countries, the WSF-2015-Density layer performed better in regions that fell within the "high" SSC class, with improvements ranging from 1.12% to 31.20% over the WSF-2015 layer. For regions within the "low" or "medium" SSC classes, the behaviour of the covariate layers is more variable among the countries. For Germany, England, France and Myanmar, the WSF-2015-Density layer performed better for regions within the "low" SSC class, with improvements ranging from 4.36% to 22.40%, while. for Côte d' Ivoire, Cambodia, Malawi and Vietnam, the WSF-2015 layer performed better with improvements ranging from 2.12% to 9.82%. For the regions within the "medium" SSC class, the WSF-2015-Density layer performed better in Germany, England, France, Malawi and Vietnam, with improvements ranging from 6.62% to 21.03%, as opposed to Côte d' Ivoire, Cambodia, Mexico and Myanmar, where the WSF-2015 layer performed better with improvements ranging from 6.69% to 30%.

Discussion
In the above sections, we present a set of comprehensive analyses to compare the relative accuracies of population distribution maps produced using the WSF-2015 and the experimental WSF-2015-Density layers. The first analysis consisted of an overall accuracy assessment carried at the validation unit level, where metrics such as MAE, MAPE, RMSE and R 2 ( Table 3) were used to evaluate maps produced using three spatial aggregation levels of the administrative units ( Table 2). The results presented in Table 6 show that, for all countries and both covariate layers, the highest accuracy values were reported for population maps produced using the finest input census units (Analysis I, Table 2), with accuracies decreasing from one level of spatial aggregation to the next. These results are directly in line with previous findings [17,35,55], and confirm the premise that higher accuracies in population mapping can be achieved with improvements in the resolution of the input census data. In the same way, from a comparative point of view, the overall accuracy results showed that, for the majority of the countries, except Cambodia and Malawi, the WSF-2015-Density layer performed better than the WSF-2015.
When interpreting and comparing the overall accuracy results between countries and between covariate layers, there are, however, a set of considerations that need to be taken into account. First, it is important to understand, that regardless of the input covariate layer used for population disaggregation, high accuracies can be reached, when the number and ASR of the of the administrative units used for validation are similar to those of the administrative units used as input data (Table 2). This can be seen, for example, by examining the results of Analysis I for Côte d'Ivoire, Myanmar and Vietnam ( Table 6).
The fact that these countries reported relatively good accuracy results is more likely to be due to the small difference between the number of administrative units used as input and validation units (407, 225 and 625, respectively) and the small ratio between their ASR (2.16, 2.11 and 3.30, respectively). These results are linked to the scale effect of the modifiable areal unit problem (MAUP), where the correlation between variables increases as the areal unit size becomes similar [59].
A second consideration to keep in mind is to avoid the use of the R 2 metric as unique statistical indicator to report the accuracy of population distribution models. Previous research has demonstrated that the lack of variability in the data influences the coefficient of determination [60]. For example, for England, where significantly low R 2 values were obtained in comparison to the MAE, MAPE and RMSE metrics, these can be related to the fact that the original census data reports similar population counts for a large number of the administrate units used for validation. This can be seen in the boxplots of Figure 8, where the reported actual population counts of the validation units of England are constrained within a small range of values. This small variability in the data, according to Goodwin et al. [60], results in a poor correlation between the estimated population counts and actual population counts as exemplified in the scatter plots of Figure 9. Here, it is possible to observe an amorphous or non-structured appearance of the data points for England in comparison to France, which results in a poor correlation, signalised by the almost horizontal trend-line. Myanmar and Vietnam ( Table 6). The fact that these countries reported relatively good accuracy results is more likely to be due to the small difference between the number of administrative units used as input and validation units (407, 225 and 625, respectively) and the small ratio between their ASR (2.16, 2.11 and 3.30, respectively). These results are linked to the scale effect of the modifiable areal unit problem (MAUP), where the correlation between variables increases as the areal unit size becomes similar [59].
A second consideration to keep in mind is to avoid the use of the R 2 metric as unique statistical indicator to report the accuracy of population distribution models. Previous research has demonstrated that the lack of variability in the data influences the coefficient of determination [60]. For example, for England, where significantly low R 2 values were obtained in comparison to the MAE, MAPE and RMSE metrics, these can be related to the fact that the original census data reports similar population counts for a large number of the administrate units used for validation. This can be seen in the boxplots of Figure 8, where the reported actual population counts of the validation units of England are constrained within a small range of values. This small variability in the data, according to Goodwin et al. [60], results in a poor correlation between the estimated population counts and actual population counts as exemplified in the scatter plots of Figure 9. Here, it is possible to observe an amorphous or non-structured appearance of the data points for England in comparison to France, which results in a poor correlation, signalised by the almost horizontal trend-line.  The aforementioned findings indicate that the use of single statistics metrics can be misleading and that population distribution maps can report high accuracy results independently of the quality Myanmar and Vietnam ( Table 6). The fact that these countries reported relatively good accuracy results is more likely to be due to the small difference between the number of administrative units used as input and validation units (407, 225 and 625, respectively) and the small ratio between their ASR (2.16, 2.11 and 3.30, respectively). These results are linked to the scale effect of the modifiable areal unit problem (MAUP), where the correlation between variables increases as the areal unit size becomes similar [59].
A second consideration to keep in mind is to avoid the use of the R 2 metric as unique statistical indicator to report the accuracy of population distribution models. Previous research has demonstrated that the lack of variability in the data influences the coefficient of determination [60]. For example, for England, where significantly low R 2 values were obtained in comparison to the MAE, MAPE and RMSE metrics, these can be related to the fact that the original census data reports similar population counts for a large number of the administrate units used for validation. This can be seen in the boxplots of Figure 8, where the reported actual population counts of the validation units of England are constrained within a small range of values. This small variability in the data, according to Goodwin et al. [60], results in a poor correlation between the estimated population counts and actual population counts as exemplified in the scatter plots of Figure 9. Here, it is possible to observe an amorphous or non-structured appearance of the data points for England in comparison to France, which results in a poor correlation, signalised by the almost horizontal trend-line.  The aforementioned findings indicate that the use of single statistics metrics can be misleading and that population distribution maps can report high accuracy results independently of the quality The aforementioned findings indicate that the use of single statistics metrics can be misleading and that population distribution maps can report high accuracy results independently of the quality of the underlying covariate layers used for population disaggregation. Therefore, it is important to emphasise, not only that full dissemination of the data used for modelling and validation is essential when reporting accuracy results [3,6,35,54], but also that, to evaluate the real effectiveness of the covariate layers, it is necessary to undertake more in-depth analyses using complementary metrics.
In this research, with the use of the REE statistical metric (Equation (5)), it was possible to evaluate the amount and distribution of error generated by each covariate layer ( Figure 3 and Table 7), and identify the areas where large errors of underestimation and large errors overestimation can be expected (Figure 4). Our results show that both layers perform similarly, distributing approximately the same percentage of each country's total population with the same REE values. For all countries, the largest percentage of the population has been estimated with errors ranging from −25% to 25%, which in previous research has been considered as "accurately estimated" [54]. Nevertheless, only in Côte d' Ivoire, Germany, England and Myanmar this represent more than 50% of the total population, which indicates that, for the rest of the countries, a significant percentage of the total population was distributed with larger errors of underestimation and errors overestimation.
We attribute these errors to the quality (completeness) of the covariate layers and to the fact that they do not take into account information on the land or building use. On the one hand, our findings indicate that errors of underestimation are reported in validation units where not enough settlement pixels have been found for population disaggregation. These errors increase as the ratio between the actual population and the number of settlement pixels increases (Figure 4a). This means, for example, that, in countries where a large percentage of the population and validation units were "greatly underestimated" (Table 7 and Figure 4b), such as France, Cambodia, Mexico and Malawi, this can be explained by the large amount of validation units where zero or very few settlement pixels have been identified ( Figure 10). Therefore, despite the fact that the thematic accuracy of the WSF-2015 layer clearly outperforms any of the currently existing global human settlements masks [43], it is clear the data still show limitations with respect to a complete detection of all building structures. This can be explained by the spatial resolution of the Sentinel-1 and Landsat imagery used as input data, which restricts the identification of building structures, especially in regions where the settlement pattern is characterised by wide-spread single houses or very small hamlets. of the underlying covariate layers used for population disaggregation. Therefore, it is important to emphasise, not only that full dissemination of the data used for modelling and validation is essential when reporting accuracy results [3,6,35,54], but also that, to evaluate the real effectiveness of the covariate layers, it is necessary to undertake more in-depth analyses using complementary metrics.
In this research, with the use of the REE statistical metric (Equation (5)), it was possible to evaluate the amount and distribution of error generated by each covariate layer ( Figure 3 and Table  7), and identify the areas where large errors of underestimation and large errors overestimation can be expected (Figure 4). Our results show that both layers perform similarly, distributing approximately the same percentage of each country's total population with the same REE values. For all countries, the largest percentage of the population has been estimated with errors ranging from −25% to 25%, which in previous research has been considered as "accurately estimated" [54]. Nevertheless, only in Côte d' Ivoire, Germany, England and Myanmar this represent more than 50% of the total population, which indicates that, for the rest of the countries, a significant percentage of the total population was distributed with larger errors of underestimation and errors overestimation.
We attribute these errors to the quality (completeness) of the covariate layers and to the fact that they do not take into account information on the land or building use. On the one hand, our findings indicate that errors of underestimation are reported in validation units where not enough settlement pixels have been found for population disaggregation. These errors increase as the ratio between the actual population and the number of settlement pixels increases (Figure 4a). This means, for example, that, in countries where a large percentage of the population and validation units were "greatly underestimated" (Table 7 and Figure 4b), such as France, Cambodia, Mexico and Malawi, this can be explained by the large amount of validation units where zero or very few settlement pixels have been identified ( Figure 10). Therefore, despite the fact that the thematic accuracy of the WSF-2015 layer clearly outperforms any of the currently existing global human settlements masks [43], it is clear the data still show limitations with respect to a complete detection of all building structures. This can be explained by the spatial resolution of the Sentinel-1 and Landsat imagery used as input data, which restricts the identification of building structures, especially in regions where the settlement pattern is characterised by wide-spread single houses or very small hamlets. On the other hand, errors of overestimation are reported in validation units where a large number of settlement pixels have been reported in comparison to the amount of actual population, and that they increase as the ration between these two parameters decreases (Figure 4a). After a visual analysis of VHR satellite imagery, we found that large errors of overestimation are mainly reported in validation units where seaports and industrial complexes exist. Figure 11 shows an example of the population distribution results for an input unit in England with this particular built-up environment. The red line represents the geographical boundary of the input unit used for population disaggregation and the blue lines represent the geographical boundaries of the On the other hand, errors of overestimation are reported in validation units where a large number of settlement pixels have been reported in comparison to the amount of actual population, and that they increase as the ration between these two parameters decreases (Figure 4a). After a visual analysis of VHR satellite imagery, we found that large errors of overestimation are mainly reported in validation units where seaports and industrial complexes exist. Figure 11 shows an example of the population distribution results for an input unit in England with this particular built-up environment. The red line represents the geographical boundary of the input unit used for population disaggregation and the blue lines represent the geographical boundaries of the validation units. Here, it is possible to observe industrial areas in the southern parts of the input unit. These areas capture a large amount of the population counts comparable to high-density residential areas, reporting large errors of overestimation in the validation units. In the selected validation unit, for example the WSF-2015-Density layer reported a higher REE (186.56%) in comparison to the WSF-2015 (154.49%). This does not mean, however, that in every validation unit where this built-up environment exists the binary layer will perform better than the impervious layer. Depending on the extent and geographical boundaries of the input units, industrial or port areas can be mixed with residential areas, influencing the performance of each layer. More detailed information on and discussion of this aspect is provided at the end of this section in the context of the SSC index. validation units. Here, it is possible to observe industrial areas in the southern parts of the input unit. These areas capture a large amount of the population counts comparable to high-density residential areas, reporting large errors of overestimation in the validation units. In the selected validation unit, for example the WSF-2015-Density layer reported a higher REE (186.56%) in comparison to the WSF-2015 (154.49%). This does not mean, however, that in every validation unit where this built-up environment exists the binary layer will perform better than the impervious layer. Depending on the extent and geographical boundaries of the input units, industrial or port areas can be mixed with residential areas, influencing the performance of each layer. More detailed information on and discussion of this aspect is provided at the end of this section in the context of the SSC index. Similar accuracy limitations have been reported in the production of the GHS-POP and the HRSL population datasets [10,17]. Even when several local studies have demonstrated that information on the building use has the potential to improve population distribution results [61,62], this remains a major source of limitation in the production of global population datasets, as it is not possible to derive detailed semantic information on the building use through remote sensing methodologies. Population datasets such as LandScan and WorldPop integrate land use and land cover covariates to improve their results; however, as mentioned above, this introduces global transferability limitations and applicability restrictions.
For this reason, in this study, we began to analyse the relationship between the inherent characteristics of the underlying built-up environment and the performance of each covariate layer, as an alternative approach that could be used to minimise the errors introduced by the quality and lack of functional characterisation of the input covariate layers. Here, we introduced the SSC index as a globally transferable metric to categorise the input units in terms of the size and coverage of the underlying settlement objects. Our results clearly indicate that WSF-2015-Density layer distributes population with higher accuracies in regions with high SSC index values, reaching improvements Similar accuracy limitations have been reported in the production of the GHS-POP and the HRSL population datasets [10,17]. Even when several local studies have demonstrated that information on the building use has the potential to improve population distribution results [61,62], this remains a major source of limitation in the production of global population datasets, as it is not possible to derive detailed semantic information on the building use through remote sensing methodologies. Population datasets such as LandScan and WorldPop integrate land use and land cover covariates to improve their results; however, as mentioned above, this introduces global transferability limitations and applicability restrictions.
For this reason, in this study, we began to analyse the relationship between the inherent characteristics of the underlying built-up environment and the performance of each covariate layer, as an alternative approach that could be used to minimise the errors introduced by the quality and lack of functional characterisation of the input covariate layers. Here, we introduced the SSC index as a globally transferable metric to categorise the input units in terms of the size and coverage of the underlying settlement objects. Our results clearly indicate that WSF-2015-Density layer distributes population with higher accuracies in regions with high SSC index values, reaching improvements up to~30% over the WSF-2015 layer (Table 8). For regions with low and medium SSC index values, the performance of each covariate layer varies from country to country. Figure 12 shows the distribution of the SSC index values and the mean SSC index value for the "low" and "medium" SSC classes for each country. up to ~30% over the WSF-2015 layer (Table 8). For regions with low and medium SSC index values, the performance of each covariate layer varies from country to country. Figure 12 shows the distribution of the SSC index values and the mean SSC index value for the "low" and "medium" SSC classes for each country. Focusing on the distribution of the "low" SSC class, countries where the WSF-2015 reported in average less RMSE values are also the countries where more than half of the input units reported SSC index values lower than 0.40. In other words, the SSC index values fell below the mean of the "low" SSC class that ranges from >0 to 1. For the "medium" SSC class, the distribution of the SSC index values among countries is relatively similar. The mixture of medium to highly populated cities and rural areas within these input units represent challenging modelling regions where further analyses are required to identify the particular circumstances where one layer outperforms the other.
Nevertheless, it is important to notice that the WSF-2015-Density layer performed better in all three classes of the SSC index for countries such as Germany, France and England, hence suggesting that the overall performance is largely driven by an accurate identification of building structures within the rural areas of each country. In this context, it is expected that limitations derived from the current underestimation of smaller settlements and isolated buildings can be overcome by the future integration of Sentinel-2 data in the production of future WSF datasets, due to its increased spatial resolution [40].
As a final note, it is important to mention that, even when the population distribution maps presented in this research have been produced using the most frequently employed population census data, the difficulties in the acquisition of the finest census data, the challenges in integrating census data with spatial boundaries and the uncertainties of population estimates based on statistical projections, are additional sources of errors and uncertainty limiting the accuracy of the population distribution models. Therefore, as stated by Doxsey-Whitfield et al. [6], acquiring up-to-date global population census data at the highest spatial detail possible should remain a priority for improving global population mapping.

Conclusions
The presented study focused on the cross-comparison of population distribution maps produced using the WSF-2015 and the experimental WSF-2015-Density layers. The main objective was to investigate if higher accuracies in population distribution mapping can be achieved using additional information on the build-up environment, such as the percentage of impervious surface, in comparison to the already established binary approach employed by other population datasets and their baseline settlement layers. Focusing on the distribution of the "low" SSC class, countries where the WSF-2015 reported in average less RMSE values are also the countries where more than half of the input units reported SSC index values lower than 0.40. In other words, the SSC index values fell below the mean of the "low" SSC class that ranges from >0 to 1. For the "medium" SSC class, the distribution of the SSC index values among countries is relatively similar. The mixture of medium to highly populated cities and rural areas within these input units represent challenging modelling regions where further analyses are required to identify the particular circumstances where one layer outperforms the other.
Nevertheless, it is important to notice that the WSF-2015-Density layer performed better in all three classes of the SSC index for countries such as Germany, France and England, hence suggesting that the overall performance is largely driven by an accurate identification of building structures within the rural areas of each country. In this context, it is expected that limitations derived from the current underestimation of smaller settlements and isolated buildings can be overcome by the future integration of Sentinel-2 data in the production of future WSF datasets, due to its increased spatial resolution [40].
As a final note, it is important to mention that, even when the population distribution maps presented in this research have been produced using the most frequently employed population census data, the difficulties in the acquisition of the finest census data, the challenges in integrating census data with spatial boundaries and the uncertainties of population estimates based on statistical projections, are additional sources of errors and uncertainty limiting the accuracy of the population distribution models. Therefore, as stated by Doxsey-Whitfield et al. [6], acquiring up-to-date global population census data at the highest spatial detail possible should remain a priority for improving global population mapping.

Conclusions
The presented study focused on the cross-comparison of population distribution maps produced using the WSF-2015 and the experimental WSF-2015-Density layers. The main objective was to investigate if higher accuracies in population distribution mapping can be achieved using additional information on the build-up environment, such as the percentage of impervious surface, in comparison to the already established binary approach employed by other population datasets and their baseline settlement layers.
The results of the quantitative assessment showed that the overall accuracies between both covariate layers are comparably similar, with the best accuracy results reported for population distribution maps produced using the finest input census data. Our results indicate that, while both layers distribute the largest percentage of each country's total population with estimation errors ranging from −25% to 25%, remaining limitations derived from: (i) the incomplete identification of settlement pixels; and (ii) the lack of information on the building use, still introduce large errors of underestimation and errors overestimation in a considerable percentage of the population.
Notwithstanding these limitations, from a comparative point of view, our results have shown that population distribution maps produced on basis of the WSF-2015-Density layer provide a more realistic representation of the spatial distribution of the population, as the heterogeneous allocation of population counts prevents the appearance of artificial patterns between neighbouring administrative census units. Furthermore, it has been demonstrated that the WSF-2015-Density layer produces higher accuracies in high-density built-up environments and is capable to improve the estimation accuracies of the WSF-2015 layer up to~30%, especially in those countries where a good percentage of building structures have been identified within the rural areas. The fact that the WSF-2015-Density layer is derived from remote sensing approaches that do not require a priori knowledge of the land cover makes it a strong suitable proxy capable to improve global population distribution methodologies, and, as it is not based on local relationships, it has no applicability restrictions in comparison to other existing products. Moreover, it provides global coverage and can be straightforward updated allowing time agreement with census population data, enabling the production of a consistent global population distribution dataset with higher accuracy and spatial resolution than those currently available.
One of the strengths of our study is the implementation of the SSC index, used to investigate the correlation between the built-up environment and the performance of each covariate layer. Our results suggest that higher accuracies in population disaggregation could be achieved with the correct preselection of the input covariate at the input unit level; however, to implement this preselection, additional research is still necessary, as the SSC index cannot provide a complete distinction between the covariate layers in areas with middle SSC index values.
However, in the light of these highly promising results, future research will focus on the validation and open release of the WSF-2015-Density layer, expanding the accuracy assessment of population mapping to other regions of the world, with special focus on arid and semi-arid areas, and comparing the results against other existing global population distribution datasets. Within this outlook, deeper research on the SSC index will also be included, to develop a methodology that can help minimise the inherent distribution errors derived from the quality and functional characterisation of the input covariates, as well as in the production of a new global population dataset.