Implications for Tracking SDG Indicator Metrics with Gridded Population Data

: Achieving the seventeen United Nations Sustainable Development Goals (SDGs) requires accurate, consistent, and accessible population data. Yet many low- and middle-income countries lack reliable or recent census data at the sufﬁciently ﬁne spatial scales needed to monitor SDG progress. While the increasing abundance of Earth observation-derived gridded population products provides analysis-ready population estimates, end users lack clear use criteria to track SDGs indicators. In fact, recent comparisons of gridded population products identify wide variation across gridded population products. Here we present three case studies to illuminate how gridded population datasets compare in measuring and monitoring SDGs to advance the “ﬁtness for use” guidance. Our focus is on SDG 11.5, which aims to reduce the number of people impacted by disasters. We use ﬁve gridded population datasets to measure and map hazard exposure for three case studies: the 2015 earthquake in Nepal; Cyclone Idai in Mozambique, Malawi, and Zimbabwe (MMZ) in 2019; and ﬂash ﬂood susceptibility in Ecuador. First, we map and quantify geographic patterns of agreement/disagreement across gridded population products for Nepal, MMZ, and Ecuador, including delineating urban and rural populations estimates. Second, we quantify the populations exposed to each hazard. Across hazards and geographic contexts, there were marked differences in population estimates across the gridded population datasets. As such, it is key that researchers, practitioners, and end users utilize multiple gridded population datasets—an ensemble approach—to capture uncertainty and/or provide range estimates when using gridded population products to track SDG indicators. To this end, we made available code and globally comprehensive datasets that allows for the intercomparison of gridded population products.


Introduction
The United Nations Sustainable Development Goals (SDGs) aim to end global poverty by 2030 and ensure a sustainable future [1]. To accomplish this, the SDGs outline a set of seventeen interlinked and shared objectives to improve economic and health outcomes in low-and middle-income countries (LMICs) while simultaneously reducing environmental sets may influence the methods and results employed to monitor and evaluate progress towards SDGs.
Here we present three case studies to illuminate how gridded population datasets compare in measuring and monitoring SDGs and to advance the "fitness for use" guidance. Our focus is on SDG Target 11.5: "By 2030, significantly reduce the number of deaths and the number of people affected and substantially decrease the direct economic losses relative to global gross domestic product caused by disasters, including water-related disasters, with a focus on protecting the poor and people in vulnerable situations." In the context of SDG Target 11.5, we compare how five gridded population data products (Table 1) measure the population exposed to the 2015 earthquake in Nepal and Cyclone Idai in Mozambique, Malawi, and Zimbabwe (henceforth referred as MMZ) in 2019, as well as populations susceptible to flash floods in Ecuador. By focusing on a range of geographic and country-specific contexts across several hazards, our results provide insights into the ways in which the construction of different gridded population products across geographies affects the resulting calculation of the potentially affected populations.
Furthermore, we explore how gridded population products can be applied to other global frameworks, such as towards the Sendai Framework for Disaster Risk Reduction Targets [26]. Specifically, to achieve the first four targets (a-d), disaster risk monitoring requires accurate estimates of impacts on people and/or property [27]. Our work contributes to the Sendai Framework Global Target B: "Substantially reduce the number of affected people globally by 2030, aiming to lower the average global figure per 100,000 between 2020-2030 compared with 2005-2015." Beyond the importance of gridded data for calculating these indicators, however, we note that accurate population estimates are also vital to emergency management and humanitarian agencies in the post-disaster response phase, when assessments of the number of people affected directly translates to supplies and disaster finance being prioritized (or deprioritized) across spatial units of interest [28]. Thus, the work presented here contributes to an understanding of the potential for operational uses of various gridded population products.
We chose five gridded populations products (Table 1) due to their availability at the time of analysis and their global coverage. Our analysis focuses on the comparison of these different gridded products and why estimates are similar or dissimilar in different socio-economic and hazard contexts. Given the lack of "ground truth" micro census population estimates for the regions compared, we do not assess the accuracy of the gridded population products themselves. However, the analysis provided does inform end users of the potential pros and cons of using these datasets in the context of measuring SDG 11.5.
We have two interrelated objectives. First, we map and quantify geographic patterns of agreement/disagreement across gridded population products for Nepal, MMZ, and Ecuador, including delineating urban from rural populations estimates. Several methodologies have been used to compare products [21][22][23][24][25]. For the initial step, we identify the number of rasters that agree if a given pixel is inhabited or not. Next, we assess pixel-level variation across the five gridded population products by plotting the minimum pixel values against pixel ranges to identify outliers and showcase the contrast in pixel-level measurements. Then, for each gridded product, we examine transects through the primary urban centers impacted by the hazards to both visually and quantitatively demonstrate the variability in population estimates by product across the urban-rural continuum [29].
The second objective aims to situate our first objective within the context of measuring, monitoring, and mapping SDG 11.5. For each dataset, we estimate the total number of people, stratified by urban and rural populations, exposed to each hazard. For Nepal, we compare estimates across seismic intensity levels during the 2015 Earthquake. With Cyclone Idai, we compare estimates of population inundated by water detected by Sentinel-1 EO platform and exposed to wind speed zones. Lastly, in Ecuador, we quantify and map populations living in zones across levels of susceptibility to flash floods. We emphasize that our results do not quantify error or validate population estimates across gridded population products. We note that the urban land cover designation we employ (Section 2.1) is for comparative purposes only. Our analysis does not assess how gridded population products measure urban populations and urban boundaries (for further detail see [25,29]). All our code and data used in this analysis is open source and freely available for other scholars and practitioners to develop their own use cases. This includes global raster datasets that allow for the intercomparison of gridded population products.

Population Data
We focus on three geographies of interest-Nepal, the region of Mozambique, Malawi, and Zimbabwe (MMZ), and Ecuador ( Figure 1)-to explore how the gridded population products measure populations related to SDG 11.5 across a range of geographic contexts. Nepal, a relatively small country, is landlocked between China's Tibet Autonomous Region and India, and is very mountainous. As hazards do not respect political boundaries, we present MMZ to measure exposure in a cross-border use case. Indeed, population exposure to Cyclone Idai spanned from Mozambique's low-elevation coastal zones, to Malawian settlements near Lake Malawi, to the relatively high-elevation settlements in Zimbabwe. Ecuador presents both a mountainous and a coastal geography to examine hazards, as well as higher levels of economic development compared to the other two geographies of interest. We intentionally chose countries and study areas that coincided with hazard events that matched the gridded population product dates and span LMIC contexts. gridded population products. We note that the urban land cover designation we employ (Section 2.1) is for comparative purposes only. Our analysis does not assess how gridded population products measure urban populations and urban boundaries (for further detail see [25,29]). All our code and data used in this analysis is open source and freely available for other scholars and practitioners to develop their own use cases. This includes global raster datasets that allow for the intercomparison of gridded population products.

Population Data
We focus on three geographies of interest-Nepal, the region of Mozambique, Malawi, and Zimbabwe (MMZ), and Ecuador ( Figure 1)-to explore how the gridded population products measure populations related to SDG 11.5 across a range of geographic contexts. Nepal, a relatively small country, is landlocked between China's Tibet Autonomous Region and India, and is very mountainous. As hazards do not respect political boundaries, we present MMZ to measure exposure in a cross-border use case. Indeed, population exposure to Cyclone Idai spanned from Mozambique's low-elevation coastal zones, to Malawian settlements near Lake Malawi, to the relatively high-elevation settlements in Zimbabwe. Ecuador presents both a mountainous and a coastal geography to examine hazards, as well as higher levels of economic development compared to the other two geographies of interest. We intentionally chose countries and study areas that coincided with hazard events that matched the gridded population product dates and span LMIC contexts.   The five gridded population products we use are (Table 1) [34]. For a complete description of how each gridded population product is produced, see [20,25], as well as the PopGrid Data Collaborative [35].
Aside from GPW-15, the gridded population datasets used in this study rely on relationships between EO data and human settlement patterns to disaggregate the finest available administrative unit level population data into pixels (Table 1). A higher administrative level corresponds to a finer resolution administrative unit. Additionally, only LS-15 focuses on daytime, or ambient population, whereas the other products aim to capture nighttime residential population [35]. Last, aside from GHS-15, which disaggregates administrative unit-level population only within pixels identified as containing built settlements (constrained), all other products disaggregate administrative unit-level population over all land pixels globally (unconstrained). All products we use are at 1 km spatial resolution. We use GPW-15 as a baseline for comparison, as it is the underlying population data for both GHS-15 and WP-16. Finally, we include United Nations population estimates for each geography of interest for 2015 (  [39]. Official estimates state that the earthquake killed more than 8000 people, injured 21,000 more people, and displaced at least 2 million people in total [40]. Some 600,000 homes were destroyed, with another quarter million damaged [41]. The government estimated that reconstruction costs would surpass $7 billion, or a third of Nepal's GDP in the prior fiscal year [42]. Data and information on the earthquake was obtained from the US Geological Survey (USGS) [39]. USGS ShakeMap shapefiles were used to estimate earthquake impacts by "Instrumental Intensity", which is a proxy for Modified Mercalli Intensity (a qualitative index that can not strictly be determined by instruments).

Cyclone Idai
Cyclone Idai made landfall near Beira, Mozambique, on 14 March 2019. The storm had sustained wind >120 km/h. By March 16, the storm had tracked across Southern Mozambique into Zimbabwe. Flooding was observed throughout Malawi, Mozambique, and Zimbabwe, directly impacting 1.85 million people across MMZ [43]. The immediate financial requirement of the response was estimated to be nearly $300 million [43].
To measure maximum flood extent, we use a 90 m raster available from the World Food Program that captures maximum flood extent as of 21 March 2019 [44]. The raster is derived from Sentinel-1 data obtained from 12 to 21 March 2019 and ARC Flood Extent Depiction Model (AFED) detecting non-persistent water (7)(8)(9)(10)(11)(12)(13)(14)(15)(16) March and 20 March 2019). We resampled the 90 m flood raster to 1 km and reprojected it to match GPW-15. Data on wind speeds were downloaded from the Global Disaster Alert and Coordination System [45], with shapefiles delineating zones impacted by 60, 90, 120 km/h wind speed thresholds.

Ecuador Flash Flood Susceptibility
In Ecuador, as well as on a global scale, flash floods are one of the most deadly types of flood with distinct spatiotemporal physical and impact-related characteristics [46][47][48][49]. Early warning systems exist for floods in many countries; however, they are rarely linked to resilience programming that can decrease risk of a flash flood disaster. While a long time series of impact data for flash floods (and any type of floods) does not openly exist in Ecuador [50], financial estimates of flood impact in the country can be acquired in some instances, with a reported US$238 million in flood impact in 2012 [51].
To represent the susceptibility for flash flooding at the catchment scale in Ecuador, a new vector dataset is derived from geophysical and non-geophysical data [52]. The susceptibility layer was built using a principal component analysis (PCA) derived weighted mean [53] of geographic indicators known to drive the flash flood potential of a catchment, related to geomorphology, drainage systems, and surface characteristics [54][55][56]. Geographic indicators such as slope, curvature, stream order, area of contributing sources, density of drainage, land cover, and sand content [57][58][59][60] were attributed to each catchment, using the predefined level 12 watershed units of HydroSHEDS [61], developed by the Conservation Science Program of World Wildlife Fund (WWF). The resulting flash flood susceptibility composite layer is normalized and reclassified into an equal count discrete flash flood susceptibility index from 1 to 10, low to high susceptibility, respectively, and represents the relative ranking of Ecuador catchments according to their increased susceptibility to generate flash flooding in the case of heavy rain.

Urban/Rural Data
To identify urban versus rural population estimates across the five gridded population products, we use an urban-rural binary land cover classification derived from MODIS data-the MODIS global urban extent product (MGUP) [62]. This dataset is available from 2002 to 2018 at 500 m spatial resolution. We resample the 2015 MGUP data to 1 km and projected it to match GPW-15. We employ MGUP as a relatively independent estimation of where urban settlements exist, as other MODIS products are an input in three of the five gridded population products (Table 1). In addition, we recognize that MGUP is one among many datasets that delineate urban from rural land cover globally and that binary urban/rural categorizations have well-known limitations [29]. As such, the MGUP urban/rural designation we employ is, to a degree, arbitrarily defined with an intent more on trying to better understand underlying population distribution methods, not a statement on what population is urban and what is rural.

Raster Processing & Analysis
The five global gridded population rasters and the MGUP urban/rural land cover raster were spatially co-registered (see Supplemental Information for the detail) and clipped using the GADM level 0 administrative units for Nepal, MMZ, and Ecuador (excluding islands). Across all five gridded population datasets, for the three study areas we map uninhabited pixels, calculate maximum and minimum population (as well as the range (maximum-minimum) at the pixel level), and measure urban and rural population estimates according to the MGUP urban/rural classification (Table 2). To identify outliers and highlight the variation in pixel-level estimates, we plotted pairwise pixel minimum population estimates against pixel ranges ( Figure 2). We then examine 7 km by 61 km transects through three urban areas-Katmandu for Nepal, Beira for MMZ, and Quito for Ecuador-to visually and quantitatively demonstrate how the products' population estimates compare at the pixel level across the urban-rural continuum.
To compare estimates of populations impacted by the three hazards under study, we sum the populations by hazard criteria for the five gridded population products, separating urban and rural estimates. For the 2015 Earthquake in Nepal, we sum the population exposed by the USGS Shakemap "Instrumental Intensity" contour polygons. For Cyclone Idai, we sum the population exposed by wind speed buffer and flooded area. Finally, for the flash flood in Ecuador we sum the population exposed by the susceptibility index layer.
transects through three urban areas-Katmandu for Nepal, Beira for MMZ, and Quito for Ecuador-to visually and quantitatively demonstrate how the products' population estimates compare at the pixel level across the urban-rural continuum.
To compare estimates of populations impacted by the three hazards under study, we sum the populations by hazard criteria for the five gridded population products, separating urban and rural estimates. For the 2015 Earthquake in Nepal, we sum the population exposed by the USGS Shakemap "Instrumental Intensity" contour polygons. For Cyclone Idai, we sum the population exposed by wind speed buffer and flooded area. Finally, for the flash flood in Ecuador we sum the population exposed by the susceptibility index layer. Values correspond to the number of rasters in agreement that a pixel is inhabited. White shows that all five gridded population datasets agree that a pixel is uninhabited. The higher agreement for Nepal (a) and Malawi (b) is a result of the higher number of administrative input units. Note that the spatial scale of each panel is different.

Pixel-Level Comparisons
Four broad patterns emerged when comparing how the five gridded population datasets allocate populations in Nepal, MMZ, and Ecuador. First, we found widespread pixel-level variation in agreement across gridded population products of whether or not a given pixel is inhabited. Broadly, GHS-15 and WPE-16 identify a smaller proportion of inhabited pixels regardless of geography. We found that Nepal had the highest proportion Values correspond to the number of rasters in agreement that a pixel is inhabited. White shows that all five gridded population datasets agree that a pixel is uninhabited. The higher agreement for Nepal (a) and Malawi (b) is a result of the higher number of administrative input units. Note that the spatial scale of each panel is different.

Pixel-Level Comparisons
Four broad patterns emerged when comparing how the five gridded population datasets allocate populations in Nepal, MMZ, and Ecuador. First, we found widespread pixel-level variation in agreement across gridded population products of whether or not a given pixel is inhabited. Broadly, GHS-15 and WPE-16 identify a smaller proportion of inhabited pixels regardless of geography. We found that Nepal had the highest proportion of agreement, with all five gridded population products agreeing that 76% of pixels are either inhabited or uninhabited (Figure 2a). Only 67% for MMZ ( Figure 2b) and 62% for Ecuador ( Figure 2c) had full agreement by all five products. WP-16, LS-15, and GPW-15 tended to distribute population to a far greater number of pixels, unlike GHS-15 and WPE-16 (Table 2). For example, for MMZ, 85% of pixels in WPE-16 and 89% of pixels in GHS-15 were uninhabited. In contrast, WP-16, LS-15, and GPW-15 identified that only 11%, 14%, and 8% of pixels in MMZ are uninhabited, respectively.
Second, we documented extreme pixel-level population estimation disparities across gridded population products ( Figure 3) and identified outliers. In pairwise comparison between the minimum pixel values with the range identified across all five gridded pop-Sustainability 2021, 13, 7329 9 of 21 ulation products, for Nepal and MMZ, 27 rural pixels with minimums of 0-1000 people had ranges that exceed 50,000 people. Rural outliers in Ecuador do not have the same magnitude as Nepal or MMZ. Yet we still identified 8 rural pixels with minimum values of 0-1000 that have a range that exceeds 25,000 people in Ecuador. In the most extreme example, one pixel on the border of Nepal and India was estimated by GHS-15 to have nearly 120,000 residents (Figure 3a, Table 2). GPW-15 and WP-16 allocate 2392 and 730 people, respectively, to the same pixel, and the other two products identify fewer than 125 people. A visual inspection of high-resolution WorldView imagery from Google Earth reveals that the pixel mostly corresponds to a river with sand bars. For context, recently the United Nations Statistical Commission released standards identifying pixels within urban cores as having a population density of at least >1500 people per km 2 [63].  Figure 5 presents the comparison of population estimates exposed to seismic intensity across the five gridded population products, stratified by MGUP-identified urban and rural settlements in Nepal. For the total population exposed to an intensity greater than seven, the difference between the highest and lowest populations estimated Third, we found a clear pattern that WPE-16 total population estimates greatly exceeded the other four datasets. For instance, in MMZ, we found that WPE-16 exceeds the total population measured by the other gridded population datasets by 6-11 million people (Table 2), and exceeds UN population estimates for all three geographies of interest as well. As another example, while GHS-15 tended to prioritize allocating population to urban settlements compared to the other gridded population datasets, WPE-16 identified more urban residents in Nepal, by as much as 500,000 people, than the other four gridded population products.
Fourth, GHS-15 allocated a greater share of the total population to urban areas (as defined by the MGUP dataset) than the other four products for all three regions under study. For example, in Ecuador, GHS-15 estimated that 51% of the total population is urban. LS-15 was ranked second for Ecuador, allocating 47.9% of the total population to urban areas, followed by WPE-16 with 43.67% of the total population estimated as urban. In MMZ, GHS-15 again led in terms of the share of total population allocated to urban areas, again followed by LS-15. In Nepal, WP-16 and GPW-15, respectively, followed GHS-15 in terms of the share of total populations allocated to urban areas. Nonetheless, all five gridded population products underestimated the total urban population for all three geographies of interest when compared to 2018 United Nations World Urbanization Prospect estimates [36]. Indeed, even GHS-15 estimated nearly 50% fewer urban residents in MMZ than official UN counts.
Because the MGUP rural-urban binary designation does not capture how population density varies across that rural-urban continuum [29], Figure 4 illustrates how each gridded population product allocates population moving away from major urban centers. For the Kathmandu transect (Figure 4a), there is far closer agreement in populations among the products compared to Beira (Figure 4b) and Quito (Figure 4c). GPW-15, LS-15, and WP-16 capture less dense populations to the west of Beira (Figure 4b), as well as to the east of Quito (Figure 4c). In contrast, WPE-15 and GHS-15 do not capture these rural populations near Beira and Quito. This finding reinforces the preference of GHS-15 to allocate population to urban pixels. Figure 5 presents the comparison of population estimates exposed to seismic intensity across the five gridded population products, stratified by MGUP-identified urban and rural settlements in Nepal. For the total population exposed to an intensity greater than seven, the difference between the highest and lowest populations estimated to be exposed by the products was more than 1 million people. WP-15 estimated the maximum number of people exposed at 9.85 million people, whereas GHS-15 finds 8.64 million people. The products furthermore measured a wide range in both the total number and the proportion of urban populations exposed to an intensity > 7. On the low end, WPE-16 categorized 22% (2.11 million people) of the population exposed to an intensity greater than 7 as urban, whereas, on the high end, WP-16 indentifed 33% (3.33 million people) of the total population exposed to an intensity greater than 7 as urban. As such, the elevated number of urban residents exposed according to WP-16 paralleled the previous finding that WP-16 identified more urban residents in Nepal compared to the other gridded population products ( Table 2).
For intensities less than 7 ( Figure 5), we found a broad range of population estimates across the gridded population products. For example, for an intensity between 5 and 6, WPE-16 measured more than 7.73 million people exposed, yet GHS-15 found only 6.46 million people exposed. Generally, the gridded products were largely in agreement for these lower intensities that the vast majority of people impacted lived in rural areas, although LS-15 still identified nearly 500,000 urban residents exposed to an intensity between 5 and 6, and 200,000 urban residents exposed to an intensity between 4 and 5.
For intensities less than 7 (Figure 5), we found a broad range of population estimates across the gridded population products. For example, for an intensity between 5 and 6, WPE-16 measured more than 7.73 million people exposed, yet GHS-15 found only 6.46 million people exposed. Generally, the gridded products were largely in agreement for these lower intensities that the vast majority of people impacted lived in rural areas, although LS-15 still identified nearly 500,000 urban residents exposed to an intensity between 5 and 6, and 200,000 urban residents exposed to an intensity between 4 and 5.

Cyclone Idai Exposure in MMZ
Generally, for Cyclone Idai, estimates of populations exposed to high wind speeds and living in flood-inundated areas in MMZ varied across the five gridded population data products, both in total agreement and divided by MGUP-identified urban and rural areas ( Figure 6). Wind speeds of 60 km/h, though the least severe of wind categories, had the greatest variation. For instance, WPE-16 measured 7.41 million people (80% rural), the most impacted by wind speeds of 60 km/h. GPW-15 identified only 7.01 million people (88% rural) exposed. Estimates of populations exposed to wind speeds of 120 km/h and flood-inundated areas, the most damaging hazards, also showed substantial variation. Again, WPE-16 ranked first, with 2.39 million people (83% rural) exposed to wind speeds of 120 km/h. The other four gridded products had similar estimates of the total population exposed to wind speeds of 120 km/h, which ranged from 1.89 to 1.95 million people, though GHS-15 and LS-15 identify a greater proportion of urban populations impacted compared to GPW-15 and WP-16. Similarly, estimates of populations living in floodinundated areas ranged from WPE-16 identifying 817,000 people (88% rural) to GPW-15 identifying 1.28 million people (99% rural). Only for wind speeds of 90 km/h were the products relatively consistent-rural populations were almost exclusively exposed, with high-end estimates of 1.56 million people by WPE-16 and a low-end estimate of 1.46 million people by WP-16.

Cyclone Idai Exposure in MMZ
Generally, for Cyclone Idai, estimates of populations exposed to high wind speeds and living in flood-inundated areas in MMZ varied across the five gridded population data products, both in total agreement and divided by MGUP-identified urban and rural areas ( Figure 6). Wind speeds of 60 km/h, though the least severe of wind categories, had the greatest variation. For instance, WPE-16 measured 7.41 million people (80% rural), the most impacted by wind speeds of 60 km/h. GPW-15 identified only 7.01 million people (88% rural) exposed. Estimates of populations exposed to wind speeds of 120 km/h and flood-inundated areas, the most damaging hazards, also showed substantial variation. Again, WPE-16 ranked first, with 2.39 million people (83% rural) exposed to wind speeds of 120 km/h. The other four gridded products had similar estimates of the total population exposed to wind speeds of 120 km/h, which ranged from 1.89 to 1.95 million people, though GHS-15 and LS-15 identify a greater proportion of urban populations impacted compared to GPW-15 and WP-16. Similarly, estimates of populations living in floodinundated areas ranged from WPE-16 identifying 817,000 people (88% rural) to GPW-15 identifying 1.28 million people (99% rural). Only for wind speeds of 90 km/h were the products relatively consistent-rural populations were almost exclusively exposed, with high-end estimates of 1.56 million people by WPE-16 and a low-end estimate of 1.46 million people by WP-16. Sustainability 2021, 13, x FOR PEER REVIEW 13 of 21

Flash Flood Susceptibility in Ecuador
We found two main trends in the comparative analysis of population datasets and the estimation of the rural and urban share within each susceptibility decile (Figure 7). First, for most deciles, WPE-15 identified more population compared to the other four gridded population data products. For example, for the 10th decile, which was the most populated, WPE-15 estimated 2.96 million people, while on the low end, GPW-15 identified 2.47 million people. Second, across all susceptibility deciles GPW-15 estimated a greater share of rural population compared to the other four gridded population datasets. This is demonstrated by clear differences in rural/urban proportions in decile 3, whereby GPW-15 estimated almost all population as rural, compared to the under 50% estimation of rural population identified using the other products. GHS-15 and LS-15, on the other hand, allocated a greater share of population to urban areas. Again, using the 10th decile as an example, which indicates the areas with the highest likelihood of flash flood susceptibility, GHS-15 and LS-15 allocate 72% and 69% of the total population to urban areas, respectively. WPE-15 (62% urban) and WP-16 (56% urban) tend to fall between the preference of GHS-15 and LS-15 for urban areas and GPW-15's (32% urban) preference for rural areas.

Flash Flood Susceptibility in Ecuador
We found two main trends in the comparative analysis of population datasets and the estimation of the rural and urban share within each susceptibility decile (Figure 7). First, for most deciles, WPE-15 identified more population compared to the other four gridded population data products. For example, for the 10th decile, which was the most populated, WPE-15 estimated 2.96 million people, while on the low end, GPW-15 identified 2.47 million people. Second, across all susceptibility deciles GPW-15 estimated a greater share of rural population compared to the other four gridded population datasets. This is demonstrated by clear differences in rural/urban proportions in decile 3, whereby GPW-15 estimated almost all population as rural, compared to the under 50% estimation of rural population identified using the other products. GHS-15 and LS-15, on the other hand, allocated a greater share of population to urban areas. Again, using the 10th decile as an example, which indicates the areas with the highest likelihood of flash flood susceptibility, GHS-15 and LS-15 allocate 72% and 69% of the total population to urban areas, respectively. WPE-15 (62% urban) and WP-16 (56% urban) tend to fall between the preference of GHS-15 and LS-15 for urban areas and GPW-15's (32% urban) preference for rural areas.

Discussion
Gridded population data provide valuable population counts and densitity estimates for regions in the world where census data is lacking, coarse-scaled, or outdated [13]. The comparable and consistent means by which individual gridded data products are created ensures simple integration with other geospatial data products for use in measuring and monitoring various SDG indicators. However, using SDG 11.5 and a hazard context for three different geographies, we demonstrate that gridded population estimates can vary widely depending on the product of choice. Given the broad geographic contexts of our three case studies, our results suggest that gridded population products will similarly vary across many low-and middle-income countries (LMICs).
While variation in gridded population datasets has been documented by previous studies [21][22][23][24], many widely cited hazards studies (e.g., [64,65]), recent media narratives [66], and United Nations reports [67] continue to employ a single gridded population dataset without justification. In all of these cases, the authors neglect to acknowledge the wide variation in pixel-level population estimates. Indeed, we note that a recent review of estimates of population exposure to sea-level rise or living in low-elevation coastal zones [68] identified multiple global studies published since 2016 that use gridded population products. None of these studies employed multiple gridded population products in their risk estimates. Single-source population estimates, if framed in the context of hazard risk reduction related to SDG 11.5, are thus presented as facts to decision makers. This has broad implications for the allocation of scarce resources. If deployed in the immediate aftermath of a natural disaster, it could also affect humanitarian response and allocation of disaster relief.
Take two examples: first, we found that GHS-15 and LS-15 tend to allocate a greater share of population to the MODIS global urban extent product (MGUP)-designated urban areas compared to the other three products. While the MGUP rural-urban delineation is specific to the context of MODIS built-environment detection [62], and it is not the only criterion to identify urban settlement locations with EO data [25,29], this finding suggests that GHS-15 and LS-15 also prioritize allocating population to where the built environment is detected by Earth observation (EO)-derived spatial correlates. WP-16 also has been shown to prioritize built environment [69]; but our results indicate that WP-16 allocates population more evenly across land cover classes. Second, WPE-16 estimates a

Discussion
Gridded population data provide valuable population counts and densitity estimates for regions in the world where census data is lacking, coarse-scaled, or outdated [13]. The comparable and consistent means by which individual gridded data products are created ensures simple integration with other geospatial data products for use in measuring and monitoring various SDG indicators. However, using SDG 11.5 and a hazard context for three different geographies, we demonstrate that gridded population estimates can vary widely depending on the product of choice. Given the broad geographic contexts of our three case studies, our results suggest that gridded population products will similarly vary across many low-and middle-income countries (LMICs).
While variation in gridded population datasets has been documented by previous studies [21][22][23][24], many widely cited hazards studies (e.g., [64,65]), recent media narratives [66], and United Nations reports [67] continue to employ a single gridded population dataset without justification. In all of these cases, the authors neglect to acknowledge the wide variation in pixel-level population estimates. Indeed, we note that a recent review of estimates of population exposure to sea-level rise or living in low-elevation coastal zones [68] identified multiple global studies published since 2016 that use gridded population products. None of these studies employed multiple gridded population products in their risk estimates. Single-source population estimates, if framed in the context of hazard risk reduction related to SDG 11.5, are thus presented as facts to decision makers. This has broad implications for the allocation of scarce resources. If deployed in the immediate aftermath of a natural disaster, it could also affect humanitarian response and allocation of disaster relief.
Take two examples: first, we found that GHS-15 and LS-15 tend to allocate a greater share of population to the MODIS global urban extent product (MGUP)-designated urban areas compared to the other three products. While the MGUP rural-urban delineation is specific to the context of MODIS built-environment detection [62], and it is not the only criterion to identify urban settlement locations with EO data [25,29], this finding suggests that GHS-15 and LS-15 also prioritize allocating population to where the built environment is detected by Earth observation (EO)-derived spatial correlates. WP-16 also has been shown to prioritize built environment [69]; but our results indicate that WP-16 allocates population more evenly across land cover classes. Second, WPE-16 estimates a far greater population in Nepal, MMZ, and Ecuador compared to both UN estimates (Table 1) [1] and the other gridded population products. Should decision makers be presented with hazard risk reduction solutions or disaster preparedness scenarios based on GHS-15, they may implement policies that overly support urban populations. Communities not easily identified by the spectral signature of the EO-based data product would then be neglected. Similarly, relying only on WPE-16, in which population estimates are largely driven by leveraging the Landsat archive, may overestimate populations exposed or impacted by a hazard, leading to the over-allocation of resources, compared to policies developed with the other gridded population products. As such, efforts to monitor SDG indicators like SDG 11.5, which depend on detailed population data, can vary as a function of the gridded population product employed.
It is important to note that tracking any SDG indicator will also depend on the granularity of hazard data used in association with gridded population data. Indeed, in the context of SDG 11.5 the granularity of hazard data will affect estimates. For example, when we zoomed in on major urban centers from the three case studies presented ( Figure 4, Figure 8), significant pixel-level population ranges are identified that may not actually affect the total population measured over larger geographic areas. The USGS earthquake instrument intensity data (Figure 8a) is at a much coarser granularity over Kathmandu than the EO-observed flood inundated area around Beira, Mozambique (Figure 8b), or the flash flood susceptibility layer for Quito ( Figure 8c). Yet, unlike the more standardized analysis-ready hazard datasets we presented here, there is not a widely accepted set of criteria for deciding which gridded population dataset should be used to measure exposure to a given hazard.
While recent "fitness for use" guidelines provide key information for researchers, practitioners, and decision makers [5,20], our results suggest that the single use of a gridded population product should be avoided in tracking SDG indicators. These guidelines emphasize that spatial scale, reliability, and granularity of underlying census data, the population under study, and geography must be considered in data selection. Yet the variation across gridded population products we found for a range of hazards across geographic contexts signals that the single use of a product has realworld financial consequences. For instance, using our results from Cyclone Idai and per capita dollar basic disaster emergency costs of $112 per person [70], relief costs solely for those in flood inundated areas in MMZ range from US$92 million using the WPE-16 estimate of 817,000 people to US$143 million using GHS-15 estimate 1.28 million. Both estimates are below the official estimate of 1.85 million people impacted by Idai and the immediate financial estimate of US$300 million [43]. As such, we caution against the single use of a gridded population product. We reaffirm the need for the validation of existing products and urge future producers of gridded population products to provide error estimations.
Different gridded population modeling approaches and EO input data can result in varying population estimates per grid cell [20]. However, the finer the spatial resolution of input administrative units (Table 2) associated with population counts, the more similar the output population values per pixel will be across products regardless of the disaggregation approach. This is clear from Figure 4a, which shows much less variation in population estimates across the urban-rural gradient around Kathmandu. Generally speaking, finer administrative units tend to concentrate in higher population density areas. Administrative units that are larger, and coarser in spatial extent, tend to associate with lower population density areas, and are affected more by the different disaggregation approaches and their underlying assumptions. The result is greater variation in population estimates per pixel within the same administrative unit. We thus encourage national census agencies to release data with associated boundary files for the highest resolution units possible.   Our results further highlight differences between constrained and unconstrained gridded populations products. Constrained approaches disaggregate population counts linked to administrative units only within pixels identified as "settled." In doing so, there may be a tendency to overestimate the number of people "distributed" within high-density (urban) settings. Our results indicate that this is the case with GHS-15. In contrast, the unconstrained approach will disaggregate population counts into any pixel identified as "land," which may overestimate the number of people allocated within low-density, i.e., more rural, settings. In the case of GPW-15, the overestimation in low-density (rural) settings is expected to be even more pronounced than in the other unconstrained products given that the disaggregated population totals are evenly redistributed within all pixels identified as "land" and there is no additional ancillary data used in the model.
While we recommend using multiple products in hazard analysis, if selecting a single gridded population product, it is important to identify not only if the gridded population product is constrained in some way, but also how it is constrained. Underlying data sources of constrained products may have validation or uncertainty estimates that may vary by region or time point. For input settlement products that represent the constrained "built" areas in which population counts are disaggregated, there is a general expectation that those gridded products will have a more accurate representation of population distribution. However, the final gridded population product will be greatly impacted by the presence of omission and commission errors in the input settlement dataset, with small/isolated rural settlements potentially being more difficult to detect and certain types of land cover (e.g., rock outcrops or sandy soils) potentially being misclassified as settlement areas [71].
On the other hand, unconstrained approaches use a dasymetric approach to disaggregate input administrative population values within all pixels identified as "land." This will introduce error in the final gridded output for those areas that are actually uninhabited. There will be some trade-off with a minimization of the presence of omission and commission errors in whatever input settlement data is used in the modeling due to the influence of other ancillary covariates representing factors correlated with population density and presence. There is error in all products, but some basic understanding of how the gridded data products are produced can help identify which product makes the most sense for a given application. Reed et al. (2018) [72] demonstrates the relative robustness of the unconstrained WorldPop dataset compared to the constrained HRSL dataset. The similarities in error metrics for these products emphasizes the importance of considering omission/commission errors in the input built datasets and subsequent allocation of population values in the final product.
Finally, it should be noted that most of these gridded population datasets represent a residential population, or where people are most likely to be when at home. LandScan is an exception and represents the ambient population, or the average population distribution over 24 h. That type of data product is potentially useful for assessing exposure as it represents not only the static census-based residence information but also the ambient nature of population movement over a 24 h period.

Conclusions
Vector-based administrative-level population data often fails to disaggregate population at the spatial scales requisite to identify where people actually live on the planet. This type of data can fail to provide useful information for the delivery of services required to achieve the SDGs. As such, our findings reinforce the many advantages of using gridded population products to track SDG indicators. This is especially important in the context of measuring, monitoring, and mapping SDG 11.5. Indeed, reducing exposure to hazards requires accurate population estimates, and for many LMICs, Earth observation-derived gridded population products are the best available data. Likewise, gridded population products can provide crucial information in post-disaster contexts as well. The case studies we showcase here reinforce the broad utility of these products and advance our understanding of "fitness for use" for both SDG 11.5 and the 73 SDG indicators that require accurate, comparable, and timely high-resolution population estimates.
Nonetheless, we highlight that for some geographical regions (and/or hazards), population estimates will vary depending on the choice of gridded population product.
Despite the variation we identify across gridded population datasets, we emphasize that uncertainty is not per se a limitation in employing gridded population datasets to track the SDGs. Without externally derived validation data or producer error metrics, it remains difficult to provide definitive recommendations in terms of what product to use, where and for what type of hazard. As such, we recommend further resources be dedicated to micro census data collection and encourage producers to quantify uncertainty in future gridded population products.
Most importantly, we recommend that researchers, practitioners, and decision makers acknowledge that inherent uncertainty when using these products. Along with leveraging the "fitness for use guidelines" [5,20], a key step to doing so is to perform a sensitivity analysis [23] and/or present a range of estimates using multiple gridded populations using an ensemble approach. To this end, we have provided the code used in the analysis and made available a global raster dataset that allows for the intercomparison of gridded population products. Furthermore, researchers and practitioners who develop tools for decision makers to track the SDGs should incorporate multiple gridded population datasets. Decision makers can thus develop policies and allocate resources informed by information that captures some of the inherent uncertainties of leveraging EO data to measure human populations across the planet. Indeed, single-use population estimates could open the door for bad actors to select the gridded population product that maximizes progress towards achieving an SDG. Lastly, we emphasize that the development of indicators, including the use of datasets like gridded population products, should be a collective effort between users and producers across decision-making levels [73].