Estimation of Gridded Population and GDP Scenarios with Spatially Explicit Statistical Downscaling

: This study downscales the population and gross domestic product (GDP) scenarios given under Shared Socioeconomic Pathways (SSPs) into 0.5-degree grids. Our downscale approach has the following features. (i) It explicitly considers spatial and socioeconomic interactions among cities, (ii) it utilizes auxiliary variables, including road network and land cover, (iii) it endogenously estimates the inﬂuence from each factor by a model ensemble approach, and (iv) it allows us to control urban shrinkage/dispersion depending on SSPs. It is conﬁrmed that our downscaling results are consistent with scenario assumptions (e.g., concentration in SSP1 and dispersion in SSP3). Besides, while existing grid-level scenarios tend to have overly-smoothed population distributions in nonurban areas, ours does not suffer from the problem, and captures the difference in urban and nonurban areas in a more reasonable manner. Our gridded dataset, including population counts and gross productivities by 0.5 degree grids by 10 years, are available from http://www.cger.nies.go.jp/gcp/population-and-gdp.html.


Introduction
Socioeconomic scenarios are needed to project carbon dioxide (CO 2 ) emissions, disaster risks, and other factors affecting sustainability from a long-term perspective. The Intergovernmental Panel on Climate Change (IPCC) published Shared Socioeconomic Pathways (SSPs) [1,2] that describe future socioeconomic conditions under various scenarios, including SSP1-3. SSP1 makes relatively good progress toward sustainability under an open and globalized world. SSP2 is a middle-of-the-road scenario assuming that the typical trends in the last decades will continue, and in SSP3, the world is closed and fragmented into regions, but it fails to achieve sustainability.
While the SSPs are devised in terms of country scenarios, finer scenarios (e.g., scenarios in terms of 0.5-degree grids) are required to analyze regional/city-level sustainability and resiliency. A number of studies have downscaled country-level socioeconomic scenarios into finer spatial units. Gaffin et al. (2004) [3] is an initial work which estimated the gridded population and GDP of the world. Unfortunately, based on [4], the authors' approach has the following shortcomings: implausibly high growth rates, discontinuity of the projection algorithm before and after 2050, and assumption of independence between population and GDP. Studies [4,5] developed new algorithms to downscale population and GDP to address these limitations. Bengtsson et al. (2006) [6] estimated the gridded urban and nonurban population projection for 1990 to 2100. While the above-mentioned studies rely on trend extrapolation (e.g., GDP extrapolation assuming a constant growth rate), Hachadoorian et al. (2011) [7] Figure 1. Procedure for population and gross domestic product (GDP) downscaling. Variables by countries, cities, and grids are coloured by green, yellow, and red, respectively. The black arrows represent the downscaling procedure while the blue arrows represent subprocessing to consider auxiliary variables. As this figure shows, urban population is downscaled from countries to cities to grids, while nonurban population is downscaled from countries to grids. GDP is downscaled from countries to grids by utilizing downscaled populations.  Notes: 1 Settlement Points, v1 (http://sedac.ciesin.columbia.edu/data/set/grump-v1-settlement-points; [19]) of Global Rural-Urban Mapping Project (GRUMP), SEDAC (Socioeconomic Data and Applications Center; http://sedac.ciesin.columbia.edu/). 2 Global maps of urban extent from satellite data (https://nelson.wisc.edu/sage/data-and-models/schneider.php), which is estimated from MODIS (MODerate resolution Imaging Spectroradiometer; https://modis.gsfc.nasa.gov/). See [20] for further details. 3 Natural Earth (http://www.naturalearthdata.com/). 4 CoW (The Correlates of War project; http://www.correlatesofwar.org/) Hereafter, the city population model, the urban expansion/shrinkage model, and the downscaling model will be explained in Sections 2.2 and 2.3, Section 2.4, and Section 2.5, respectively. For further details about these models, see Appendix A.

City Growth Model: Estimation with Current Data
This section estimates the impacts of local spatial interactions, global economic interactions, and auxiliary variables {Road dense, Airport dist, Ocean dist} on city population change between 1995 and Figure 1. Procedure for population and gross domestic product (GDP) downscaling. Variables by countries, cities, and grids are coloured by green, yellow, and red, respectively. The black arrows represent the downscaling procedure while the blue arrows represent subprocessing to consider auxiliary variables. As this figure shows, urban population is downscaled from countries to cities to grids, while nonurban population is downscaled from countries to grids. GDP is downscaled from countries to grids by utilizing downscaled populations.  Hereafter, the city population model, the urban expansion/shrinkage model, and the downscaling model will be explained in Sections 2.2-2.5, respectively. For further details about these models, see Appendix A.

City Growth Model: Estimation with Current Data
This section estimates the impacts of local spatial interactions, global economic interactions, and auxiliary variables {Road dense, Airport dist, Ocean dist} on city population change between 1995 and 2000 (source: GRUMP Settlement Point dataset version 1; see Table 1) by fitting a city growth model. A distance-decay function is used to describe the spatial interactions, whereas the trade amount among cities, which is estimated from Trade amount (see Table 1), is used to describe the global economic interactions.
The results suggest that population increases rapidly in cities with dense road network and good access to airports. These results are intuitively consistent. Also, city growth in inland areas tends to be faster than that in coastal cities. This might be because coastal cities are already matured, and their populations are more stable than those of inland cities. It is estimated that both (local) spatial interaction and (global) economic interaction accelerate population increases (or mitigates population decrease). See Appendix A for further details about the city growth model and the estimation results.

Overview
Since SSP1-3 concerns globalization, business as usual (BAU), and fragmentation scenarios, respectively, different levels of socioeconomic interactions are assumed in each scenario. Specifically, we assume that the intensity of the economic interaction doubles by 2100 in comparison with 2000 in SSP1, stays constant in SSP2, and halves in SSP3. In each scenario, the intensity of the economic interaction between 2010 and 2100 is linearly interpolated. In other words, we assume a constant growth of the interaction network connectivity over the years. See Appendix A for further details.
Under these assumptions, city populations in 2005, 2010, . . . 2100 are estimated by sequentially applying the city growth model (see Section 2.2), which projects the 5-year-after populations.

Projection of Urban Area
Projected city populations are used to project urban expansion/shrinkage. The influence of projected city populations on urban area in 2000 is modeled by Equations (1) and (2): Urban area g,2000 = a + q g,2000 (r)b + ε g, 2000 (1) where ε g,2000 denotes disturbance. Urban area g,2000 is the urban area in the g-th grid in 2000 (see Table 1). q g,2000 (r) represents the urbanization potential, where p c,2000 is the population in the c-th city in 2000, d c,g is the arc distance between the c-th city and the center of the g-th grid. a, b, and r are parameters. This model describes urbanization due to city population increase, and urban shrinkage due to city population decrease. The a, b, and r parameters are estimated by maximizing the adjusted R 2 of Equation (1). The estimate of r is 16.4, which implies that the distance at which 95% of the influence from city population change disappears is 49.2 (= 16.4 × 3) km. r = 16.4 is assumed for SSP2. On the other hand, r = 8.2 (= 0.5 × 16.4) is assumed for SSP1 to model compact urban growth, while r = 32.8 (= 2.0 × 16.4) is assumed in SSP3 to model dispersed growth. equal or less than the area of each grid. Thus, each grid can have both urban and agricultural areas. In our downscaling, projected urban and agricultural areas are used as baseline variables, which will be explained in the next section.

Downscale Approach
Following Shiogama et al. (2011) [21], which suggest the robustness of an ensemble learningbased downscaling, sub-downscaling models are integrated by an ensemble learning technique. Each submodel distributes population or GDP in accordance with distribution weights, which are defined by (baseline variable) × (control variable). Baseline variables capture the difference in urban expansion/shrinkage assumed in each scenario whereas control variables capture the influence from auxiliary variables. These variables are given in Table 2.
Our urban population downscaling applies three baseline variables and four control variables. Thus, 12 submodels distribute urban populations proportionally to (baseline variable) × (control variable). Likewise, the nonurban population downscaling has 12 submodels, while the GDP downscaling has 16 submodels. In each case, downscaling is done by a weighted average of the submodels, where the weights are estimated by applying the gradient boosting (Freidman, 2002), which is an ensemble learning method.
Note that, while city population is projected by setting 2000 as the base year, the gradientboosting-based downscaling is conducted for each year independently without setting any base year. A temporal smoothing is performed to the downscaling results to assure a gradual change of gridded estimates (see Appendix A). After all, distributions of populations and gross productivities in each country gradually change across years depending on the gradient boosting result, whereas total populations and GDPs in each country change following assumptions in SSPs by country.

Parameter Estimation Result
As discussed, weights of each submodel, which equal the weights for each auxiliary variable, are estimated by the gradient boosting. The results suggest that urban potential explains 55% (SSP1), 54%, (SSP2), and 48% (SSP3) of urban population distributions and 69%, 68%, and 64% of nonurban

Downscale Approach
Following Shiogama et al. (2011) [21], which suggest the robustness of an ensemble learning-based downscaling, sub-downscaling models are integrated by an ensemble learning technique. Each submodel distributes population or GDP in accordance with distribution weights, which are defined by (baseline variable) × (control variable). Baseline variables capture the difference in urban expansion/shrinkage assumed in each scenario whereas control variables capture the influence from auxiliary variables. These variables are given in Table 2. Our urban population downscaling applies three baseline variables and four control variables. Thus, 12 submodels distribute urban populations proportionally to (baseline variable) × (control variable). Likewise, the nonurban population downscaling has 12 submodels, while the GDP downscaling has 16 submodels. In each case, downscaling is done by a weighted average of the submodels, where the weights are estimated by applying the gradient boosting (Freidman, 2002), which is an ensemble learning method.
Note that, while city population is projected by setting 2000 as the base year, the gradient-boosting-based downscaling is conducted for each year independently without setting any base year. A temporal smoothing is performed to the downscaling results to assure a gradual change of gridded estimates (see Appendix A). After all, distributions of populations and gross productivities in each country gradually change across years depending on the gradient boosting result, whereas total populations and GDPs in each country change following assumptions in SSPs by country.

Parameter Estimation Result
As discussed, weights of each submodel, which equal the weights for each auxiliary variable, are estimated by the gradient boosting. The results suggest that urban potential explains 55% (SSP1), 54%, (SSP2), and 48% (SSP3) of urban population distributions and 69%, 68%, and 64% of nonurban populations. Regarding urban population downscaling, distance to the ocean has the biggest contribution (SSP1: 38%, SSP2: 47%, SSP3: 46%). Because many of megacities are near the ocean, the result is intuitively reasonable. Concerning nonurban population, distance to principal road has the largest contribution. It is suggested that nonurban population grows along principal roads. The contribution of principal roads is 48 % that is significant in SSP1. The percentage is calculated by aggregating shares of a g,t,k = (baseline variables) × (control variables) whose control variables equal Road (i.e., 48% = 3% + 3% + 41%; see Table A2). It might be because cities strongly interact in SSP1, and small cities emerge in between these cities. On the other hand, ocean is more important than principal road in SSP3.
Distribution of gross productivity, which is estimated by the GDP downscaling, depends on a wider variety of auxiliary variables than population distributions. In SSP1, (Urban pop × Constant) is estimated the most influential (18%), while (Urban pop × Airport dist) is the second most influential (14%). Based on the result, city growth and its interaction with airport encourage economic growth in SSP1. By contrast, (Urban potential × Road) and (Urban pot × Airport dist) have a strong impact in SSP3 with contributions of about 17%. The result is interpretable that dispersed urbanization in SSP3 yields dispersed economic growth along road network and nearby airports. In short, SSP1 and SSP3 result in compact and dispersed economic growth, respectively, and SSP2 lies in between them. See Table A2 in Appendix A for the full estimation results. Figure 3 plots the estimated population distributions in 2080 under SSP1-3. Compared with SSP3, SSP1, and SSP2 show higher population density around megacities, including London, Paris, and New York (NY). By contrast, SSP3 has higher and dispersed population density in Africa and West-Middle Asia. Thus, the populations in SSP1 are concentrated while those in SSP3 are dispersed. The concentration and dispersed patterns are thought to be due to the spatial range parameter r that is set in Section 2.4 following scenario assumptions. It is verified that these parameters are useful to control urban expansion/shrinkage following scenario assumptions. Figure 4 displays the distributions of gross productivity in 2080. The results in SSP1 and SSP2 are relatively similar; both show considerable economic productivity around mega cities (e.g., London and NY). By contrast, economic productivity is small and dispersed in SSP3.

Downscaling Result
To compare compactness/dispersion quantitatively, population densities in the grids, whose distances to the nearest city are between 0 and 10 km, 10 and 20 km, . . . 190 and 200 km are averaged respectively, and plotted in Figure 5 (left). For comparison, the evaluated values are standardized so that the sum becomes 1. This figure confirms that populations are concentrated in SSP1, moderate in SSP2, and dispersed in SSP3. The same is true for gridded gross productivities (see Figure 5 (right)). This figure also suggests that gross productivities are more concentrated in nearby cities than populations. Figure 6 displays the results of the GDP downscaling in Europe and South-West Asia. In Europe, economic productivity around major cities (e.g., London and Paris) changes significantly depending on SSPs. In South-West Asia, compared with SSP1-2, SSP3 shows lower productivity in urban areas whereas higher productivity in nonurban areas. In other words, SSP3 results in dispersed economic growth. Considering such differences among SSPs would be important for analyzing future climate risks on socioeconomic activities.   Finally, we evaluate the validity of our downscaling by comparing our population estimates using Gridded Population of the World in 2000 (GPW Version3; source: SEDAC), which is another gridded population database created by aggregating/proportionally distributing administrative data.  by History Database of the Global Environment (HYDE; [22]). The results again confirm that our estimates also have a similar tendency to the HYDE data. The local R2 values in countries (a), (b), and (c) are 0.84, 0.82, and 0.77, respectively, whereas the global R2 value equals 0.81. It is verified that our estimate, which replicates more than 80 percent of the variation in the GPW and HYDE estimates, is at least likely.   by History Database of the Global Environment (HYDE; [22]). The results again confirm that our estimates also have a similar tendency to the HYDE data. The local R2 values in countries (a), (b), and (c) are 0.84, 0.82, and 0.77, respectively, whereas the global R2 value equals 0.81. It is verified that our estimate, which replicates more than 80 percent of the variation in the GPW and HYDE estimates, is at least likely.    [10]. Estimates of [10] tend to be overly smoothed (e.g., populations are uniformly distributed in desert areas in Saudi Arabia). It might be because the authors apply a gravity-based approach, which ignores auxiliary variables. In our results, such over smoothing is not conceivable. It is verified that consideration of auxiliary variables is also needed to avoid oversmoothing.
Finally, we evaluate the validity of our downscaling by comparing our population estimates using Gridded Population of the World in 2000 (GPW Version3; source: SEDAC), which is another gridded population database created by aggregating/proportionally distributing administrative data.  Figure 9 compares our estimates in 2010 with the population count estimates provided by History Database of the Global Environment (HYDE; [22]). The results again confirm that our estimates also have a similar tendency to the HYDE data. The local R2 values in countries (a), (b), and (c) are 0.84, 0.82, and 0.77, respectively, whereas the global R2 value equals 0.81. It is verified that our estimate, which replicates more than 80 percent of the variation in the GPW and HYDE estimates, is at least likely.

Concluding Remarks
This study downscales SSP scenarios into 0.5-degree grids, using a model to consider spatial and economic interactions among cities and an ensemble learning technique to utilize multiple auxiliary variables accurately. The downscaling result suggests that SSP1, which refers to the sustainable scenario, yields a compact population distribution relative to SSP3, which denotes the fragmentation scenario. The results also show that GDP growth in major metropolitan areas changes significantly depending on the scenarios. These results are intuitively consistent. The consideration of such differences is critical to the estimation of grid level CO2 emissions, disaster risks, energy demand, and

Concluding Remarks
This study downscales SSP scenarios into 0.5-degree grids, using a model to consider spatial and economic interactions among cities and an ensemble learning technique to utilize multiple auxiliary variables accurately. The downscaling result suggests that SSP1, which refers to the sustainable scenario, yields a compact population distribution relative to SSP3, which denotes the fragmentation scenario. The results also show that GDP growth in major metropolitan areas changes significantly depending on the scenarios. These results are intuitively consistent. The consideration of such differences is critical to the estimation of grid level CO2 emissions, disaster risks, energy demand, and other variables determining future sustainability and resiliency.

Concluding Remarks
This study downscales SSP scenarios into 0.5-degree grids, using a model to consider spatial and economic interactions among cities and an ensemble learning technique to utilize multiple auxiliary variables accurately. The downscaling result suggests that SSP1, which refers to the sustainable scenario, yields a compact population distribution relative to SSP3, which denotes the fragmentation scenario. The results also show that GDP growth in major metropolitan areas changes significantly depending on the scenarios. These results are intuitively consistent. The consideration of such differences is critical to the estimation of grid level CO2 emissions, disaster risks, energy demand, and other variables determining future sustainability and resiliency.

Concluding Remarks
This study downscales SSP scenarios into 0.5-degree grids, using a model to consider spatial and economic interactions among cities and an ensemble learning technique to utilize multiple auxiliary variables accurately. The downscaling result suggests that SSP1, which refers to the sustainable scenario, yields a compact population distribution relative to SSP3, which denotes the fragmentation scenario. The results also show that GDP growth in major metropolitan areas changes significantly depending on the scenarios. These results are intuitively consistent. The consideration of such differences is critical to the estimation of grid level CO 2 emissions, disaster risks, energy demand, and other variables determining future sustainability and resiliency.
Nonetheless, various other important issues require further study. First, spatially finer auxiliary data is needed to sophisticate our downscaling approach. For example, microscale urban data, such as industrial structure, detailed road network, and traffic volume, are required to describe urban phenomena such as industrial agglomeration, growth of transportation networks, and birth of new cities, which we could not consider. Since consideration of these factors can increase the uncertainty of downscaling, it is crucial to employ a robust estimation approach, such as ensemble learning (applied in this paper) or Bayesian estimation (as done by [23] for population projection).
Second, downscaling to finer grids is required. Although 0.5-degree grids are sufficient to evaluate socioeconomic activities in each region, these grids are not sufficient to quantify urban form, i.e., compact and disperse. Finer grids, such as 1-km grids, are required to analyze the impact of urban form on climate change mitigation and adaption. High-resolution auxiliary variables would be needed to achieve it.
Third, consideration of longer-term trend of urban expansion, population and economic growth is needed. Fortunately, historical data of gridded population, production, and so on, are now available at the HYDE database [22] Use of this database would especially be valuable to improve the accuracy of long-term projections.
Forth, it is important to discuss how to use our estimates for city-level economic policy-making. For example, our estimates, which reveal local emission intensity, are potentially useful to optimize carbon taxation, green bonding, and other mitigation policies for individual cities. Our estimates will also be useful to estimate local exposure to flood, heat, and other disasters; the estimated exposures will be useful to consider local adaptation policy, for example, through subsidy for encouraging people to move from high risk areas to safer areas. Related to policy-making, the project titled World Urban Database and Access Portal Tools (WUDAPT: http://www.wudapt.org/) is an interesting activity. The project aims to (i) collect data describing urban forms and functions (e.g., land cover, building structure, and building allocations), (ii) utilize the data to classify urban areas into 17 Local Climate Zones (LCZs) [24], and (iii) design universal policies for each of the LCZs toward improving climate resilience. While LCZs classify urban areas based on their influence on the ambient local climate and distributions of population and gross productivity are key factors determining CO 2 emissions and amount of wasted heat. To combine our downscaled populations and GDPs with LCZs might be an interesting topic to devise appropriate policies.
Our downscaling results are available from "Global dataset of gridded population and GDP scenarios", which is provided by the Global Carbon Project, National Institute of Environmental Studies (http://www.cger.nies.go.jp/gcp/population-and-gdp.html). This dataset summarizes population and GDP scenarios in 0.5 × 0.5 degree grids between 1980 and 2100 by 10 years. The gridded data between 2020 and 2100 are estimated by downscaling country-level SSP1-3 scenarios (SSP database: https://secure.iiasa.ac.at/web-apps/ene/SspDb/dsd?Action=htmlpage&page=about) as explained in this manuscript, whereas those in 1980-2010 are estimated by applying the same downscaling method to actual populations and GDPs by country (source: IMF data; http://www.imf.org/data).

Conflicts of Interest:
The authors declare no conflict of interest.

A.1. Projection of Urban Population and Urban Expansion
City Growth Model: Model The 5-year population changes of 67,934 cities (source; SEDAC Settlement Point dataset; see Table 1 and Figure A1) are estimated using the following spatial econometric model.
p c,t is the population of city c in year t. p t (log) and ∆p t (log) are N × 1 vectors whose c-th elements are log(p c,t ) and log(p c,t /p c,t-5 ), respectively. X t is an N × K matrix of explanatory variables, ε t is an N × 1 vector of disturbance with variance σ 2 , 0 is an N × 1 vector of zeros, I is an N × N identity matrix, α is a coefficient (scalar), and β is a K × 1 coefficient vector.
pc,t is the population of city c in year t. pt (log) and Δpt (log) are N × 1 vectors whose c-th elements are log(pc,t) and log(pc,t /pc,t-5), respectively. Xt is an N × K matrix of explanatory variables, εt is an N × 1 vector of disturbance with variance σ 2 , 0 is an N × 1 vector of zeros, I is an N × N identity matrix, α is a coefficient (scalar), and β is a K × 1 coefficient vector. Following the literature on spatial econometrics, Wgeo, W e1 , and W e2 are given by rowstandardizing (i.e., row sums are scaled to one) W 0 geo, W 0 e1, and W 0 e2, which describe connectivity among cities. W 0 geo is a spatial connectivity matrix whose (c, c')-th element is exp(-dc,c' /h), where dc,c' is the arc distance between cities c and c', and r is a range parameter. For instance, if h = 100 km, 95% of the spill over effects disappear within 300 km (=3 ×100km; [25]). In other words, a large h implies global spill over from cities whereas a small h implies local spill over. W 0 e1 and W 0 e2 describe economic connectivity. Since we could not find any data on economic connectivity among cities, we approximated it with Eq.(A2), which represents an estimate of trade amount between cities c and c': where PC is the population of the country, including the c-th city, and TC,C' is the amount of trade between countries C and C' (source: CoW data set; see Table 1). Equation (A2) simply distributes the amount of trade, TC,C', in proportion to city populations. The (c, c')-th element of W 0 e1 is given by ̂ , ′ if cities c and c' are in different countries (i.e., C ≠ C'), and 0 otherwise. By contrast, the (c, c')-th elements of W 0 e2 are given by ̂ , ′ if these cities are in the same country (i.e., C = C'), and 0 otherwise. Finally, We1 and We2 describe international and national economic connectivity, respectively.
If ρgeo is positive, population growth in a city increases the populations in its neighboring cities. When ρe1 and/or ρe2 is positive, population growth in a city increases the populations in foreign cities with strong economic connectivity. Intuitively speaking, ρgeo and ρe2 capture local interactions, and ρe1 captures global interactions.
In short, our city growth model projects 5-year-population-change considering attributes of the cities, local spatial interactions among neighboring cities, and global interactions among cities with strong economic connectivity.

City Growth Model: Estimation
We used the data of city populations (1990, 1995, and 2000)  Following the literature on spatial econometrics, W geo , W e1 , and W e2 are given by row-standardizing (i.e., row sums are scaled to one) W 0 geo, W 0 e1 , and W 0 e2 , which describe connectivity among cities. W 0 geo is a spatial connectivity matrix whose (c, c')-th element is exp(-d c,c' /h), where d c,c' is the arc distance between cities c and c', and r is a range parameter. For instance, if h = 100 km, 95% of the spill over effects disappear within 300 km (=3 × 100 km; [25]). In other words, a large h implies global spill over from cities whereas a small h implies local spill over. W 0 e1 and W 0 e2 describe economic connectivity. Since we could not find any data on economic connectivity among cities, we approximated it with Equation (A2), which represents an estimate of trade amount between cities c and c':t where P C is the population of the country, including the c-th city, and T C,C' is the amount of trade between countries C and C' (source: CoW data set; see Table 1). Equation (A2) simply distributes the amount of trade, T C,C' , in proportion to city populations. The (c, c')-th element of W 0 e1 is given bŷ t c,c if cities c and c' are in different countries (i.e., C = C'), and 0 otherwise. By contrast, the (c, c')-th elements of W 0 e2 are given byt c,c if these cities are in the same country (i.e., C = C'), and 0 otherwise. Finally, W e1 and W e2 describe international and national economic connectivity, respectively. If ρ geo is positive, population growth in a city increases the populations in its neighboring cities. When ρ e1 and/or ρ e2 is positive, population growth in a city increases the populations in foreign cities with strong economic connectivity. Intuitively speaking, ρ geo and ρ e2 capture local interactions, and ρ e1 captures global interactions.
In short, our city growth model projects 5-year-population-change considering attributes of the cities, local spatial interactions among neighboring cities, and global interactions among cities with strong economic connectivity.

City Growth Model: Estimation
We used the data of city populations (1990, 1995, and 2000) provided by GRUMP, and estimated Equation (A1) while assuming t = 1995. In other words, (population in 2005)/(population in 2000) is projected from (population in 2000)/(population in 1995). The spatial 2-step least squares (2SLS; [26]) is used for the estimation. Specifically, to estimate r in W geo , 2SLS is iterated while varying r values, and the optimal r value, which maximizes the adjusted R2, is identified.) The explanatory variables are road density (Road dens), distance to the nearest airport (Airport dist), and distance to the nearest ocean (Ocean dist; see Table 1), whose coefficients are denoted by β road , β ocean , and β airport , respectively. Table A1 summarizes the estimated parameters. The table suggests that population increases rapidly in areas with dense road network and good access to airports, although the latter is statistically insignificant. These results are intuitively consistent. The positive sign of β ocean suggests that city growth in inland areas is faster than that in coastal cities. This might be because coastal cities are already matured, and their populations are more stable than those of inland cities.
Regarding parameters describing interactions, β geo has a statistically significant positive effect, whereas β e2 does not. Thus, geographic proximity is a significant factor determining local-scale city interactions. On the other hand, β e1 , which quantifies global-scale interactions, is statistically significant. It is suggested that consideration of both local and global-scale interactions is important in city growth modeling.
The quasi-adjusted R 2 for the population change in 5 years, ∆p t+5 , is 0.401, which is not very accurate. However, the value of R 2 for the population after 5 years, p t+5 , is 0.998. Since we focus on the latter, the accuracy of the model is sufficient.
While we used the 2SLS method, which is computationally efficient because of large samples, a Bayesian approach is also available to estimate the model Equation (A1) (see e.g., [27]). The Bayesian estimation, which explicitly considers uncertainty in model parameters, would be an important future task, to quantify uncertainty in our socioeconomic scenarios.

City Growth Model: Application for City Population Projections
Since SSP1-3 represents globalization, BAU, and fragmentation scenarios, respectively, different levels of international interactions are assumed in each scenario. Specifically, we assume that ρ e1 doubles by 2100 in comparison with 2000 in SSP1, ρ e1 is constant in SSP2, and ρ e1 becomes half the value of 2000 by 2100 in SSP3. In each scenario, the values for ρ e1 between 2000 and 2100 are linearly interpolated.

Projection of Urban Potentials
Increase/decrease of city population encourages/discourages urbanization in the neighboring areas. Thus, this study evaluates urbanization potential using Equation (A2), which equals Equation (2) when t = 2000: wherep c,t is the city population in year t, which is projected as explained just above, and d c,g is the arc distance between the c-th city and the center of the g-th grid. The potential q g,t (r) increases nearby cities with large population. Although r is a range parameter just like h in W 0 geo , r represents the range of spill over around each city, whereas h (= 209 km; see Table A1) represents the range of spill over across cities. Thus, r must be smaller than h. Considering the consistency with the subsequent urban area projection in Section 2.4, r is given by a value maximizing the explanatory power of urban potential, q g,t (r'), on urban expansion. In other words, r is estimated by maximizing the adjusted R-squares (adj-R 2 ) of the following model, Equation (2)

Projection of Urban Area
This section projects urban extent based on estimated urbanization potentials (see Figure 2). The 5-year change of urban area in each grid is projected by Equation (A4), which is derived from Equation (1): We also project the expansion of nonurban residential areas due to the potentials. This study assumes that nonurban residential areas are proportional to Agri area (see Table 1), and the 5-year change is estimated by the following model: The parameters in Equation (A4) for 2000 are estimated by the adjusted-R 2 maximization of Equation (1) whose Urban Area g,2000 is replaced with Agri Area g,2000 (Equation (A5) is obtained from Equation (4) after the replacement). The estimated values arer A = 12.1 andb A q = 0.129. While b q A = 0.129 is assumed across scenarios, r A values in SSP1-3 are given by 6.05, 12.1, and 24.2, respectively, just like r. Urban areas and agricultural areas are projected by applying Equations (A4) and (A5) sequentially. In each sequence, if (Urban Area g,t+5 + Agri Area g,t+5 ) exceeds the area of the grid, Agri area g,t+5 is reduced. Urban Area g,2000 and Agri Area g,2000 are used as baseline areas. Thus, each grid can have both urban and agricultural areas.
The next section applies the estimated urban and nonurban areas as weights for proportional distribution. In the distribution, the range parameters, h, r, and r A control the share of populations and gross productivity nearby cities. For instance, if r is very small as in SSP1, most people and gross productivity are concentrated nearby cities. As such, the proportional distribution can describe both urban expansion and shrinkage depending on the range parameter values. Similarly, r A controls the nonurban population distribution. In case of SSP1, the small r A concentrates nonurban populations into grids with greater Agri Area with greater potentials. The populations are dispersed in SSP3 whose r A value is large.

A.2. Downscale Approach
We downscale the urban and nonurban populations and GDPs utilizing projected city populations, urbanization potentials, urban areas, and other auxiliary variables summarized in Table 1.
To date, numerous downscale methods have been proposed in quantitative geography, geostatistics, and other fields. The accurateness of the dasymetric mapping, which simply distributes populations in proportion to auxiliary variables, has been remarked upon in many comparative studies (e.g., [28,29]). We use Equation (A6), which modifies the dasymetric mapping model to consider differences in scenarios (Square root is used because distribution weights are defined by the product of two weight variables.): f (a g,t,k ) = a ssp g,t a g,t,k ∑ g∈C a ssp g,t a g,t,k where Y C,t is population or GDP in country C including the g-th grid in year t. a ssp g is a baseline variable to control urban expansion/shrinkage assumed in each scenario. Urban area g,t , Agri area g,t , and UAgri area g,t (=Urban area g,t +Agri area g,t ; see Table 2), which are projected under each SSP, are used to downscale urban population, nonurban population, and GDP, respectively. a g,t,k is a control variable capturing influence from auxiliary variables, where k is the index of control variables. We are not sure which auxiliary variables are appropriate for a g,t,k . Hence, this study downscales population/gross productivity in g-th grid at year t, y g,t , using a weighted average of dasymetric mapping models, which is formulated as followŝ where ω k,t measures the importance of the k-th submodel, f (a g,t,k ). The following country level model is obtained by aggregating the grid-level model presented by Equation (A8).
Y C(g),t = ∑ g∈C(g) K ∑ k=1 ω k,t f (a g,t,k ) (A8) ω k,t in the downscale model Equation (A7) is estimated by gradient boosting, which is an ensemble learning technique, for Equation (A8). As explained in Section 2.5, the gradient boosting takes a weighted ensemble mean of 12 submodels in the urban and nonurban population downscaling, while 16 submodels exist in the GDP downscaling. Meanwhile, our ensemble learning means averaging of the submodels based on the weights optimized by the gradient boosting. Roughly speaking, the gradient boosting optimizes the weights, ω k,t : (i) the weights for the submodels are equally set by ω k,t = 1/K; (ii) residuals are evaluated using Equation (A8); (iii) samples (e.g., Y C(g),t values) are weighted according to the size of the residuals; (iv) the ω k,t values are updated so that model accuracy is improved for samples with larger weights (i.e., larger residuals in step (ii)); and (v) steps (ii), (iii), and (iv) are iterated until convergence. The gradient boosting procedure is known to be robust even if the submodels are collinear.
The gradient boosting is performed for every target year. To assure the gradual change of the weights across years, the ω k,t value is replaced with − ω k,t = (ω k,t−1 + ω k,t + ω k,t+1 )/3, which is their temporal moving average. Finally, the submodels in year t is averaged by the gradient boosting first, and the resulting models at time t−1, t, and t+1 are temporally averaged subsequently. Table A2 summarizes estimated ω k,t parameters in 2080. Section 3.1 discusses the parameter estimates.