Mapping Open Data and Big Data to Address Climate Resilience of Urban Informal Settlements in Sub-Saharan Africa

: This perspective paper highlights the potentials, limitations, and combinations of openly available Earth observation (EO) data and big data in the context of environmental research in urban areas. The aim is to build the resilience of informal settlements to climate change impacts. In particular, it highlights the types, categories, spatial and temporal scales of publicly available big data. The beneﬁts of publicly available big data become clear when looking at issues such as the development and quality of life in informal settlements within and around major African cities. Sub-Saharan African (SSA) cities are among the fastest growing urban areas in the world. However, they lack spatial information to guide urban planning towards climate-adapted cities and fair living conditions for disadvantaged residents who mostly reside in informal settlements. Therefore, this study collected key information on freely available data such as data on land cover, land use, and environmental hazards and pressures, demographic and socio-economic indicators for urban areas. They serve as a vital resource for success of many other related local studies, such as the transdisciplinary research project “DREAMS—Developing REsilient African cities and their urban environMent facing the provision of essential urban SDGs”. In the era of exponential growth of big data analytics, especially geospatial data, their utility in SSA is hampered by the disparate nature of these datasets due to the lack of a comprehensive overview of where and how to access them. This paper aims to provide transparency in this regard as well as a resource to access such datasets. Although the limitations of such big data are also discussed, their usefulness in assessing environmental hazards and human exposure, especially to climate change impacts, are emphasised.


Urban Expansion in African Countries and Its Influence on Informal Settlements
Sub-Saharan African (SSA) cities are among the fastest growing urban areas in the world. According to United Nations (UN) estimates, more than 50% of the African population will live in cities by 2030. To put it in perspective, the population will have doubled from almost 1.2 billion in 2021 to more than 2.5 billion in 2050 [1]. This continued growth puts immense pressure on major urban areas in SSA to provide basic services such as permanent housing, water and sanitation, jobs, learning opportunities and access to health care for their residents. On the other side of this growth, the UN [2] warns that there is a proliferation of informal settlements within and around major cities and this creates environmental hazards for the poor urban residents. It is therefore imperative that efforts be made to stabilise and, if possible, improve the resilience and quality of life in urban areas, especially in informal settlement zones.

The Value of Open and Big Data for Fostering Information Processes
In a rapidly evolving data-driven economy, as in most African countries, the role of openly available multisource Earth Observation (EO) and other big data cannot be underestimated when building urban resilience and creating opportunities to find solutions to urgent environmental pressures related to urbanisation processes and population growth, as well as accessibility of resources and utilisation problems. Recent research [3][4][5] shows how resilience is enhanced through the use of big data analyses. They improve the speed and effectiveness of linking disaster information and systemic responses. Geospatial data is made available through various access points, ranging from crowdsourcing to traditionally established sources. To this end, modern infrastructures provide timely data access and easily accessible data sources, which in turn facilitate data analysis and support reliable evidence-based decision-making in SSA.
A starting point for this type of decision-making is the provision of freely available data, i.e., open data, on land cover, land use, environmental hazards, and pressures, as well as on demographic and socio-economic indicators of urban areas. Since the responsible authorities, interest groups and organisations usually do not have this information, access to open and freely available data is significant. This is because in most parts of SSA, critical landscape and environmental information is often sparse, and at worst non-existent, and certainly not suitable for sophisticated and complex analyses [6,7]. In particular, the problem of data availability is a major impediment to assessing environmental resilience in the region. This increases the need for easily accessible open EO and big data to address urban issues in SSA.
This perspective paper focuses on the provision of context-specific open and big data sources, access options, usage guidelines and dependencies for the organisation of different sources in any big data ventures in SSA. We show which data is available where, how to access it, and which dimensions of resilience could be considered when using the different data sources. Beyond that, we also explore the barriers to be anticipated in the use of these open data sources, as well as suggest which data sources could present opportunities for developing urban resilience against climate related environmental hazards like flooding. These narratives and guidelines are provided within the broader context of the Belmont Forum funded DREAMS Project.

The Dreams Project
African cities and urban agglomerations face significant challenges in achieving the UN Sustainable Development Goals (SDG). Even though there are official planning systems for many SSA countries, the trajectory of urban developments is often out of governmental control, especially in informal settlements of major cities where growth is largely driven by informal social networks [8,9]. These informal settlements account for 570 million slum dwellers, in which about 238 million out of whom represent 23% living in a critical habitat in or around major cities in SSA [2,10,11]. They often develop when rural migrants and original inhabitants of areas undergoing rapid urban growth resort to cheap accommodation in sub-standard and unplanned settlements, to support subsistence livelihoods [12,13]. The transdisciplinary research project 'DREAMS-Developing Resilient African cities and their urban environMent facing the provision of essential urban SDGs' seeks to create new sustainability pathways for African cities through participatory scenario modelling, impact assessment and integrated strategic planning. DREAMS is a Belmont Forumsponsored project and anticipates future development of African cities with regard to key driving forces and their social-ecological influences, as reflected in six Sustainable Development Goals (SDGs) including SDG 3, 5, 6, 11, 13 and 15 (see https://www.eli-web.com/, accessed on 4 November 2022). A key component of the project was the transdisciplinary and cross-continental partnership (SDG 17) of researchers from West Africa (Ghana), East-Africa (Uganda) and Southern Africa (Republic of South Africa), Germany and the United States, with municipal planning authorities, NGOs and traditional leaders in the target cities. The DREAMS project also combines expertise from the natural and social sciences which brings in experience on urban planning processes and methods that embrace Charrettes as a key participatory planning approach for addressing local hotspots where the cooperation between communities and planners is urgently needed [14]. The DREAMS project primarily focuses on existing informal settlements or those under development in three major cities in SSA. These informal settlements are usually located in areas susceptible to climate change impacts such as increased flood intensity. Through the DREAMS project activities, we sought to develop new sustainability pathways increasing the resilience of these informal settlements against such environmental pressures that affect many growing African cities. Data and information derived from these strategies are intended to be bundled and documented as learning blocks for weaker informal systems, while significant inputs for slum upgrades will be communicated to governments for action.

Charrette Approach
The DREAMS project integrates the Charrette process which is essential in risk sensitive planning. It consists of intensive and time-constrained public participatory activities that are usually organised as a sequence of different types of meetings, exchanges, local discussions and syntheses ( Figure 1) [14]. Such a process stretches over 9-12 months. An effective Charrette process is premised on essential preliminary work, and these include problem definition, the identification of stakeholders, and the identification of information and physical logistics. It is noteworthy that poor data or obsolete data could lead to poor Charrette design and poor planning outcomes. The Charrette process in the DREAMS project was constrained by the project timeline. For example, the Charrette for target cities in Ghana occurred over a 2-month period, and this was also affected by the lack of relevant information logistics, including updated land use change and environmental data. Hence, publicly and freely available open and big data are vital to understanding urban landscape dynamics with the view of planning for urban growth and resilience, especially informal settlements. component of the project was the transdisciplinary and cross-continental partnership (SDG 17) of researchers from West Africa (Ghana), East-Africa (Uganda) and Southern Africa (Republic of South Africa), Germany and the United States, with municipal planning authorities, NGOs and traditional leaders in the target cities. The DREAMS project also combines expertise from the natural and social sciences which brings in experience on urban planning processes and methods that embrace Charrettes as a key participatory planning approach for addressing local hotspots where the cooperation between communities and planners is urgently needed [14]. The DREAMS project primarily focuses on existing informal settlements or those under development in three major cities in SSA. These informal settlements are usually located in areas susceptible to climate change impacts such as increased flood intensity. Through the DREAMS project activities, we sought to develop new sustainability pathways increasing the resilience of these informal settlements against such environmental pressures that affect many growing African cities. Data and information derived from these strategies are intended to be bundled and documented as learning blocks for weaker informal systems, while significant inputs for slum upgrades will be communicated to governments for action.

Charrette Approach
The DREAMS project integrates the Charrette process which is essential in risk sensitive planning. It consists of intensive and time-constrained public participatory activities that are usually organised as a sequence of different types of meetings, exchanges, local discussions and syntheses ( Figure 1) [14]. Such a process stretches over 9-12 months. An effective Charrette process is premised on essential preliminary work, and these include problem definition, the identification of stakeholders, and the identification of information and physical logistics. It is noteworthy that poor data or obsolete data could lead to poor Charrette design and poor planning outcomes. The Charrette process in the DREAMS project was constrained by the project timeline. For example, the Charrette for target cities in Ghana occurred over a 2-month period, and this was also affected by the lack of relevant information logistics, including updated land use change and environmental data. Hence, publicly and freely available open and big data are vital to understanding urban landscape dynamics with the view of planning for urban growth and resilience, especially informal settlements.

Technical Considerations
The opportunity to use open data and big data has become more popular in recent years. In preparation for the International Geophysical Year of 1957-1958, the open data concept was established, and it focuses on the free availability of data and the resulting

Technical Considerations
The opportunity to use open data and big data has become more popular in recent years. In preparation for the International Geophysical Year of 1957-1958, the open data concept was established, and it focuses on the free availability of data and the resulting advantages of big data due to exponential growth in the amount of EO data and increasing computational power [15]. Both aspects of exponential growth and increasing computation power have become the driving force in most operational sectors in developed and developing countries [16].  [17,18]. These available geodata often cover the four dimensions of big data: the large volume, recurrence (temporal resolution), variety and veracity. There is the need for new approaches for geospatial data analyses, since resolutions are increasing: finer spatial and spectral scales make more details visible, while a higher temporal resolution enables inter-annual and intra-annual evaluations. Due to this rapid development, centralised cloud computing has become more important to bundle computational power and avoid the duplication of computationalextensive tasks. Over the years, several articles and research projects e.g., [3,5,[19][20][21] have focused on specific aspects of big data interventions without necessarily considering the critical aspect of access, processing methods and use for specific interventions that address urban resilience.

Benefits of Global EO Data for Urban Research
Global EO data offer various opportunities for scientists and public institutions to use the rapidly growing available data infrastructure to improve their strategies for building resilient cities. To address the problem of missing geo-information, satellite-based remote sensing (RS) is often used for EO (see Table 1). Thematic data from RS sensors are commonly exploited not only to map land cover at different spatial scales, but also to monitor landscape change at varying temporal scales. There have been substantive efforts committed into the use of EO data to monitor derived land uses and environmental stressors. A review by Hirschmugl et al. [22] revealed the availability of a wide variety of methods for the detection of forest degradation, for example, in Cameroon and Congo, by using optical EO data. Despite this advancement, similar applications of EO in urban areas face intricate problems of high complexity of underlying spatial heterogeneities and close proximities of the urban systems: mostly, urban morphology and infrastructure. At the same time, urban areas undergo rapid landscape change dynamics compared to other land-cover/land-use types. For this reason, satellite-based EO information is used in various inter-and transdisciplinary fields of urban research, such as environmental management, morphological analysis and population estimation, among others [6,23].
Publicly available geodata play a crucial role in developing resilient cities, especially in regions with smaller urban planning budgets. These datasets can be classified by different means for instance by their format, extent and coverage, a brief description, and relevant categories inherent in the available dataset. We categorise important openly available global and continental scale datasets into four groups: (i) land cover (Table 1), (ii) urban land use (Table 2), (iii) environmental pressures (Table 3) and (iv) demographic information (Table 4). Global scale land-cover data are derived from satellite images and contain information like vegetation or soil types pertaining to international scales. At this scale, it is possible for users to obtain these datasets across national and international boundaries. Urban land-use data focus on structure and functions of cities and settlements and map infrastructure information such as building types. Especially when determining the potential environmental challenges of an urban area, environmental pressure data are indispensable to quantify vulnerabilities like flood risk, heat islands or light pollution. Each dataset has a value in itself, but their usability increases when the information is linked or integrated with other datasets. To visualise data sets of the different subjects, each data set is illustrated by an exemplary map of one of the DREAMS cities (Table 2).

Barriers in EO Application
The use of EO data reaches its limits where high-resolution data are needed but cannot be provided by publicly available data archives, or where data with a much higher temporal resolution would be needed to monitor current problems, such as sudden flash floods or varying intensities of urban heat islands and their impact on local people [24]. The coupling of different datasets from different sources can also be hindered by different spatial resolutions, which then require the aggregation of the higher-quality data, which in turn leads to the loss of vital details. Inconsistent categories of data can be a hindrance. In this case, different datasets must first be harmonised (i.e., the matching of categories) in order to generate a thematic understanding of specific urban dynamics and enable the creation of a detailed map of vulnerable areas, such as informal settlements. They are often located in ecologically sensitive landscapes, such as floodplains [25,26]

Barriers in EO Application
The use of EO data reaches its limits where high-resolution data are needed but cannot be provided by publicly available data archives, or where data with a much higher temporal resolution would be needed to monitor current problems, such as sudden flash floods or varying intensities of urban heat islands and their impact on local people [24]. The coupling of different datasets from different sources can also be hindered by different spatial resolutions, which then require the aggregation of the higher-quality data, which in turn leads to the loss of vital details. Inconsistent categories of data can be a hindrance. In this case, different datasets must first be harmonised (i.e., the matching of categories) in order to generate a thematic understanding of specific urban dynamics and enable the creation of a detailed map of vulnerable areas, such as informal settlements. They are often located in ecologically sensitive landscapes, such as floodplains [25,26]. Metadata of the categories (e.g., land cover) are therefore particularly important information needed to carry out status analyses. The aforementioned limitations can be addressed by resampling the data or by data fusion, and also by utilising different kinds of downscaling methods [27][28][29][30][31]. This is common for soil and elevation data when undertaking and optimising spatial models to make predictions about urban resilience.
The dynamic and complex socio-ecological problems of urban areas demand integrated approaches. This implies that social or environmental challenges are not addressed as sectoral issues but are tailored across sectors to the complexity of the urban areas under consideration. Environmental pressures are the highest amongst poorer urban residents living in informal settlements that are exposed to environmental risks such as flooding. This requires that EO data should be analysed in a coupled and more holistic way. For example, to understand how urban drainage systems could accommodate flow volumes across different landscapes within the urban system, EO data on land cover and land use (Table 1) can be coupled with soil characteristics, a digital elevation model and river flow data to undertake flood frequency analyses to quantify the impacts of extreme flooding events on informal settlements, often inhabited by poor urban dwellers or squatters. This is very important for the development of flood vulnerability or risk indices which are needed for effective flood mitigation strategies and planning to combat impacts of climate change in SSA.
An important benefit of the use of publicly available data is that spatial information can easily be complemented by demographic information, or linked to socio-demographic indicators of urban resilience. It is important to explore connections across different information types and consider the spatio-temporal dimensions to understand urban environmental risks and threat exposures to urban dwellers in informal settlements. Another benefit of exploiting EO data is that the same piece of information can be stored for each city, for different cities or for different areas within a given city. This is useful for comparing development patterns and dynamics, and for comparing the different degrees of environmental exposure of poorer dwellers in different locations to identify and prioritise well-adapted solutions and climate change mitigation options.
EO data are important to make comparisons between cities of different countries and cities within a country. The same indicators can be compiled from open and big data and integrated with other geo-located data to deepen our understanding of either similar environmental challenges across cities or contrasting situations between them. In this way, analysts and decision-makers can gain better insights into the management of environmental hazards and the effectiveness of risk assessments by evaluating the same spatial information under different scenarios. This again fosters communication between urban planners and stakeholders across different cities and countries in SSA. For example, such environmental risk data could provide useful "information logistics" for the planning and execution of the DREAMS project's Charrette activities in Ghana.
EO datasets can be compiled to propose new land-cover mapping, tailored to a specific research question, or newly arising environmental threats. Such challenges can include climate change impacts, biodiversity conservation and ecosystem functions and services occurring at different scales, from the local to the international/cross-border levels. The use of EO data also makes urban and spatial planning more effective; it is faster to react or proactively build resilient strategies in the mitigation of the impacts of environmental risks and protection of poor urban dwellers in places such as informal settlements.

Barriers in EO Application
The use of EO data reaches its limits where high-resolution data are needed but cannot be provided by publicly available data archives, or where data with a much higher temporal resolution would be needed to monitor current problems, such as sudden flash floods or varying intensities of urban heat islands and their impact on local people [24]. The coupling of different datasets from different sources can also be hindered by different spatial resolutions, which then require the aggregation of the higher-quality data, which in turn leads to the loss of vital details. Inconsistent categories of data can be a hindrance. In this case, different datasets must first be harmonised (i.e., the matching of categories) in order to generate a thematic understanding of specific urban dynamics and enable the creation of a detailed map of vulnerable areas, such as informal settlements. They are often located in ecologically sensitive landscapes, such as floodplains [25,26]. Metadata of the categories (e.g., land cover) are therefore particularly important information needed to carry out status analyses. The aforementioned limitations can be addressed by resampling the data or by data fusion, and also by utilising different kinds of downscaling methods [27][28][29][30][31]. This is common for soil and elevation data when undertaking and optimising spatial models to make predictions about urban resilience.
Another kind of limitation is the documentation of data accuracies. Documentation may not easily be found in the metadata, or sometimes they may not exist. This affects the accuracy of the resulting mapping and derived products. Furthermore, the lack of standardisation of the metadata nomenclature hinders cooperation among national institutions, transnational and regional institutions, and makes it difficult to use the data for specific urban interventions or research agenda. Therefore, global or continental institutions that are able to produce, thematically process and provide EO data are responsible for this data harmonisation. A lack of standardisation may limit the opportunity for cross-data adoption in the world of big data [31]. Evaluation products are important to estimate the quality of the mapping methods and related scores. In other cases, the actual temporal and spatial resolution of the available geospatial data may not be compatible with certain specific environmental risk assessments and research questions. This often results in disjoint data management processes and affects the veracity of geospatial data analysis results.

Conclusions
Cities in SSA are faced with immense socio-ecological problems due to the projected increases in the rural-urban population shift by 2030. Nowhere is this more pronounced than in informal settlements within urban areas, or on the periphery of cities. The DREAMS project is based on selected cities in Ghana, Uganda and South Africa, and uses a transdisciplinary approach to develop insights to help build resilient and liveable communities, especially in the informal settlements. This involves research on land use changes, planning Charrette workshops and scenario modelling. The success of the DREAMS project is intimately tied to readily available and reliable information, especially Earth observation (EO) data or geodata. In the era of exponential increases in open and big data analytics, especially geospatial information, their usefulness in SSA has been hampered by the disparate nature of these datasets, with no comprehensive overview of where and how to access these datasets. The aim of this contribution is to create transparency regarding these matters, and we have created an online resource (Open Data and Big Data Available for Urban Resilience Assessment in SSA) to serve this purpose.
In the previous sections, we described the four categories of open and big data regarding their sources, access options, and some usage guidelines for both analysts and practitioners in SSA. We highlighted the available datasets and how to access them, in addition to the important resilience dimensions to be considered when using the different data sources to assess environmental risks and conditions. This is critical given the barriers associated with disparate spatial and temporal resolutions of EO data, such as accuracy documentations, data standardisations and others. However, care must be taken to limit the propagation of minor data management errors in the initial phase of analysis tasks, as errors increase exponentially over the cycle of geospatial data analyses and can severely affect the reliability of the results and derivations of EO data. Therefore, users who undertake spatially explicit urban environmental research need to pay attention to the metadata of the multisource EO and other big data. When using open data, uncertainties arising from different formats at different spatial and temporal scales must be adequately taken into account. Nonetheless, the integration of different types of open data and big data, such as the geospatial and socio-economic data presented here, is of great benefit.