Slums from Space — 15 Years of Slum Mapping Using Remote Sensing

The body of scientific literature on slum mapping employing remote sensing methods has increased since the availability of more very-high-resolution (VHR) sensors. This improves the ability to produce information for pro-poor policy development and to build methods capable of supporting systematic global slum monitoring required for international policy development such as the Sustainable Development Goals. This review provides an overview of slum mapping-related remote sensing publications over the period of 2000–2015 regarding four dimensions: contextual factors, physical slum characteristics, data and requirements, and slum extraction methods. The review has shown the following results. First, our contextual knowledge on the diversity of slums across the globe is limited, and slum dynamics are not well captured. Second, a more systematic exploration of physical slum characteristics is required for the development of robust image-based proxies. Third, although the latest commercial sensor technologies provide image data of less than 0.5 m spatial resolution, thereby improving object recognition in slums, the complex and diverse morphology of slums makes extraction through standard methods difficult. Fourth, successful approaches show diversity in terms of extracted information levels (area or object based), implemented indicator sets (single or large sets) and methods employed (e.g., object-based image analysis (OBIA) or machine learning). In the context of a global slum inventory, texture-based methods show good robustness across cities and imagery. Machine-learning algorithms have the highest reported accuracies and allow working with large indicator sets in a computationally efficient manner, while the upscaling of pixel-level information requires further research. For local slum mapping, OBIA approaches show good capabilities of extracting both areaand object-based information. Ultimately, establishing a more systematic relationship between higher-level image elements and slum characteristics is essential to train algorithms able to analyze variations in slum morphologies to facilitate global slum monitoring.

. Morphological features typical for slum areas (adapted from [41,54] This review analyzes the diversity of RS studies of the past 15 years that deal with the challenge of extracting slums. It is based on a systematic literature search, performed in December 2015, using several search engines (Web of Science, Science Direct, SpringerLink Journals, Taylor & Francis and Scopus) and covers the keywords "slums," "informal," "unplanned," "squatter," "precarious," "spontaneous," "illegal," "deprived," "irregular" or "substandard settlement/area," "self-help housing," "shantytown," "favela" or "bidonville" and "mapping" or "remote sensing." The review covers journal publications, book sections and conference publications that could be retrieved either via the employed research engines or websites of the main RS conferences. Only English-language papers are selected, and very similar publications by the same authors (e.g., journal and conference publication) were counted only once. In total, 87 key publications ( [3][4][5][6][7]10,23,25,[31][32][33][38][39][40][41][43][44][45][46]) are identified. A temporal analysis of the number of publications shows an increasing trend (Figure 1), having a high linear correlation with the number of satellite launches (r 2 = 0.75). Satellite launches were derived from the following websites (including only the earth observation satellites with a spatial resolution of 5 m and less): In the mid-2000s, when more VHR satellites became available the number of related publications increased. The same occurred for the period after 2010.
The analytical framework ( Figure 2) for analyzing the retrieved publications, inspired by the outcome of the expert meeting on slum mapping [36], forms the skeleton for this review. such as geographic location and climate, the topography of the city, the location within the city including proximity to services, and general socioeconomic and political factors (e.g., land governance). For example, slum dwellers often trade off accessibility to livelihood opportunities with locations exposed to hazards. Physical slum characteristics are often an expression of the slum-development processes: i.e., from low-density at their infancy stage to high-density mature slums, sometimes also including increasing building size and height. For example, slums can have multiple incrementally constructed floors [2]. Patterns of roads, building layouts and general site characteristics define the growth potential of a specific settlement. When mapping slums, physical slum characteristics need to be well understood for translating them into image-based proxies. The data and requirements of slum-mapping studies relate to imagery and ancillary data and the level (scale) of analysis, e.g., extraction of dwelling units (objects) versus delineation of settlements (areas). Thus the scale varies from small objects (e.g., slum buildings that can be below 20 m 2 ) to large settlements of several hectares [8,82]. Furthermore, the required spatial, spectral and temporal resolution for slum mapping need to be specified. These requirements are closely linked to extraction methods. Across studies, a multiplicity of extraction methods for slum mapping have been employed, from classical visual image interpretation to OBIA or machine learning, or a combination of methods, with the main methodological challenge of translating a relevant set of slum characteristics into robust indicators (e.g., developing a slum ontology) for image-based slum mapping [23] that would ultimately allow for a global slum inventory.
Remote Sens. 2016, 8,455 5 of 31 governance). For example, slum dwellers often trade off accessibility to livelihood opportunities with locations exposed to hazards. Physical slum characteristics are often an expression of the slumdevelopment processes: i.e., from low-density at their infancy stage to high-density mature slums, sometimes also including increasing building size and height. For example, slums can have multiple incrementally constructed floors [2]. Patterns of roads, building layouts and general site characteristics define the growth potential of a specific settlement. When mapping slums, physical slum characteristics need to be well understood for translating them into image-based proxies. The data and requirements of slum-mapping studies relate to imagery and ancillary data and the level (scale) of analysis, e.g., extraction of dwelling units (objects) versus delineation of settlements (areas). Thus the scale varies from small objects (e.g., slum buildings that can be below 20 m 2 ) to large settlements of several hectares [8,82]. Furthermore, the required spatial, spectral and temporal resolution for slum mapping need to be specified. These requirements are closely linked to extraction methods. Across studies, a multiplicity of extraction methods for slum mapping have been employed, from classical visual image interpretation to OBIA or machine learning, or a combination of methods, with the main methodological challenge of translating a relevant set of slum characteristics into robust indicators (e.g., developing a slum ontology) for image-based slum mapping [23] that would ultimately allow for a global slum inventory.   governance). For example, slum dwellers often trade off accessibility to livelihood opportunities with locations exposed to hazards. Physical slum characteristics are often an expression of the slumdevelopment processes: i.e., from low-density at their infancy stage to high-density mature slums, sometimes also including increasing building size and height. For example, slums can have multiple incrementally constructed floors [2]. Patterns of roads, building layouts and general site characteristics define the growth potential of a specific settlement. When mapping slums, physical slum characteristics need to be well understood for translating them into image-based proxies. The data and requirements of slum-mapping studies relate to imagery and ancillary data and the level (scale) of analysis, e.g., extraction of dwelling units (objects) versus delineation of settlements (areas). Thus the scale varies from small objects (e.g., slum buildings that can be below 20 m 2 ) to large settlements of several hectares [8,82]. Furthermore, the required spatial, spectral and temporal resolution for slum mapping need to be specified. These requirements are closely linked to extraction methods. Across studies, a multiplicity of extraction methods for slum mapping have been employed, from classical visual image interpretation to OBIA or machine learning, or a combination of methods, with the main methodological challenge of translating a relevant set of slum characteristics into robust indicators (e.g., developing a slum ontology) for image-based slum mapping [23] that would ultimately allow for a global slum inventory.

Contextual Factors
Context matters for slum mapping. We first provide an overview of the terminological differences regarding settlements with poor living conditions as they affect the choice and definition of indicators. We also summarize the purposes of slum-mapping studies linked to the socioeconomic and political context. The final section gives an overview of geographic locations mapped by slum studies linking to variation in climate and topography.

Terminological Differences
The nomenclatures of slum settlements vary depending on different connotations [41,122]. To some extent, these terms reflect the different views on such settlements. Terms such as "informal," "illegal" or "squatter," for instance, focus on the land rights (tenure status) [24], whereas "unplanned" relates to the planning context [41]. "Spontaneous" or "irregular" emphasizes the growth dynamics [123], whereas "deprived," "shantytown" and "sub-standard" are associated with poor physical and socioeconomic conditions [54]. The recent revival in popularity of the rather political term "slum" [122] is largely linked to the Habitat Agenda and the related development goals [27,124]. The analysis of the retrieved publications with respect to these terms in combination with RS methods (see Table 2) identified "informal settlement/area" (47%) and "slum" (29%) as the most commonly used terms in the RS community, among which some researchers use both terms interchangeably (6%). Less frequently used terms that refer to the physical condition (e.g., "deprived/sub-standard") focus on a specific issue (e.g., "refugee camps") or on a specific national context (e.g., "migrant housing" or "urban villages" in China). Terms such as "squatter" or "unplanned," which were common in the 1970s-1980s planning literature, are no longer commonly used. "Informal settlement/area" being the most frequently used term in the RS literature is actually awkward as it constitutes the legal (tenure) status of an area, which cannot be directly extracted from imagery. A change in tenure status does not necessarily affect the physical characteristics. In this review, we use the term "slum" to refer to urban areas with poor living conditions as this term expresses explicitly physical characteristics such as high densities or irregular patterns, indicators that can be derived by means of RS methods. Here, an ontological framework (e.g., developed by [5,7]) "provides a comprehensive description of spatial characteristics and their relationships to represent and characterize slums in an image" ( [31], p. 155). Such an ontology framework-split in three phases: specification, conceptualization and implementation [5]-provides a clear conceptual foundation for developing robust image-based indicators, facilitating global knowledge acquisition and comparisons for the development of a global slum inventory. Table 2.
Frequency of publications using a specific term (within the reviewed remote sensing publications).

Purposes of Slum Mapping Using Remote Sensing
Our second contextual topic concerns the different purposes of RS-based slum-mapping studies. The review has identified three key geographical questions-where, when and what?-as the main objectives of studies. Often, researchers aim at the provision of basic information on "where" the slums are located within the urban fabric and what their areal extent is. Such information allows compensating for the non-availability of socioeconomic information (e.g., income levels) in many cities of the Global South [31]. Besides its importance for urban development [68], the where question is also relevant within a humanitarian context, for which several studies [67,76,109] developed methods to map refugee camps (e.g., under Copernicus) [125].
While development dynamics of slums at the city scale are of particular interest for local planning and decision support [10,126], only a few studies have focused on temporal slum dynamics (when) (e.g., [10]). This could be related to challenges extracting these dynamics, in particular in terms of data availability and obtaining local knowledge. Examples of studies on dynamics are the analysis of the process of forced mass evictions in Harare (Zimbabwe) [104], the investigation of built-up changes for large slum settlements such as Kibera-Nairobi [3] or the exploration of development dynamics of slums in Delhi, showing stagnation in the center versus growth in the periphery [127]. Such multi-temporal information can feed simulation models on the growth of slum areas, generating policy-relevant information of future growth scenarios [48,[128][129][130][131][132].
Several publications have focused on what'-related issues, such as the number of slum inhabitants, since many census statistics slums are not well covered (with high uncertainty about the number of inhabitants) [133]. Moreover, RS-based population estimates allow a more detailed spatial and temporal disaggregation [133,134]. However, population estimates of slums can vary [4] as illustrated for the case of Kibera (Nairobi) [3], where estimates differed by half a million people depending on the sample data used. For the slums in Hyderabad, India, Kit et al. [134] computed slightly lower image-based population figures than the figures reported by the census. Furthermore, relying on physical proxies for population estimations can lead to errors for areas that have not yet been fully occupied, e.g., new developments in outskirts [70]. Other what-related issues deal with boundaries and effectiveness of policies for health campaigns [84,135], allocation of public services and protection of environmentally sensitive areas [68] or spatial planning and policy formulation [69]. These efforts are related to the fact that local planning authorities often lack elementary information on slums, which "has led to a deficit in policy for these areas, as without quality map data, it is often difficult to plan effectively for these areas" ( [69], p. 390), leading to ad hoc plans that do not consider the specific locational context. For example, for one settlement in Johannesburg, Gunter [69] mapped 10,000 more dwellings by using Google Earth (GE) images compared to the government estimates. Such discrepancies are problematic for policy development and monitoring and may point to conceptual differences in what constitutes a slum dwelling. Spatial information on slums can support local governments in better determining the demand of basic services and other relevant amenities [136] and monitor slums via RS-based proxies of "human deprivation or well-being" ( [137], p. 68).

Geographic Locations, Climate and Topography
Given the aim of identifying relevant issues for developing a global slum inventory, we analyze the geographic distribution of RS-based information on slums by mapping the case study locations found in English-language publications on top of a population density map ( Figure 3). In the figure, "slum cities" are grouped into locations where object-level information (roofs or roads), area-based slum maps, or both were extracted. Object-level information is mainly available in SSA. Obviously, there is a spatial relationship between areas of high urban population densities in the Global South and the location of case studies. The highest concentrations are found in South-East Asia and SSA (East and South). Some clusters also exist in North and West Africa and South/Central America. Examples are even found in the Global North, dealing with the monitoring of informal development, e.g., in Greece [153] and the US [119]. The cities covered range from (sub)tropic, Mediterranean, arid and continental climates, as well as low-lying areas with rather flat terrain (e.g., Dhaka) up to high-lying cities (e.g., La Paz) with steep slopes. Still, many urban regions with very dynamic urban and slum developments are not well covered in English-language publications, e.g., areas in the Caribbean, West and Central Africa or in South-East Asia. Also, areas in Europe might become a future focus, considering the recent erection of refugee camps or examples of deprived Roma settlements in European countries [144]. Many of the regions not covered belong to the least developed countries with large income inequalities and/or instable political conditions, e.g., Liberia, Congo or Myanmar. In such countries, ground-truth or reference data accessibility might be even more of a problem. Moreover, many studies are about methodological developments and do not create exhaustive citywide slum maps, illustrating that we are still far away from a global slum inventory.  Table 3. Application domains of remote sensing-based information on the morphology and temporal dynamics of slums.

Geographic Locations, Climate and Topography
Given the aim of identifying relevant issues for developing a global slum inventory, we analyze the geographic distribution of RS-based information on slums by mapping the case study locations found in English-language publications on top of a population density map ( Figure 3). In the figure, "slum cities" are grouped into locations where object-level information (roofs or roads), area-based slum maps, or both were extracted. Object-level information is mainly available in SSA. Obviously, there is a spatial relationship between areas of high urban population densities in the Global South and the location of case studies. The highest concentrations are found in South-East Asia and SSA (East and South). Some clusters also exist in North and West Africa and South/Central America. Examples are even found in the Global North, dealing with the monitoring of informal development, e.g., in Greece [153] and the US [119]. The cities covered range from (sub)tropic, Mediterranean, arid and continental climates, as well as low-lying areas with rather flat terrain (e.g., Dhaka) up to highlying cities (e.g., La Paz) with steep slopes. Still, many urban regions with very dynamic urban and slum developments are not well covered in English-language publications, e.g., areas in the Caribbean, West and Central Africa or in South-East Asia. Also, areas in Europe might become a future focus, considering the recent erection of refugee camps or examples of deprived Roma settlements in European countries [144]. Many of the regions not covered belong to the least developed countries with large income inequalities and/or instable political conditions, e.g., Liberia, Congo or Myanmar. In such countries, ground-truth or reference data accessibility might be even more of a problem. Moreover, many studies are about methodological developments and do not create exhaustive citywide slum maps, illustrating that we are still far away from a global slum inventory.  The majority of the reviewed publications are authored by academic researchers, both from universities in the Global North (48%) and Global South (21%) or combinations thereof (6%). Fewer publications stem from research centers (including national RS agencies), in the Global North (8%) and Global South (5%) and one by a commercial image provider (1%). Moreover, there is some cooperation between research centers and university in the Global North (5%) and South (2%), also across South and North (2%), or with an NGO (2%). The majority of English-language publications from the Global North links to the global slum debate. South´North and South´South cooperation are of particular relevance for knowledge exchanges and transfer, bridging technology gaps and for further expanding our knowledge to more cities including also very instable regions like the Sudan [125].

Physical Characteristics of Slum Areas
VHR imagery provides a detailed representation of the physical elements of a landscape, capturing physical characteristics of slums. This section conceptualizes these characteristics derived from imagery and considers their diversity.

Characterization of Slum Areas
The definition of what constitutes a slum is complex. Variations exist between global, regional and local slum definitions [154] that can result in large differences of mapped slum areas [64]. Many publications adopted the global UN-Habitat definition of slums (e.g., [4,43,155]), which consists of five well-established indicators: secure tenure, adequate access to safe water, access to acceptable forms of sanitation, overcrowding, and the durability of housing considering both the quality of the structures as well as site conditions in terms of hazards. For instance, based on the work of Weeks et al. [43], Duque et al. [61] used these indicators (i.e., wall material, overcrowding, access to piped water, sanitation connection to sewers, and ownership) to build a slum index based on census data for the city of Medellin (Colombia). This index, compared with image-based information on land cover, structural and texture-based features, showed that the image-based information could explain 59% of the slum index. A major problem in employing the UN-Habitat definition in RS-based studies is that only the indicator "durability of housing conditions" has a direct link to information extracted from imagery, namely location aspects (such as location on steep slopes, along major drainage channels [112]), compliance with building codes measured via density, distance or roofing material [2]. Slums do not have "easily distinguishable spectral signatures" ( [45], p. 661), meaning that roofing material may vary within slums (e.g., plastic, iron, concrete, tin, asbestos) and between different slums and globally between cities. For the example of Accra (Ghana), Engstrom et al. [64] concluded that when using the UN-Habitat definition, most of the city is classified as slums, while an image-based identification matched much better the local delineation of slums. These examples indicate that global slum definitions need to be adjusted to the local context. However, most researchers fail to start with a local characterization of the slum morphology and the development of related image-based proxies. Table 4 presents an overview of physical characterizations of slums found in literature, split into five major dimensions: building geometry, density, arrangement (pattern/road), roofing material, and site characteristics. The most frequently used characteristics are small roof sizes, high density, and irregular patterns (visible by irregular and narrow streets combined with heterogeneous building orientation). Densities in Asian cities tend to have higher values than in SSA cities (Asia~80% and SSA~60%) [5,41]. However, also in SSA, centrally located slums, such as those in Nairobi, Kenya with an estimated roof cover of 50%-60%, have high densities [156]. For the group of roofing material and physical site characteristics, there is a great deal of difference between cities across different geographic regions and even across slums. For instance, in Dehradun, slums are characterized by tone differences due to different roofing materials (e.g., plastic, wood etc.) [75], but in Guangzhou [97] or Ahmedabad [5], spectrally similar roofing material characterizes slums. Regarding physical site characteristics, there is also no general agreement; however, slums are often located in areas that are not suitable for constructions (e.g., on a flood plain, steep slope or other hazardous locations) [157]. To conceptualize such physical characteristics, Kohli et al. [5] developed a slum ontology (Figure 4), based on Hofmann et al. [7], which consists of three spatial levels (object, settlement and environs). For each level, indicators identify specific physical slum characteristics. Yet the ontology requires a local adaption as not all indicators are relevant for a specific local slum identification [158]. Thus slums are different from non-slum areas, but are not homogeneous [4]. To conceptualize such physical characteristics, Kohli et al. [5] developed a slum ontology ( Figure  4), based on Hofmann et al. [7], which consists of three spatial levels (object, settlement and environs). For each level, indicators identify specific physical slum characteristics. Yet the ontology requires a local adaption as not all indicators are relevant for a specific local slum identification [158]. Thus slums are different from non-slum areas, but are not homogeneous [4].

The Diversity of Slums
Besides the commonalities of slums in terms of physical characteristics, we also explore the heterogeneity of slums. Already in 1962, Charles Stokes differentiated between "slums of hope" and "slums of despair" [159]. Slums vary between and within cities and within slums in terms of substandard living conditions [122,160]. Therefore, recently, some researchers have been exploring different slum typologies based on building sizes, density, pattern or location (Table 5). However, slums are often not the worst off areas in terms of socioeconomic conditions [31,114,161]. Thus, such typologies include also fuzzy classes (i.e., semi-formal), reflecting the dilemma that some areas are formal but are physically and/or socioeconomically similar to slums, e.g., high-density resettlement colonies in Delhi [54]. On the contrary, areas can have morphological characteristics that align with slums, but on the ground, they are not slums like historic core areas. The established typologies (Table 5) range from two to five categories. The main factor that influenced authors to develop such typologies is the diversity on the ground, e.g., very deprived areas and areas that have an unsecure tenure status but are better off in terms of building characteristics. Some differences are visible in imagery and may assist in a semi-automatic slum identification. However, none of the reviewed studies established an (semi)-automatic imageclassification approach to extract slum typologies.

The Diversity of Slums
Besides the commonalities of slums in terms of physical characteristics, we also explore the heterogeneity of slums. Already in 1962, Charles Stokes differentiated between "slums of hope" and "slums of despair" [159]. Slums vary between and within cities and within slums in terms of sub-standard living conditions [122,160]. Therefore, recently, some researchers have been exploring different slum typologies based on building sizes, density, pattern or location (Table 5). However, slums are often not the worst off areas in terms of socioeconomic conditions [31,114,161]. Thus, such typologies include also fuzzy classes (i.e., semi-formal), reflecting the dilemma that some areas are formal but are physically and/or socioeconomically similar to slums, e.g., high-density resettlement colonies in Delhi [54]. On the contrary, areas can have morphological characteristics that align with slums, but on the ground, they are not slums like historic core areas. The established typologies (Table 5) range from two to five categories. The main factor that influenced authors to develop such typologies is the diversity on the ground, e.g., very deprived areas and areas that have an unsecure tenure status but are better off in terms of building characteristics. Some differences are visible in imagery and may assist in a semi-automatic slum identification. However, none of the reviewed studies established an (semi)-automatic image-classification approach to extract slum typologies.

Data Availability and Spatial Requirements
The complexity of physical slum characteristics requires advanced sensor systems for mapping purposes. This section focuses on available imagery data and the spatial requirements in terms of spatial resolution and extent (settlement to urban region level) of reviewed studies.

Our Remote Eyes: Available Sensors
The successful launch of Ikonos-2 on 24 September 1999 heralded a new era of urban RS. The increased availability of high and very-high-resolution imagery produced by sensors such as Ikonos, QuickBird, WorldView (very-high-resolution sensors (VHR) have spatial resolutions of the PAN band of 1 m and below, while high-resolution (HR) sensors have between 1 and 5 m spatial resolutions) have provided a new and rich data repository for urban research in general and for slum-related research in particular, as it allows for a more detailed spatial analysis [162]. Besides commercial VHR imagery, since 2005, GE has provided universal web-based access to VHR imagery, although not providing the original spectral bands, which limits potential analysis.
An increasing number of multi-spectral (MS) and panchromatic (PAN) VHR sensors has become available (see Figure 5). For instance, since August 2014, the first commercial satellite with a spatial resolution of 0.31 m (PAN) and 1.24 m (MS) allows an improved object-level analysis. While the first sensors were launched by countries in the Global North, there is an increasing number of launches of (V)HR sensors by countries in the Global South (such as NigeriaSat). Also, China has launched a large number of (V)HR sensors; however, access to data from outside China is an issue. Besides optical systems, synthetic aperture radar (SAR) systems are gaining an increasing role in extracting information on slums, especially since the availability of (V)HR systems, e.g., PALSAR:

Data Availability and Spatial Requirements
The complexity of physical slum characteristics requires advanced sensor systems for mapping purposes. This section focuses on available imagery data and the spatial requirements in terms of spatial resolution and extent (settlement to urban region level) of reviewed studies.

Our Remote Eyes: Available Sensors
The successful launch of Ikonos-2 on 24 September 1999 heralded a new era of urban RS. The increased availability of high and very-high-resolution imagery produced by sensors such as Ikonos, QuickBird, WorldView (very-high-resolution sensors (VHR) have spatial resolutions of the PAN band of 1 m and below, while high-resolution (HR) sensors have between 1 and 5 m spatial resolutions) have provided a new and rich data repository for urban research in general and for slumrelated research in particular, as it allows for a more detailed spatial analysis [162]. Besides commercial VHR imagery, since 2005, GE has provided universal web-based access to VHR imagery, although not providing the original spectral bands, which limits potential analysis.
An increasing number of multi-spectral (MS) and panchromatic (PAN) VHR sensors has become available (see Figure 5). For instance, since August 2014, the first commercial satellite with a spatial resolution of 0.31 m (PAN) and 1.24 m (MS) allows an improved object-level analysis. While the first sensors were launched by countries in the Global North, there is an increasing number of launches of (V)HR sensors by countries in the Global South (such as NigeriaSat). Also, China has launched a large number of (V)HR sensors; however, access to data from outside China is an issue. Besides optical systems, synthetic aperture radar (SAR) systems are gaining an increasing role in extracting information on slums, especially since the availability of (V)HR systems, e.g., PALSAR:   Analyzing the imagery used in the reviewed studies ( Figure 5), we identify QuickBird, launched in 2001 with a spatial resolution of 0.61 and 2.44 m (PAN and MS) and a revisit time of 3 days, as the most frequently used sensor (33%). The revisit time does not equal repetition rate, e.g., WV-3 needs 4.5 days until capturing a scene with the same geometric characteristics (20˝off-nadir or less at exactly the same position). While, images taken <1 day might have different geometric characteristics, e.g., causing problems for multi-temporal image comparison. The second most frequently used sensor is Ikonos (11%), with a spatial resolution of 1 and 4 m and the same revisit time. This is followed by SPOT (mainly SPOT-5) (9%), with a slightly lower spatial resolution (SPOT-5: Pan: 2.5/5 m and MS: 10 m) and revisit time of 5 days; Landsat (9%) and aerial photos/imagery (8%). The latter have been an important spatial information source in mapping and analyzing (e.g., [163,164]), in monitoring growth processes (e.g., [73]) and in extracting buildings in slums (e.g., [102]). The main advantages of aerial photographs are that archives often cover long time series and have very high spatial resolutions (in cm range). Apart from the V(HR) imagery, some studies employed moderate resolution imagery (e.g., Landsat, Terra ASTER), e.g., analyzing vegetation cover in slums [110], which is often a good proxy for deprivation [165].

Spatial Requirements of Slum-Mapping Studies
Spectrally most of the imagery have 2-3 VIS bands and 1-2 IR bands, and the availability of more VHR sensors with more spectral bands (e.g., Worldview-2 with 8 bands) producing images of improved spatial resolutions raises the question of what is an optimal or minimum spatial resolution for slum mapping. In this respect, Jacobsen and Büyüksalih [166] determined the required GSD (ground sampling distance) for building objects to be 2 m and for footpaths 1-2 m, while for minor roads 5 m was considered sufficient. However, detailed building object information requires below 0.5 m and a sufficient contrast between buildings and their surrounding [167]. Moreover, this may vary in different urban environments. For instance, in cities with a high clustering of buildings, such as in many Asian cities, a resolution of 2 m does not allow the extraction of roof objects [41]. Furthermore, according to Pesaresi and Ehrlich ([168], p. 45), when "assuming a typical minimal built-up element in a settlement, having a size of 10ˆ10 m, we need at least 0.5 m." Many slum buildings are, however, considerably below 100 m 2 . Moreover, roof surfaces are frequently not homogeneous; for instance, when using a VHR sensor, the majority of the roof pixels will be "mixed pixels" (due to different materials/shadow/illumination) [168]. Consequently, not only the high densities of roofs, but also the heterogeneity of roof surfaces causes serious limitations for automatically extracting roof objects, subsequently requiring manual editing for producing reliable information [169]. There is as of yet no systematic study that analyzes the impact of different spatial and spectral resolutions on the accuracy of extracting object-level information in slums. It is also interesting that most studies on roof [55,73,87,91] or road extraction [95] are from African cities (see also Figure 3), where coverage densities and clustering of roofs are in general a bit lower than in Asian cities [5].
Considering the high costs for commercial VHR imagery and the required processing resources, many studies have focused on methodological advances and therefore only used as spatial extent small areas, e.g., subsets of scenes (34%) and settlements (24%) or administrative units (9%) ( Table 6). Methods developed for one scene segment are not necessarily transferable to other scenes [105]. However, more than one-quarter performed the analysis for an entire city (28%) or at urban region scale (5%). The city and urban regional scale are important stepping stones for building a global slum inventory. A further stepping stone towards a global slum inventory is a recent pilot study to map slums of an entire country (South Africa) [77].

Slum-Mapping Approaches
Among the reviewed studies, multiple methods have been used to map slums. This section focuses on the most promising methods with respect to extracted information level (objects or areas) and achieved accuracies. In general, the level of analysis depends on the spatial resolution of available imagery, the specific urban morphology and the information requirements.

Methods Employed for Slum Mapping
In order to explore the discursive context of slum-mapping efforts, we analyze the actual information that is extracted in the reviewed case studies (Table 7 rows). The majority (55%) of studies identify entire slum areas [74,82,100], and fewer studies aim at extracting objects in slums (15%), i.e., roofs [40,87] or roads [95]. The extraction of object-level information depends largely on the relation between (available) data sources and morphological characteristics of the study area, meaning that roof or road extraction works well when objects have clearly visible spacing and contrast in the imagery. The more classical focus on extracting land use/cover information is addressed by 17% of the publications (e.g., [96,101]). Within this category, a recent research stream aims at mapping built-up areas using, for example, texture measures. Here, the co-occurrence matrix (GLCM) is commonly used (e.g., [62,170,171]), which is also the basis for the "anisotropic rotation-invariant built-up presence index" (Pantex) [172]. Finally, a limited number of studies develop methodologies to analyze the link between image-based and socioeconomic indicators (6%) (e.g., [173]) or the diversity of slums (7%) (e.g., [85]). Since the expert meeting on slum mapping in 2008 [36], more methods and cases on slum cities have been explored, expanding the global knowledge repository of slum characteristics and their variability. Brito and Quintanilha [174] stated that in recent years many methods have been based on feature extraction but there is no clear agreement on the most successful method(s), where the majority of studies rely on optical data. The availability of imagery with sub-meter resolution still has many unresolved technical challenges for the characterization of slums, such as mixed pixels or the obliqueness of images. Thus, there is "a strong need of new approaches for automatic image understanding on remote sensing data bridging the gap between visual and automatic image interpretation" ( [175], p. 3). In this respect, also complex (visual) interpretation elements (e.g., height, shadow, pattern and site) ( Figure 6, [41,176,177] ) need to be more systematically explored [177]. In this respect, also complex (visual) interpretation elements (e.g., height, shadow, pattern and site) ( Figure 6, [41,176,177] ) need to be more systematically explored [177]. Figure 6. Complexity of image interpretation elements (adapted from [41,176,177]).
Already in 1998, Mason and Fraser [178] specified three main characteristics of an effective system to map/monitor slums, specifically low-cost (data acquisition and processing), semi-automated processes (fast and reliable results) and simple usage by low-skilled operators (standard software). The analysis of the employed methods in the reviewed slum publications shows that most studies used commercial and rather expensive imagery. Only very few studies used free data sources such as GE image, mostly for visual image interpretation (e.g., [69,84,119]), visualization of slums [90] or combining GE with commercial imagery [72,80], whereas Praptono et al. [98] used GE images to automatically detect slums employing a Gabor filter and GLCM with a promising accuracy of 74%. Many of the methods used commercial software solutions, but to some extent also open-source software. Nevertheless, both are not easy to be operated by non-RS experts.
Overall, the methods to extract slums are rather diverse (Table 7 columns). The most frequently used method in the last 15 years was OBIA (32%), also referred to as GEOBIA [179]. For OBIA, the transferability [82] or robustness [71] of rules and indicators is a critical issue, which is a stronger feature of texture or morphology-based methods [82], accounting for 16% of the studies. Significantly, Hofmann et al. [180] stressed that a systematic adaption of segmentation parameters is crucial to transfer rules from one image to another. Several studies focused on the optimization of scale parameters [181], where the tool Estimation of Scale Parameters (ESP) allows optimizing the scale based on patterns in the data [182].
Apart from OBIA, visual image interpretation (17%) and standard pixel-based image classification were employed (13%). However, the reliance on standard pixel-based classification methods is not that appropriate for analyzing a complex urban environment having high spectral diversity, very small and clustered objects and diverse morphological characteristics. Therefore, many researchers used machine-learning algorithms (14%) such as neural networks [59], random forest (RF) or support vector machines (SVM) [72]. Machine-learning approaches are information-driven approaches that allow for a repetitive learning from a large and rich set of training data [94]. However, those approaches are mainly pixel-based methods, which are "not very effective in high-resolution urban image classification" procedures ( [116], p. 869). Therefore, a large spatial context of many neighboring pixels is necessary, such as multi-instant learning [116] or Markov random fields [46]. Given that neighborhoods or wards are relevant spatial units of policy and decision-making processes, the issue of aggregation is important, via segments (e.g., [41,88]), regular grids (e.g., [104]) or nonoverlapping block [173].
Crossing the main foci and methods (Table 7), OBIA appears to be the most common method for extracting both slum areas and objects in slums. Although rather labor intensive, visual Already in 1998, Mason and Fraser [178] specified three main characteristics of an effective system to map/monitor slums, specifically low-cost (data acquisition and processing), semi-automated processes (fast and reliable results) and simple usage by low-skilled operators (standard software). The analysis of the employed methods in the reviewed slum publications shows that most studies used commercial and rather expensive imagery. Only very few studies used free data sources such as GE image, mostly for visual image interpretation (e.g., [69,84,119]), visualization of slums [90] or combining GE with commercial imagery [72,80], whereas Praptono et al. [98] used GE images to automatically detect slums employing a Gabor filter and GLCM with a promising accuracy of 74%. Many of the methods used commercial software solutions, but to some extent also open-source software. Nevertheless, both are not easy to be operated by non-RS experts.
Overall, the methods to extract slums are rather diverse (Table 7 columns). The most frequently used method in the last 15 years was OBIA (32%), also referred to as GEOBIA [179]. For OBIA, the transferability [82] or robustness [71] of rules and indicators is a critical issue, which is a stronger feature of texture or morphology-based methods [82], accounting for 16% of the studies. Significantly, Hofmann et al. [180] stressed that a systematic adaption of segmentation parameters is crucial to transfer rules from one image to another. Several studies focused on the optimization of scale parameters [181], where the tool Estimation of Scale Parameters (ESP) allows optimizing the scale based on patterns in the data [182].
Apart from OBIA, visual image interpretation (17%) and standard pixel-based image classification were employed (13%). However, the reliance on standard pixel-based classification methods is not that appropriate for analyzing a complex urban environment having high spectral diversity, very small and clustered objects and diverse morphological characteristics. Therefore, many researchers used machine-learning algorithms (14%) such as neural networks [59], random forest (RF) or support vector machines (SVM) [72]. Machine-learning approaches are information-driven approaches that allow for a repetitive learning from a large and rich set of training data [94]. However, those approaches are mainly pixel-based methods, which are "not very effective in high-resolution urban image classification" procedures ( [116], p. 869). Therefore, a large spatial context of many neighboring pixels is necessary, such as multi-instant learning [116] or Markov random fields [46]. Given that neighborhoods or wards are relevant spatial units of policy and decision-making processes, the issue of aggregation is important, via segments (e.g., [41,88]), regular grids (e.g., [104]) or non-overlapping block [173].
Crossing the main foci and methods (Table 7), OBIA appears to be the most common method for extracting both slum areas and objects in slums. Although rather labor intensive, visual interpretation is still used for slum identification, producing reliable results by skilled interpretations; however, texture/morphology and machine-learning methods are increasingly being used.

Accuracy Levels and Employed Methods
The last dimension of the analysis deals with the performance of indicators and methods, measured by accuracy levels. Across the studies, there is much diversity with respect to these levels. For instance, Ella et al. [63] compared various texture features (e.g., local binary pattern (LBP), GLCM, lacunarity) by training a support vector machine. While LBP achieved the highest accuracy of 98%, GLCM had an accuracy of 94%. Based on single indicator approaches, lacunarity was identified as having a high utility for extracting slums (e.g., [45,57]); however, lacunarity cannot identify small slum pockets as it requires a rather large window size [10]. Verzosa and Gonzalez [118] suggested entropy for monitoring uncontrolled sprawl, while the morphology of slums can be described by spatial metrics [41,83] with reported accuracies of not more than 70%. Besides the use of single or a small set of indicators, several studies used large sets of indicators. For example, Owen and Wong [6] performed a systematic comparison between indicators to distinguish formal and slum areas using 24 spectral, accessibility, texture, scale-based and morphological indicators. The result showed that the best indicators were entropy of roads, vegetation patch size, and vegetation patch compactness. Similarly, Graesser, Cheriyadat, Vatsavai, Chandola, Long and Bright [46] focused on the development of consistent predictors for formal and slum areas by a decision tree using GLCM, lacunarity, histogram gradients, linear feature distribution, line support regions, vegetation indices, and textons (texture patches). Their result showed that texton features were most robust for all included cities (i.e., Caracas, Kabul, Kandahar, and La Paz), achieving a maximum accuracy of 92% [43]. Thus, a fully automatic system for mapping slums with 100% accuracy is not in sight. However, reported accuracy levels show promising developments for semi-automatic methods.
Apart from comparing the capacity of indicators, the performance of methods is evaluated. In general, advanced approaches (such as mathematical morphology analysis) have a better performance than standard classification approaches [67]. To evaluate the performance of methods, we compare the accuracy of all reviewed slum-mapping publications (Figure 7). The highest mean accuracy is obtained by machine-learning approaches, but also texture and statistical-based approaches show promising results, while the variance of the performance of OBIA is rather large. The cases of lower accuracies of OBIA are often related to very complex urban environments such as Indian cities where slum areas are very diverse and often have similar spatial characteristics compared to formal areas. Thus obtained accuracy levels not only depend on the methodology, but also on the urban morphology and how well slum characteristics are captured by image-based proxies. To address this, Shekhar [105] proposed an OBIA procedure, identifying first formal areas; the remaining built-up areas are then classified as slums achieving an overall accuracy of 87%.
In conclusion, machine-learning methods seem to be more successful when aiming at extracting slum areas at the city scale, whereas OBIA was found to work well for the extraction of objects (e.g., roofs, roads) on settlement level when the urban morphology combined with a sufficient resolution image allowed their extraction. Both methods can be combined, e.g., using image segmentation together with machine-learning approaches [183], which allows combining the advantages of both methods.
The cases of lower accuracies of OBIA are often related to very complex urban environments such as Indian cities where slum areas are very diverse and often have similar spatial characteristics compared to formal areas. Thus obtained accuracy levels not only depend on the methodology, but also on the urban morphology and how well slum characteristics are captured by image-based proxies. To address this, Shekhar [105] proposed an OBIA procedure, identifying first formal areas; the remaining built-up areas are then classified as slums achieving an overall accuracy of 87%.

Challenges and Promising Aspects for a Global Slum Monitoring System
Based on the results presented in the previous sections, the most promising aspects of the reviewed studies (in terms of context, physical characteristics, data and requirements, and methodologies) for developing a global slum inventory are explored.

Access to Image Data and Contextual Factors
Our geographic "ground" knowledge on slums is limited to a few urban regions. Therefore, Owen and Wong [38] recommended more systematic comparisons between different slum settlements, done by very few studies (e.g., [46]). It would be important to compare the performance of indicators, methods, and image data for different urban contexts across the globe to obtain an overview of robust indicators, methods, and required data. A major initiative in this respect is performed by the Oak Ridge National Laboratory, where researchers are working on "a computationally efficient and automated framework that is capable of detecting new settlements (especially slums) across the globe" ( [117], p. 1425). To promote large-scale slum-mapping programs, clear guidelines and continuous political support is necessary, shown by the challenges of implementing, for instance, past slum-mapping programs in Indian cities (RAY; the vision of a slum-free India) [184].
A major bottleneck (besides image costs) is image availability due to frequent cloud cover in tropical cities. However, since the massive increase of VHR sensors, we can expect an improvement of image availability. To further overcome this problem, more (V)HR resolution SAR sensors are available (e.g., TerraSAR-X) that penetrate clouds, being suitable for texture analysis of slums [60,89]. Also, LIDAR data have enormous potential for object extraction with their capability of extracting building heights [185], a relevant indicator for slum mapping [4]. Furthermore, drones (UAVs) are able to fly below clouds to capture settlement details [186], but to cover entire cities would be computational challenging, in addition to the inevitable privacy issues. In addition, other image data sources have potential, e.g., night-light images. Unfortunately, the resolution of sensors like OLS is too low to map detailed inner urban night-light variations. On a metropolitan scale, researchers successfully correlated poverty rates with observed night-time lights (e.g., [187]). Alternative image sources for global slum mapping are, for instance, GE images that allow working with VHR imagery free of charge (democratizing data access), where texture-based image analysis showed promising accuracies [98]. To increase the classification accuracy, several studies have proposed the use of auxiliary data, such as the utility of DSM for built-up or roof extraction [153] or the usage of VGI (volunteered geographic information) [69].

Systematic Conceptualization of Slums: Methods and Slum Characteristics
Regarding the transferability and robustness of OBIA-based methods (across different data and locations), locally developed rule sets have their limitations, compared to the better performance of texture-based or machine-learning algorithms [188]. Furthermore, the latter have the capacity to deal with multilayer inputs of spectral, texture or spatial´physical indicators (e.g., [46]). However, machine-learning algorithms (both parametric and non-parametric) mostly employ per-pixel classifiers. Only a few examples extracted area-based layers (e.g., via segments) [188], which would be more relevant for slum mapping towards informing pro-poor policies. Thus, besides setting up a well-structured conceptual framework in form of a slum ontology [71] and developing a consistent framework for assessing the transferability and robustness of slum extraction methods [23], the advantages of both methods-OBIA and machine learning-need be combined.
One major dilemma when assessing the performance of slum-mapping methods is access to reference data. In general, studies use ground truth data (collected in the field) (e.g., [45]), expert delineations (e.g., [82]), or available municipal data sets (e.g., [183]). All data sets have the inherent dilemma of "what is a slum," as slum definitions vary between and within countries but even within a city, and different institutions can have different slum definitions and therefore slum maps (e.g., in Jakarta [189]). These uncertainties have a negative impact on classification accuracies [158], and reduces the comparability of the performance of different slum-mapping methods. In Sections 5 and 7 the role of systematically selected image-based proxies (e.g., in form of a slum ontology) for methodological advances in slum mapping has been stressed. Therefore, a more systematic exploration of potential proxies to describe differences within slums and slums versus formal areas is needed. A first starting point is the relation between classical visual image interpretation elements [116] and physical characteristics of slums. This would allow developing systematic rules for OBIA or training machine-learning approaches similar to how human interpreters recognize slums. An overview of proxies is presented in Table 8. Roof density [4], vegetation density [120] Line segment heterogeneity [78] Settlement form and distance to river [82], accessibility and slope [6], border length and distances to hazardous areas [105] To be explored Building distance and height [4], building orientation [72] and shadow [71] Patch density pattern [41] Line orientation and distribution [46], entropy of objects (e.g., roads) [6], LBP [63], vegetation pattern [54], aggregation [41], Fourier transformation [190] Vegetation patch compactness [6] To be explored Shadow variation Shadow density Connectivity of roads/footpaths

Primary
Color/Tone NA NA Roof colors/ material [82] NDVI [72], road material [96], soil index [104], V-I-S model [43,135] To be explored Pattern of roof material Variation of roof material Land cover variation In the literature, many interpretation elements have been employed-simple (primary) ones based on color, or complex (tertiary and higher) ones, based on pattern or site. However, many potentially interesting combinations have not been systematically explored (Table 8), e.g., homogeneity of roofing material at settlement level. Building height has only been used in an explorative study [4]. Other site/association-related proxies have only been used in a few OBIA studies (e.g., [78,82]). Promising proxies relate to distance to livelihood opportunities and services, road features (e.g., available from OSM) or general line features and their orientation and pattern. For example, we know that lines in slums are generally much more irregular and heterogeneous in orientation compared to formal settlements [78]. Furthermore, the analysis of heterogeneity (versus homogeneity) and patterns of density, roof materials, shadow, vegetation, and other land cover types could provide promising proxies. Thus a more systematic inclusion of all levels of interpretation elements will be important to improve slum extraction approaches. To avoid high dimensionality of large proxy sets, automatized feature selection will be the road forward [191], allowing the selection of the most relevant features (indicators) while excluding redundant ones.
Presently, we know too little about the global diversity of slums and how to capture this diversity by robust image-based proxies. Thus, a more systematic exploration of potential proxies is required for the development of a global slum inventory. Besides the establishment of a global slum inventory, regular monitoring systems at local level are necessary for the detection of changes but also as an instrument to monitor policy implementation or to protect slum dwellers against illegal evictions.

Conclusions
In this review, we identified the variety of methodological advances in slum-mapping studies that are relevant for developing a global slum inventory. The reviewed literature shows that the current geographic knowledge on slum characteristics is rather limited. This knowledge needs to be extended to cover the main urban regions in the Global South, especially where urban growth rates and poverty levels are high. Specifically, more comparative studies on proxies are needed across the globe, using a systematic depiction of established slum characteristics (i.e., building geometry, density, pattern, roof material and site characteristics) versus image (interpretation) elements for the development of robust image-based proxies. This also requires a clear conceptual frame to assess their transferability and robustness. The same is required for methodologies. On the basis of the reported accuracies and the ability to process larger data and indicator sets, the most promising methods for a global slum inventory use machine-learning approaches. Several important recommendations for future methodological developments are: (1) include better contextual properties (larger neighborhoods); (2) avoid pixel-based approaches; (3) employ scalable aggregation levels that allow the mapping of smaller slum pockets as well as larger slum areas; (4) include more complex interpretation elements (site and association) and proxies based on ancillary data; and (5) examine the impact of different sensor characteristics on classification accuracies.
For OBIA, extracting object-level information on roofs or roads, often required for counting dwellings or estimating population, the availability of VHR imagery in the range of 30 cm might open up new avenues, in particular for very high-density slums in Asian cities. For the further development of OBIA approaches that are fit for city or settlement-level information extraction, the important recommendations and considerations are: (1) select suitable sensors for the local context; (2) systematic slum characterization should be translated into robust and transferable rule sets; (3) include readily available ancillary data in the classification process; (4) link image-derived products with socioeconomic data.
Locally feasible and quick monitoring approaches could rely on both, OBIA or machine learning, but also single indicator approaches (e.g., GLCM or lacunarity) have the potential to capture quickly the location and extent of slum areas in support of pro-poor policy implementations. Therefore, capturing the local slum morphology with the most suitable indicator(s) transferable to imagery of different sensors or different years is crucial. Nevertheless, a global slum inventory must acknowledge the diversity of slums within and between cities. Therefore, besides the mapping of slums, the identification of contextual slum typologies is an important research direction; such information will allow the combination of image-based information with socioeconomic characteristics, which may ultimately lead to a better targeting of pro-poor policy interventions. Finally, the information gap and access to data between the Global South and North needs to be better bridged by making data and tools globally accessible to local actors with appropriate attention for capacity building to ensure proper understanding and application.