WorldView-2 Data for Hierarchical Object-Based Urban Land Cover Classiﬁcation in Kigali: Integrating Rule-Based Approach with Urban Density and Greenness Indices

: The emergence of high-resolution satellite data, such as WorldView-2, has opened the opportunity for urban land cover mapping at ﬁne resolution. However, it is not straightforward to map detailed urban land cover and to detect urban deprived areas, such as informal settlements, in complex urban environments based merely on high-resolution spectral features. Thus, approaches integrating hierarchical segmentation and rule-based classiﬁcation strategies can play a crucial role in producing high quality urban land cover maps. This study aims to evaluate the potential of WorldView-2 high-resolution multispectral and panchromatic imagery for detailed urban land cover classiﬁcation in Kigali, Rwanda, a complex urban area characterized by a subtropical highland climate. A multi-stage object-based classiﬁcation was performed using support vector machines (SVM) and a rule-based approach to derive 12 land cover classes with the input of WorldView-2 spectral bands, spectral indices, gray level co-occurrence matrix (GLCM) texture measures and a digital terrain model (DTM). In the initial classiﬁcation, confusion existed among the informal settlements, the high- and low-density built-up areas, as well as between the upland and lowland agriculture. To improve the classiﬁcation accuracy, a framework based on a geometric ruleset and two newly deﬁned indices (urban density and greenness density indices) were developed. The novel framework resulted in an overall classiﬁcation accuracy at 85.36% with a kappa coe ﬃ cient at 0.82. The confusion between high-and low-density built-up areas signiﬁcantly decreased, while informal settlements were successfully extracted with the producer and user’s accuracies at 77% and 90% respectively. It was revealed that the integration of an object-based SVM classiﬁcation of WorldView-2 feature sets and DTM with the geometric ruleset and urban density and greenness indices resulted in better class separability, thus higher classiﬁcation accuracies in complex urban environments.


Introduction
Cities are considered important areas for economic opportunities and an engine for a country's development [1,2]. Nevertheless, accelerated urbanization can lead to, not only deterioration of the quality of life, but also to environmental degradation especially in developing countries with a few coping strategies [3][4][5]. Cropland conversion, land use competition and wetlands alteration are Previous studies investigated various methods for improving image classification accuracy including: (i) the use of a knowledge-based and rule-based approach [19,23,24]; (ii) integrating original spectral bands with texture feature measures [10,[25][26][27]; (iii) combining original image bands with transformed bands using principle components analysis (PCA) and/or intensity-hue-saturation (HIS) [28,29]; (iv) classifying generated segments using hybrid approaches [30], and (v) the fusion of multispectral images with synthetic aperture radar (SAR) images [19,24]. Other studies proposed either the application of edge detection algorithms [31,32] for accurately depicting linear features such as road networks, coupling landscape metrics and transfer learning as a framework for urban dynamics monitoring [33] or performing multi-scale hierarchical classification in high spectral dimensional feature space [34,35]. Machine learning methods have been further applied, specifically for assessing the potential of high-resolution data in producing accurate urban land cover information. Novack et al. [36] e.g., tested the performance of WV-2 over QuickBird-2 imagery in detecting urban structures in complex urban environments of São Paulo, Brazil. By comparing different supervised learning algorithms in high dimensional feature space, they found that additional spectral bands of WorldView-2 were improving the accuracy of spectrally similar urban objects, especially those lacking geometrical and contextual patterns. Deep learning approaches, such as convolution neural networks (CNNs) [37] were also found promising in segregating urban objects with high spectral variability. High-resolution sensors with multiple collection sequences over same illuminated scene are also believed to provide data that can improve the classification in complex environments, such as urban areas [38,39]. For instance, classified WV-2 data from multiple and sequential collections through a variety of observation angles over Atlanta, Georgia resulted in 13% kappa improvement compared to a single overflight target acquisition [40].
The use of high-resolution imagery and object-based approaches is well established in various urban applications, such as slum detection [10,25,32,41,42], urban poverty analysis [43], road network and buildings detection [44][45][46][47][48][49][50], mapping and monitoring urban ecosystem services [51] to name few. Pu et al. [52] e.g., tested the performance of object-based and pixel-based methods when applied on high-resolution IKONOS data for a detailed urban land cover classification in Tampa Bay, Florida, USA. Their findings revealed that the objected-based approach outperformed the pixel-based method. Furberg and Ban [53] assessed the spatial-temporal urban land cover change in Stockholm, Sweden between 1986 and 2006 using an object-based and rule-based strategy with multi-spectral SPOT images and successfully performed change detection by considering seven land cover classes, including high and low-density built-up areas, mixed forest and open land, industrial areas and water. Their study purpose was to investigate the spatial temporal dynamics of the landscape composition and configuration in the study area using landscape metrics. Using QuickBird multispectral data, Ban et al. [19] developed an object-based and knowledge-based approach for urban land cover classification in Toronto, Canada. They found that, for identifying 16 land-cover classes, the object-based and rule-based approach was effective and an overall classification accuracy of 87.9% (kappa: 0.868) was achieved. In the global south, detailed urban land cover studies were carried out based on high-resolution data and object-based approaches as well. Turlapaty et al. [50] detected buildings and estimated their height in Rio de Janeiro, Brazil using a hybrid approach, where a template matching algorithm was coupled with a support vector machine (SVM) classifier. They demonstrated that multi-angular high-resolution images, such as WorldView-2 and height information, were worthwhile for detecting urban built-up structures and the 3D visualization of urban forms. Kuffer et al. [54] combined spectral information extracted from QuickBird and IKONOS imagery with spatial metrics to quantify the morphological differences in planned and unplanned urban areas focusing on New Delhi, India and Dar es Salaam, Tanzania. Based on the homogeneous urban patches derived from segmented high-resolution images, spatial metrics and multi-criteria evaluation, the accuracy of their developed urban settlement index was confirmed with more than 70%. Kohli et al. [25] developed an objected-oriented and rule-based method for slum detection using QuickBird data over Pune, India. Their method consisted of integrating expert knowledge with a hierarchical classification Remote Sens. 2019, 11, 2128 4 of 23 resulting in slum detection. Kuffer et al. [10] tested the utility of the gray level co-occurrence matrix (GLCM) variance to classify slums and planned areas using very high-resolution data over Mumbai and Ahmedabad, India and Kigali, Rwanda respectively. Kit and Lüdeke [32] illustrated the usefulness of combining the Canny and line-segments-detection algorithms for slum detection in the Indian megacity of Hyderabad. Despite the limited spatial coverage, mapping and analysing informal and planned settlements in urban environments using high-resolution data and object-based approaches has been used in different areas of Sub-Saharan Africa, such as Voi Township, Kenya [55], Kibera ward in Nairobi, Kenya [56], Kisumu in Western Kenya [57], Cape Town, South Africa [58] to name few. The satellite-based detection of deprived urban areas such as slums and informal settlements, and environmentally sensitive areas, such as the wetlands, is still a challenging topic to explore especially in rapidly urbanizing hotspots of global south nations. Meanwhile, the production of geospatial data and information about urban spatial patterns and growth in those nations are urgently needed for monitoring the implementation of Goal 11 of the Sustainable Development Goals (SGDs) which advocates making cities inclusive, safe, resilient and sustainable [59].
This research aims to evaluate the combination of the object-based SVM classification of high-resolution satellite imagery, the derived rules of geometric features, and the urban density and greenness density indices for improving land cover classifications in complex urban environments. The proposed framework can be a method to rapidly produce land cover maps when land cover class separability is still problematic. The application of the above-mentioned method can also speed up the production of detailed urban land cover information, which is highly needed in rapidly urbanizing cities of the global south, particularly in Sub-Saharan Africa. Despite previous research efforts, very few studies aiming at producing high quality urban land cover maps, including slum areas extraction in Sub-Saharan Africa, have been produced. In particular, scant studies based on near real-time information extraction using high-resolution satellite data exist on Kigali, Rwanda. To the best of the author's knowledge, there are only generalized studies on slum detection by Kuffer et al. [10] and the building footprints extraction based on Pléiades multispectral bands and elevation information by [60]. Therefore, the evaluation on the use of very high-resolution images for detailed urban land cover mapping in Kigali, Rwanda is considered as an added value to the existing methods for urban information production in Africa. This paper is structured as follows. Section 1 introduces the background and rationale for the research and the objectives. Section 2 presents the study area and data used in this research. The methodology is described in Section 3, followed by the presentation of the results and discussion respectively in Sections 4 and 5. Section 6 provides the main study outlook and draws conclusions.

Study Area and Data Description
Kigali is Rwanda's capital and largest city with an estimated 730 km 2 metropolitan area and a population of more than 1 million [61]. The area shown in the WorldView-2 image below covers the city's central and eastern parts (see Figure 1). As a fast-growing and rapidly changing city, there is an increasing demand for land in peri-urban areas for housing and emerging secondary and tertiary activities. Parcel subdivisions, expropriation and peri-urban land-use conversion are continuously taking place. The demolition of existing buildings, the construction of high-rise buildings in the inner city, the densification and renewal of road networks, and the expansion of built-up areas are among the most prominent urban developments. Therefore, tracing the ongoing urbanization (re)development is paramount for supporting land management and sustainable urban planning. Since the post-genocide (Genocide against Tutsi happened in Rwanda in 1994 and at least one million people lost their lives) era, the population growth was high paced in Kigali. In two consecutive censuses period, Kigali's city population quadrupled from an estimated 200,000 in 2002 to 1.135 million in 2012 [61], and the impervious surface has been sprawling ever since. A cloud free high-resolution WorldView-2 (WV-2) image acquired on 17 May 2016 was used in this study. The WV-2 satellite is a high spatial resolution space-borne sensor launched in 2009 with eight multispectral bands ranging from blue to the near infrared parts of the electromagnetic spectrum and one panchromatic band (450-800 nm) [62]. In addition to blue (450-510 nm), green (510-580 nm), red (630-690 nm), and near-infrared-1 (770-895 nm) bands, WV-2 has the additional coastal blue (400-450 nm), yellow (585-625 nm), red-edge (705-745 nm), near-infrared-2 (860-1040 nm) bands. The imagery was first orthorectified using the satellite orbital model and a digital terrain A cloud free high-resolution WorldView-2 (WV-2) image acquired on 17 May 2016 was used in this study. The WV-2 satellite is a high spatial resolution space-borne sensor launched in 2009 with eight multispectral bands ranging from blue to the near infrared parts of the electromagnetic spectrum and one panchromatic band (450-800 nm) [62]. In addition to blue (450-510 nm), green (510-580 nm), red (630-690 nm), and near-infrared-1 (770-895 nm) bands, WV-2 has the additional coastal blue (400-450 nm), yellow (585-625 nm), red-edge (705-745 nm), near-infrared-2 (860-1040 nm) bands. The imagery was first orthorectified using the satellite orbital model and a digital terrain model (DTM) at 10 m resolution, then projected in the Universal Transverse Mercator-36 South Zone with World Geodetic System 1984. The DTM was produced by the Department of Lands and Mapping of Rwanda Natural Resource Authority (RNRA) using stereo-restitution based on 25 cm spatial resolution ortho-rectified aerial photos. Before its use, the DTM was reprojected from a customized local projection system, Transverse Mercator 2005 to the same projection as the WV-2 imagery. The DTM was mainly used to derive the elevation and slope information.

Conceptualizing Land Cover Classification Scheme
Based on the information needs to support sustainable land management and urban planning, twelve land cover classes were proposed in the classification scheme including the high-density built-up area (HDB) and low-density built-up area (LDB), informal settlements (IS), paved road (PR) and unpaved roads (UPR), urban green space (UGS), upland and lowland agricultures, forest, bare land, wetland and water as illustrated in Table 1. The designation of proposed land cover classes was preceded by conceptualizing the hierarchical land cover scheme that facilitated the implementation of the multi-stage land cover classification workflow. The built-up area was conceptually considered as a super-class, which can be later split into sub-classes (i.e., high-and low-density built-up areas). The informal settlements were captured in the schema as a sub-class nested in high-density built-up areas. Besides, agriculture was proposed as a super-class, i.e., at an aggregate level, and was subsequently subdivided into two sub-classes including the upland and lowland agricultures. Other remaining land cover classes such as paved and unpaved roads, forests, urban green spaces, bare land, wetland, and water were considered at the higher level of the proposed hierarchy and consequently refined using spectral and geometric based rules.

Training and Validation Data Collection
One of the key aspects of the supervised methodology was the collection of the training samples. In this study, the training samples were collected at the segment level (Level 1, see step 1 in Figure 2) and the proportion of the area occupied by each land cover class (see Table 2) was considered to determine the number of the objects representing each class. The training data was collected at the segment level as all pixels in each segment belong to one land cover class. The number of training objects for each class ranged from 12 to 181 and the total was 438 (see Table 2). Regarding the validation data, individual points were randomly selected across the study area for each class. To guarantee the independence of the validation data from the training data, the validation points that were inside the training objects were excluded. To ensure the quality of the validation samples, each of the 2925 validation points were manually labeled by cross-checking the high-resolution QuickBird images on Google Earth with WV-2 imagery. The number of validation points for each land cover class is reported in Table 2. As the lowland and highland agriculture class was extracted using a rule-based strategy, the training objects were selected for the combined agriculture class. However, the validation samples for the highland and lowland agriculture classes equivalent to 208 and 91 respectively, were separately collected.

Step 1: Image Segmentation
The multi-resolution segmentation algorithm implemented in Definiens eCognition version 9.1.2 was used to perform a first level image segmentation (Level 1). The goal of this step is to aggregate surrounding pixels with similar spectral properties into objects/segments (i.e., buildings, roads, agricultural fields). The multi-resolution segmentation uses a region growing and merging algorithm based on similar spectral grouping [63]. Studies such as [58,63] found that multi-resolution segmentation was a suitable algorithm to easily generate meaningful segments that were adapted to the spatial pattern of land cover distribution. The level 1 segmentation was performed on WV-2 visible (blue, green, red and yellow), near-infrared-1 (NIR1), near-infrared-2 (NIR2) and panchromatic bands combined with DTM. All bands were given the same weight (1) except NIR1 and NIR2 that were considered twice as important for maximizing the distinction among different vegetated zones such as the forest, open land and UGS. The segmentation parameters were empirically tested and deemed satisfactory in producing meaningful segments. A scale parameter (SP) of 60 was empirically found suitable for generating the segments corresponding to the spatial configuration of land cover spectral grouping. With regard to the composition of a homogeneity

Methodology
In this research, a hierarchical object-based classification workflow involving multi-stage classifications and a rule-based strategy was implemented and it consisted of five major processing steps (see Figure 2). The first step consists of generating image objects through a segmentation process. The second step involves land cover classification using the SVM classifier. In the third step, the geometric ruleset for the object was used for a first refinement of the land cover map. In the fourth step, a binary road network map was generated based on the refined land cover map and it was used to segment the WV-2 imagery to obtain objects representing the different city blocks. The bigger objects were used for computing the newly defined indices: The urban density index (UDI) and the greenness density index (GDI) that were used to refine the final land cover map. The fifth step is a customize feature extraction workflow to delineate the informal settlements. A detailed description of all processing steps is reported in Sections 3.1-3.5.

Step 1: Image Segmentation
The multi-resolution segmentation algorithm implemented in Definiens eCognition version 9.1.2 was used to perform a first level image segmentation (Level 1). The goal of this step is to aggregate surrounding pixels with similar spectral properties into objects/segments (i.e., buildings, roads, agricultural fields). The multi-resolution segmentation uses a region growing and merging algorithm based on similar spectral grouping [63]. Studies such as [58,63] found that multi-resolution segmentation was a suitable algorithm to easily generate meaningful segments that were adapted to the spatial pattern of land cover distribution. The level 1 segmentation was performed on WV-2 visible (blue, green, red and yellow), near-infrared-1 (NIR1), near-infrared-2 (NIR2) and panchromatic bands combined with DTM. All bands were given the same weight (1) except NIR1 and NIR2 that were considered twice as important for maximizing the distinction among different vegetated zones such as the forest, open land and UGS. The segmentation parameters were empirically tested and deemed satisfactory in producing meaningful segments. A scale parameter (SP) of 60 was empirically found suitable for generating the segments corresponding to the spatial configuration of land cover spectral grouping. With regard to the composition of a homogeneity criterion, both the shape and compactness were fixed to 0.5. Figure 3 shows the patterns and shapes of generated level 1 segments. It was observed that the segments for HDB and informal settlements were characterized by irregular shapes. The same irregularity was also observed in land cover classes occupying large and continuous spaces such as the forest, agricultural land and UGS.
Remote Sens. 2019, 11, x FOR PEER REVIEW 9 of 23 criterion, both the shape and compactness were fixed to 0.5. Figure 3 shows the patterns and shapes of generated level 1 segments. It was observed that the segments for HDB and informal settlements were characterized by irregular shapes. The same irregularity was also observed in land cover classes occupying large and continuous spaces such as the forest, agricultural land and UGS.

Step 2: SVM Object-Based Image Classification
In this step, the segments were classified into ten land cover classes using an SVM based on different spectral and geometric features. The selection of these features was performed to increase the class separability. The resulting input features for the SVM classification included the multispectral and panchromatic bands of WV-2 combined with geometric and texture features. The mean value and standard deviation for each multispectral band were considered as an input in the SVM classifier. The SVM radian basis function (RBF) kernel function was used. The regularization

Step 2: SVM Object-Based Image Classification
In this step, the segments were classified into ten land cover classes using an SVM based on different spectral and geometric features. The selection of these features was performed to increase the class separability. The resulting input features for the SVM classification included the multispectral and panchromatic bands of WV-2 combined with geometric and texture features. The mean value and standard deviation for each multispectral band were considered as an input in the SVM classifier. The SVM radian basis function (RBF) kernel function was used. The regularization parameter (or C parameter) as well as the Gamma for the RBF kernel were left by the default value (2 and 0 respectively). The above-mentioned SVM parameters were chosen as past studies showed that SVM with RBF kernel yielded good class separation in hyperplane and minimizes the misclassification for land cover classifications in complex environments [64,65]. Moreover, various geometric features were included in the feature space including the pixels' area, width, length, asymmetry and compactness, while the rectangular fit was selected as the shape feature to improve class separability. The Gray-level co-occurrence matrix (GLCM) texture features as the mean, entropy and standard deviation were chosen. The co-occurrence joint probabilities calculation for the GLCM feature values was set to all directions (0 • , 45 • , 90 • and 135 • ). The GLCM texture measures were computed for each image object. Table 3 reports the selected features for the SVM classification.

Step 3: Hierarchical Classification Refinement with Ruleset
Several tests highlighted that the object-based SVM classification was not accurate enough to produce high quality land cover maps. For instance, HDB and LDB were confused with each other, while informal settlements were confused with HDB. The upland and lowland agriculture classes were difficult to separate due to their spectral similarity. Therefore, an extended framework was proposed to improve the classification (see Figure 4). The framework involves a number of processing steps for the classification refinement including the development and application of an object-based ruleset and the generation of separate layers used for filtering the results.
The rules were constructed based on the attributes of the objects and features in Definiens eCognition version 9.1.2. The selected features included the statistics of the WV-2 bands, each object's geometry and shape, such as the asymmetry, rectangular fit, and compactness. The lowland agriculture class was delineated based on the slope and elevation criteria of the DTM (i.e., slope ≤ 5% and elevation ≤ 1500 m). The proposed criteria were empirically tested and judged as suitable for delineating valleys that are candidate areas for lowland agriculture. Using the above criteria, the agriculture class was split into two new classes: Lowland agriculture and highland agriculture. Similarly, the rules were developed to reduce confusion among several classes. Table 4 summarizes the proposed ruleset that was applied to refine the land cover map extracted in the first step.
Several tests highlighted that the object-based SVM classification was not accurate enough to produce high quality land cover maps. For instance, HDB and LDB were confused with each other, while informal settlements were confused with HDB. The upland and lowland agriculture classes were difficult to separate due to their spectral similarity. Therefore, an extended framework was proposed to improve the classification (see Figure 4). The framework involves a number of processing steps for the classification refinement including the development and application of an object-based ruleset and the generation of separate layers used for filtering the results. The rules were constructed based on the attributes of the objects and features in Definiens eCognition version 9.1.2. The selected features included the statistics of the WV-2 bands, each object's geometry and shape, such as the asymmetry, rectangular fit, and compactness. The lowland agriculture class was delineated based on the slope and elevation criteria of the DTM (i.e., slope ≤ 5% and elevation ≤ 1500 m). The proposed criteria were empirically tested and judged as suitable for delineating valleys that are candidate areas for lowland agriculture. Using the above criteria, the

Step 4: Urban Density and Greenness Indices Computation
The goal of this processing step was to use the refined land cover map extracted in the second step to generate both the urban density index (UDI) and greenness density index (GDI) layers to further reduce the confusions between some land cover classes. The computation of these indices was preceded by extracting road networks (considering both paved and unpaved roads). A raster layer composed of extracted road networks was created for generating level 2 segments by applying the multi-resolution segmentation algorithm as illustrated in Figure 4. Level 2 segmentation was performed on the road network binary image extracted from the refined SVM classification (Step 3). The scale parameter increased to 100 from 60 used in the Level 1 segmentation performed in Step 1, whilst the shape and compactness were kept to 0.5. The purpose of this Level 2 segmentation was to generate the segments representing building blocks that were used for computing both UDI and GDI.
The overview of the UDI computation is presented in Figure 5: In (A), the SVM land cover classification is illustrated: In (B) the road network binary mask was computed after extracting the road network; in (C) generated segments based on the road network and masked non-road classes are portrayed. Finally, the urban density index was computed based on the land cover proportional occurrence as represented in (D). The road networks were considered appropriate for delineating urban blocks, such as built-up areas and paved surfaces. With values ranging from 0 (absence of urban structures) to 1 (presence of 100% urban structures), the UDI raster layer was produced to represent the amount of the built-up structures in each segment. Figure 6 shows that the areas with high UDI values (i.e., close to 1) represent the objects fully occupied by impervious surfaces (mainly road network and/or buildings). As the segments are progressively less occupied by the built-up area and/or roads, the UDI values are gradually decreasing towards zero for areas fully occupied by other land cover classes. A threshold value of 0.6 was considered suitable for separating HDB and LDB, i.e., the segments with more than a 0.6 index value were assigned to HDB and those below 0.6 were assigned to LDB. To compute GDI, the same level 2 segments were used to determine the normalized difference vegetation index (NDVI) mean value for each of the segment. The GDI values range from 0, i.e., the absence of vegetation to 0.7 for vegetation classes, such as the forest, UGS, agriculture and wetland. Figure 6 clearly shows the decrease of vegetation from rural areas to the urban core. The GDI values were also used to further improve the separation between HDB and LDB. Figure 6. Urban density index (UDI) and greenness density index (GDI) maps. The two indices' value is ranging from 0 to 1. The built-up area is characterized by low GDI and high UDI. Conversely, green structures are characterized by high GDI and low UDI.
The threshold rules to refine the HDB and LDB classification using both UDI and GDI were To compute GDI, the same level 2 segments were used to determine the normalized difference vegetation index (NDVI) mean value for each of the segment. The GDI values range from 0, i.e., the absence of vegetation to 0.7 for vegetation classes, such as the forest, UGS, agriculture and wetland. Figure 6 clearly shows the decrease of vegetation from rural areas to the urban core. The GDI values were also used to further improve the separation between HDB and LDB. The threshold rules to refine the HDB and LDB classification using both UDI and GDI were implemented as follows. Three layers were considered including L1, L2, and L3 to represent SVM and the rule-based refined classification, GDI and UDI respectively. The output layer named L4 was then derived based on a combination of values in the input layers.
Rule1: from HDB to LDB: If GDI ≥ 0.4 and UDI ≤ 0.6 then HDB →.LDB To compute GDI, the same level 2 segments were used to determine the normalized difference vegetation index (NDVI) mean value for each of the segment. The GDI values range from 0, i.e., the absence of vegetation to 0.7 for vegetation classes, such as the forest, UGS, agriculture and wetland. Figure 6 clearly shows the decrease of vegetation from rural areas to the urban core. The GDI values were also used to further improve the separation between HDB and LDB.
The threshold rules to refine the HDB and LDB classification using both UDI and GDI were implemented as follows. Three layers were considered including L1, L2, and L3 to represent SVM and the rule-based refined classification, GDI and UDI respectively. The output layer named L4 was then derived based on a combination of values in the input layers. Rule1: from HDB to LDB: If GDI ≥ 0.4 and UDI ≤ 0.6 then HDB →.LDB Rule 2: from LDB to HDB If GDI < 0.1 and UDI > 0.85 then LDB → HDB

Step 5: Informal Settlements Extraction
The spatial pattern of informal settlements is characterized by complexity in the shape, appearance with high building density, the absence of both green space and ventilation space [66,67]. Due to their spectral similarity, informal settlements were classified as HDB in the refined SVM classification. Thus, to extract informal settlements, the HDB class was used as an input. A HDB mask was derived using the refined SVM classification and applied to the WV-2 imagery to highlight only the areas covered HDB (See Figure 7). It has been observed that the informal settlements (inside the red rectangle A) have very different texture than the regular high-density built-up areas (inside the red rectangle B). Therefore, the texture measures were used to extract informal settlements from the HDB class.

Step 5: Informal Settlements Extraction
The spatial pattern of informal settlements is characterized by complexity in the shape, appearance with high building density, the absence of both green space and ventilation space [66,67]. Due to their spectral similarity, informal settlements were classified as HDB in the refined SVM classification. Thus, to extract informal settlements, the HDB class was used as an input. A HDB mask was derived using the refined SVM classification and applied to the WV-2 imagery to highlight only the areas covered HDB (See Figure 7). It has been observed that the informal settlements (inside the red rectangle A) have very different texture than the regular high-density built-up areas (inside the red rectangle B). Therefore, the texture measures were used to extract informal settlements from the HDB class. The first is seen as the IS nesting super-class. In zoomed map A, IS is highlighted in the WV-2 imagery with congested and small build-up areas and the absence of road network. The zoomed map B illustrates HDB in WV-2 with a morphology and spatial pattern characterized, either by big housing structures or by densely built-up areas intercepted with a road network.
The GLCM texture measures were identified as promising inputs for extracting urban congested structures, for example, the informal settlements and slums from very high-resolution imagery [10,25,31]. In this study, an example-based feature extraction workflow implemented in ENVI 5.3 was used for informal settlements extraction using an object-based SVM classification. The inputs features included all WV-2 bands and their GLCM texture features. The GLCM mean, variance, homogeneity, contrast and entropy for each of the nine WV-2 bands were derived. First, the segmentation was performed on the masked WV-2 image using the following segmentation settings: Edge detection was set as the default scale level (0), whereas the 80-scale value was empirically selected. The full lambda schedule algorithm was used for the segments merging to achieve the best results integrating The first is seen as the IS nesting super-class. In zoomed map A, IS is highlighted in the WV-2 imagery with congested and small build-up areas and the absence of road network. The zoomed map B illustrates HDB in WV-2 with a morphology and spatial pattern characterized, either by big housing structures or by densely built-up areas intercepted with a road network.
The GLCM texture measures were identified as promising inputs for extracting urban congested structures, for example, the informal settlements and slums from very high-resolution imagery [10,25,31]. In this study, an example-based feature extraction workflow implemented in ENVI 5.3 was used for informal settlements extraction using an object-based SVM classification. The inputs features included all WV-2 bands and their GLCM texture features. The GLCM mean, variance, homogeneity, contrast and entropy for each of the nine WV-2 bands were derived. First, the segmentation was performed on the masked WV-2 image using the following segmentation settings: Edge detection was set as the default scale level (0), whereas the 80-scale value was empirically selected. The full lambda schedule algorithm was used for the segments merging to achieve the best results integrating neighbouring areas based on a combination of spectral and spatial information [68]. To train the SVM classifier, training samples for HDB and informal settlements were randomly selected. Ten objects were selected for training informal settlements, whereas HDB were represented by 25 objects.

Results
The integrated object-based and rule-based approach resulted in 12 land cover classes with an overall accuracy at 85.36% and a kappa coefficient at 0.8228. With a thematic layer representing valleys derived from DTM, the lowland and highland agriculture classes were separated. As a result, the land cover classes increased from ten to eleven. The geometric rules and the UDI and GDI indices helped in the classification refinement for several classes. For example, the confusion between HDB and LDB was reduced and the producer's accuracies reached 72.2% and 81.2% respectively. The informal settlements were successfully depicted with the producer and user's accuracies at 77% and 90.2% respectively. This increased the number of land cover classes to twelve. Figure 8 presents the final classification map with 12 land cover classes.

Results
The integrated object-based and rule-based approach resulted in 12 land cover classes with an overall accuracy at 85.36% and a kappa coefficient at 0.8228. With a thematic layer representing valleys derived from DTM, the lowland and highland agriculture classes were separated. As a result, the land cover classes increased from ten to eleven. The geometric rules and the UDI and GDI indices helped in the classification refinement for several classes. For example, the confusion between HDB and LDB was reduced and the producer's accuracies reached 72.2% and 81.2% respectively. The informal settlements were successfully depicted with the producer and user's accuracies at 77% and 90.2% respectively. This increased the number of land cover classes to twelve. Figure 8 presents the final classification map with 12 land cover classes. The results in Table 5 show that all classes achieved over 80% of the producer's accuracy except HDB, informal settlements and bare land. Confusion still exists among HDB, LDB and informal settlements so that 18.8% of the HDB validation points were classified as LDB, while 13.9% of informal settlements validation points were classified as HDB. The small patches of UGS in the urban core and around the airport were misclassified as agricultural lands, while scattered UGS patches in rural areas were in fact agriculture.
Seven out of 12 land cover classes achieved over 90% of the user's accuracy, while the user's accuracies were rather low for three classes including unpaved road (52.8%), LDB (55.1%) and UGS (60.7%). The high commission error from HDB to LDB is due to the fact that the boundary between HDB and LDB is difficult to draw in some circumstances. The application of filtering rules between unpaved road and bare land is sometimes limited, given that the two classes are spectrally highly The results in Table 5 show that all classes achieved over 80% of the producer's accuracy except HDB, informal settlements and bare land. Confusion still exists among HDB, LDB and informal settlements so that 18.8% of the HDB validation points were classified as LDB, while 13.9% of informal settlements validation points were classified as HDB. The small patches of UGS in the urban core and around the airport were misclassified as agricultural lands, while scattered UGS patches in rural areas were in fact agriculture.  Seven out of 12 land cover classes achieved over 90% of the user's accuracy, while the user's accuracies were rather low for three classes including unpaved road (52.8%), LDB (55.1%) and UGS (60.7%). The high commission error from HDB to LDB is due to the fact that the boundary between HDB and LDB is difficult to draw in some circumstances. The application of filtering rules between unpaved road and bare land is sometimes limited, given that the two classes are spectrally highly correlated. Overall, the final land cover classification map matched reality quite well. Figure 9 illustrates the selected excerpts of the classification results. In row (A), the input WV-2 image is shown, whereas row (B) represents the final classification results.

Discussion
In this research, a high-resolution urban land cover map was produced with 12 classes using a multi-level and customized classification strategy based on WV-2 data. As inferred in previous studies e.g., [69,70] and our own experience during the proposed processing chain, the extraction of a detailed urban land cover map based on high-resolution data, such as WV-2, in complex urban environments using conventional object-based classification is prone to inaccuracies due to spectral variability in the same land cover class and spectral similarities among several land cover classes. Despite its worthwhile importance, the use of object-based and rule-based strategies for classification refinement is not sufficient to detect informal settlements. Figure 10 shows the spectral signatures of the 12 land cover classes using the 2925 validation points across the eight multispectral bands of WV-2 imagery. It was revealed that that some land cover classes were spectrally overlapping in all bands. This is the case for the lowland and upland agriculture. Overlapping between classes was also observed in bare land and unpaved roads. Similarly, the built-up area classes consisted of HDB, LDB and IS which were spectrally confusing with one another in all bands with a slight separation in the near infrared bands. Therefore, merely training an advanced classifier, such as SVM, is not enough to delineate land cover classes with good accuracy in complex urban environments. The integrated approach involving the combination of a ruleset, density indices and texture features allowed the informal settlements extraction. Spectrally, it is challenging to separate informal settlements from HDB. The texture features using GLCM were helpful in detecting IS with 77% of the producer's and 90% of the user's accuracies, respectively. The creation of a HDB mask was helpful to speed up the computation process while applying the example based feature extraction using SVM. Nevertheless, small HDB patches were also found intercepting the informal settlement objects due to the presence of houses surrounded by isolated and small vegetation patches.

Discussion
In this research, a high-resolution urban land cover map was produced with 12 classes using a multi-level and customized classification strategy based on WV-2 data. As inferred in previous studies e.g., [69,70] and our own experience during the proposed processing chain, the extraction of a detailed urban land cover map based on high-resolution data, such as WV-2, in complex urban environments using conventional object-based classification is prone to inaccuracies due to spectral variability in the same land cover class and spectral similarities among several land cover classes. Despite its worthwhile importance, the use of object-based and rule-based strategies for classification refinement is not sufficient to detect informal settlements. Figure 10 shows the spectral signatures of the 12 land cover classes using the 2925 validation points across the eight multispectral bands of WV-2 imagery. It was revealed that that some land cover classes were spectrally overlapping in all bands. This is the case for the lowland and upland agriculture. Overlapping between classes was also observed in bare land and unpaved roads. Similarly, the built-up area classes consisted of HDB, LDB and IS which were spectrally confusing with one another in all bands with a slight separation in the near infrared bands. Therefore, merely training an advanced classifier, such as SVM, is not enough to delineate land cover classes with good accuracy in complex urban environments.
Despite its worthwhile importance, the use of object-based and rule-based strategies for classification refinement is not sufficient to detect informal settlements. Figure 10 shows the spectral signatures of the 12 land cover classes using the 2925 validation points across the eight multispectral bands of WV-2 imagery. It was revealed that that some land cover classes were spectrally overlapping in all bands. This is the case for the lowland and upland agriculture. Overlapping between classes was also observed in bare land and unpaved roads. Similarly, the built-up area classes consisted of HDB, LDB and IS which were spectrally confusing with one another in all bands with a slight separation in the near infrared bands. Therefore, merely training an advanced classifier, such as SVM, is not enough to delineate land cover classes with good accuracy in complex urban environments.  The effectiveness of using high resolution data for mapping an urban landscape at fine resolution needs to rely on integrated methods to take into account the spectral information content, geometric properties of the urban structure, the rule-based approach and spatial variability of urban and greenness density. The application of the above-mentioned method can lead to the production of a highly accurate urban land cover map, but its implementation needs to follow a step-by-step methodology, such as the hierarchical classification workflow as illustrated in this study. The proposed approach involves a number of multistage classification and rule-based classification strategies. At the high level of the hierarchy, the one-pass object-based SVM classification takes into account the super-classes that are disaggregated into sub-classes and refined at a low level to capture the within class spectral and spatial diversity (see Figure 4). For instance, the built-up areas were first considered as a super-class that was divided into three sub-classes (i.e., HDB, LDB and IS) using a rule-based approach, the urban and greenness density indices, and the GLCM texture measures. The delineation of lowland agriculture stretched in the valleys in the study area was judged important because their management plan is different from one of the neighbouring upland agriculture. Indeed, while planning the use of land located in valleys, attention to their sensitivity to degradation and their respective ecological functions need to be taken into account. Therefore, it was found worthy to split the agriculture class into two sub-classes, i.e., lowland and upland agriculture. Spectrally, the lowland and upland agriculture classes are identical. Their separation was possible after the extraction of lowland based on topographic and slope data combined with post-processing operations and using a thematic object feature overlap. Previous studies combined high resolution multispectral data with digital surface models (DSM) to establish the distinction between the upland and lowland, such as mangrove forests or wetlands e.g., [71,72]. In the present study, the thematic layer valleys were found useful in separating the lowland agriculture class from the neighbouring upland agricultural zones.
The study pointed out that the synergy between a robust classifier, such as SVM, and the integration of a geometric rule-set and the proposed density indices (UDI and GDI), is a reliable method to improve the urban land cover classification in complex urban environments. The findings in the present study concur with previous studies e.g., [25,[73][74][75], where the rule-based approach using geometric features, texture measurements and the spectral band threshold were found useful for land cover classification enhancement. Some of the features to ingest in the feature space include the bands' mean and standard deviation, and in particular the geometric features related to the object's extent and shape, such as compactness, asymmetry and rectangular fit, area, width and length. The GLCM-texture features as proposed by [76] are contributing to the production of optimal segments and to class separability, while training an advanced classifier [75]. The improved classification results align with the claims in the previous studies where several authors e.g., [10,22,69] have been emphasizing the importance of texture features and geometric related features in improving the land cover classification in complex land cover environments. Particularly, the shape and extent of the object's features (especially area, length, width and rectangular fit) were found valuable for refining misclassified buildings in planned areas with LDB. The length and width were identified suitable for delineating the unpaved road and bare land which usually have similar spectral properties.
Furthermore, the multistage object-based classification was found to be a worthwhile framework for the informal settlements extraction from high-resolution imagery. Normally, informal settlements are developed in high-density built-up areas in unsuitable construction sites [66,67]. It was illustrated that the proposed urban density and greenness indices contributed to define the spatial patterns of urban morphology, such as slums and informal settlements. The results demonstrate that the proposed method is able to enhance both the land cover classification accuracy and computation performances while extracting informal settlements considering the HDB as a super-class in which informal settlements are nested as a sub-class. The relatively high producer and user's accuracies (77% and 90.2% respectively) in the informal settlements detection are promising and our results are in accordance with recent findings in slum detection research based on high resolution multispectral data by [25]. In that study, slums were detected with 60% agreement after training only three land cover classes, namely slums, non-slums and others. The methodology for slums and the informal settlements detection can be enhanced by incorporating additional features to the spectral information, such as the bands' statistics, image texture features and geometric objects, which are added in the feature space. Meanwhile, the conceptualization of informal settlements can take advantage of the information derived from high-resolution imagery using the integrated method, but the full coverage of informal settlements ontology can go beyond image classification. Indeed, the image classification and visual interpretation are considering the physical entities, whilst the definition of an informal settlement is embracing other aspects beyond physical morphology, such as legal aspects of land tenure and deprived living conditions and sub-standards [77,78].
As rule-based classifications remain consistent to a certain degree when applied to other areas [75,79], the proposed classification framework can be easily transferred and successfully tested in different study areas. The practical challenges in applying the proposed framework are mainly related to finding the optimum values for thresholding and to customize methods for extracting particular land cover classes, which are not standardized across different testing areas. Previous studies illustrated that the accuracy of the land cover classification is highly dependent on the type of landscape composition [80] and on the training and testing set characteristics [81,82]. Therefore, the cut-off values used in the present study for thresholding should be adapted, not only to the local context, but also to the variables taken into account during the classification processing chain.

Conclusions and Further Research
In the present study, high-resolution WorldView-2 data was evaluated for detailed urban land cover mapping in Kigali, an urbanization hotspot in Sub-Sahara Africa, using hierarchical object-based and rule-based classification strategies. The aim was to achieve an accurate land cover map considering challenging classes such as informal settlements, roads, low-density built-up areas and lowland agriculture. The results showed that an object-based SVM classification coupled with an integrated rule-based approach and two newly defined indices (urban and greenness density) yielded a very good overall classification accuracy (85.36%, kappa coefficient: 0.8228). However, confusion persisted between several classes, such as the high-and low-density built-up areas as well as between the unpaved road and bare land due to their spectral similarities. The proposed framework involving the integration of spectral statistics, geometric feature rulesets, urban and greenness density indices was found valuable for the classification refinement. The informal settlements were successfully detected with high producer and user's accuracies (77% and 90.2% respectively) applying the proposed method enhancing both the detection accuracy and computation performance. It was revealed that the most challenging tasks in urban land cover classification based on high-resolution multispectral data were the delineation among the built-up classes. Indeed, the three classes of built-up areas are often confused with one another due to their spectral similarity. However, the developed ruleset, UDI and GDI indices together with texture measures were proven effective to separate HDB, LDB and the informal settlements. An important finding from this study is that an improved detailed urban land cover classification based on high-resolution satellite data can be achieved through the combination of a set of features derived from the visible, near infrared and panchromatic bands, geometric ruleset, and the urban and greenness density indices. Further research is planned to test the developed methodology in several cities in the global south for the land cover classification and for informal settlements mapping.
Author Contributions: T.M. conducted the experiment, analysed the data and co-wrote the paper. A.N. conceived and designed the experiment on urban density and greenness indices computation and co-wrote the paper. Y.B. conceived the study on improving urban land cover classification using high-resolution data with an object-based approach, contributed to the pre-classification segmentation process, analysis of the results, and co-wrote the paper.