Detailed Urban Land Use Land Cover Classification at the Metropolitan Scale Using a Three-Layer Classification Scheme

Urban Land Use/Land Cover (LULC) information is essential for urban and environmental management. It is, however, very difficult to automatically extract detailed urban LULC information from remote sensing imagery, especially for a large urban area. Medium resolution imagery, such as Landsat Thematic Mapper (TM) data, cannot uncover detailed LULC information. Further, very high resolution (VHR) satellite imagery, such as IKONOS and QuickBird data, can only be applied to a small area, largely due to the data unavailability and high computation cost. As a result, little research has been conducted to extract detailed urban LULC information for a large urban area. This study, therefore, developed a three-layer classification scheme for deriving detailedurban LULC information by integrating newly launched Chinese GF-1 (medium resolution) and GF-2 (very high resolution) satellite imagery and synthetically incorporating geometry, texture, and spectral information through multi-resolution image segmentation and object-based image classification (OBIA). Homogeneous urban LULC types such as water bodies or large areas of vegetation could be derived from GF-1 imagery with 16 m and 8 m spatial resolutions, while heterogeneous urban LULC types such as industrial buildings, residential buildings, and roads could be extracted from GF-2 imagery with 3.2 m and 0.8 m spatial resolutions. The multi-resolution segmentation method and a random forest algorithm were employed to perform image segmentation and object-based image classification, respectively. An analysis of the results suggests an overall accuracy of 0.89 and 0.87 were achieved for the second and third level urban LULC classification maps, respectively. Therefore, the three-layer classification scheme has the potential to derive high accuracy urban LULC information through integrating medium and high-resolution remote sensing imagery.


Introduction
Land Use/Land Cover (LULC) is defined as the physical composition and characteristics (e.g., grass, forest, and impervious surfaces) or human-related activities (e.g., residential, commercial, and The remainder of this paper is organized as follows: Section 2 introduces the methodology including study area, data sources, data processing, classification scheme, and methods for accuracy assessment; Section 3 details the results; Section 4 provides discussion; and Section 5 concludes the paper.

Study Area
Changchun, the capital city of Jilin province, is located in the Northeast of China covering a region of longitudes from 124°18′ East to 127°05′ East and latitudes from 43°05′ North to 43°15′ North. It belongs to a temperate continental monsoon climate zone with an average temperature of 4.8 °C and an annual precipitation of 522 millimeters to 615 millimeters. Changchun is consisted of seven districts and three counties with a total governmental area of 20,604 km 2 and population of 7.793 million including 4.509 million registered citizens in Changchun city. The area enclosed by the expressway surrounding the urban region is selected as the study site with an area of 523.16 km 2 in this research (Figure 1).

Figure 1.
The study area (standard false color composition from GF-1 satellite imagery).

Satellite Data
The Chinese GF-1 is the first satellite of the China High-resolution Earth Observation System. The GF-1 satellite was launched in April 2013 with two panchromatic/multi-spectral (P/MS) and four wide field view (WFV) cameras. GF-1 P/MS data have a spatial resolution of 2 m/8 m and swath width of 60 km, while WFV data have a spatial resolution of 16 m and swath width of 800 km with four spectral channels, which are highly valuable data sources for estimating fractional vegetation cover, building density, and monitoring suspended particulate matter on a large extension [25][26][27].

Satellite Data
The Chinese GF-1 is the first satellite of the China High-resolution Earth Observation System. The GF-1 satellite was launched in April 2013 with two panchromatic/multi-spectral (P/MS) and four wide field view (WFV) cameras. GF-1 P/MS data have a spatial resolution of 2 m/8 m and swath width of 60 km, while WFV data have a spatial resolution of 16 m and swath width of 800 km with four spectral channels, which are highly valuable data sources for estimating fractional vegetation cover, building density, and monitoring suspended particulate matter on a large extension [25][26][27]. The Chinese GF-2 was launched after one and half year of the successful operation of the GF-1. The GF-2 is the first satellite with a spatial resolution lower than one meter which marks the China's civil satellite enterprise into an era of sub-meter spatial resolution. The GF-2 is equipped with two fine resolution 0.8 m panchromatic, 3.2 m multi-spectral cameras, and a swath width of 45.7 km. It is featured as finer spatial resolution, high position accuracy, and fast maneuverability [28]. The detailed sensor characteristics for GF-1 and GF-2 are presented in Table 1. The GF-1 imagery on June 22, 2015 and GF-2 imagery on May 25, 2015 were collected to conduct this study. Because of the large areas in this study area, six swaths of imagery were employed and their characteristics are listed in Table 2.

In Situ Data Collection
Based on the visual analysis and interpretation of GF-2 and GF-1 false color images (with bands four, three, and two as RGB for display) and the prior knowledge of this region, we identified and labelled 21 urban LULC types (Table 3), including two types of water bodies, two types of vegetation, one type of farmland, two types of bare lands, two types of roads and squares, five types of industrial buildings, and seven types of residential buildings, as well as shadow. The choices of these 21 LULC types are based on the practices of the Ministry of Housing and Urban-Rural Development of China, as well as the capability of visual interpretation from the 0.8 m resolution GF-2 imagery. Shadow is not an urban LULC type, but it is listed here as it cannot be grouped into other types. More than 100 points for each LULC type and altogether 2,732 points were manually collected from the image (Figure 2a). These samples were selected to ensure the coverage of all available urban land use land covers. While most of the samples have be successfully identified using GF-1 and GF-2 images, few of Sensors 2019, 19, 3120 5 of 24 them are, as yet, unidentified. To address this issue, a field investigation was performed in Changchun city in June 2017 (Figure 2b) when the season coincides with that of the employed satellite imagery. A fieldwork route with 321 points was planned out on the image and imported into two GPS receivers whose positioning accuracy is 0.2-0.5 m. The current LULC information at each site was checked and photos were taken at the same time. Based on the field work, the wrongly interpreted LULCs were corrected in the laboratory and regarded as the training points. At the same time, the field work is helpful to interpret the testing points for accuracy assessment. Table 3. Three-level classification scheme for urban Land Use/Land Cover (LULC) extraction.

Classification Scheme
Finer resolution images can provide more detailed information, which might be helpful to city managers and plannersfor extracting meaningful LULC types. Based on visually interpreted spectral differences, field investigation, and possible applications to urban planning and management, a three-layer classification scheme of urban LULC types was developed and is illustrated in Table 3. Table 3. Three-level classification scheme for urban Land Use/Land Cover (LULC) extraction.

Classification Scheme
Finer resolution images can provide more detailed information, which might be helpful to city managers and plannersfor extracting meaningful LULC types. Based on visually interpreted spectral differences, field investigation, and possible applications to urban planning and management, a three-layer classification scheme of urban LULC types was developed and is illustrated in Table 3.

Classification Scheme
Finer resolution images can provide more detailed information, which might be helpful to city managers and plannersfor extracting meaningful LULC types. Based on visually interpreted spectral differences, field investigation, and possible applications to urban planning and management, a three-layer classification scheme of urban LULC types was developed and is illustrated in Table 3.

Classification Scheme
Finer resolution images can provide more detailed information, which might be helpful to city managers and plannersfor extracting meaningful LULC types. Based on visually interpreted spectral differences, field investigation, and possible applications to urban planning and management, a three-layer classification scheme of urban LULC types was developed and is illustrated in Table 3.

Classification Scheme
Finer resolution images can provide more detailed information, which might be helpful to city managers and plannersfor extracting meaningful LULC types. Based on visually interpreted spectral differences, field investigation, and possible applications to urban planning and management, a three-layer classification scheme of urban LULC types was developed and is illustrated in Table 3.

Classification Scheme
Finer resolution images can provide more detailed information, which might be helpful to city managers and plannersfor extracting meaningful LULC types. Based on visually interpreted spectral differences, field investigation, and possible applications to urban planning and management, a three-layer classification scheme of urban LULC types was developed and is illustrated in Table 3.  High-density white roofs/75 Intensive residential buildings with small size and grey color Playground/7 6 Playground with running track and rectangle-shape soccer field Intensive residential buildings with small size and grey color The design of this three-layer land cover scheme is mainly based on the possible uses of land cover in urban-related applications, like urban planning and management. For instance, a city suffering from rapid urbanization needs urgent green space planning because the high rate of population growth contributes to diminishing green space. Detailed land cover information about grass, trees, and shrubs should be identified before making a reasonable plan for green space, while the land cover information of vegetation is adequate for a project on urban landscape change analysis. In the three-layer classification scheme designed here, water/pervious, impervious, and shadow land covers belong to the first level of land cover types. Water pervious and impervious land covers are critical parameters for urban hydrology, ecology, or environmental studies. With the objective of extracting the visually recognizable land cover types from GF-1 and GF-2 imagery, the second and third levels of urban LULC classification scheme were designed. In particular, buildings with different land uses present different spectral, textural, or geometric features, which lead to the possibility of separating LULC types of buildings in as much detail as possible. Considering the fact that a planning agency official conducting a land use or land cover inventory may wish to map several different types of roofs rather than a single building class. Building roofs are not only described as the last defining touch to giving a building, the aesthetic impression to a whole construction process, but also as expression and sign of a society's level of civilization [32,33]. In addition, building roofs, especially with red or white colors are usually used to identify industrial or logistical land use information in regions surrounding central urban areas. Five types of industrial buildings, five types of residential buildings, as well as playgrounds were identified visually by geometric and spectral features.This process has been carried out mainly by visual interpretation of images, and is then corrected by in situ investigation and validation. Some land cover types in the third layer belong to the same land cover or land use: For instance, the residential or industrial buildings types. Although they are nearly meaningless for urban planning or management projects, they are helpful in extracting the second layer of the urban LULC types with high accuracy or beneficial to detect roofing materials. In practice, particularly in China, buildings with the same use are likely to be with the same or similar color. It is more likely to be a planning practice in China.

Image Segmentation and Classification
For segmentating and classifying GF-1 and GF-2 imagery, an object-based approach was employed in this study. Object-based image analysis processing generally includes two main steps: Segmentation and classification [34]. The first step in the object-oriented image classification is to segment the image into different objects. Numerous image segmentation algorithms have been developed and applied in remote sensing image analysis [35]. Based on the multi-level classification scheme designed in this research, the algorithm of multi-resolution segmentation (MRS) was The design of this three-layer land cover scheme is mainly based on the possible uses of land cover in urban-related applications, like urban planning and management. For instance, a city suffering from rapid urbanization needs urgent green space planning because the high rate of population growth contributes to diminishing green space. Detailed land cover information about grass, trees, and shrubs should be identified before making a reasonable plan for green space, while the land cover information of vegetation is adequate for a project on urban landscape change analysis. In the three-layer classification scheme designed here, water/pervious, impervious, and shadow land covers belong to the first level of land cover types. Water pervious and impervious land covers are critical parameters for urban hydrology, ecology, or environmental studies. With the objective of extracting the visually recognizable land cover types from GF-1 and GF-2 imagery, the second and third levels of urban LULC classification scheme were designed. In particular, buildings with different land uses present different spectral, textural, or geometric features, which lead to the possibility of separating LULC types of buildings in as much detail as possible. Considering the fact that a planning agency official conducting a land use or land cover inventory may wish to map several different types of roofs rather than a single building class. Building roofs are not only described as the last defining touch to giving a building, the aesthetic impression to a whole construction process, but also as expression and sign of a society's level of civilization [32,33]. In addition, building roofs, especially with red or white colors are usually used to identify industrial or logistical land use information in regions surrounding central urban areas. Five types of industrial buildings, five types of residential buildings, as well as playgrounds were identified visually by geometric and spectral features.This process has been carried out mainly by visual interpretation of images, and is then corrected by in situ investigation and validation. Some land cover types in the third layer belong to the same land cover or land use: For instance, the residential or industrial buildings types. Although they are nearly meaningless for urban planning or management projects, they are helpful in extracting the second layer of the urban LULC types with high accuracy or beneficial to detect roofing materials. In practice, particularly in China, buildings with the same use are likely to be with the same or similar color. It is more likely to be a planning practice in China.

Image Segmentation and Classification
For segmentating and classifying GF-1 and GF-2 imagery, an object-based approach was employed in this study. Object-based image analysis processing generally includes two main steps: Segmentation and classification [34]. The first step in the object-oriented image classification is to segment the image into different objects. Numerous image segmentation algorithms have been developed and applied in remote sensing image analysis [35]. Based on the multi-level classification scheme designed in this research, the algorithm of multi-resolution segmentation (MRS) was Shadow (adjacent to high-rise or low-rise buildings) with dark color Sensors. 2019, 11, x FOR PEER REVIEW 5 of 26 using GF-1 and GF-2 images, few of them are, as yet, unidentified. To address this issue, a field investigation was performed in Changchun city in June 2017 (Figure 2b) when the season coincides with that of the employed satellite imagery. A fieldwork route with 321 points was planned out on the image and imported into two GPS receivers whose positioning accuracy is 0.2-0.5 m. The current LULC information at each site was checked and photos were taken at the same time. Based on the field work, the wrongly interpreted LULCs were corrected in the laboratory and regarded as the training points. At the same time, the field work is helpful to interpret the testing points for accuracy assessment.

Data Processing
All of the images were transformed into the same spatial coordinate system of UTM zone 51N with datum of WGS 84. With the help of DEM dataset of ASTGTM (http://datamirror.csdb.cn) and the rational polynomial coefficients (RPC) file [29], the multispectral, and panchromatic images were ortho-rectified. The Gram-Schmidt spectral sharpening algorithm [30,31] was employed to fuse the ortho-rectified multispectral image (as the low resolution image) and the ortho-rectified panchromatic image (as the high resolution image), such that the fused multi-spectral imagery can be with a high spatial resolution of 0.8 m. The Gram-Schmidt spectral sharpening algorithm is a classic technique for image fusion, and relevant studies have shown its advantages over other methods [30,31]. In this research, the Gram-Schmidt algorithm embedded in ENVI, a commercial program, was adopted. As a result, all four bands of the GF-2 multi-spectral imagery were merged to generate the pansharpened imagery with four bands and 0.8 meter resolution. As geometric differences exist between GF-1 and GF-2 images, GF-1 images were rectified with an RMS of 0.02

Data Processing
All of the images were transformed into the same spatial coordinate system of UTM zone 51N with datum of WGS 84. With the help of DEM dataset of ASTGTM (http://datamirror.csdb.cn) and the rational polynomial coefficients (RPC) file [29], the multispectral, and panchromatic images were ortho-rectified. The Gram-Schmidt spectral sharpening algorithm [30,31] was employed to fuse the ortho-rectified multispectral image (as the low resolution image) and the ortho-rectified panchromatic image (as the high resolution image), such that the fused multi-spectral imagery can be with a high spatial resolution of 0.8 m. The Gram-Schmidt spectral sharpening algorithm is a classic technique for image fusion, and relevant studies have shown its advantages over other methods [30,31]. In this research, the Gram-Schmidt algorithm embedded in ENVI, a commercial program, was adopted. As a result, all four bands of the GF-2 multi-spectral imagery were merged to generate the pansharpened imagery with four bands and 0.8 m resolution. As geometric differences exist between GF-1 and GF-2 images, GF-1 images were rectified with an RMS of 0.02 GF-1 pixel by selecting ground control points (GCPs) from the fused GF-2 image. Finally, we obtained four levels of imagery with spatial resolutions of 16 m, 8 m, 3.2 m, and 0.8 m, respectively. The flowchart for image pre-processing is presented in Figure 3.

Classification Scheme
Finer resolution images can provide more detailed information, which might be helpful to city managers and plannersfor extracting meaningful LULC types. Based on visually interpreted spectral differences, field investigation, and possible applications to urban planning and management, a three-layer classification scheme of urban LULC types was developed and is illustrated in Table 3.

Classification Scheme
Finer resolution images can provide more detailed information, which might be helpful to city managers and plannersfor extracting meaningful LULC types. Based on visually interpreted spectral differences, field investigation, and possible applications to urban planning and management, a three-layer classification scheme of urban LULC types was developed and is illustrated in Table 3.
The design of this three-layer land cover scheme is mainly based on the possible uses of land cover in urban-related applications, like urban planning and management. For instance, a city suffering from rapid urbanization needs urgent green space planning because the high rate of population growth contributes to diminishing green space. Detailed land cover information about grass, trees, and shrubs should be identified before making a reasonable plan for green space, while the land cover information of vegetation is adequate for a project on urban landscape change analysis. In the three-layer classification scheme designed here, water/pervious, impervious, and shadow land covers belong to the first level of land cover types. Water pervious and impervious land covers are critical parameters for urban hydrology, ecology, or environmental studies. With the objective of extracting the visually recognizable land cover types from GF-1 and GF-2 imagery, the second and third levels of urban LULC classification scheme were designed. In particular, buildings with different land uses present different spectral, textural, or geometric features, which lead to the possibility of separating LULC types of buildings in as much detail as possible. Considering the fact that a planning agency official conducting a land use or land cover inventory may wish to map several different types of roofs rather than a single building class. Building roofs are not only described as the last defining touch to giving a building, the aesthetic impression to a whole construction process, but also as expression and sign of a society's level of civilization [32,33]. In addition, building roofs, especially with red or white colors are usually used to identify industrial or logistical land use information in regions surrounding central urban areas. Five types of industrial buildings, five types of residential buildings, as well Sensors 2019, 19, 3120 9 of 24 as playgrounds were identified visually by geometric and spectral features.This process has been carried out mainly by visual interpretation of images, and is then corrected by in situ investigation and validation. Some land cover types in the third layer belong to the same land cover or land use: For instance, the residential or industrial buildings types. Although they are nearly meaningless for urban planning or management projects, they are helpful in extracting the second layer of the urban LULC types with high accuracy or beneficial to detect roofing materials. In practice, particularly in China, buildings with the same use are likely to be with the same or similar color. It is more likely to be a planning practice in China.

Image Segmentation and Classification
For segmentating and classifying GF-1 and GF-2 imagery, an object-based approach was employed in this study. Object-based image analysis processing generally includes two main steps: Segmentation and classification [34]. The first step in the object-oriented image classification is to segment the image into different objects. Numerous image segmentation algorithms have been developed and applied in remote sensing image analysis [35]. Based on the multi-level classification scheme designed in this research, the algorithm of multi-resolution segmentation (MRS) was selected as the segmentation method [36]. The MRS technique is a region-merging method. Its objective is to minimize the summed heterogeneity between adjacent pixels. Three user-defined segmentation parameters, including scale, shape, and compactness, could have a significant effect on the classification accuracy as they control the dimension and size of segmented objects [37,38]. Scale, the most important parameter, specifies the size of the final segmented image object that corresponds to the maximum acceptable heterogeneity. Higher scale parameter values produce larger image objects and vice versa. The shape parameter varies between zero and one and determines both the level of radiometric homogeneity and object shape, simultaneously. Higher shape values yield image objects with optimal shape homogeneity, while lower shape values produce image objects with optimal radiometric homogeneity. Same as the parameter of shape, the compactness parameter varies between zero and one and controls the degree of object smoothing [39]. These three user-defined parameters are affected by different image spatial resolutions and the sizes of the recognized ground objects [40]. Based on research from Drăgut et al., in 2010 and 2014 [41,42], the optimal segmentation scale parameters for images with spatial resolutions of 16 m and 8 m were designed as 100, while for images with spatial resolutions of 3.2 m and 0.8 m were 50 and 25, respectively. The detailed segmentation parameters could be found in Table 4. Classifiers like decision tree (DT), random forest (RF), and support vector machine (SVM) have attracted great attention among many object-oriented classification algorithms owing to their excellent classification performance [43,44]. For medium spatial resolution images, the performances of these three algorithms are similar [45]. For classifying object-based imagery with finer spatial resolution, inconsistent conclusions have been drawn due to the effect of various factors such as segmentation parameters [46][47][48]. By systematically analyzing the performance of various commonly-used supervised classifiers under different conditions, Li et al. (2016) concluded that RF was most suitable supervised classification method for object-based image analysis [49].
RF is an ensemble classification technique and is a further development of DTs [50]. RF has advantages over DT due to its characteristics of little training time, easy parameterization, and parameter stability. Therefore, RF has attracted more attentions around the scientific community. Unlike DT classifiers, RF runs iteratively with a random sample of the training points,and because of the law of large numbers, RF reduces the likelihood of over-fitting [51]. In addition, compared with other commonly used non-parametric classifiers such as SVM, RF is less sensitive to noise and is more efficient [52]. Due to its advantages, RF was employed in this research with urban LULC types as the dependent variable, and spectral features, i.e., mean values of each individual band, band composition of normalized difference vegetation index (NDVI), geometric information of shape index, texture measures, i.e., homogeneity, angular second moment, contrast, and entropy from the grayscale co-occurrence matrix (GLCM) which were calculated in a sliding window of 11 by 11 pixels, as the main input features [53,54]. The processes of image segmentation, classification and the following section of accuracy assessment were carried out through eCognition developer 9.0.

Urban LULC Type Extraction
A three-layer classification scheme was designed for this studyto identify different urban LULC types from GF-1 and GF-2 imagery with different spatial resolutions. This multi-level classification can provide city planners with an approach for selecting appropriate LULC types by combining or separating these extracted land use/land covers. In contrast to LULC classes like buildings, squares, and gardens within residential regions, LULCs like water bodies, bare lands, and urban green lands have a large area of existence such that they could be identified by GF-1 data with a 16 m spatial resolution. After masking out those extracted LULCs, those with small areas could be classified using finer resolution images. During each step of LULC classification, the minimum mapping unit (MMU), which is based on the MMU values from Globeland30 products, is employed to extract the LULC types (Table 4) [8]. Because of spatial resolution differences among images employed in this research, there are some inconsistencies in the boundaries of each type of LULC. In order to avoid slivers caused by the inconsistencies, we generated polygon vector files for regions whose LULC information has been extracted from GF-1 images, those regions were not processed any further. Only regions outside these vector files are classified from GF-2 images. These vector files along with raster files were regarded as the input data for the next step urban LULC extraction. In addition, shadow does not belong to any type of LULC, but it is widely present in VHR images, especially for these covering urban areas or mountainous regions, so shadow is presented as an individual type. The combination of the information from GF-1 and GF-2 satellite images with different spatial resolutions for extracting each urban LULC type is illustrated in Figure 4.
Only regions outside these vector files are classified from GF-2 images. These vector files along with raster files were regarded as the input data for the next step urban LULC extraction. In addition, shadow does not belong to any type of LULC, but it is widely present in VHR images, especially for these covering urban areas or mountainous regions, so shadow is presented as an individual type. The combination of the information from GF-1 and GF-2 satellite images with different spatial resolutions for extracting each urban LULC type is illustrated in Figure 4.

Water Body
There are two types of water bodies identifiable from the visual image interpretation: One is clean water while the other is turbid water. They are classified individually and then merged together and termed the water body class. Because of the relatively large areas of water bodies, the LULC information for water bodies were classified from the GF-1 data with 16 m and 8 m spatial resolutions. Because the effect of shadows on the identification of water bodies was much heavier in images with an 8 m spatial resolution than in images with a 16 m spatial resolution, an overlay spatial analysis was employed to perform topology intersect operations between water bodies extracted from images at each spatial resolution. The intersect operation is an "AND" logical operation, and the resultant polygon is classified as water bodies if both inputs are classified as water bodies. The overlying results are considered to be the main part of water bodies. The detailed workflow for extracting water bodies is shown in Figure 5.

Water Body
There are two types of water bodies identifiable from the visual image interpretation: One is clean water while the other is turbid water. They are classified individually and then merged together and termed the water body class. Because of the relatively large areas of water bodies, the LULC information for water bodies were classified from the GF-1 data with 16 m and 8 m spatial resolutions. Because the effect of shadows on the identification of water bodies was much heavier in images with an 8 m spatial resolution than in images with a 16 m spatial resolution, an overlay spatial analysis was employed to perform topology intersect operations between water bodies extracted from images at each spatial resolution. The intersect operation is an "AND" logical operation, and the resultant polygon is classified as water bodies if both inputs are classified as water bodies. The overlying results are considered to be the main part of water bodies. The detailed workflow for extracting water bodies is shown in Figure 5.

Vegetation
Vegetation is mainly composed of trees, shrubs, and grass. The spectral information from grass is different from that of trees and shrubs, so they can be separately extracted and then merged together as vegetation. Large areas of vegetation such as urban gardens and parks can be extracted from GF-1 images with an 8 m spatial resolution, while small areas of vegetation in residential regions can be identified from GF-2 images with a 0.8 m spatial resolution.

Bare Lands
Most of the bare lands are located in regions surrounding urban areas. They are mainly construction sites. Additionally, there are places with piles of coal ash, and we classified them as bare lands as well. This type of land cover is very difficult to be identified from visual interpretation of the images. Therefore, they have been further investigated through field work and shown in Figure 6. Because bare lands are relatively large and are unlikely to be misclassified as other land covers, they were extracted from low resolution images (e.g., GF-1 imagery with 8 m spatial

Vegetation
Vegetation is mainly composed of trees, shrubs, and grass. The spectral information from grass is different from that of trees and shrubs, so they can be separately extracted and then merged together as vegetation. Large areas of vegetation such as urban gardens and parks can be extracted from GF-1 images with an 8 m spatial resolution, while small areas of vegetation in residential regions can be identified from GF-2 images with a 0.8 m spatial resolution.

Bare Lands
Most of the bare lands are located in regions surrounding urban areas. They are mainly construction sites. Additionally, there are places with piles of coal ash, and we classified them as bare lands as well. This type of land cover is very difficult to be identified from visual interpretation of the images. Therefore, they have been further investigated through field work and shown in Figure 6. Because bare lands are relatively large and are unlikely to be misclassified as other land covers, they were extracted from low resolution images (e.g., GF-1 imagery with 8 m spatial resolution). The flow chart for extracting vegetation and bare lands is shown in Figure 7.
together as vegetation. Large areas of vegetation such as urban gardens and parks can be extracted from GF-1 images with an 8 m spatial resolution, while small areas of vegetation in residential regions can be identified from GF-2 images with a 0.8 m spatial resolution.

Bare Lands
Most of the bare lands are located in regions surrounding urban areas. They are mainly construction sites. Additionally, there are places with piles of coal ash, and we classified them as bare lands as well. This type of land cover is very difficult to be identified from visual interpretation of the images. Therefore, they have been further investigated through field work and shown in Figure 6. Because bare lands are relatively large and are unlikely to be misclassified as other land covers, they were extracted from low resolution images (e.g., GF-1 imagery with 8 m spatial resolution). The flow chart for extracting vegetation and bare lands is shown in Figure 7.

Farm Lands, Roads and Squares, and Buildings
For this research, GF-1 images were collected in summer, while GF-2 images were acquired in late spring. During the late spring, many farm lands only contain bare soil with scattered crops, while trees, shrubs, and grass are with green leaves. Therefore, it is feasible to distinguish farmlands from vegetation (e.g., trees, shrubs, and grass) using GF-2 data. Because most farm lands are with regular geometric patterns and occupy large geographic areas, GF-2 data with a 3.2 m spatial resolution were employed to classify farm lands.
In addition to farm lands, roads were extracted from GF-2 images with a 3.2 m spatial resolution. Roads were characterized as linear features and squares were characterized by very bright tones. Because it is difficult to differentiate between roads and buildings from the classified images, GPS trace data of the main roads were employed to identify roads by topological intersection with the classified LULC map. All roads intersecting with GPS trace data were selected and regarded as the final data for the main roads, while others were categorized as buildings.
Changchun is a typical automobile city with many relatively large industrial buildings compared to residential buildings. GF-2 imagery with a 3.2 m spatial resolution was used to extract industrialbuildings, while imagery with a 0.8 m spatial resolution was employed to extract

Farm Lands, Roads and Squares, and Buildings
For this research, GF-1 images were collected in summer, while GF-2 images were acquired in late spring. During the late spring, many farm lands only contain bare soil with scattered crops, while trees, shrubs, and grass are with green leaves. Therefore, it is feasible to distinguish farmlands from vegetation (e.g., trees, shrubs, and grass) using GF-2 data. Because most farm lands are with regular geometric patterns and occupy large geographic areas, GF-2 data with a 3.2 m spatial resolution were employed to classify farm lands.
In addition to farm lands, roads were extracted from GF-2 images with a 3.2 m spatial resolution. Roads were characterized as linear features and squares were characterized by very bright tones. Because it is difficult to differentiate between roads and buildings from the classified images, GPS trace data of the main roads were employed to identify roads by topological intersection with the classified Sensors 2019, 19, 3120 13 of 24 LULC map. All roads intersecting with GPS trace data were selected and regarded as the final data for the main roads, while others were categorized as buildings.
Changchun is a typical automobile city with many relatively large industrial buildings compared to residential buildings. GF-2 imagery with a 3.2 m spatial resolution was used to extract industrialbuildings, while imagery with a 0.8 m spatial resolution was employed to extract residential buildings and small sized shadows ( Table 4). The flowchart for extracting farm lands, roads and squares, and buildings is shown in Figure 8.
from vegetation (e.g., trees, shrubs, and grass) using GF-2 data. Because most farm lands are with regular geometric patterns and occupy large geographic areas, GF-2 data with a 3.2 m spatial resolution were employed to classify farm lands.
In addition to farm lands, roads were extracted from GF-2 images with a 3.2 m spatial resolution. Roads were characterized as linear features and squares were characterized by very bright tones. Because it is difficult to differentiate between roads and buildings from the classified images, GPS trace data of the main roads were employed to identify roads by topological intersection with the classified LULC map. All roads intersecting with GPS trace data were selected and regarded as the final data for the main roads, while others were categorized as buildings.
Changchun is a typical automobile city with many relatively large industrial buildings compared to residential buildings. GF-2 imagery with a 3.2 m spatial resolution was used to extract industrialbuildings, while imagery with a 0.8 m spatial resolution was employed to extract residential buildings and small sized shadows ( Table 4). The flowchart for extracting farm lands, roads and squares, and buildings is shown in Figure 8.

Accuracy Assessment
Because of the advantages in making sure samples will be included in each class, we employed the stratified random sampling protocol to collect the testing points. Pixels were used as the spatial unit in this work. All of the testing points were visually interpreted by overlying on GF-2 false color composition image or on GF-1 image for areas without the coverage of GF-2 image. Pixel based error

Accuracy Assessment
Because of the advantages in making sure samples will be included in each class, we employed the stratified random sampling protocol to collect the testing points. Pixels were used as the spatial unit in this work. All of the testing points were visually interpreted by overlying on GF-2 false color composition image or on GF-1 image for areas without the coverage of GF-2 image. Pixel based error matrix was employed to compute the overall accuracy (O), user's accuracy (U), and producer's accuracy (P), and to quantitatively assess the accuracy of the urban LULC map [55,56]. In the error matrix, p ij represents the proportion of area for the population that has map class I and reference class j. Overall accuracy derived from an error matrix of q classes can be expressed as [57]: User's accuracy of class i is: And producer's accuracy of class j is: The cell entries of the population error matrix and the parameters derived from it must be estimated from a sample. The sample based estimator of p ij is denoted asp ij , and correspondingly, the error matrix should be reported in terms of these estimated area proportionsp ij instead of sample counts, n ij .p ij can be expressed as:p where W i is the proportion of area mapped as class i. replacing p ij byp ij , we can calculate the overall, user's and producer's accuracies.
Aside from accuracy parameters, standard errors should be reported to indicate the sampling variability. For user's accuracy of map class i, the estimated variance is: For producer's accuracy of reference class j = k, the estimated variance is: where N i· n i· n ij is the estimated marginal total number of pixels of reference class j, N j· is the marginal total of map class j and n j· is the total number of sample units in map class j.
In addition, error matrix provides the basis for estimating the areas of classes. If the sampling design is simple random, systematic or stratified random, an estimator of the proportion of areas of class k is:p For the stratified estimator of proportion of area (Equation (7)), the standard error is estimated by: where n ik is the sample count at cell (i,k) in the error matrix, W i is the area proportion of map class i,p ·k = W i n ik n i· and the summation is over the q classes. The estimated area of class k isÂ k = A ×p ·k , where A is the total map area.
The standard error of the estimated area is given by: An approximate 95% confident interval is obtained asÂ k ± 1.96 × S Â k .

Urban LULC Mapping
Based on the above mentioned method, urban LULC maps of classification scheme layers two and three were obtained through applying the object-based RF classifiers and are presented in Figure 9. As can be discerned from Figure 9, the third layer of the urban LULC map (Figure 9a) presents more detailed urban land cover information when compared with the second layer with eight urban LULC types (Figure 9b). Compared with the third layer, the second layer of the urban LULC map looks more consistent with the general impression. That is, the central urban area is covered by residential buildings, green lands, and water bodies, while the surrounding regions are covered by industrial buildings, farm lands, and construction lands. For a better visualization, a portion of third layer of the urban LULC map was zoomed in, and visualized compared with the GF-2 image (Figure 10). Similarly, it indicates a good consistency exists between the classified urban LULC types and the GF-2 image.
LULC map looks more consistent with the general impression. That is, the central urban area is covered by residential buildings, green lands, and water bodies, while the surrounding regions are covered by industrial buildings, farm lands, and construction lands. For a better visualization, a portion of third layer of the urban LULC map was zoomed in, and visualized compared with the GF-2 image (Figure 10). Similarly, it indicates a good consistency exists between the classified urban LULC types and the GF-2 image.

Quantitative Accuracy Assessment
Accuracy assessment determines the quality of a classified map from satellite images. In this paper, we performed the process of accuracy assessment according to the good practices for estimating accuracy recommended by Olofsson et al. (2014) [57].

Estimating Accuracy
The estimation of map accuracy is conducted by a plugin named "AcATaMa" version 18.11.21 and installed in QGIS 3.4.4. Stratified random sampling approach was used, sample sizes were calculated and allocated according to the mapped proportion of area of each class. Because of the uneven of the mapped area of classes, some classes with small areas were allocated with only five or

Quantitative Accuracy Assessment
Accuracy assessment determines the quality of a classified map from satellite images. In this paper, we performed the process of accuracy assessment according to the good practices for estimating accuracy recommended by Olofsson et al. (2014) [57].

Estimating Accuracy
The estimation of map accuracy is conducted by a plugin named "AcATaMa" version 18.11.21 and installed in QGIS 3.4.4. Stratified random sampling approach was used, sample sizes were calculated and allocated according to the mapped proportion of area of each class. Because of the uneven of the mapped area of classes, some classes with small areas were allocated with only five or even less samples. For being more representative of the samples in each class, we manually adjusted samples with a minimum number of 10. All of the testing points were labeled by two different analysts together. Finally, the urban LULC maps layer two and layer three error matrices resulting from the sample and response design are presented in terms of the sample counts displaced in Tables 5 and A1. In which, user's accuracy, producer's accuracy, and overall accuracy for each class are presented. Table 5 shows that the overall accuracy of the second layer of the urban LULC map reaches 0.89, which means that the second layer of the urban LULC map is more accurate than the third layer of the urban LULC map whose overall accuracy is 0.87. Based on Equations (4)-(6), we can compute the estimated user's and producer's accuracy and variances by using an error matrices with cell values ofthe estimated area proportions (Tables 6 and A2). The estimated user's and producer's accuracy with a 95% confident interval for urban LULC maps layer two and layer three are presented in Table 7. Although good overall accuracies are achieved for both layer two and layer three map classification scheme, there are some map classes, especially in map class layer three, with lower accuracies either in user's or producer's accuracy, for instance, map classes of bare lands T2, residential bd2-bd4, their accuracies are less than 0.80.

Estimating Area and Uncertainty
Based on the estimated area proportions we can estimate the area of each class according to the reference data. For instance, the error matrix in Table 6 can be used to indicate how to estimate the area and uncertainty. The estimated area of water body (Wb) isÂ 1 =p ·1 × A tot = 0.02613 × 523.1622 = 13.66806 km 2 , the mapped area of water body of 14.09400 km 2 was thus underestimated by 0.42594 km 2 . The confident interval for the area of each class can be estimated based on the method mentioned in Section 2.6. From Equation (8), S(p ·i ) = 0.00082 and the standard error for the estimated area of water body is S Â 1 = S(p ·i ) × A tot = 0.00082 × 523.1622 = 0.42900 km 2 . The margin of error of the confidence interval is 1.96 × 0.429 = 0.84084 km 2 . Such that we estimated the area of water body with a 95% confident interval is 13.66806 ± 0.84084 km 2 , i.e., the lower and upper limit is 12.82722 km 2 and 14.5089 km 2 , respectively. The area of each map class in layer two and layer three can be estimated in the same way. Table 8 presents the estimated areas of each class with the second digit after the decimal place in urban LULC layers two and three with a confident interval of 95%. These can provide useful information for urban planners to examine the uncertainties of each map class, such that they can make some decisions in protection of some land uses.

Discussion
Medium resolution satellite imagery lacks of the ability of providing detailed urban LULC information. On the contrary, sub-meter and meter-scale satellite imagery provide essential data sources to extract detailed land cover information, especially for urban regions with highly heterogeneous manmade materials. These VHR imagery, however, cannot cover an entire city and generally computationally expensive. In this study, medium resolution imagery (GF-1) and very high resolution imagery (GF-2) were integrated to map metropolitan-scale urban LULC types by taking advantage of the multi-level resolution imagery from these two satellites. The acquisition of multi-level resolution imagery provides a unique opportunity to classify urban LULC types in a large urban area at different spatial scales. Considering the fact that urban LULC types with a relatively large areas, such as water bodies, green lands, and bare lands can be classified by GF-1 images while small sized urban LULC types, like industrial and residential buildings, can be extracted by GF-2 data, this work proposed a three-layer classification scheme. This scheme is very flexible, allowing for urban planners and policy makers to select which level of urban LULC type might be more appropriate for their specific applications, including spatio-temporal dynamics of urbanization, suburbanization, dynamic land cover or land use change, urban landscape change analysis, and ecology conservation [58][59][60]. Although with many issues, including the unavailability of two or more satellite images, relatively complex image processing techniques, and the difficulty of selecting appropriate urban LULC types, this scheme provides a practical approach to extract metropolitan-scale urban LULC types at different level of details.
The design of the three-layer land cover scheme is mainly based on the visually separability among each urban LULC type aiming to achieve a better accuracy. The accuracy assessment shows that many urban LULC types have achieved high accuracies using the designed three-layer classification scheme. Although there is not a definite threshold of accuracy for relative applications of urban LULC mapping from satellites, one goal of this study is to achieve the highest possible accuracy while minimizing fieldwork and post-processing procedures to keep potential applications both cost-effective and operationally practical. All of the urban LULC types in the second and third layer of the urban LULC map are appropriate for use by urban planners or managers to perform dynamic urban expansion analysis, to estimate urban population density and other urban landscapes and to conduct urban environment-related projects [61][62][63]. This scheme is specially designed for GF-1 and GF-2 satellite images. It may not perform well for other satellite images with different dimensions, but it can provide an approach to derive LULC types with different level of details. Most of the urban LULC types in this scheme are helpful for urban planners or local managers to understand the current conditions of urban LULCs.

Conclusions
More and more satellites with finer spatial resolutions have been successfully launched around the world in the last two decades. While finer scale imagery allows for differentiating more subtle geometric differencesin land cover types than coarse or medium spatial resolution images, problems such as the large volume of datasets and the length of time necessary for processing segmentation and classification still persist, especially for a large region like a whole urban area. By integrating GF-1 imagery with meter-resolution and GF-2 imagery with sub-meter-resolution, this study designed a three-level classification scheme and mapped detailed urban LULC at different spatial resolutions. Conclusions from a case study in Changchun, the capital city of Jilin Province, can be drawn as follows: First, the proposed multi-level classification scheme is feasiblein extracting urban LULCs by combining medium resolution and VHR remote sensing imagery. Homogeneous urban LULC types such as water bodies, bare lands, or large areas of vegetation could be derived from GF-1 imagery with 16 m and 8 m spatial resolutions, while heterogeneous urban LULC types such as farm lands, industrial buildings, and roads and squares could be extracted from GF-2 imagery with 3.2 m spatial resolution, and residential buildings, small patches of vegetation, and shodows could be generated from GF-2 with 0.8 m spatial resolutions.
Second, through implementing the image segmentation and object-based image classification, detailed urban LULC maps at the second and third levels illustrate an overall accuracy of 0.89 and 0.87, suggesting the three-layer classification scheme has the potential to derive high accuracy urban LULC information.
Author Contributions: G.C., and H.R. processed the satellite images, collected the ground true data, performed the accuracy assessment and wrote the manuscript, L.Y. and N.Z. proposed the requirement, prepared and made available the GF-1 and GF-2 data, and discussed the classification scheme and the possible approaches to classify the urban land covers, M.D. contributes to revsing the manuscript, and C.W. contributed to overseeing the research design and revising the manuscript. Table A1. Assessment of the land cover map in the third layer of the classification scheme.  Tt  21  9  15  24  19  26  23  28  15  17  17  15  12  15  41  13  20  13  45  13  15