Extraction of Urban Objects in Cloud Shadows on the basis of Fusion of Airborne LiDAR and Hyperspectral Data

Feature extraction in cloud shadows is a difficult problem in the field of optical remote sensing. The key to solving this problem is to improve the accuracy of classification algorithms by fusing multi-source remotely sensed data. Hyperspectral data have rich spectral information but highly suffer from cloud shadows, whereas light detection and ranging (LiDAR) data can be acquired from beneath clouds to provide accurate height information. In this study, fused airborne LiDAR and hyperspectral data were used to extract urban objects in cloud shadows using the following steps: (1) a series of LiDAR and hyperspectral metrics were extracted and selected; (2) cloud shadows were extracted; (3) the new proposed approach was used by combining a pixel-based support vector machine (SVM) and object-based classifiers to extract urban objects in cloud shadows; (4) a pixel-based SVM classifier was used for the classification of the whole study area with the selected metrics; (5) a decision-fusion strategy was employed to get the final results for the whole study area; (6) accuracy assessment was conducted. Compared with the SVM classification results, the decision-fusion results of the combined SVM and object-based classifiers show that the overall classification accuracy is improved by 5.00% (from 87.30% to 92.30%). The experimental results confirm that the proposed method is very effective for urban object extraction in cloud shadows and thus improve urban applications such as urban green land management, land use analysis, and impervious surface assessment.


Introduction
With the acceleration of urbanization, the problems of urban population expansion, resource shortage and traffic congestion are becoming increasingly serious.The trend in urban development is the increasing construction of smart cities according to appropriate spatial information, thereby improving the efficiency of urban management.The ability to acquire information about urban objects quickly and accurately is an important factor to ensure the smooth progress of intelligent cities.With the advantages of providing a synoptic view and rich spatial and spectral information, remote sensing can accurately obtain urban object information at various scales [1][2][3][4][5][6][7][8][9], and is becoming one of the most effective tools for urban object extraction [10].
Up to now, many scholars have carried out relevant studies on the extraction of urban objects with medium-and high-resolution optical images.With the accelerating process of urbanization, the types of urban objects are becoming increasingly complex, which makes urban object extraction more challenging.Examples of such challenges are as follows: (1) similar spectral characteristics are shared by many different urban land types, such as cement pavements, parking_lots, roads, rooftops, sidewalks, and buildings.Remote sensing data from a single source cannot meet the needs of current remote sensing applications [11,12]; (2) high-resolution optical remote sensing images may encounter serious problems with shadows, such as building shadows and cloud shadows; (3) medium-resolution optical remote sensing images have the mixed pixel problem; (4) the acquisition of optical remote sensing images is easily affected by the weather, which leads to weak spectral information in shadow areas.To overcome these limitations of single-source remote sensing data, studies have been carried out on urban object extraction that is based on multi-source remote sensing data [7][8][9]13].
With the continuous development of sensor technology, we can obtain different remote sensing images more conveniently and quickly, and this provides a prerequisite for multi-source image data fusion.Airborne hyperspectral data can acquire hundreds of continuous spectral bands that can provide the precise spectral information of land objects [14].High-density light detection and ranging (LiDAR) point cloud data can provide precise height information [15].These two data sources are complementary to each other, so their combination is very helpful for obtaining spectral and height information classification [16].Up to now, many scholars have applied fused airborne LiDAR and hyperspectral data in various studies for a variety of purposes such as tree species classification [17][18][19][20][21], forest biomass estimation [22][23][24][25], and urban object extraction [26][27][28][29][30][31][32][33][34][35][36].These studies have mainly concentrated on the following aspects: (1) Extraction of feature parameters, such as attribute profile parameters [26] and morphological attribute profile [27].(2) Fusion methods, such as pixel-level [28][29][30][31][32], feature-level [27,30,33], and decision-level fusion [34][35][36][37].Among them, pixel-level fusion mainly uses layer stacked feature parameters.Luo et al. [31] fused LiDAR and hyperspectral data by layer stacking, and they employed the maximum likelihood and support vector machine (SVM) classifiers to classify urban objects.Compared with the classification results of hyperspectral data alone, the overall accuracy of layer stacking fused hyperspectral and LiDAR data was improved by about 9.00%.Feature-level fusion is a method in which features are derived first, and comprehensive analysis and processing are then carried out.Man et al. [30] proposed new methods to fuse LiDAR and hyperspectral data in pixel-level and feature-level fusion, and they used object-based and SVM classifiers to extract urban land use information.In comparison with hyperspectral data alone, hyperspectral-LiDAR data fusion improved the overall accuracy by 6.80% (from 81.70% to 88.50%) when the SVM classifier was used.Meanwhile, compared with the SVM classifier alone, the combined SVM and object-based method improved the overall accuracy by 7.10% (from 87.60% to 94.70%).Zhang et al. [27] fused LiDAR and hyperspectral data in feature-level fusion, and improved the overall accuracy from 80.49% to 89.93% in comparison with hyperspectral data alone.Decision level fusion is used to classify and identify each image individually, and then get the optimal decision results.Zhong et al. [36] fused LiDAR and hyperspectral data in decision-level fusion, and improved the overall accuracy from 87.80% to 90.80% when compared with hyperspectral data alone.Bigdeli et al. [34] applied a decision-fusion method based on multiple support vector machine system for fusing hyperspectral and LiDAR data, the overall accuracy was improved from 88.00% to 91.00% in comparison with hyperspectral data alone.(3) Classification methods have included spectral angle mapping [28], random forest [32,38,39], maximum likelihood classification [28,29,31,33,36,40], support vector machine [29,31,34,36,[38][39][40], and object-based classification [41,42].In general, many studies have shown that LiDAR and hyperspectral data fusion could overcome some of the limitations of using a single data source for object extraction [43].
Although many studies have fused LiDAR and hyperspectral data for urban object extraction and obtained improved classification results [31], there are still some problems that need to be further investigated.For example, few studies have focused on urban object extraction in shadows by using hyperspectral and LiDAR data.Against the background of accelerated urbanization, it is essential to make full use of the advantages of multi-source remotely sensed data to improve the overall accuracy of urban object extraction and support intelligent cities and urban planning activities.While the pixel-based SVM classifier works well with high-dimensional data classification, it is difficult to achieve high classification accuracy using the pixel-based SVM classifier in shadow areas because of missing spectral information.Besides that, the advantage of LiDAR height information is not fully utilized.Therefore, there is a need to explore other methods for fusing LiDAR and hyperspectral data.Object-based classification is an evolving technology that is driven by the understanding of objects rather than pixels [44].This study aims to extract urban objects in shadows using fused airborne LiDAR and hyperspectral data, and utilizes the advantages of LiDAR data in shadows to improve the overall accuracy of urban object extraction.We propose a new workflow to fuse hyperspectral and LiDAR data for the classification of remotely sensed scenes with cloud shadows.The proposed method comprises the following steps.Firstly, cloud shadow areas are extracted.Secondly, a multi-scale object-based classification method is used to classify cloud shadow areas.Thirdly, the whole study area is classified with pixel-based support vector machine algorithms.Finally, decision-level fusion is conducted to improve the overall classification accuracy of the whole study area, including the cloud shadow areas.
The remainder of this paper is organized as follows: Section 2 describes the study area and data; Section 3 describes the proposed workflow with a detailed description of the proposed method; Section 4 presents the results and discussions; and Section 5 draws the conclusions.

Study Site
The study area is in Houston in the southeast of Texas, USA (Figure 1), covering an area of approximately 4 km 2 , extending from 95 • 19 13.56"W to 95 • 22 9.94"W and 29 • 43 0.96"N to 29 • 43 32.34"N.

Datasets
The datasets used in this study are provided by the 2013 IEEE GRSS Data Fusion Contest [43] (URL: http://www.grssieee.org/community/technical-commitees/data-fusion/)and include airborne hyperspectral imagery, LiDAR point cloud data, training data and validation data.

LiDAR Data
The LiDAR data used in the present study were acquired on 22 June 2012 between the times 14:37:55 and 15:38:10 UTC (Coordinated Universal Time).The sensor recorded five returns and intensity information at a platform altitude of 609.6 m, with an average point spacing of 0.74 m.In this study, the intensity of LiDAR data was not calibrated, and the atmospheric effects were not considered.
high classification accuracy using the pixel-based SVM classifier in shadow areas because of missing spectral information.Besides that, the advantage of LiDAR height information is not fully utilized.Therefore, there is a need to explore other methods for fusing LiDAR and hyperspectral data.Objectbased classification is an evolving technology that is driven by the understanding of objects rather than pixels [44].This study aims to extract urban objects in shadows using fused airborne LiDAR and hyperspectral data, and utilizes the advantages of LiDAR data in shadows to improve the overall accuracy of urban object extraction.We propose a new workflow to fuse hyperspectral and LiDAR data for the classification of remotely sensed scenes with cloud shadows.The proposed method comprises the following steps.Firstly, cloud shadow areas are extracted.Secondly, a multi-scale object-based classification method is used to classify cloud shadow areas.Thirdly, the whole study area is classified with pixel-based support vector machine algorithms.Finally, decision-level fusion is conducted to improve the overall classification accuracy of the whole study area, including the cloud shadow areas.
The remainder of this paper is organized as follows: Section 2 describes the study area and data; Section 3 describes the proposed workflow with a detailed description of the proposed method; Section 4 presents the results and discussions; and Section 5 draws the conclusions.

Datasets
The datasets used in this study are provided by the 2013 IEEE GRSS Data Fusion Contest [43] (URL: http://www.grssieee.org/community/technical-commitees/data-fusion/)and include airborne hyperspectral imagery, LiDAR point cloud data, training data and validation data.

LiDAR Data
The LiDAR data used in the present study were acquired on 22 June 2012 between the times 14:37:55 and 15:38:10 UTC (Coordinated Universal Time).The sensor recorded five returns and intensity information at a platform altitude of 609.6 m, with an average point spacing of 0.74 m.In this study, the intensity of LiDAR data was not calibrated, and the atmospheric effects were not considered.

Hyperspectral Data
The hyperspectral imagery data were acquired on 23 June 2012 between the times of 17:37:10 and 17:39:50 UTC.The sensor is CASI and its above ground height is 1676.4 m.There are 144 spectral bands in the 380-1050 nm region.The hyperspectral imagery was calibrated to at-sensor spectral radiance units (SRUs), which are equivalent to units of µWcm −2 sr −1 nm −1 .The spectral and spatial resolutions were 4.8 nm and 2.5 m, respectively.

Training and Validation Data
In this study, 12 classes were identified: (1)

Methodology
The methodology flowchart consists of six major parts: (1) data preprocessing; (2) cloud shadow extraction; (3) extraction of urban objects from cloud shadow areas using a multi-scale object-based classifier; (4) extraction of urban objects from the whole study area using a pixel-based SVM classifier; (5) decision fusion of the classification results of shadow areas and the whole study area; (6) accuracy assessment for evaluating the performance of the proposed method.

Data Preprocessing
The LiDAR point clouds were processed into four raster datasets: a digital surface model (DSM), a digital elevation model (DEM), a normalized digital surface model (nDSM), and intensity imagery.The detailed steps are as follows: (1) Terrasolid software was used to filter the raw point cloud data into ground and non-ground points.(2) Inverse distance weighted interpolation (IDW) was used to interpolate ground points to the DEM.(3) The first returns of the LiDAR point cloud data were interpolated to create the DSM.(4) The nDSM was generated by subtracting the DEM from the DSM.Finally, the intensity values of the LiDAR point cloud data were interpolated by IDW in ArcGIS 10.2.The spatial resolution of the four raster datasets was 2.5 × 2.5 m.
Since the hyperspectral imagery has 144 narrow spectral bands, and some of these bands are particularly affected by atmospheric effects, atmospheric correction is necessary during the preprocessing of hyperspectral data.In this study, atmospheric correction was applied to the hyperspectral data using the FLAASH model in ENVI 5.1.Then the hyperspectral imagery after atmospheric correction was processed to generate the normalized difference vegetation index (NDVI).In order to avoid band redundancy [45][46][47], the 144 spectral bands of hyperspectral imagery were processed using the minimum noise fraction rotation (MNF) and principal component analysis (PCA) in ENVI 5.1.It has been demonstrated that the textural features derived from the gray-level co-occurrence matrix (GLCM) can significantly improve the classification accuracy of satellite images [48,49].Therefore, the first band of PCA was used to generate the GLCM for texture analysis.
The first 22 bands of MNF (MNF22), nDSM, intensity, GLCM homogeneity, GLCM dissimilarity, and GLCM entropy were selected for urban land use classification.The pixel size of the above parameters was 2.5 × 2.5 m. Figure 2 shows the flowchart of the whole study.Since the hyperspectral imagery has 144 narrow spectral bands, and some of these bands are particularly affected by atmospheric effects, atmospheric correction is necessary during the preprocessing of hyperspectral data.In this study, atmospheric correction was applied to the hyperspectral data using the FLAASH model in ENVI 5.1.Then the hyperspectral imagery after atmospheric correction was processed to generate the normalized difference vegetation index (NDVI).In order to avoid band redundancy [45][46][47], the 144 spectral bands of hyperspectral imagery were processed using the minimum noise fraction rotation (MNF) and principal component analysis (PCA) in ENVI 5.1.It has been demonstrated that the textural features derived from the gray-level co-occurrence matrix (GLCM) can significantly improve the classification accuracy of satellite images [48,49].Therefore, the first band of PCA was used to generate the GLCM for texture analysis.
The first 22 bands of MNF (MNF22), nDSM, intensity, GLCM homogeneity, GLCM dissimilarity, and GLCM entropy were selected for urban land use classification.The pixel size of the above parameters was 2.5 × 2.5 m. Figure 2 shows the flowchart of the whole study.

Shadow Area Extraction
Usually, shadows refer to areas in which the imaging rays are completely or partially obscured by obstacles.The pixel value of a shadow area is generally lower than that of the surrounding imaged area.The loss of spectral information of ground objects in shadow areas increases the difficulty of classification.To improve the accuracy of urban object extraction in shadow areas, it is necessary to investigate new methods.
In recent years, many scholars have proposed a variety of shadow detection methods, such as model based and shadow attribute-based detection methods [41,[50][51][52][53].Because the cloud shadow area is very large in our study, and the derivation of the shadow area is not the primary goal of this work, a simple shadow detection method based on area attribute filters was employed [50].The

Shadow Area Extraction
Usually, shadows refer to areas in which the imaging rays are completely or partially obscured by obstacles.The pixel value of a shadow area is generally lower than that of the surrounding imaged area.The loss of spectral information of ground objects in shadow areas increases the difficulty of classification.To improve the accuracy of urban object extraction in shadow areas, it is necessary to investigate new methods.
In recent years, many scholars have proposed a variety of shadow detection methods, such as model based and shadow attribute-based detection methods [41,[50][51][52][53].Because the cloud shadow area is very large in our study, and the derivation of the shadow area is not the primary goal of this work, a simple shadow detection method based on area attribute filters was employed [50].The attribute filters are connected operators.On the basis of a given criterion, attribute filters operate on the connected components that compose an image.Each component of the image is evaluated by the criterion.An arbitrary attribute γ (e.g., area, volume, etc.) of component C is compared with a given reference value λ, which is the filter parameter.Taking γ(C) > λ as an example, if the criterion is verified, the regions remain unaffected; otherwise, they will be set to the gray level of a darker or brighter surrounding region, depending on the transformation used (i.e., thickening or thinning).In this study, by gradually increasing the threshold of the area attribute, progressively more bright objects were filtered out, leaving dark shadow areas.In this study area, two shadow areas were detected: one is large and dark, and the other is small.As we conducted decision fusion of the classification results, the small shadow area was found to be too small to conduct classification individually, let alone decision fusion of the final results.Therefore, only the large main shadow area was chosen for the study.Then, the shadow areas were binarized and used as masks for the subsequent object extraction.M = {m ij } denotes the cloud-shadow mask, with pixel values of m ij = 0 in the cloud-shadow region and m ij = 1 in the shadow-free region.Figure 3 shows the hyperspectral imagery in the shadow area in the true color display and false color display.The vegetation in the shadow area is more apparent in the false color display of the hyperspectral data.
this study, by gradually increasing the threshold of the area attribute, progressively more bright objects were filtered out, leaving dark shadow areas.In this study area, two shadow areas were detected: one is large and dark, and the other is small.As we conducted decision fusion of the classification results, the small shadow area was found to be too small to conduct classification individually, let alone decision fusion of the final results.Therefore, only the large main shadow area was chosen for the study.Then, the shadow areas were binarized and used as masks for the subsequent object extraction.M = { } denotes the cloud-shadow mask, with pixel values of  = 0 in the cloud-shadow region and  = 1 in the shadow-free region.Figure 3 shows the hyperspectral imagery in the shadow area in the true color display and false color display.The vegetation in the shadow area is more apparent in the false color display of the hyperspectral data.

Extraction of Urban Objects in Cloud Shadow Areas
Information extraction from shadow areas has been a difficult problem in the field of remote sensing.This study mainly utilizes the advantage of airborne LiDAR data and an object-based classification method to improve fusion classification accuracy.
Compared with traditional pixel-based classification methods, object-based classification is an evolving technology that can make full use of spatial, texture, spectral, and other information of the fused data, and it also can reduce the "salt and pepper effect".Firstly, a multi-resolution segmentation algorithm was used to segment imageries with a certain scale level.Secondly, threshold segmentation classification was conducted to extract urban objects in shadow areas using attributes such as shape, length, and area.As the process is complicated, more detailed information is given in the following sections, which illustrate the images used in the segmentation and the parameters used in the classification.Figure 4 shows a detailed flowchart of the object-based classification in the shadow areas.

Image Segmentation
In object-based classification, segmentation aggregates pixels into objects according to their similarity [44].As the accuracy of image segmentation significantly influences the classification accuracy [54], the process was performed using the multi-resolution segmentation algorithm (FNEA, fractal net evolution approach) in Trimble eCognition® Developer 9.0.In order to avoid the subjectivity of scale parameter selection and the time-consuming trial-and-error method, the estimation of scale parameter 2 (ESP2) tool was selected to obtain the optimal scale parameter [55][56][57][58].As an automated tool for segmentation assessment, ESP2 can automatically identify suitable segmentation parameters (SPs) for multi-resolution segmentation on the basis of local variance across scales.The advantages of this strategy are that: (1) different layers have different weights; (2) the attribute (pixel value) and shape of objects are taken into account in the segmentation process; (3) the method is flexible and can make full use of the fused data.After image segmentation, different features were extracted from spectral images to classify the urban objects in the cloud shadows.

Extraction of Urban Objects in Cloud Shadow Areas
Information extraction from shadow areas has been a difficult problem in the field of remote sensing.This study mainly utilizes the advantage of airborne LiDAR data and an object-based classification method to improve fusion classification accuracy.
Compared with traditional pixel-based classification methods, object-based classification is an evolving technology that can make full use of spatial, texture, spectral, and other information of the fused data, and it also can reduce the "salt and pepper effect".Firstly, a multi-resolution segmentation algorithm was used to segment imageries with a certain scale level.Secondly, threshold segmentation classification was conducted to extract urban objects in shadow areas using attributes such as shape, length, and area.As the process is complicated, more detailed information is given in the following sections, which illustrate the images used in the segmentation and the parameters used in the classification.Figure 4 shows a detailed flowchart of the object-based classification in the shadow areas.

Image Segmentation
In object-based classification, segmentation aggregates pixels into objects according to their similarity [44].As the accuracy of image segmentation significantly influences the classification accuracy [54], the process was performed using the multi-resolution segmentation algorithm (FNEA, fractal net evolution approach) in Trimble eCognition®Developer 9.0.In order to avoid the subjectivity of scale parameter selection and the time-consuming trial-and-error method, the estimation of scale parameter 2 (ESP2) tool was selected to obtain the optimal scale parameter [55][56][57][58].As an automated tool for segmentation assessment, ESP2 can automatically identify suitable segmentation parameters (SPs) for multi-resolution segmentation on the basis of local variance across scales.The advantages of this strategy are that: (1) different layers have different weights; (2) the attribute (pixel value) and shape of objects are taken into account in the segmentation process; (3) the method is flexible and can make full use of the fused data.After image segmentation, different features were extracted from spectral images to classify the urban objects in the cloud shadows.

Classification Algorithms (1) Extraction of Buildings
In the hyperspectral imagery, much of the spectral information of shadow areas was lost, so the extraction of objects in shadow areas was difficult.However, some features were still identifiable; for example, vegetation could be identified due to its strong reflectance in the near infrared band.Therefore, we first analyzed the distribution of trees using the normalized difference vegetation index (NDVI), and we then determined the threshold for separating vegetation and non-vegetation on the basis of the NDVI image, which provided a theoretical basis for the subsequent classification.Figure 5 shows the spectral reflectance curves of vegetation in shadow areas.Figure 6 (left) shows the NDVI imagery of the shadow areas.Then, vegetation samples were selected randomly to get their distribution in NDVI imagery, and Figure 6 (right) shows the distribution of vegetation samples in different intervals of NDVI imagery.According to the statistics of sample spectral characteristics, vegetation information in shadow areas was apparent, and the NDVI of vegetation in shadows was greater than 0.3.In addition, the LiDAR height data could assist in the extraction of objects in the shadows.Therefore, the fusion of hyperspectral and LiDAR data could produce better classification results in shadow areas.In order to extract buildings, the nDSM data were first segmented into homogeneous regions using the multi-resolution segmentation method (MRS).The MRS algorithm was run using a shape parameter of 0.4 and a compactness parameter of 0.2.The optimal segmentation scale is 2 according to the calculation of ESP2.Here "scale" means the size of segmented objects, and "shape" and "compactness" are heterogeneity criteria used for merging neighboring objects.Firstly, nDSM was segmented, and then height and NDVI thresholds were used to separate

Classification Algorithms (1) Extraction of Buildings
In the hyperspectral imagery, much of the spectral information of shadow areas was lost, so the extraction of objects in shadow areas was difficult.However, some features were still identifiable; for example, vegetation could be identified due to its strong reflectance in the near infrared band.Therefore, we first analyzed the distribution of trees using the normalized difference vegetation index (NDVI), and we then determined the threshold for separating vegetation and non-vegetation on the basis of the NDVI image, which provided a theoretical basis for the subsequent classification.Figure 5 shows the spectral reflectance curves of vegetation in shadow areas.Figure 6 (left) shows the NDVI imagery of the shadow areas.Then, vegetation samples were selected randomly to get their distribution in NDVI imagery, and Figure 6 (right) shows the distribution of vegetation samples in different intervals of NDVI imagery.According to the statistics of sample spectral characteristics, vegetation information in shadow areas was apparent, and the NDVI of vegetation in shadows was greater than 0.3.In addition, the LiDAR height data could assist in the extraction of objects in the shadows.Therefore, the fusion of hyperspectral and LiDAR data could produce better classification results in shadow areas.In order to extract buildings, the nDSM data were first segmented into homogeneous regions using the multi-resolution segmentation method (MRS).The MRS algorithm was run using a shape parameter of 0.4 and a compactness parameter of 0.2.The optimal segmentation scale is 2 according to the calculation of ESP2.Here "scale" means the size of segmented objects, and "shape" and "compactness" are heterogeneity criteria used for merging neighboring objects.Firstly, nDSM was segmented, and then height and NDVI thresholds were used to separate non-ground objects from trees.Secondly, the extracted small objects were merged into large objects.Finally, the geometry parameters (e.g., area, length) were used to separate buildings from other non-ground urban objects.The rules for extracting buildings were set as follows: non-ground objects from trees.Secondly, the extracted small objects were merged into large objects.Finally, the geometry parameters (e.g.area, length) were used to separate buildings from other nonground urban objects.The rules for extracting buildings were set as follows:   (2) Extraction of trees In shadow areas, urban objects with height information include buildings, trees, and highways.Therefore, trees were extracted using height information of the nDSM and NDVI.The nDSM data were first segmented into homogeneous regions.Then, the scale, shape, and compact parameters were set as 2, 0.4, and 0.2, respectively.After the nDSM was segmented, the height and NDVI thresholds were used to separate trees from non-ground objects.Then, the extracted small objects were merged into large objects.Finally, the geometry parameters (e.g.area, length) were used to separate trees from other non-ground urban objects.The detailed rules were set as follows:
(3) Extraction of grass   (2) Extraction of trees In shadow areas, urban objects with height information include buildings, trees, and highways.Therefore, trees were extracted using height information of the nDSM and NDVI.The nDSM data were first segmented into homogeneous regions.Then, the scale, shape, and compact parameters were set as 2, 0.4, and 0.2, respectively.After the nDSM was segmented, the height and NDVI thresholds were used to separate trees from non-ground objects.Then, the extracted small objects were merged into large objects.Finally, the geometry parameters (e.g.area, length) were used to separate trees from other non-ground urban objects.The detailed rules were set as follows:
(3) Extraction of grass (2) Extraction of trees In shadow areas, urban objects with height information include buildings, trees, and highways.Therefore, trees were extracted using height information of the nDSM and NDVI.The nDSM data were first segmented into homogeneous regions.Then, the scale, shape, and compact parameters were set as 2, 0.4, and 0.2, respectively.After the nDSM was segmented, the height and NDVI thresholds were used to separate trees from non-ground objects.Then, the extracted small objects were merged into large objects.Finally, the geometry parameters (e.g., area, length) were used to separate trees from other non-ground urban objects.The detailed rules were set as follows:

•
Mean NDVI ≥ 0.4 and Mean nDSM ≥ 1 m; • Merge the extracted segments; (3) Extraction of grass In shadow areas, the NDVI could be used to separate vegetation and non-vegetation.In addition, the height information of the nDSM could be used to separate trees and grass.Therefore, the rules for the extraction of grass were set as follows:

•
Assign the non-extracted objects to unclassified.
(4) Extraction of highways Because of the spectral similarity, it is difficult to extract highways, railways, and roads, especially in cloud shadow areas.Here, the unclassified areas were first used as a mask; then, the combined image (Intensity + NDVI + PCA3) was segmented into homogeneous regions using the multi-resolution segmentation method (MRS).The MRS algorithm was run using a shape parameter of 0.1 and a compactness parameter of 0.5.The optimal segmentation scale is 60 according the calculation of ESP2.Finally, the rules were set as follows: (5) Extraction of railways and roads Because of the spectral similarity and cloud shadow effect, it is difficult to separate roads from railways.Using the same processing steps as (4) above, the scale, shape, and compact parameters were set as 5, 0.1, and 0.5.Finally, the rules for railway extraction were set as:

Extraction of Urban Objects in the Whole Study Area
As high-dimensional, multi-source data were used in this study, the traditional parametric classifiers would have been inadequate.The support vector machine (SVM) classifier is a nonparametric algorithm that can produce better classification results with limited training samples [59].SVM is a supervised machine learning method based on a set of theoretical machine learning algorithms [44,60].The SVM classification method is dependent on finding a separating hyperplane that provides the best classification between two classes in a multi-dimensional feature space.In an R n classification situation, the hyperplane can be determined by the following expression.Here, (y i , x i ) is a training sample; i = 1, 2, . . . . . .n; and w is a vector that is perpendicular to the classification hyperplane.
The hyperplane should maximize the distance between itself and the margin (Figure 7).Here, the margin means the distance from hyperplane to the nearest sample.The larger the margin, the lower the error of classification.min where 1 w is the distance from the nearest point to the hyperplane.The function can also be written as: Finally, the classification plane that can produce the lowest w 2 is the optimal hyperplane.In SVM, the linear and radial basis function (RBF) is frequently used [44,61], and therefore, RBF was used in this study.The input parameters of SVM in the ENVI 5.1 software include "gamma" (γ), "penalty parameter", "pyramid levels", and "classification probability threshold".Since SVM is sensitive to the selection of parameters, cross-validation was used to determine the optimal parameters for the SVM classifier in this study.All of the samples were divided into five parts equally.Each part of the samples was set as an individual validation sample, and the remaining parts were set as training samples.Finally, the average of the five classification accuracies was used as the performance index of the classifier.Using the MATLAB platform, the search range of "penalty parameter" and "gamma" (γ) was set as 0.01-32,768.After cross-validation, the optimal penalty parameter (C) was 1024, the optimal gamma parameter (γ) was 0.045, and the accuracy of cross-validation was 98.70%.In this study, the γ parameter was set to 0.045, the penalty parameter was set to 1024, the pyramid parameter was set to 0, and the classification probability threshold was set to 0. where ‖ ‖ is the distance from the nearest point to the hyperplane.The function can also be written as: Finally, the classification plane that can produce the lowest ‖‖ is the optimal hyperplane.
In SVM, the linear and radial basis function (RBF) is frequently used [44,61], and therefore, RBF was used in this study.The input parameters of SVM in the ENVI 5.1 software include "gamma" (γ), "penalty parameter", "pyramid levels", and "classification probability threshold".Since SVM is sensitive to the selection of parameters, cross-validation was used to determine the optimal parameters for the SVM classifier in this study.All of the samples were divided into five parts equally.Each part of the samples was set as an individual validation sample, and the remaining parts were set as training samples.Finally, the average of the five classification accuracies was used as the performance index of the classifier.Using the MATLAB platform, the search range of "penalty parameter" and "gamma" (γ) was set as 0.01-32,768.After cross-validation, the optimal penalty parameter (C) was 1024, the optimal gamma parameter (γ) was 0.045, and the accuracy of crossvalidation was 98.70%.In this study, the γ parameter was set to 0.045, the penalty parameter was set to 1024, the pyramid parameter was set to 0, and the classification probability threshold was set to 0.

Decision Fusion
The definition of image fusion is the combination of two or more different images into a new image using certain algorithms.On the basis of the stage at which fusion happens, image fusion can be divided into three levels: pixel-level, feature-level, and decision-level [62,63].Decision-level fusion is performed by classifying and identifying each image individually and then getting the optimal decision results.In this study, decision-level fusion was used to make full use of the rich spectral information of hyperspectral data and the advantages of airborne LiDAR data for urban object extraction in shadow areas.As expressed in Equation ( 4), the final classification map is obtained by the fusion of the two maps:    .
In the above function, Map means the final classification map after the decision fusion; Mapall means the classification result obtained by using the SVM classifier and the training samples; Mapshadow means the classification results in shadow areas obtained by using the object-based classifier.The decision fusion process was conducted in ArcGIS 10.2 software.As shown in Table 1,

Decision Fusion
The definition of image fusion is the combination of two or more different images into a new image using certain algorithms.On the basis of the stage at which fusion happens, image fusion can be divided into three levels: pixel-level, feature-level, and decision-level [62,63].Decision-level fusion is performed by classifying and identifying each image individually and then getting the optimal decision results.In this study, decision-level fusion was used to make full use of the rich spectral information of hyperspectral data and the advantages of airborne LiDAR data for urban object extraction in shadow areas.As expressed in Equation ( 4), the final classification map is obtained by the fusion of the two maps: Map all and Map shadow .
In the above function, Map f usion means the final classification map after the decision fusion; Map all means the classification result obtained by using the SVM classifier and the training samples;

The Decision-fusion Result of the Whole Study Area
The nDSM data are relatively consistent and stable across heterogeneous urban areas, so it is possible to extract objects with height information using the object-based classification method.The decision fusion result of the two classifiers is much better than that of the SVM classifier.The resulting classified images of the whole study area are shown in Figure 10.The decision-fusion results of SVM and OB (object-based classifier) are much better than those of SVM only.

The Decision-fusion Result of the Whole Study Area
The nDSM data are relatively consistent and stable across heterogeneous urban areas, so it is possible to extract objects with height information using the object-based classification method.The decision fusion result of the two classifiers is much better than that of the SVM classifier.The resulting classified images of the whole study area are shown in Figure 10.The decision-fusion results of SVM and OB (object-based classifier) are much better than those of SVM only.

The Decision-fusion Result of the Whole Study Area
The nDSM data are relatively consistent and stable across heterogeneous urban areas, so it is possible to extract objects with height information using the object-based classification method.The decision fusion result of the two classifiers is much better than that of the SVM classifier.The resulting classified images of the whole study area are shown in Figure 10.The decision-fusion results of SVM and OB (object-based classifier) are much better than those of SVM only.

The Decision-fusion Result of the Whole Study Area
The nDSM data are relatively consistent and stable across heterogeneous urban areas, so it is possible to extract objects with height information using the object-based classification method.The decision fusion result of the two classifiers is much better than that of the SVM classifier.The resulting classified images of the whole study area are shown in Figure 10.The decision-fusion results of SVM and OB (object-based classifier) are much better than those of SVM only.

Accuracy Assessment of the Final Decision Fusion Result
In order to quantify the performance of the proposed method in shadow object extraction, the SVM classification results of the whole study area and the final decision fusion result were compared for accuracy assessment.The accuracy assessments of the classification results are shown in Tables 2 and 3. Table 2 indicates that, compared with the SVM classifier only, the fused SVM and OB classifier improved the overall accuracy by 5.00% (from 87.30% to 92.30%).The error statistics of the accuracy assessment for 12 classes are listed in Table 3.Since the overall classification results using the kappa-statistic were close, the significance of the results using McNemar's test statistic was valuated.A total of 14,202 pixels were compared for the accuracy assessment.The non-diagonal cells in the error matrix that represent incorrectly classified pixels after classification with the SVM classifier include 2868 pixels, and the proposed method results include 1473 pixels.In contrast, the diagonal totals that represent the correctly classified values for SVM include 12,412 pixels, and those for the proposed method's classification include 13,109 pixels.A two-by-two contingency matrix was constructed for the above correctly classified and incorrectly classified pixels and then evaluated using McNemar's test.The results indicate a chi-square test statistic value of 1304.18, which exceeds the chi-square test critical value of 11.07 (alpha = 0.05) [44].Thus, the superiority of the proposed method over SVM is accepted.As mentioned earlier, the same training and validation samples were used for all of the classification experiments.Therefore, the proposed method (SVM + OB) significantly improves the classification accuracy, which also means that the proposed object-based classification method can effectively extract urban objects in shadow areas with fused hyperspectral and LiDAR data.As seen in Table 3, all classes reach a high accuracy except the road, railway, highway, parking_lot, and building classes.Compared with the SVM classifier only, the proposed method improves AI of trees from 88.71% to 93.49%, the AI of highways from 6.76% to 75.72%, the AI of railways from 56.58% to 78.81%, the AI of grass from 93.38% to 98.17%, and the AI of buildings from 63.64% to 86.08%.Using the proposed method to extract shadow area objects improves the overall accuracy from 87.30% to 92.30%, and it also improves the AI of the tree, highway, railway, grass, and building classes by 4.78%, 68.96%, 22.23%, 4.79%, and 22.44%, respectively.In order to better display the performance of object-based classification, the classification results of buildings, trees, grass, and highway using SVM and object-based classifiers are shown in Figure 11. Figure 11 compares the classified urban objects in shadow areas (especially buildings, trees, grass, and highway) resulting from the proposed object-based classification method with those resulting from the traditional pixel-based SVM classifier.As shown in Figure 11 (the blue circle), more buildings (especially residential areas) have been extracted.In addition, the shapes of the buildings are more complete.As shown in Figure 11, it was not possible to extract trees in shadow areas using the pixel-based SVM classifier, whereas the object-based classifier could extract most of the trees in shadow areas.This is mainly because the object-based classifier can make full use of the height information in the nDSM and spectral information of the NDVI imagery, as is the case for the grass.Highways are difficult to extract from hyperspectral imagery, let alone highways in shaded areas.In this study, intensity imagery derived from LiDAR data was also used to extract highways.
Remote Sens. 2018, 10, x FOR PEER REVIEW 14 of 19 accuracy from 87.30% to 92.30%, and it also improves the AI of the tree, highway, railway, grass, and building classes by 4.78%, 68.96%, 22.23%, 4.79%, and 22.44%, respectively.In order to better display the performance of object-based classification, the classification results of buildings, trees, grass, and highway using SVM and object-based classifiers are shown in Figure 11. Figure 11 compares the classified urban objects in shadow areas (especially buildings, trees, grass, and highway) resulting from the proposed object-based classification method with those resulting from the traditional pixelbased SVM classifier.As shown in Figure 11 (the blue circle), more buildings (especially residential areas) have been extracted.In addition, the shapes of the buildings are more complete.As shown in Figure 11, it was not possible to extract trees in shadow areas using the traditional pixel-based SVM classifier, whereas the object-based classifier could extract most of the trees in shadow areas.This is mainly because the object-based classifier can make full use of the height information in the nDSM and spectral information of the NDVI imagery, as is the case for the grass.Highways are difficult to extract from hyperspectral imagery, let alone highways in shaded areas.In this study, intensity imagery derived from LiDAR data was also used to extract highways.

Discussion
In cloud shadow areas, the spectra of the objects are missing, making it difficult to extract urban objects.In this study, two different types of remotely sensed data and a proposed classification method were used to extract objects in cloud shadow areas.The advantages of the method are that: (1) the pixel-based SVM classifier is effective for high -dimensional data classification; (2) the object-based classification method is suitable for classifying nDSM data [71]; (3) decision fusion can fuse different types of data from different sensors, and it is independent of errors in data registration.In this respect, the decision-fusion method is much better than fusion at other levels [72,73].The object-based classification method uses LiDAR elevation data and shape attributes to separate buildings, trees, and elevated roads.Usually, these are always above ground and part of the highway is aboveground.The elevation information of the nDSM was used to extract the above objects from others.Furthermore, although some spectral information was missing, the NDVI of trees and grass was still obvious and valuable for the separation of vegetation and non-vegetation areas.Therefore, the height information of the nDSM, the spectral information of the NDVI, and the flexible parameters of the object-based classification method were used for the object extraction from cloud shadows.Table 2 shows that the overall accuracy increases from 87.30% to 92.30%.Table 3 shows increased accuracy for buildings, highways, trees, grasses, and railways.Figure 11 also shows the advantage of the object-based classification method in cloud shadow areas using fused hyperspectral-LiDAR data.All of these results indicate the effectiveness of the proposed method.
The results from this study can also be compared with other classifications using fused hyperspectral and LiDAR data.Man et al. [30] fused LiDAR and hyperspectral data in a pixel-level and feature-level fusion strategy, and the overall accuracy of the classification was improved to 94.70%.Because different segmentation methods and spectral band combinations were used, the threshold values are a little different between the two studies.Luo et al. [50] classified the cloud-shadow area with fused hyperspectral and LiDAR data.This study mainly focused on the selection of training data in cloud-shadow areas.The proposed method improves the overall accuracy by 5.00% for the whole study area.However, the scale parameter of the segmentation was unsuitable for all classes.Different urban objects have different heterogeneities.In order to satisfy the best segmentation effect of different urban objects, it is necessary to find the most suitable segmentation scale and obtain different types of image objects.Therefore, the multi-resolution segmentation method was used for the segmentation and the subsequent classification.In summary, the method proposed in this study is valuable for improving the accuracy of urban land cover classification in cloud shadows.

Conclusions
The aims of this study are to explore the performance of a proposed method for the extraction of objects in shadow areas using fused hyperspectral and LiDAR data.Although previous studies have evaluated the performance of fused hyperspectral and LiDAR data in urban land use classification, to our knowledge, few studies have attempted to explore the fused hyperspectral and LiDAR data in shadow areas, especially using the object-based hierarchical classification method.This study combined a pixel-based SVM classifier and an object-based hierarchical classifier to extract urban objects in shadow areas using fused hyperspectral and LiDAR data.The following conclusions can be drawn on the basis of these results.
(1) The proposed method yields better accuracy and is confirmed by visual interpretation in urban shadow areas.The decision fusion results of the SVM and object-based classifiers improve the overall classification accuracy by 5.00% (from 87.30% to 92.30%).The overall accuracy improvement mainly occurs in the extraction of objects in the shadow area.In particular, this was observed for the classes of tree (AI increased from 88.71% to 93.49%), highway (AI increased from 6.76% to 75.72%), railway (AI increased from 56.58% to 78.81%), grass (AI increased from 93.38% to 98.17%), and building (AI increased from 63.64% to 86.08%).This is mainly because of the height information of the LiDAR datasets and the flexibility of the object-based classifier, which was very helpful for the separation of trees and low vegetation, buildings, and roads.Overall, the results from this study suggest that the combination of the pixel-based SVM classifier and object-based classifier with fused hyperspectral and LiDAR data has considerable potential to achieve high classification accuracy in urban land use classification, especially for urban object extraction in shadow areas.
(2) Compared with the pixel-level fusion of hyperspectral and LiDAR data, the decision-level fusion of pixel-and object-based classifications is very effective for urban object extraction in shadow areas.However, the segmentation threshold values and rules used in this study may not be readily applicable to other urban areas using the same remotely sensed data.
(3) In the future, the object-based classifier will be applied to the whole study area, and decision level will be fused with the pixel-based SVM classifier in order to obtain a better result and further evaluate the performance of the proposed method for the whole study area.Furthermore, more classification algorithms and multi-source remote sensing data will also be considered to further improve the classification results of the shadow areas.

Figure 1 .
Figure 1.False color composite of hyperspectral imagery (top) and the normalized digital surface model (nDSM) derived from LiDAR data (bottom).

Figure 1 .
Figure 1.False color composite of hyperspectral imagery (top) and the normalized digital surface model (nDSM) derived from LiDAR data (bottom).

Figure 2 .
Figure 2. Flowchart of the proposed method.

Figure 3 .
Figure 3. Hyperspectral imagery in cloud-shadow areas, true color display (left) and false color display (right).

Figure 3 .
Figure 3. Hyperspectral imagery in cloud-shadow areas, true color display (left) and false color display (right).

Figure 4 .
Figure 4.The flowchart of urban object extraction from shadow areas with the fusion of airborne light detection and ranging (LiDAR) and hyperspectral data.

Figure 4 .
Figure 4.The flowchart of urban object extraction from shadow areas with the fusion of airborne light detection and ranging (LiDAR) and hyperspectral data.

Figure 5 .
Figure 5. Spectral curve of vegetation in shadow areas.Here, red/green/yellow lines represent max/mean/min values of hyperspectral data, respectively.The longitudinal axis represents radiance, and the transverse axis represents wavelength (nm).

Figure 6 .
Figure 6.Normalized difference vegetation index (NDVI) imagery of the shadow area (left) and the distribution of vegetation samples in different intervals of the NDVI (right).Here, longitudinal and transverse axes represent the number of samples and NDVI, respectively.

Figure 5 .
Figure 5. Spectral curve of vegetation in shadow areas.Here, red/green/yellow lines represent max/mean/min values of hyperspectral data, respectively.The longitudinal axis represents radiance, and the transverse axis represents wavelength (nm).

Figure 5 .
Figure 5. Spectral curve of vegetation in shadow areas.Here, red/green/yellow lines represent max/mean/min values of hyperspectral data, respectively.The longitudinal axis represents radiance, and the transverse axis represents wavelength (nm).

Figure 6 .
Figure 6.Normalized difference vegetation index (NDVI) imagery of the shadow area (left) and the distribution of vegetation samples in different intervals of the NDVI (right).Here, longitudinal and transverse axes represent the number of samples and NDVI, respectively.

Figure 6 .
Figure 6.Normalized difference vegetation index (NDVI) imagery of the shadow area (left) and the distribution of vegetation samples in different intervals of the NDVI (right).Here, longitudinal and transverse axes represent the number of samples and NDVI, respectively.

Figure 8 .
Figure 8. Classified maps in the shadow area obtained from object-based classification (left) and the traditional pixel-based support vector machine (SVM) classifier (right).

Figure 9 .
Figure 9.The nDSM imagery of the cloud shadow area (left) and the hyperspectral data of the cloud shadow area (right).

Figure 8 .
Figure 8. Classified maps in the shadow area obtained from object-based classification (left) and the traditional pixel-based support vector machine (SVM) classifier (right).

Figure 8 .
Figure 8. Classified maps in the shadow area obtained from object-based classification (left) and the traditional pixel-based support vector machine (SVM) classifier (right).

Figure 9 .
Figure 9.The nDSM imagery of the cloud shadow area (left) and the hyperspectral data of the cloud shadow area (right).

Figure 9 .
Figure 9.The nDSM imagery of the cloud shadow area (left) and the hyperspectral data of the cloud shadow area (right).

19 Figure 8 .
Figure 8. Classified maps in the shadow area obtained from object-based classification (left) and the traditional pixel-based support vector machine (SVM) classifier (right).

Figure 9 .
Figure 9.The nDSM imagery of the cloud shadow area (left) and the hyperspectral data of the cloud shadow area (right).

Figure 10 .
Figure 10.The traditional pixel-based SVM classified map of the whole study area (top) and the final decision fusion classification map (bottom).

Figure 11 .
Figure 11.A comparison of the classified results in the shadow area between the traditional pixelbased SVM classifier (left) and the proposed object-based classification method (right).

Figure 11 .
Figure 11.A comparison of the classified results in the shadow area between the traditional pixel-based SVM classifier (left) and the proposed object-based classification method (right).

Table 1 .
Details of training and validation samples used for classification.

Table 2 .
Comparison of overall accuracy of SVM and SVM + OB (object-based) decision-fusion results using fused hyperspectral and LiDAR data.OA means overall accuracy.

Table 3 .
Classification results at the class level using SVM and SVM + OB classifiers.PA, producer's accuracy; UA, user's accuracy; AI, accuracy index.