Next Article in Journal
Investigating the Relationship between Air Pollutants and Meteorological Parameters Using Satellite Data over Bangladesh
Previous Article in Journal
Remote Sensing of Complex Permittivity and Penetration Depth of Soils Using P-Band SAR Polarimetry
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Adaptive Unsupervised-Shadow-Detection Approach for Remote-Sensing Image Based on Multichannel Features

1
School of Computer Science, China University of Geosciences, Wuhan 430074, China
2
Artificial Intelligence School, Wuchang University of Technology, Wuhan 430223, China
3
China National Engineering Research Centre for Geographic Information System, Wuhan 430074, China
4
Wuhan Zondy Advanced Technology Institute Co., Ltd., Wuhan 430074, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2022, 14(12), 2756; https://doi.org/10.3390/rs14122756
Submission received: 20 April 2022 / Revised: 2 June 2022 / Accepted: 6 June 2022 / Published: 8 June 2022

Abstract

Shadow detection is an essential research topic in the remote-sensing domain, as the presence of shadow causes the loss of ground-object information in real areas. It is hard to define specific threshold values for the identification of shadow areas with the existing unsupervised approaches due to the complexity of remote-sensing scenes. In this study, an adaptive unsupervised-shadow-detection method based on multichannel features is proposed, which can adaptively distinguish shadow in different scenes. First, new multichannel features were designed in the hue, saturation, and intensity color space, and the shadow properties of high hue, high saturation, and low intensity were considered to solve the insufficient feature-extraction problem of shadows. Then, a dynamic local adaptive particle swarm optimization was proposed to calculate the segmentation thresholds for shadows in an adaptive manner. Finally, experiments performed on the Aerial Imagery dataset for Shadow Detection (AISD) demonstrated the superior performance of the proposed approach in comparison with traditional unsupervised shadow-detection and state-of-the-art deep-learning methods. The experimental results show that the proposed approach can detect the shadow areas in remote-sensing images more accurately and efficiently, with the F index being 82.70% on the testing images. Thus, the proposed approach has better application potential in scenarios without a large number of labeled samples.

Graphical Abstract

1. Introduction

Shadow in remote-sensing images is formed when direct light is totally or partially blocked by ground objects [1,2,3]. Although shadow information can be used to estimate the direction of light sources or the height of buildings [4,5,6], the existence of shadow leads to a partial loss of radiometric information in the blocked areas. Recently, high-resolution and super-resolution remote-sensing images have been applied widely [7]. The shadow makes image interpretation (e.g., object detection and classification) difficult or impossible [8,9,10], and especially in high-resolution remote-sensing images. Therefore, the extraction of shadow regions in high-resolution remote-sensing images is an essential step for subsequent image processing. The accuracy of shadow detection has an important effect on applications, such as classification, object detection, and change detection.
The existing shadow-detection methods for remote-sensing images can be divided into two categories: unsupervised shadow detection and supervised shadow detection. Unsupervised shadow detection refers to detecting shadow regions through certain features of shadow, without training samples. The main idea of these methods is to obtain the parameters on the characteristics, such as spectral differences [11,12,13] and geometric features [14,15,16], and they are usually assisted by the prior position, height, camera calibration parameters, and orientation knowledge about scenes and sensors. The disadvantage of these methods is that the parameters that are related to scenes are hard to obtain, such as the spectrum, intensity, color temperature, camera correction parameters, and geometric characteristics of ground objects (height, angle, etc.). For example, Tian [17] propose a tricolor attenuation model to extract shadow areas, but the parameters of the tricolor attenuation model need knowledge about the spectral power distribution of daylight and skylight. However, it is difficult to detect the shadow due to the lack of illumination information in various scenes. Nakajima [11] uses airborne laser scanner data to export the digital surface model for the purpose of simulating the shadow. However, in many cases, the digital surface model of remote-sensing images is hard to obtain. Therefore, a particular method that performs well in one situation may give a poor performance in other situations due to the change in these parameters. The supervised learning methods, as the name implies, rely on many labeled shadow samples. Deep-learning methods, which can learn the complex nonlinear characteristics of training data, have been widely used for the shadow detection of remote-sensing images in recent years [18,19,20,21,22]. Although these methods can achieve a high shadow-detection accuracy, few labeled shadow samples are used in many situations in practice and labeling the shadow samples often requires a large amount of labor costs.
An adaptive unsupervised shadow-detection method for remote-sensing images was developed in this study to improve the generalization ability of shadow-detection models and to degrade the dependence on labeled samples.
Specifically, the main contributions of this study can be summarized as follows:
  • Multichannel features are designed on the basis of the hue, saturation, and intensity (HSI) space. The designed multichannel corresponds to the visible spectral bands (400–700 nm) for aerial images. The shadow characteristics of low intensity, high hue, and high saturation are considered to capture the shadow characteristics comprehensively;
  • A dynamic local adaptive particle-swarm-optimization (DLA-PSO) algorithm is proposed to calculate the threshold for determining shadow regions adaptively. The main idea of DLA-PSO is to select the optimal segmentation threshold for multi-channel features.
The structure of this paper is organized as follows. Section 2 reviews the related traditional unsupervised methods and the state-of-the-art deep-learning methods for the shadow detection of remote-sensing images. Section 3 describes the proposed shadow-detection method in detail. Section 4 verifies the rationality of the channel design and the robustness of DLA-PSO qualitatively and quantitatively. Section 5 discusses the parameter settings, advantages, and limitations of the proposed model. Section 6 provides the conclusions.

2. Related Work

A variety of shadow-detection methods have been proposed in the past decades. In accordance with the need of labeled shadow samples, these methods can be roughly divided into two categories: unsupervised shadow detection and supervised shadow detection.

2.1. Unsupervised Shadow Detection

Unsupervised shadow detection does not require a large amount of prior knowledge and labeled samples. Unsupervised shadow detection can be divided into model-based and property-based methods. The model-based methods require the prior position, height, and orientation knowledge of the scenes and sensors when detecting shadow regions [1,23]. Zhu developed a cloud shadow-detection method [24]. In their study, the illuminating angle and view angle of the sensor are used together to predict shadow locations. Lee detected the shadowed objects by considering the direction of the shadow area toward the regions of light sources [25]. However, model-based methods are limited because acquiring accurate related geometrical parameters is difficult. The property-based methods distinguish between shadow and nonshadow regions by considering properties such as the geometric and spectral characteristics (e.g., the intensity). Compared with model-based methods, the property-based methods require minimal previous information. The property-based methods have been widely used in the existing literature due to the simplicity of the principle and application.
For the property-based methods, two crucial issues (namely, the selection of the shadow property in the color space and the determination of suitable thresholds) should be solved carefully to identify shaded regions. Color can be represented in different 3D spaces (e.g., red, green, blue (RGB) HSI, hue saturation value (HSV), and C1C2C3 color space), and one color space can be converted into another through a mathematical transformation. Each color space has special properties and may be suitable for a specific scenario. Yang proposes a normalized blue channel on the basis of the characteristics of the high blue value of shadow regions in the RGB color space [26]. In the normalized blue channel, the shadow regions are mainly concentrated in the high-pixel part, but the blue ground objects have high-pixel value. The main disadvantage of the RGB color space is due to its insufficient ability to distinguish between similar colors. Tsai introduced a spectral ratio of hue to intensity (SRHI) to detect shadow in the HSI chromaticity space [27]. However, such a method cannot distinguish shadow from ground objects with high hue values (e.g., dark blue or dark green). The C1C2C3 color space is widely used and is considered the best nonlinear variation for shadow detection [28]. The selection of the color space affects the shadow-detection performance. Tsai compared the effectiveness of shadow detection in different color spaces and found that the best shadow-detection performance is the HSI color space [27]. Some researchers have tried to detect shadows by combining multiple color spaces (e.g., the RGB and HSI) to utilize the spectral properties of shadows [29]. Previous studies have proven that three different indices for shadow detection based on the HSI color space produce more accurate results than using the HSI color space only. In summary, the principle and application of property-based methods are simple and require minimal information about the illumination conditions and geometric scenes. The selection of a suitable color space and shadow features is helpful to promote the detection accuracy.
The thresholds in property-based methods can be calculated in a global or local manner. The global thresholds can be determined by using the histogram bimodal method [29], Otsu method [30], and maximum entropy method [31]. The idea of the histogram bimodal method is to treat the valley of the gray histogram as the threshold value. However, the gray distribution of an image does not necessarily fit the bimodal, and the determined threshold cannot describe the shaded regions accurately. The maximum entropy method calculates the threshold by maximizing the information entropy of foreground and background images. The Otsu method determines the optimal threshold by maximizing the interclass variance. However, the global threshold cannot ensure the performance of shadow detection due to the color diversity of ground objects. Commonly used local-threshold-selection methods include the seeded-region-growing and max-flow/min-cut algorithms [32]. The former merges the adjacent pixels with the same attribute to separate the connected region, and the latter minimizes the energy function of the network by constructing a graph cut network. The general idea of these methods is to calculate the threshold of shadows by considering the similarity of neighboring pixels in the local regions of the remote-sensing image. However, the local-threshold-selection methods are often trapped in the local-optimum problem, without considering the characteristic of the whole image.

2.2. Supervised Shadow Detection

Shadow detection can be achieved with the assistance of machine learning or artificial intelligence. The supervised-shadow-detection method can effectively identify the shaded areas by learning features from considerable training data. Lorenzi detected a shadow region from very high-resolution satellite images by using support vector machine [33]. Yuan developed a shadow-region classifier by combining the logistic-regression and conditional-random-field models [34]. Recently, the deep-learning technique has been applied to shadow detection. Deep-learning models, such as convolutional neural networks (CNNs) and graph convolutional networks (GCNs), have shown superiority in semantic segmentation or object recognition [35,36]. Shadow detection from high-resolution remote-sensing images can be regarded as a semantic-segmentation problem. Ronneberger verified the effectiveness of the U-Net network in shadow detection [37]. Badrinarayanan proposed a Segnet framework [38]. The advantage of Segnet is that the decoder performs nonlinear upsampling on its low-resolution input feature, so as to perform semantic segmentation on the target object in the image. Lei designed a recursive-attention-residual (RAR) module and bidirectional feature pyramid network on the basis of a CNN and proposed a BDRAR algorithm to enhance the detected shadow details [39].
Although the supervised-shadow-detection methods can often achieve better accuracy, the model training process depends heavily on labeled samples. On the one hand, ensuring sufficient training samples is difficult for every scenario. On the other hand, labeling the training samples is related to a large amount of labor costs. Thus, the supervised-shadow-detection method is limited in some application scenarios, and especially in small-sample situations.

2.3. Critical Analysis of Existing Work

In summary, the supervised and unsupervised approaches have their advantages and disadvantages. Although the supervised approaches (including the deep-learning model) can achieve a high shadow-detection accuracy, there are few open datasets for the remote-sensing field in practice, which leads to a serious impact on the performance of shadow detection. Labeling the shadow samples on the basis of satellite images (such as DOTA [40], INRIA aerial datasets [41], etc.) often requires a large amount of labor costs. Luo [2] has publicly released the first Aerial Image dataset for Shadow Detection (AISD). This dataset is aerial orthorectified color imagery with a spatial resolution of 0.3 m, covering five major regions of the world. Moreover, due to the complexity of remote-sensing object types and geometric structures, shadow areas are often scattered in various regions of remote-sensing images with irregular shapes, making it difficult for mainstream deep-learning methods to distinguish complex and diverse features. In addition, the color characteristics of sunlit areas with weak illumination are similar to those shadow areas that are in the process of extracting nonlinear features, which will also cause false detection. The unsupervised approach requires minimal training samples and can be applied to a wide range of applications. However, capturing the shadow characteristics comprehensively and determining the segment threshold for shadow in an adaptive manner considering the diverse characteristics of complex ground objects is difficult. Thus, objects with similar spectral properties are misjudged. For example, the high-hue (green, red) ground objects in nonshadow areas are detected as shadowed areas due to the small difference between the hue and brightness values. Therefore, the development of an unsupervised-shadow-detection approach that can capture the shadow characteristics comprehensively and calculate the shadow threshold adaptively is necessary.

3. Method

3.1. Research Framework

An adaptive unsupervised-shadow-detection method based on multichannel features is described. First, aerial images in the RGB color space should be converted into the HSI color space because the HSI color space is more consistent with visual characteristics [42]. Considering that Gaussian noise may be produced due to the poor optical elements or high temperature of the sensor in the process of image acquisition, a Gaussian filter is applied after the conversion to remove the Gaussian noise in each channel. The main approach is roughly divided into three parts, as follows: (1) the design of multiple channels to capture shadow characteristics comprehensively; (2) the determination of the segment threshold for each channel on the basis of an optimization method, named DLA-PSO; and (3) regional optimization by applying mathematical morphology. The overview of the research framework is presented in Figure 1.

3.2. Conversion from RGB to HSI Color Space

Firstly, the RGB image should be converted into the HSI color space, which is composed of hue (H), saturation (S), and intensity (I) channels, due to the superiority of the HSI color space in discriminating similar colors. The conversion can be achieved via the following formulas [42]:
I = 1 3 ( R + G + B     )
S = 1 3 R + G + B min ( R ,     G ,     B )
H = { θ ,     B G 360 θ ,     B > G
θ = arcos ( R G ) + ( G B ) 2 ( R B ) 2 + ( R B ) ( G B )
Gaussian noise exists in each component channel, and the presence of Gaussian noise affects shadow recognition and threshold selection [43]. Therefore, Gaussian filtering can be used to eliminate the noise in the image. Usually, a 2D Gaussian function can be used to remove noise, which is expressed as follows:
h g ( n 1 ,   n 2 ,   δ ) = 1 2 π δ 2 e ( n 1 2 + n 2 2 ) 2 δ 2
h ( n 1 ,   n 2 ) = h g ( n 1 ,     n 2 ,     δ ) n 1 n 2 h g
where n 1     and     n 2 are the ordinates of pixels, and   δ is the standard deviation of the normal distribution. The size of the δ determines the smooth effect of the image, and the default value for the δ is usually 0.5 [43]. The output ( h ( n 1 ,   n 2 ) ) is the weighted average of the adjacent pixel points around.

3.3. Design of Multiple Channels for Shadow Detection

In this paper, the design of multiple channels considers both the properties of shadow and the interference of special remote-sensing features on shadow detection. The design of shadow channels should reflect the characteristics of the shadow region in the HSI color space. Shadow areas are caused by objects blocking the sun’s light; thus, they have low intensity [44]. The saturation in the shadow region is high due to the influence of atmospheric Rayleigh scattering. The Phong illumination model shows that the shaded areas often have high hue values [45].
Figure 2 shows the grayscale maps of multiple channels for the same image in shadow detection. Figure 2a,b illustrates the original image and the true labeled shaded region, and Figure 2c displays the grayscale map for the H channel. As observed in Figure 2c, the areas with high hue (lawns, football fields, etc.) cannot be distinguished from real shadow areas through the hue channel. The intensity value of these high-hue areas is higher than the shadow areas, as shown in the I channel in Figure 2f. On the basis of the characteristics of the shadow areas (i.e., high hue and low brightness), a new channel (namely, the H-I channel) is designed in this study. The rationality of the H-I channel depends on its ability to strengthen the difference between the H and I channels and distinguish the objects with high hue (lawns, football fields, etc.) and the shadow regions from the image. The value of the shadow areas through the H-I channel are higher than the high-hue objects in the nonshadow region, which appears white in Figure 2d and can be easily separated. Dark objects (e.g., black roof or gray roads) with high hue and low intensity are found in the nonshadow areas. They are easily mistaken for shadows through the H-I channel. Therefore, a saturation channel is considered. These dark objects in nonshadow areas (i.e., the misjudged shadows) are usually less saturated and are brighter than real shadow areas. Therefore, the shadow regions detected through the H-I channel are further refined by using the low-saturation and high-intensity constraints. Specifically, the initial shadow-detection results are first obtained through the H-I channel. The initial detected shadow regions are processed in the saturation and intensity channels. The intersection regions of the detected regions in different channels are treated as the final shadow regions. Accurate shadow-detection results can be realized through the designed detection channels and framework. The ground truth in this paper is actually a shadow mask provided by manual annotation in the AISD dataset. Due to large number of shadows with various types and shapes in the selected aerial images, the shadow masks were carefully annotated by Luo et al. [2]. In recent years, the AISD dataset has been widely used for evaluating shadow-detection performance [2,46]. As shown in Figure 2g, the detected shadow is consistent with the ground truth.

3.4. Threshold Determination Using the DLA-PSO Algorithm

Another issue is how to select the optimal thresholds for each channel (i.e., H-I, S, and I channels) adaptively and accurately. In this study, a DLA-PSO is proposed on the basis of the traditional OTSU algorithm and PSO. PSO is a classical intelligent optimization algorithm that has been widely used in image registration [47,48,49] and image segmentation [50,51,52]. The advantages of the PSO algorithm include simplicity, fewer parameters, and fast convergence [53]. The determination of thresholds in shadow detection can be calculated by using the PSO algorithm. In the PSO algorithm, an optimization function needs to be set to judge the fitness of the particle before and after the movement. On the basis of the fitness function, the individual optimal value ( p b e s t i ) of the particle (i) and the global optimal value ( g b e s t ) can be found. Shadow detection, itself, is a pixel-level image binary classification task, and the interclass-variance formula is a method to divide the image into foreground and background according to the gray characteristics of the image. In terms of shadow detection, the threshold of each channel needs to use the formula of interclass variance as the evaluation standard. The maximum interclass variance means the minimum misclassification probability. Given a particle swarm of the size (n) Q = { X 1 , X 2 , , X n } , the velocity of each particle is randomly set as V = { v 1 , v 2 , , v n } . The optimal threshold for shadow detection can be obtained by maximizing the interclass variance, which can be calculated as:
g = w 0 ( μ 0 μ ) 2 + w 1 ( μ 1 μ ) 2
where   g represents the interclass variance, w 0 represents the proportion of shadow pixels in the overall image, and w 1   represents the proportion of nonshadow pixels in the overall image. μ 0 represents the average gray value in the shadow image, and μ 1   represents the average gray value in the nonshadow image. μ   represents the average gray value of the whole image, which can be expressed as μ = w 0 μ 0 + w 1 μ 1 . Thus, the simplified formula of interclass variance can be obtained as:
g = w 0 w 1 ( μ 0 μ 1 ) 2
To solve the local-optimal problem for the classical PSO algorithm, a DLA-PSO algorithm is proposed in this study. For the k-th iteration, the PSO algorithm changes the velocity and position of the particle in accordance with the velocity-update and position-update formula of the particle (i), which can be expressed as:
v i k = w v i k 1 + c 1 r 1 ( p b e s t i X i k 1 ) + c 2 r 2 ( g b e s t k 1 X i k 1 )
X i k = X i k 1 + v i k 1
where c 1 and c 2 are the learning factors, which are usually set to 2. r 1 and r 2 are random factors between 0 and 1, which are used to enhance the performance of the PSO algorithm [53,54,55]. In Formula (9), the inertia weight (w) determines the prior knowledge’s influence on the current velocity and affects the convergence speed of the algorithm. When w is set too small, it means that the historical speed has too little influence on the current velocity, so that the particle moves in the local range, resulting in the local optimal solution. However, if w is too large, it means that the historical speed has heavy influence, so that the particles move in a wide range in the later period, which will lead to nonconvergence. A large w should be specified at the initial stage so that particles can find the optimal value domain as soon as possible. To avoid the situation of local optimization, a small w should be given at the later stage so that a stable search speed can be maintained to accurately search the optimal threshold within the optimal value domain. This process is performed to balance the global search speed and local optimization effect. For this reason, a dynamic inertia weight ( w ) is introduced, which is calculated as:
w = w m a x ( w m a x w m i n ) · k M a x i t e r a t i o n
The dynamic inertia weight changes with the number of iterations (k), total number of iterations ( M a x i t e r a t i o n ), maximum weight ( w m a x ), and minimum weight ( w m i n ). The w m a x is set to 0.95, and the w m i n is set to 0.4 [56].
In this study, it can be found, from Formula (9), that the classical PSO algorithm adjusts the particle velocity by the individual optimal position ( p b e s t i ) and the global optimal position ( g b e s t k 1 ). When the value of g b e s t k 1 X i k 1 is too large, the convergence rate of the particle is fast in the early stage, resulting in the local-optimal condition. However, when the value of p b e s t i X i k 1 is too large, nonconvergence will occur in the late searching period under the constraint of a lack of global information. In order to alleviate the contradiction, neighborhood particles are considered, and the current iteration and the maximum iteration are combined to design the β value. Therefore, global PSO and local PSO are combined to define the velocity-updating formula:
v i k 1 = ( 1 β ) · G l o b a l P S O i + β · L o c a l P S O i
G l o b a l P S O i = w v i k 1 + c 1 r 1 ( p b e s t i X i k 1 ) + c 2 r 2 ( g b e s t k 1 X i k 1 )
L o c a l P S O i = w v i k 1 + c 1 r 1 ( p b e s t i X i k 1 ) + c 3 r 3 ( p b e s t i + 1 X i k 1 )
β = k M a x i t e r a t i o n
X i k = X i k 1 + v i k 1
where Formula (13) represents the velocity updating on the basis of the global PSO, and Formula (14) represents the velocity updating on the basis of the optimization of particles in the local neighborhood.
The particle-velocity-update formula ( v i k 1 ) is divided into four parts: w v i k 1 is the product of the particle velocity in the k − 1 evolution and the dynamic inertia weight, which is regarded as the prior knowledge part of the particle (i); p b e s t i X i k 1   is the local perception part, which is the optimal distance between the current position of the particle (i) and its own individual, and reflects the self-cognition of the particle (i) itself; g b e s t k 1 X i k 1   is the global perception part, which is the distance between the global optimal position and the current position of the particle, and it reflects the global cognition of the particle (i); p b e s t i + 1 X i k 1 is the neighborhood perception part, which is the distance between the neighborhood particle and the current position of the particle, and it reflects the communication and information sharing between the particle (i) and its companions. Thus, at the beginning of iteration, g b e s t k 1 is more suitable for searching the approximate range of optimal thresholds. At the same time, the existence of nearby particles can also limit the local-optimal problem caused by excessive velocity. In addition, more consideration is given to the influence of adjacent particles and current particles in the late search process, so that the optimal threshold can be accurately found within the rough range determined in the early stage to avoid the problem of nonconvergence. The framework of the DLA-PSO algorithm is illustrated in Algorithm 1.
Algorithm 1: Framework of the DLA-PSO algorithm
Input: Designed channels (H-I, I, S); n, Max_iteration;
Output: gbest
  • Initialization: c1 = 2, c2 = 2, w m a x = 0.95 ,   w e n d = 0.4 ; r1, r2 = rand (); v 0 = 2;
  • while (k < Max_iteration), do;
  • Calculate the fitness function according to Otsu algorithm;
  • p b e s t i k = w 0 w 1 ( μ 0 μ 1 ) 2 ;
  • Set the optimal particle value p b e s t i k and g b e s t k in the k-th iteration;
  • p b e s t i k = {   p b e s t 1 k , p b e s t 2 k , , p b e s t n k } ;
  • g b e s t k = max ( p b e s t i k )
  • for i = 1 to n, do;
  • if ( g ( p b e s t i k ) > g ( p b e s t i ) ), then;
  • p b e s t i = p b e s t i k ;
  • if ( g ( g b e s t k ) > g ( g b e s t ) ), then;
  • g b e s t = g b e s t k ;
  • Update X i k and v DLA PSO k 1 ;
  • v D L A P S O k 1 = ( 1 β ) · G l o b a l P S O i + β · L o c a l P S O i ;
  • X i k = X i k 1 + v D L A P S O k 1 ;
  • k k + 1 ;
  • return gbest .

3.5. Regional Optimization

The initial shadow-detection results can be obtained by using the proposed DLA-PSO algorithm. However, small low-brightness objects (e.g., the black vehicles on the road) in nonshadow areas or bright ground objects (e.g., white vehicles) in shadow regions are often misjudged. Two types of regional optimization are adopted in the proposed approach to further improve the shadow-detection precision. The connected regions in the shadow-segmentation results are calculated to select the low-brightness objects in the nonshadow area. To eliminate small low-brightness objects (e.g., the black vehicles on the road) in the nonshadow area, a spatial lower limit for the area of small objects should be given. In this paper, the spatial lower limit is set according to the spatial resolution of the experimental datasets and the size of vehicles in real life. The spatial resolution of the AISD dataset is 0.3 m, meaning that one pixel grid in the image represents an actual area of 0.09 square meters. In real life, the length of a vehicle is about 4–6 m, and the width is about 2 m. Thus, it can be calculated that a car occupies about 90–130 pixels in the AISD dataset. Therefore, the spatial lower limit is set as 130 for regional optimization. Meanwhile, bright ground objects in shadow (e.g., white vehicles) are often mistaken for sunlit areas. Due to the small size of the white vehicles in the remote-sensing image, it will appear as small holes in the detected shaded area, as shown in Figure 3d. The closure operation of mathematical morphology is applied to the detected shadow areas to fill these holes.
Mathematical morphology is widely used in image processing, and especially in edge extraction and image segmentation [57]. Two basic operations in mathematical morphology are the corrosion operation and the expansion operation. Small holes can be filled by using the closed operation, which is expansion followed by corrosion and is expressed as:
A V = { σ | V ^ σ A }
A V = { σ | ( V ) σ A }
A V = ( A V ) V
Formulas (17)–(19) represent the expansion, corrosion, and closure operations, respectively. The expansion enlarges the boundary of the target set (A) and corrosion shrinks the boundary of the target set (A). By definition, the closure operation of mathematical morphology can fill small holes and eliminate small, narrow cracks and voids while keeping the shapes and positions of objects unchanged. Therefore, the closure operation of mathematical morphology is adopted to eliminate the small holes in the detected shadow that are caused by bright small objects in the shadow area, as illustrated in Figure 3e.

4. Experiment

The indices of the accuracy, precision, recall, and F-measure are introduced to make the comparison reliable and to evaluate these methods quantitatively, which are formulated as:
A C C = T P + T N T P + T N + F P + F N
P r e c i s i o n = T P T P + F P
R e c a l l = T P T P + F N
F M e a s u r e = 2 · P r e c i s i o n · R e c a l l P r e c i s i o n + R e c a l l
where TP and TN refer to the parts that are properly checked as shadow and nonshadow. FP represents the missing shadow, and FN represents the part that is mistakenly checked as a shaded area. ACC is adopted as the accuracy metric, and F M e a s u r e represents an aggregative indicator that considers precision and recall, ranging from 0 to 1.

4.1. Rationality of Channel Design

This section first discusses the rationality of the channel design on shadow detection. A remote-sensing image of Innsbruck is adopted as the original image in Figure 4a. Hue, saturation, and intensity serve as single-channel detection in the HSI color space in Figure 4b–d. However, the characteristics of the single channel ignore the important shadow attributes of other channels. For example, in Figure 4d, only the intensity of shadows is considered and, as a result, the playground with low intensity is mistaken as shadow. Shadow detection is often interfered with by high-hue ground objects (grass, etc.), which are of high hue and high intensity in the illumination area but of high hue and low intensity in the shadow area. As shown in Table 1, the F value of the single channel (hue, saturation, and intensity) is less than 0.8, resulting in a poor comprehensive detection performance. Therefore, the rationality of the H-I channel depends on its ability to strengthen the difference between the H and I channels and distinguish the objects with high hue (lawns, football fields, etc.) and the shadow regions from the image, as shown in Figure 4e. As can be seen from Figure 4c,d, some ground objects with low intensity and high saturation in the sunlit area will also cause interference in shadow detection, and so shadow detection that combines the saturation channel and intensity is considered to obtain better results. After determining the designed channels, the order of the H-I, S, and I channels should be considered in the next step. Figure 4f shows the shadow-detection results of a nonordered HSI channel design. As shown in Table 1, the F value of the nonordered HSI channel is 0.8061. Although a nonordered design can detect most of the shadow regions, the boundary of the shadow areas is blurred and irregular. The proposed DLA-PSO, based on multichannel features, has an F value of 0.9386, as shown in Figure 4g and Table 1. The channel design of the DLA-PSO method can achieve a more accurate shadow-detection result by comparing the performances of a single channel and multiple channels.

4.2. Robust Evaluation of DLA-PSO

In this part, the robustness of the proposed approach is evaluated on public remote-sensing datasets. The popular AISD public remote-sensing dataset is used as the benchmark to compare the performances of different shadow-detection methods. The data include 514 remote-sensing images with spatial resolutions of 0.3 m, covering five different cities (namely, Austin, Vienna, Innsbruck, Chicago, and Tyrol). The AISD datasets can be obtained from https://github.com/RSrscoder/AISD, (accessed on 1 March 2022). The remote-sensing images in the AISD dataset only contain RGB bands. However, the proposed method can be extended by adding other bands (e.g., near-infrared band) [58] to detect shadows in future work.
For the shadow-detection methods, the proposed approach is compared with traditional unsupervised methods and the latest deep-learning methods. Traditional methods include the SRHI, normalized-blue (NB), histogram-threshold-detection (HTD), and C1C2C3 space detection (C1C2C3) methods. This study selects three representative deep-learning algorithms (namely, the U-Net, Segnet, and BDRAR methods). The U-net model adopts an encoder–decoder structure and a jump-connection design so as to retain more image details and achieve good effects in shadow detection [37,59,60]. Compared with U-net, the Segnet model restores the feature graph to high resolution in the upsampling process, thus preserving the location of the target feature, and finally achieving shadow detection [38,61]. The BDRAR model adds an attention residual module and bidirectional pyramid network on the basis of the CNN framework. This model fully mines the global and local context information in different layers of CNN coding, so that the feature extraction of shadow samples is considered more comprehensively [39].
Considering that each city has different illumination characteristics and urban architectural features, five representative urban images were selected to analyze the performances of different methods qualitatively and quantitatively. Figure 5, Figure 6, Figure 7, Figure 8 and Figure 9a,j serve as the original images and ground truth, respectively.

4.2.1. Comparative Experimental Analysis on Austin Dataset

Figure 5b–e shows the results of the traditional shadow-detection methods. The C1C2C3 method cannot detect most of the shadow regions. Buildings and trees cannot be distinguished from their shadows, as shown in Figure 5e. In Figure 5b,d, the SRHI and HTD methods wrongly detected the dark roofs as shadow areas. Although the NB method can separate roofs from shadow regions, the error caused by the vegetation can still be mistaken for shadows in Figure 6c. From the above analysis, the traditional methods tend to detect the sunlit dark objects as shadow and the shadowed bright objects as nonshadow falsely because they heavily rely on shadow properties.
For deep-learning methods, the U-Net, Segnet, and BDRAR methods can detect most of the shadow regions correctly, without significant errors, but many sunlit objects (vehicles, etc.) are mixed, thereby affecting the accuracy of the U-Net and Segnet methods, as shown in Figure 5f,g. The BDRAR method can achieve better results and can eliminate the error caused by vehicles compared with the U-Net and Segnet methods. However, the boundary features of the shadow areas are blurred and irregular, resulting in some error in the details. Narrow shadows cannot be effectively recognized in the BDRAR method in Figure 5h. This condition is because deep-learning methods lack learning ability on complex ground objects in remote-sensing images. Aiming at shadow in remote-sensing images, the proposed method in this paper considers both the shadow attributes and characteristics of easily confused objects (such as high-hue grassland and high-saturation road), and designs multiple channels, so as to achieve the accurate detection of shadow and better retain the characteristics of the boundary. The DLA-PSO method can precisely detect the shadow regions, avoiding the interference of vegetation, vehicles, and roofs, and it can extract the true shadow boundary clearly and regularly. From the quantitative indicators in Table 2, the accuracy of the DLA-PSO method achieves up to 96%, which is higher than the other compared methods. The comprehensive indicator, the F-measure, in the DLA-PSO method performs the best among all methods, with a score of 87.7%. In summary, the shadow results detected by the proposed DLA-PSO method are more consistent with the ground truth in Austin City.

4.2.2. Comparative Experimental Analysis on Vienna Dataset

Although shadow areas can be detected by the traditional methods in Figure 6b–e, some dark roofs are still wrongly detected as shadows, leading to a low accuracy and F value. For deep-learning methods, the detection results of the U-Net and Segnet methods in Figure 6f,g are mixed with additional interferences (dark cars, trees) due to the complex features in Venice. The building outline in the BDRAR method is smooth, without sharp edges or corners, resulting in some detailed error, as shown in Figure 6h. As shown in Figure 6i and Table 3, the proposed DLA-PSO method can detect the shadow area more accurately, with an accuracy of 95.1%. The comprehensive evaluation indicator (F) achieves a score of 0.8873, which is the highest score among all the methods. The proposed DLA-PSO method has excellent results in the shadow detection of Venice City.

4.2.3. Comparative Experimental Analysis on Innsbruck Dataset

As shown in the results in Figure 7b,d,e, the SRHI, HTD, and C1C2C3 methods wrongly detect the dark green roofs as shadow, leading to a low accuracy and F value. U-Net and Segnet mistakenly recognize the sunlit dark roofs in the upper part of the figure as shadow in Figure 7f,g. Although the NB and BDRAR methods can well avoid the interference of sunlit dark ground objects, the F value in Table 4 shows that the comprehensive performance of shadow detection is low, and the accuracy is slightly lower than the DLA-PSO method. Compared with deep-learning methods, the proposed DLA-PSO method has a slight improvement in the comprehensive performance of detection, with a score of 85.45%. The shadow-detection accuracy has the highest score: up to 93.78%. The proposed DLA-PSO method has the best effect on remote-sensing shadow detection in Innsbruck City.

4.2.4. Comparative Experimental Analysis on Chicago Dataset

As shown in the results in Figure 8b,d,e, the SRHI, HTD, and C1C2C3 methods detect the dark road surface and lawns in the upper left corner as shadow. The shadow results of the U-Net and Segnet methods mix with additional interferences (sunlit objects), as shown in Figure 8f,g. However, the BDRAR method loses small ground objects in Figure 8h. As observed in Figure 8i and Table 5, the proposed DLA-PSO method can accurately identify shadow regions, with the highest accuracy (94.9%) and F value (0.7849) among all the methods.

4.2.5. Comparative Experimental Analysis on Tyrol Dataset

As shown in Figure 9b,d,e, the SRHI, HTD, and C1C2C3 methods wrongly recognize lawns as shadow due to the similar spectral ratios between the vegetation and the shadow in the Tyrol dataset. For deep-learning methods, the results of the U-Net and Segnet methods identify small sunlit areas by mistake, thereby greatly reducing the detection accuracy, as shown in Figure 9f,g. The NB and BDRAR methods can detect shadow regions well from the original image. However, the proposed DLA-PSO method has a detection accuracy of 97.31% and an F value of 0.8646, which are higher than other methods, as shown in Table 6. Therefore, the proposed method has the best detection performance in Tyrol City.
A boxplot for 50 testing images is introduced to show the F-value distributions of the DLA-PSO method and comparative approaches to verify the robustness of the proposed DLA-PSO. A boxplot contains five indicators, which are the minimum, lower quartile, median, upper quartile, and maximum. The advantage of the boxplot is that it excludes the interference of outliers and depicts the dispersion distribution of the data stably. The region between the lower quartile and upper quartile determines the box size. The smaller the region, the more concentrated the F value.
The average shadow-detection F values of different shadow-detection methods are shown in Table 7. The F values of the SRHI, HTD, and C1C2C3 methods are less than 65%, and those of some of the deep-learning methods, such as U-Net, Segnet, and BDRAR, are above 70%. Deep-learning methods perform better than traditional methods. The F value of the proposed DLA-PSO is 82.70%, which is higher than all the compared methods. Although the average detection F value of DLA-PSO is 0.35% higher than that of the U-Net method, the U-Net method requires training samples in advance, while the proposed method is an adaptive unsupervised-shadow-detection method, which can quickly and accurately identify the shadow area. Most importantly, the average F value in Table 7 needs to be comprehensively analyzed in combination with the box diagram of the F-value distribution in Figure 10. The size of the box and the region between the lower quartile and upper quartile, and between the maximum and minimum of the DLA-PSO method, are the smallest, indicating the strongest robustness of DLA-PSO. The box height of DLA-PSO is the highest, demonstrating superior shadow-detection performance. However, the U-Net method has a larger fluctuation range than the DLA-PSO method, and its lowest value reaches 0.58, indicating the weaker robustness. The above analysis indicates that the DLA-PSO method is robust and excellent in shadow detection.
From the above analysis, the proposed DLA-PSO method is consistent with the real shadow areas. The accuracy and F value of the proposed method are the highest compared with the other methods under five different urban scenes. From the boxplot, the DLA-PSO method performs the best among the compared methods.

4.2.6. Comparative Experimental Analysis on Vegetation Application

In this part, the vegetation images of the Kitsap district and San Francisco are selected to verify the performance of the proposed method for shadow detection in comparison with the existing methods in both qualitative and quantitative ways. The visualization results of the shadow detection on the vegetation images are shown in Figure 11 and Figure 12.
Figure 11b–e and Figure 12b–e show the results of the traditional shadow-detection methods. Vegetation in the same area can show slight color differences due to the light, soil type, vegetation type, and many other factors. Some darker vegetation can be mistaken for shadow, as shown in Figure 11b,d and Figure 12b,d. Secondly, as shadows of vegetation are complex, some small shadows just fall on the light-colored vegetation, resulting in misjudgment, as shown in Figure 11c,e and Figure 12c,e. In conclusion, traditional methods are not effective for vegetation shadow detection. Figure 11f–h and Figure 12f–h show the results of the deep-learning methods for vegetation shadow detection. Compared with traditional methods, the detection accuracy of deep-learning methods is greatly improved. However, the U-Net and Segnet methods also have false detection, as shown in Figure 11f,g and Figure 12f,g. The reason is that these models are not capable of learning irregular shadows of remote-sensing images. The BDRAR model can improve the performance of shadow detection due to the addition of the bidirectional pyramid mechanism. However, some small vegetation shadows will be mistaken as nonshadow in Figure 11h and Figure 12h, which is caused by the insufficient learning ability of these models with regard to small target shadows. The proposed method in this paper can realize the detection of small shadows, and the detected shadows are closer to the real shadows of vegetation visually, as shown in Figure 11i and Figure 12i.
The quantitative results of the vegetation shadow detection in the Kitsap district and San Francisco are shown in Table 8 and Table 9. As can be seen from Table 8 and Table 9, the performances of the traditional methods for shadow detection in vegetation application are generally low, with F values lower than 70% in most cases. The performance of the deep-learning method is higher than that of traditional methods in shadow detection. The accuracy of deep-learning methods is above 90%, and the F value is around 80% in most cases. In particular, due to its fine network design, the shadow-detection performance of the BDRAR model is higher than those of the U-Net and Segnet methods, with the F value reaching 84%. The proposed method in this paper has the highest accuracy of 97% and a superior performance, with the F value closing to 90%. Moreover, the shadows detected by the proposed DLA-PSO method are more consistent with the ground truth in vegetation application. In summary, the proposed method has excellent performance in vegetation shadow detection.

5. Discussion

5.1. Parameter Settings

The number of particles and iterations have an influence on the determination of thresholds in the PSO algorithm. For the DLA-PSO algorithm, the positions of particles refer to the possible segmentation values, and the optimal segmentation value denotes the threshold to differentiate the shadow regions from nonshadow regions. Therefore, the reasonable number of particles (i.e., the possible positions of particles) should range between 0 and 255. If the number of particles is small, then the DLA-PSO algorithm may fall into the local optimum and excessive particles may influence the computational efficiency. The max iteration parameter determines the number of particle movements, which determines the computational efficiency, together with the selection of the particle number. In practice, the number of particles and max iteration should be carefully determined by considering the computational efficiency and detection precision. If the number of max iterations is set to 100 and the initial velocity (v0) is 2, then setting the number of particles in the range of from 20 to 30 is reasonable by comparing the precision and efficiency under different parameters.
In this study, the computational efficiency is compared between the DLA-PSO algorithm and the proposed parameters and other methods. Five different urban datasets from AISD were selected as the benchmark, and the results of the average computing time are shown in Table 10. As shown in Table 10, the average computing time of the DLA-PSO method is the shortest in all the compared methods, verifying the best computational performance of the proposed DLA-PSO under these parameters.

5.2. Shadow/Nonshadow Contrast Effect

In this section, we show that shadow/nonshadow contrast may affect the performance of shadow detection. Shadow/nonshadow contrast is defined as:
R = I s h a d o w I s u n l i t
where I s h a d o w and I s u n l i t represent the average brightness of the shadow and sunlit areas, respectively. The value of R will affect the performance of shadow detection. A large R value means that the difference between shadow and nonshadow is small, which makes shadow detection difficult. This study selects different shadow/nonshadow contrast images from the AISD dataset to explore the relationship between the contrast ratio (R) and the shadow-detection performance (F value). The result is shown in Figure 13.
It can be concluded from Figure 13 that, when the R value is in the range of 0.17~0.51, the F value remains around 0.9, and the shadow-detection performance stays at a high level. However, when R exceeds 0.51, F drops by 20% to around 0.75. In future experiments, the maximum shadow/nonshadow contrast should be set to 0.51, which can ensure the performance of the shadow detection at a high level.

5.3. Advantage and Limitation

In this study, multiple channels (i.e., H-I, S, I) were designed to capture the shadow characteristics comprehensively. The rationality of the H-I channel depends on its ability to strengthen the difference between the H and I channels and distinguish the objects with high hue (lawns, football fields, etc.) and the shadow regions from the image. The influence of the combination order of different channels is studied to improve the performance of shadow detection. The DLA-PSO method is proposed by combining the OTSU and PSO algorithms to calculate the threshold for determining shadow regions adaptively. The experimental results on different urban scenes prove that the proposed approach can achieve a good shadow-detection performance, without a large number of labeled samples. Traditional methods use unsupervised ways to achieve shadow detection, but an adaptive unsupervised method, proposed in this paper, shows a great improvement in shadow detection compared with them. Shadow-detection methods based on the CNN, U-Net, and Segnet frameworks have been highly recommended in recent years. Their advantage lies in the use of the supervised learning method to automatically learn the nonlinear characteristics of training samples so as to achieve a high accuracy of shadow detection. However, the effectiveness of deep-learning models depends on training samples, to a large extent, and the learning ability of shadow details is insufficient. The accuracy of the proposed method in this paper is slightly improved compared with the deep-learning method, but the robustness of the method is higher than the current deep-learning method (as illustrated in Figure 10). In conclusion, the proposed algorithm shows advantages in detection precision and computational efficiency compared with seven mainstream algorithms.
The proposed algorithm has some limitations. The learning factors (i.e., c1 and c2 in Formula (9)) in the DLA-PSO method should be manually specified by users. They affect the determination of the step size of the particle (i) to the direction of the p b e s t i and the g b e s t k 1 and the performance of the DLA-PSO algorithm. Following common practice, c1 and c2 were fixed as constants, and their influence on the shadow-detection results was not investigated. The performance of the proposed algorithm is influenced by the fitness function. In this study, the interclass variance is set as the fitness function of the DLA-PSO algorithm. However, such a fitness function does not guarantee the best shadow detection performance in all situations. In future studies, other fitness functions may be adopted to promote the robustness of the proposed algorithm.

6. Conclusions

In this study, an adaptive unsupervised-shadow-detection approach for remote-sensing images, based on multichannel features, was developed. We designed multiple channels for shadow detection in accordance with the shadow characteristics in the HSI color space. The designed combination of multiple channels can capture the shadow characteristics comprehensively and can eliminate the interferences of high-hue and dark-colored ground objects (e.g., lawns, black roof, and soccer fields) in remote-sensing images. For the optimal threshold selection, a new DLA-PSO method is proposed to find the optimal threshold accurately and quickly. Local regions are optimized through small-area removal and the mathematical morphology operation. The experimental results on different urban scenes show that the proposed approach can detect shadow areas more accurately and efficiently, with the F index being 82.70% in the testing images. Moreover, the proposed method in this study has the highest accuracy of 97% and a superior performance, with the F value closing to 90%. Compared with other traditional and deep-learning methods from the qualitative and quantitative analyses, the superiority and stability of the proposed method in shadow detection are verified. However, the shadow-detection method proposed in this study is mainly based on the visible spectral features of aerial images. In future work, the addition of an infrared spectral band and the redesign of the shadow-detection channels can be considered. Furthermore, future work will focus on the model transferability of the proposed shadow-detection method. Remote-sensing images with different spatial resolutions or complex land-cover types should be applied to verify the robustness of the proposed DLA-PSO method. Shadow removal and image interpretation by incorporating the spectral and spatial information will be the focus of future research.

Author Contributions

Conceptualization of the experiments: Z.H. and Z.Z.; methodology: Z.Z. and M.G.; writing—original draft preparation: Z.H. and Z.Z.; writing—review and editing: Z.Z., L.W. and Y.H. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by National Natural Science Foundation of China (NO. U21A2013; NO. 42171408; NO. 41971356) and the Open Fund of Key Laboratory of Urban Land Resources Monitoring and Simulation, the Ministry of Natural Resources (NO. KF-2020-05-011).

Data Availability Statement

Data available in a publicly accessible repository. The data presented in this study are openly available on https://github.com/RSrscoder/AISD, (accessed on 1 April 2020), reference number [2].

Acknowledgments

The authors would like to sincerely thank all the funders.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Li, Z.; Shen, H.; Li, H.; Xia, G.; Gamba, P.; Zhang, L. Multi-feature combined cloud and cloud shadow detection in GaoFen-1 wide field of view imagery. Remote Sens. Environ. 2017, 191, 342–358. [Google Scholar] [CrossRef]
  2. Luo, S.; Li, H.; Shen, H. Deeply supervised convolutional neural network for shadow detection based on a novel aerial shadow imagery dataset. ISPRS J. Photogramm. Remote Sens. 2020, 167, 443–457. [Google Scholar] [CrossRef]
  3. Adeline, K.; Chen, M.; Briottet, X.; Pang, S.; Paparoditis, N. Shadow detection in very high spatial resolution aerial images: A comparative study. ISPRS J. Photogramm. Remote Sens. 2013, 80, 21–38. [Google Scholar] [CrossRef]
  4. Shettigara, V.K.; Sumerling, G.M. Height determination of extended objects using shadows in SPOT images. Photogramm. Eng. Remote Sens. 1998, 64, 35–44. [Google Scholar]
  5. Chen, Y.; Wen, D.; Jing, L.; Shi, P. Shadow information recovery in urban areas from very high resolution satellite imagery. Int. J. Remote Sens. 2007, 28, 3249–3254. [Google Scholar] [CrossRef]
  6. Irvin, R.B.; McKeown, D.M. Methods for exploiting the relationship between buildings and their shadows in aerial imagery. IEEE Trans. Syst. Man Cybern. 1989, 19, 1564–1575. [Google Scholar] [CrossRef]
  7. Xu, Y.; Luo, W.; Hu, A.; Xie, Z.; Xie, X.; Tao, L. TE-SAGAN: An Improved Generative Adversarial Network for Remote Sensing Super-Resolution Images. Remote Sens. 2022, 14, 2425. [Google Scholar] [CrossRef]
  8. Guo, R.; Dai, Q.; Hoime, D. Single-image shadow detection and removal using paired regions. In CVPR 2011; IEEE: Piscataway, NJ, USA, 2011; pp. 2033–2040. [Google Scholar]
  9. Khan, S.H.; Bennamoun, M.; Sohel, F.; Togneri, R. Automatic Shadow Detection and Removal from a Single Image. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 38, 431–446. [Google Scholar] [CrossRef]
  10. Das, S.; Aery, A. A Review: Shadow Detection and Shadow Removal from Images. Int. J. Eng. Trends Technol. 2013, 4, 1764–1767. [Google Scholar]
  11. Nakajima, T.; Tao, G.; Yasuoka, Y. Simulated recovery of information in shadow areas on IKONOS image by combing ALS data. In Proceedings of the Asian Conference on Remote Sensing, ACRS 2002, Kathmandu, Nepal, 25–29 November 2002; pp. 1–7. [Google Scholar]
  12. Finlayson, G.; Fredembach, C.; Drew, M.S. Detecting Illumination in Images. In Proceedings of the IEEE International Conference on Computer Vision, Rio De Janeiro, Brazil, 14–21 October 2007; IEEE: Piscataway, NJ, USA, 2007; pp. 1–8. [Google Scholar]
  13. Makarau, A.; Richter, R.; Muller, R.; Reinartz, P. Adaptive Shadow Detection Using a Blackbody Radiator Model. IEEE Trans. Geosci. Remote Sens. 2011, 49, 2049–2059. [Google Scholar] [CrossRef]
  14. Salvador, E.; Cavallaro, A.; Ebrahimi, T. Cast shadow segmentation using invariant color features. Comput. Vis. Image Underst. 2004, 95, 238–259. [Google Scholar] [CrossRef]
  15. Yao, J.; Zhang, Z.M. Hierarchical shadow detection for color aerial images. Comput. Vis. Image Underst. 2006, 102, 60–69. [Google Scholar] [CrossRef]
  16. Wang, Q.; Yan, L.; Yuan, Q.; Ma, Z. An Automatic Shadow Detection Method for VHR Remote Sensing Orthoimagery. Remote Sens. 2017, 9, 469. [Google Scholar] [CrossRef]
  17. Tian, J.; Jing, S.; Tang, Y. Tricolor Attenuation Model for Shadow Detection. IEEE Trans. Image Process. 2009, 18, 2355–2363. [Google Scholar] [CrossRef] [PubMed]
  18. Jiang, H.; Peng, M.; Zhong, Y.; Xie, H.; Hao, Z.; Lin, J.; Ma, X.; Hu, X. A Survey on Deep Learning-Based Change Detection from High-Resolution Remote Sensing Images. Remote Sens. 2022, 14, 1552. [Google Scholar]
  19. Li, D.; Wang, S.; Xiang, S.; Li, J.; Yang, Y.; Tang, X.-S. Dual-stream shadow detection network:biologically inspired shadow detection for remote sensing images. Neural Comput. Appl. 2022, 34, 10039–10049. [Google Scholar] [CrossRef]
  20. Zhou, T.; Fu, H.; Sun, C.; Wang, S. Shadow Detection and Compensation from Remote Sensing Images under Complex Urban Conditions. Remote Sens. 2021, 13, 699. [Google Scholar] [CrossRef]
  21. Han, H.; Han, C.; Lan, T.; Huang, L.; Hu, C.; Xue, X. Automatic Shadow Detection for Multispectral Satellite Remote Sensing Images in Invariant Color Spaces. Appl. Sci. 2020, 10, 6467. [Google Scholar] [CrossRef]
  22. Yue, G.; Xing, Z. Building Shadow Detection of Remote Sensing Images Based on Shadow Probability Constraint. Laser Optoelectron. Prog. 2018, 55, 041006. [Google Scholar] [CrossRef]
  23. Luo, S.; Shen, H.; Li, H.; Chen, Y. Shadow removal based on separated illumination correction for urban aerial remote sensing images. Signal Process. 2019, 165, 197–208. [Google Scholar] [CrossRef]
  24. Zhu, Z.; Woodcock, C.E. Object-based cloud and cloud shadow detection in Landsat imagery. Remote Sens. Environ. 2012, 118, 83–94. [Google Scholar] [CrossRef]
  25. Lee, G.-B.; Lee, M.-J.; Lee, W.-K.; Park, J.-H.; Kim, T.-H. Shadow Detection Based on Regions of Light Sources for Object Extraction in Nighttime Video. Sensors 2017, 17, 659. [Google Scholar] [CrossRef] [PubMed]
  26. Yang, J.; Zhao, Z. Shadow processing method based on normalized RGB color model. Opto-Electron. Eng. 2007, 34, 92–96. [Google Scholar]
  27. Tsai, V.J.D. A comparative study on shadow compensation of color aerial images in invariant color models. IEEE Trans. Geosci. Remote Sens. 2006, 44, 1661–1671. [Google Scholar] [CrossRef]
  28. Sarabandi, P.; Yamazaki, F.; Matsuoka, M.; Kiremidjian, A. Shadow detection and radiometric restoration in satellite high resolution images. Geoscience & Remote Sensing Symposium. IGARSS Proceedings. IEEE Int. 2004, 6, 3744–3747. [Google Scholar]
  29. Bao, H.; Yan, L. Research on shadow detection and shadow elimination methods for urban aerial images. Remote Sens. Inf. 2010, 1, 44–47. [Google Scholar]
  30. Liu, J.Z.; Li, W.Q. Automatic thresholding of gray-level pictures via two-dimensional OTSU method. Zidonghua Xuebao/Acta Autom. Sin. 1993, 19, 101–105. [Google Scholar]
  31. Kapur, J.N.; Sahoo, P.K.; Wong, A.K.C. A new method for gray-level picture thresholding using the entropy of the histogram. Comput. Vis. Graph. Image Process. 1985, 29, 140. [Google Scholar] [CrossRef]
  32. Shi, J.; Malik, J. Normalized Cuts and Image Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2000, 22, 888–905. [Google Scholar]
  33. Lorenzi, L.; Melgani, F.; Mercier, G. A Complete Processing Chain for Shadow Detection and Reconstruction in VHR Images. IEEE Trans. Geosci. Remote Sens. 2012, 50, 3440–3452. [Google Scholar] [CrossRef]
  34. Yuan, X.; Ebner, M.; Wang, Z. Single-image shadow detection and removal using local colour constancy computation. Image Processing IET 2015, 9, 118–126. [Google Scholar] [CrossRef]
  35. Long, J.; Shelhamer, E.; Darrell, T. Fully Convolutional Networks for Semantic Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 39, 640–651. [Google Scholar]
  36. Xu, Y.; Jin, S.; Chen, Z.; Xie, X.; Hu, S.; Xie, Z. Application of a graph convolutional network with visual and semantic features to classify urban scenes. Int. J. Geogr. Inf. Sci. 2022, 1–26. [Google Scholar] [CrossRef]
  37. Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015; Springer International Publishing: Cham, Switzerland, 2015; pp. 234–241. [Google Scholar]
  38. Badrinarayanan, V.; Kendall, A.; Cipolla, R. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2481–2495. [Google Scholar] [CrossRef]
  39. Lei, Z.; Deng, Z.; Hu, X.; Fu, C.-W.; Xu, X.; Qin, J.; Heng, P.-A. Bidirectional Feature Pyramid Network with Recurrent Attention Residual Modules for Shadow Detection. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; Springer International Publishing: Cham, Switzerland, 2008; pp. 122–137. [Google Scholar]
  40. Ding, J.; Xue, N.; Xia, G.-S.; Bai, X.; Yang, W.; Yang, M.; Belongie, S.; Luo, J.; Datcu, M.; Pelillo, M.; et al. Object Detection in Aerial Images: A Large-Scale Benchmark and Challenges. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 1. [Google Scholar] [CrossRef]
  41. Maggiori, E.; Tarabalka, Y.; Charpiat, G.; Alliez, P. Can semantic labeling methods generalize to any city? In Proceedings of the 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Fort Worth, TX, USA, 23–28 July 2017; pp. 3226–3229. [Google Scholar] [CrossRef]
  42. Gonzalez, R.C.; Woods, R.E. Digital image processing. IEEE Trans. Acoust. Speech Signal Processing 1980, 28, 484–486. [Google Scholar]
  43. Kotecha, J.H.; Djuric, P.M. Gaussian particle filtering. IEEE Trans. Signal Processing 2003, 51, 2592–2601. [Google Scholar] [CrossRef]
  44. Liu, J.; Tao, F.; Li, D. Shadow Detection in Remotely Sensed Images Based on Self-Adaptive Feature Selection. IEEE Trans. Geosci. Remote Sens. 2011, 49, 5092–5103. [Google Scholar]
  45. Polidorio, A.M.; Flores, F.C.; Imai, N.N.; Tommaselli, A.M.G.; Franco, C. Automatic shadow segmentation in aerial color images. In Proceedings of the 16th Brazilian Symposium on Computer Graphics and Image Processing (SIBGRAPI 2003), Sao Carlos, Brazil, 12–15 October 2003; IEEE: Piscataway, NJ, USA, 2003; pp. 270–277. [Google Scholar]
  46. Liu, D.; Zhang, J.; Wu, Y.; Zhang, Y. A Shadow Detection Algorithm Based on Multiscale Spatial Attention Mechanism for Aerial Remote Sensing Images. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
  47. Di, Z.; Wang, H.; Xu, W. Adaptive particle swarm optimization for medical image registration. In Proceedings of the 2011 International Conference on Electrical and Control Engineering, Yichang, China, 16–18 September 2011. [Google Scholar]
  48. Mandal, D.; Chatterjee, A.; Maitra, M. Robust medical image segmentation using particle swarm optimization aided level set based global fitting energy active contour approach. Eng. Appl. Artif. Intell. 2014, 35, 199–214. [Google Scholar] [CrossRef]
  49. Ahmed, S.A.; Ghali, N.I.; Hassanien, A.E. Optimize the correspondence using Particle Swarm Optimization for medical image registration. In Proceedings of the International Conference on Hybrid Intelligent Systems, Gammarth, Tunisia, 4–6 December 2013. [Google Scholar]
  50. Manikantan, K.; Arun, B.V.; Yaradoni, D. Optimal Multilevel Thresholds based on Tsallis Entropy Method using Golden Ratio Particle Swarm Optimization for Improved Image Segmentation. Procedia Eng. 2012, 30, 364–371. [Google Scholar] [CrossRef][Green Version]
  51. Yetirajam, M.; Jena, P.K. Enhanced Color Image Segmentation of Foreground Region using Particle Swarm Optimization. Int. J. Comput. Appl. 2012, 28, 283–287. [Google Scholar]
  52. Li, H.; He, H.; Wen, Y. Dynamic particle swarm optimization and K-means clustering algorithm for image segmentation. Opt.—Int. J. Light Electron Opt. 2015, 126, 4817–4822. [Google Scholar] [CrossRef]
  53. Kennedy, J.; Eberhart, R. Particle Swarm Optimization. In Proceedings of the ICNN’95—International Conference on Neural Networks, Perth, Australia, 27 November–1 December 1995; Volume 4, pp. 1942–1948. [Google Scholar]
  54. Trelea, I.C. The particle swarm optimization algorithm: Convergence analysis and parameter selection. Inf. Processing Lett. 2003, 85, 317–325. [Google Scholar] [CrossRef]
  55. Zhang, L.P.; Yu, H.J.; Hu, S.X. Optimal choice of parameters for particle swarm optimization. J. Zhejiang Univ. Ence 2005, 6, 528–534. [Google Scholar]
  56. Shi, Y.; Eberhart, R.C. Empirical study of particle swarm optimization. In Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406), Washington, DC, USA, 6–9 July 1999. [Google Scholar]
  57. Sandi, D. Mathematical morphology in image analysis. In Proceedings of the XI Conference on Applied Mathematics, PRIM’96, Budva, Montenegro, 1 June 1996. [Google Scholar]
  58. Zhang, Z.; Lin, H.; Wang, M.; Liu, X.; Chen, Q.; Wang, C.; Zhang, H. A Review of Satellite Synthetic Aperture Radar Interferometry Applications in Permafrost Regions: CurrentStatus, Challenges, and Trends. IEEE Geosci. Remote Sens. Mag. 2022, 2–23. [Google Scholar] [CrossRef]
  59. Dong, Y.; Feng, H.; Xu, Z.; Chen, Y.; Li, Q. Attention Res-Unet: An efficient shadow detection algorithm. J. Zhejiang Univ. (Eng. Sci.) 2019, 53, 373–381. [Google Scholar]
  60. Jin, Y.; Xu, W.; Hu, Z.; Jia, H.; Luo, X.; Shao, D. GSCA-UNet: Towards Automatic Shadow Detection in Urban Aerial Imagery with Global-Spatial-Context Attention Module. Remote Sens. 2020, 12, 2864. [Google Scholar] [CrossRef]
  61. Chai, D.; Newsam, S.; Zhang, H.K.; Qiu, Y.; Huang, J. Cloud and cloud shadow detection in Landsat imagery based on deep convolutional neural networks. Remote Sens. Environ. 2019, 225, 307–316. [Google Scholar] [CrossRef]
Figure 1. Process of shadow-detection method for remote-sensing images.
Figure 1. Process of shadow-detection method for remote-sensing images.
Remotesensing 14 02756 g001
Figure 2. (a) Original Image; (b) ground truth; (c) H channel; (d) H-I channel; (e) S channel; (f) I channel; (g) shadow-detection results.
Figure 2. (a) Original Image; (b) ground truth; (c) H channel; (d) H-I channel; (e) S channel; (f) I channel; (g) shadow-detection results.
Remotesensing 14 02756 g002
Figure 3. (a) Original image; (b) ground truth; (c) initial detection chart; (d) small-area-removal diagram; (e) mathematical morphology closed-operation diagram.
Figure 3. (a) Original image; (b) ground truth; (c) initial detection chart; (d) small-area-removal diagram; (e) mathematical morphology closed-operation diagram.
Remotesensing 14 02756 g003
Figure 4. (a) Original image; (b) hue channel; (c) saturation channel; (d) intensity channel; (e) H-I channel; (f) nonordered HSI; (g) DLA-PSO.
Figure 4. (a) Original image; (b) hue channel; (c) saturation channel; (d) intensity channel; (e) H-I channel; (f) nonordered HSI; (g) DLA-PSO.
Remotesensing 14 02756 g004
Figure 5. (a) Original image; (b) SRHI; (c) NB; (d) HTD; (e) C1C2C3; (f) U-Net method; (g) Segnet method; (h) BDRAR method; (i) DLA-PSO method; (j) ground truth.
Figure 5. (a) Original image; (b) SRHI; (c) NB; (d) HTD; (e) C1C2C3; (f) U-Net method; (g) Segnet method; (h) BDRAR method; (i) DLA-PSO method; (j) ground truth.
Remotesensing 14 02756 g005
Figure 6. (a) Original image; (b) SRHI; (c) NB; (d) HTD; (e) C1C2C3; (f) U-Net method; (g) Segnet method; (h) BDRAR method; (i) DLA-PSO method; (j) ground truth.
Figure 6. (a) Original image; (b) SRHI; (c) NB; (d) HTD; (e) C1C2C3; (f) U-Net method; (g) Segnet method; (h) BDRAR method; (i) DLA-PSO method; (j) ground truth.
Remotesensing 14 02756 g006
Figure 7. (a) Original image; (b) SRHI; (c) NB; (d) HTD; (e) C1C2C3; (f) U-Net method; (g) Segnet method; (h) BDRAR method; (i) DLA-PSO method; (j) ground truth.
Figure 7. (a) Original image; (b) SRHI; (c) NB; (d) HTD; (e) C1C2C3; (f) U-Net method; (g) Segnet method; (h) BDRAR method; (i) DLA-PSO method; (j) ground truth.
Remotesensing 14 02756 g007
Figure 8. (a) Original image; (b) SRHI; (c) NB; (d) HTD; (e) C1C2C3; (f) U-Net method; (g) Segnet method; (h) BDRAR method; (i) DLA-PSO method; (j) ground truth.
Figure 8. (a) Original image; (b) SRHI; (c) NB; (d) HTD; (e) C1C2C3; (f) U-Net method; (g) Segnet method; (h) BDRAR method; (i) DLA-PSO method; (j) ground truth.
Remotesensing 14 02756 g008
Figure 9. (a) Original image; (b) SRHI; (c) NB; (d) HTD; (e) C1C2C3; (f) U-Net method; (g) Segnet method; (h) BDRAR method; (i) DLA-PSO method; (j) ground truth.
Figure 9. (a) Original image; (b) SRHI; (c) NB; (d) HTD; (e) C1C2C3; (f) U-Net method; (g) Segnet method; (h) BDRAR method; (i) DLA-PSO method; (j) ground truth.
Remotesensing 14 02756 g009
Figure 10. Boxplot of the F-score distributions of different shadow-detection methods.
Figure 10. Boxplot of the F-score distributions of different shadow-detection methods.
Remotesensing 14 02756 g010
Figure 11. Vegetation shadow detection in Kitsap district: (a) original image; (b) SRHI; (c) NB; (d) HTD; (e) C1C2C3; (f) U-Net method; (g) Segnet method; (h) BDRAR method; (i) DLA-PSO method; (j) ground truth.
Figure 11. Vegetation shadow detection in Kitsap district: (a) original image; (b) SRHI; (c) NB; (d) HTD; (e) C1C2C3; (f) U-Net method; (g) Segnet method; (h) BDRAR method; (i) DLA-PSO method; (j) ground truth.
Remotesensing 14 02756 g011
Figure 12. Vegetation shadow detection in San Francisco district: (a) original image; (b) SRHI; (c) NB; (d) HTD; (e) C1C2C3; (f) U-Net method; (g) Segnet method; (h) BDRAR method; (i) DLA-PSO method; (j) ground truth.
Figure 12. Vegetation shadow detection in San Francisco district: (a) original image; (b) SRHI; (c) NB; (d) HTD; (e) C1C2C3; (f) U-Net method; (g) Segnet method; (h) BDRAR method; (i) DLA-PSO method; (j) ground truth.
Remotesensing 14 02756 g012
Figure 13. Relationship between shadow/nonshadow contrast (R) and F value.
Figure 13. Relationship between shadow/nonshadow contrast (R) and F value.
Remotesensing 14 02756 g013
Table 1. Comparative evaluation for designed channels.
Table 1. Comparative evaluation for designed channels.
Single ChannelMultiple Channels
HSIH-INonordered HSIDLA-PSO
Precision0.65960.49660.33730.91580.98620.9393
Recall0.95630.70630.99110.94040.68160.9379
F0.78070.58320.50330.92790.80610.9386
Table 2. Austin shadow-test-evaluation index.
Table 2. Austin shadow-test-evaluation index.
SRHINBHTDC1C2C3U-NetSegnetBDRARDLA-PSO
Precision0.58010.66870.60610.32280.80460.73830.93910.9646
Recall0.86230.87710.97180.77390.94820.95850.69030.8042
F-Measure0.69360.75890.74660.45560.87050.83410.79570.8771
Accuracy0.87740.91030.89380.70230.95460.93860.94290.9606
Table 3. Vienna shadow-test-evaluation index.
Table 3. Vienna shadow-test-evaluation index.
SRHINBHTDC1C2C3U-NetSegnetBDRARDLA-PSO
Precision0.83620.85710.68520.73230.8650.820.9760.9348
Recall0.76630.77080.94450.74670.90720.9180.75120.8443
F-Measure0.79970.81160.79420.73940.88650.86620.8490.8873
Accuracy0.91230.91830.88810.87970.94640.93740.93890.951
Table 4. Innsbruck shadow-test-evaluation index.
Table 4. Innsbruck shadow-test-evaluation index.
SRHINBHTDC1C2C3U-NetSegnetBDRARDLA-PSO
Precision0.40410.87750.65730.41950.82820.81160.97990.9036
Recall0.65520.67590.94130.73350.85290.88790.61040.8105
F-Measure0.49990.76360.77410.53370.84030.8480.75220.8545
Accuracy0.70720.90650.87730.71370.92760.9350.91020.9378
Table 5. Chicago shadow-test-evaluation index.
Table 5. Chicago shadow-test-evaluation index.
SRHINBHTDC1C2C3U-NetSegnetBDRARDLA-PSO
Precision0.39370.86420.52990.5190.68050.58940.90480.9215
Recall0.84220.70430.94360.69080.88410.92680.63260.6837
F-Measure0.53650.77610.67870.59270.76910.72050.74460.7849
Accuracy0.80750.94620.88180.87440.92980.90490.94260.949
Table 6. Tyrol shadow-test-evaluation index.
Table 6. Tyrol shadow-test-evaluation index.
SRHINBHTDC1C2C3U-NetSegnetBDRARDLA-PSO
Precision0.19460.98880.24790.21250.87880.76460.96820.9822
Recall0.78720.72980.99340.73010.71760.75070.72380.7721
F-Measure0.31210.83980.39680.32920.78800.75760.82840.8646
Accuracy0.61460.96910.66460.66960.93570.92000.96670.9731
Table 7. Average shadow-detection F values of different shadow-detection methods.
Table 7. Average shadow-detection F values of different shadow-detection methods.
IndexSRHINBHTDC1C2C3U-NetSegnetBDRARDLA-PSO
F58.04%74.52%64.55%53.65%82.35%78.15%72.15%82.70%
Table 8. Evaluation index for vegetation shadow in Kitsap district.
Table 8. Evaluation index for vegetation shadow in Kitsap district.
SRHINBHTDC1C2C3U-NetSegnetBDRARDLA-PSO
Precision0.44250.99980.60550.88890.74460.69620.95430.8856
Recall0.66650.69050.91010.32540.88240.95490.76070.9664
F-Measure0.53190.81690.72720.47640.80760.80530.84650.9243
Accuracy0.82410.95360.89760.89310.93910.93140.96130.9732
Table 9. Evaluation index for vegetation shadow in San Francisco district.
Table 9. Evaluation index for vegetation shadow in San Francisco district.
SRHINBHTDC1C2C3U-NetSegnetBDRARDLA-PSO
Precision0.57110.94360.45860.54820.69970.5840.93230.8665
Recall0.83970.4470.99860.4880.94120.97290.72080.9091
F-Measure0.67980.60660.62860.51640.80260.72990.8130.8873
Accuracy0.91030.93420.86970.8990.95830.93010.9630.9745
Table 10. The comparison of average computing times on different urban datasets.
Table 10. The comparison of average computing times on different urban datasets.
Time (s)Austin DatasetVenice DatasetInnsbruck DatasetChicago DatasetTyrol Dataset
DLA-PSO1.51 s1.65 s1.39 s1.59 s1.07 s
SRHI7.9 s5.41 s2.3 s8.45 s1.73 s
NB6.4 s3.92 s1.85 s6.23 s1.48 s
HTD5.94 s2.65 s1.67 s6.95 s1.68 s
C1C2C37.91 s5.46 s3 s7.15 s2.26 s
U-Net5.28 s4.17 s4.64 s4.1 s2.02 s
Segnet10.41 s5.1 s4.6 s4.53 s2.05 s
BDRAR11.76 s4.36 s3.03 s4.32 s3.7 s
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

He, Z.; Zhang, Z.; Guo, M.; Wu, L.; Huang, Y. Adaptive Unsupervised-Shadow-Detection Approach for Remote-Sensing Image Based on Multichannel Features. Remote Sens. 2022, 14, 2756. https://doi.org/10.3390/rs14122756

AMA Style

He Z, Zhang Z, Guo M, Wu L, Huang Y. Adaptive Unsupervised-Shadow-Detection Approach for Remote-Sensing Image Based on Multichannel Features. Remote Sensing. 2022; 14(12):2756. https://doi.org/10.3390/rs14122756

Chicago/Turabian Style

He, Zhanjun, Zhizheng Zhang, Mingqiang Guo, Liang Wu, and Ying Huang. 2022. "Adaptive Unsupervised-Shadow-Detection Approach for Remote-Sensing Image Based on Multichannel Features" Remote Sensing 14, no. 12: 2756. https://doi.org/10.3390/rs14122756

APA Style

He, Z., Zhang, Z., Guo, M., Wu, L., & Huang, Y. (2022). Adaptive Unsupervised-Shadow-Detection Approach for Remote-Sensing Image Based on Multichannel Features. Remote Sensing, 14(12), 2756. https://doi.org/10.3390/rs14122756

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop