Next Article in Journal
Urban Visual Localization of Block-Wise Monocular Images with Google Street Views
Previous Article in Journal
Mapping and Monitoring of the Invasive Species Dichrostachys cinerea (Marabú) in Central Cuba Using Landsat Imagery and Machine Learning (1994–2022)
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Multiscale Change Detection Domain Adaptation Model Based on Illumination–Reflection Decoupling

1
School of Automation, Northwestern Polytechnical University, Xi’an 710129, China
2
Shaanxi Provincial Innovation Center for Geology and Intelligent Remote Sensing Application, Xi’an 710129, China
3
School of Aeronautics and Astronautics, Fudan University, Shanghai 200433, China
4
China Association for Science and Technology Service Center for Societies, Beijing 100038, China
5
School of Marine Science and Technology, Northwestern Polytechnical University, Xi’an 710129, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2024, 16(5), 799; https://doi.org/10.3390/rs16050799
Submission received: 15 January 2024 / Revised: 8 February 2024 / Accepted: 14 February 2024 / Published: 25 February 2024
(This article belongs to the Section AI Remote Sensing)

Abstract

:
In the change detection (CD) task, the substantial variation in feature distributions across different CD datasets significantly limits the reusability of supervised CD models. To alleviate this problem, we propose an illumination–reflection decoupled change detection multi-scale unsupervised domain adaptation model, referred to as IRD-CD-UDA. IRD-CD-UDA maintains its performance on the original dataset (source domain) and improves its performance on unlabeled datasets (target domain) through a novel CD-UDA structure and methodology. IRD-CD-UDA synergizes mid-level global feature marginal distribution domain alignment, classifier layer feature conditional distribution domain alignment, and an easy-to-hard sample selection strategy to increase the generalization performance of CD models on cross-domain datasets. Extensive experiments conducted on the LEVIR, SYSU, and GZ optical remote sensing image datasets demonstrate that the IRD-CD-UDA model effectively mitigates feature distribution discrepancies between source and target CD data, thereby achieving optimal recognition performance on unlabeled target domain datasets.

1. Introduction

Remote sensing image change detection (CD) involves extracting features from multi-temporal homogeneous and heterogeneous images to identify and analyze variations in terrestrial instances within the same geographic area [1]. The ability of deep neural networks to map nonlinear feature spaces is an effective approach for extracting high-dimensional hidden features and minimizing interference. As a result, deep learning models have become increasingly popular in CD tasks [2,3,4,5].
However, differences in imaging modes, spatial resolution, spectral irradiance, and noise perturbations among change detection (CD) datasets can lead to covariate shifts [6]. Furthermore, the heterogeneity of multi-temporal data distributions can hinder the generalization capabilities of CD models [7]. Unfortunately, CD models based on supervised deep learning are highly dependent on the feature distribution of the training data. This leads to significant performance degradation when models trained on labeled datasets (referred to as the source domain) are tested on unlabeled datasets (referred to as the target domain). To improve model robustness, it is necessary to continuously integrate annotated data with a variety of feature distributions. However, annotating remote sensing image CD data is a costly and labor-intensive process. Furthermore, the rapid deployment of models on unannotated data with different feature distributions for urgent tasks is impractical.
In the field of CD, methods to improve performance on unlabeled target domain data primarily include fine-tuning strategies based on prior knowledge and unsupervised domain adaptation (UDA). Fine-tuning strategies typically include pre-training, difference mapping, and post-processing techniques [8,9,10]. These strategies require a substantial amount of annotated data from the source domain to train the model, and do not directly reduce the disparity in feature distribution between different data domains.
For related but not identical source and target domain data, an effective method to improve the recognition performance of CD model on target domain data is to use UDA, called CD-UDA. CD-UDA reduces the distributional differences between two data domains by extracting the shared invariant feature representations between the source and target domains [11]. For the CD model, there is research aimed at reducing the visual difference between the source and target domains and improving the performance on the target domain through an image translation model. The main methods include CycleGAN  [12,13], attention-gate based GANs [14], and multiple discriminators [15]. However, image-to-image translation methods may unintentionally alter the image content, resulting in visual inconsistencies between the original and translated images. Furthermore, this problem is exacerbated by the lack of constraints between pairs of bi-temporal CD data.
Therefore, employing domain feature alignment strategies to achieve CD-UDA is an important research direction. However, the implementation of domain feature alignment in CD-UDA faces several challenges, as follows:
  • Lack of a specific UDA framework for CD. Unlike other per-pixel segmentation models, CD models are bi-stream models designed specifically for bi-temporal data. The differential features generated by CD models have difficulty accurately representing bi-temporal data. In addition, the complexity of land cover in CD data leads to poor intra-class consistency in the features generated by CD models, increasing the difficulty of feature alignment;
  • Imbalance in CD samples. The number of samples in unaltered areas significantly exceeds the number of samples in altered areas. This imbalance can cause CD-UDA models to converge to local optima, incorrectly identifying changed areas as unchanged. This in turn leads to the generation of incorrect pseudo-labels for the target domain.
To address the aforementioned problems, we propose a novel illumination–reflection decoupled change detection domain adaptation (IRD-CD-UDA) framework. The IRD-CD-UDA is able to achieve illumination–reflection feature decoupling and multi-scale feature fusion, effectively reducing the global discrepancies in multi-temporal remote sensing images. Tailored to the specifics of CD-UDA, we have developed a domain alignment strategy for mid-layer global feature marginal distributions, a domain alignment strategy for classifier-layer feature conditional distributions, and an easy-to-hard sample selection strategy based on CD map probability thresholds. The novel contributions of this paper can be summarized as follows:
  • A high-performance feature extraction model with illumination–reflection feature decoupling (IRD) and fusion is proposed. The model consists of a module for extracting low-frequency illumination features, a module for extracting high-frequency reflection features, a module for fusing high-frequency and low-frequency features, and a module for decoding difference features. The IRD model can achieve high-performance supervised CD by decoupling the strongly coupled illumination and reflection features, and provides a backbone model for the UDA-CD strategy;
  • A global marginal distribution domain alignment method for middle layer illumination features is proposed. Utilizing the IRD model to extract shared illumination features from multi-temporal data, this method represents bi-temporal data in both source and target domains, promoting global style alignment and stable model convergence between the domains;
  • A conditional distribution domain alignment method for deep features is designed. By minimizing intra-domain disparities and maximizing inter-domain differences, this approach alleviates the issue of fitting features of different classes into the same feature space, due to the poor intra-class consistency of differential features;
  • An easy-to-hard sample selection strategy based on the CD entropy threshold is proposed. Aiming to obtain reliable pseudo-labels with balanced high confidence, this strategy reduces the transfer failure caused by sample imbalance and promotes stable convergence in CD-UDA.
The remainder of this paper is organized as follows. Section 2 details the basic principles and the latest method of domain adaptation; Section 3 introduces the IRD-CD-UDA model, including the algorithm framework and domain adaptation strategies. Section 4 outlines the experiments conducted using the proposed algorithm on three datasets and discusses the experimental results of the study. The paper is concluded in Section 5.

2. Domain Adaptation

Feature alignment-based UDA methods are an important research direction for CD-UDA. Firstly, the domain D in DA is the subject of learning and contains data samples (samples: x i X , labels: y i Y ). The domain with a large number of labeled samples is usually called the source domain D s = { x i , y i } i = 1 n t , and the direction in which the model needs to transfer is the target domain D t = { x j , y j } j = 1 n s . The feature distribution-based transformation method is based on the given distance measure to determine the feature transformation T. Its optimization function is as follows:
f * = arg min f F 1 n s i = 1 n s l ( f ( x i ) , y i ) + λ R ( T ( n s ) , T ( n t ) ) s . t . n s n t ,
where n s is the number of samples of source domain data; T ( · ) is the feature mapping function; f ( · ) is a feature extractor; l ( · ) is the supervised classification loss on the source domain; R ( T ( D s ) , T ( D t ) ) is the transfer regularization term; and λ is the weight for feature distribution alignment. The idea behind DA is as follows: learn a feature transform T ( · ) to reduce regularization R ( · ) and improve similarity.
The transfer method based on feature transformation is directly related to the measurement of probability distribution differences [16]. According to the multi-domain feature distribution difference metrics, DA methods based on feature distribution transformation methods mainly include marginal distribution alignment [17] and conditional distribution alignment [18]. The former assumes that the marginal distributions between domains are different ( P s ( x ) P t ( x ) ), but the conditional probabilities are the same ( P s ( y | x ) P t ( y | x ) ). It achieves the transfer by reducing the distance of the marginal probability distribution between two domains, i.e.,  D ( P s ( y | x ) , P t ( y | x ) ) D ( P s ( x ) , P t ( x ) ) . Common metrics are MMD [19], KL dispersion [20], and CORAL [21].
After development, MKMMD [22] introduced the concept of multi-kernel function mapping into MMD. JDA [23] and VDA [24] introduce conditional probability loss to MMD. DDC [25], DAN [16], and JMMD [26] use MMD for deep features generated by deep learning. BDA [27] and DDAN [28] acquire coefficients by computing the A-distance between source and target domains to harmonize conditional/marginal probability fractions. WDAN [29] appends the sample class fractions in the source domain to the MMD. DGR [30] regulates intraclass and interclass loss fractions using Wasserstein distance. CAN [31] optimizes an MMD to explicitly model intra-class domain disparities and inter-class domain discrepancies. In the area of CD, Chen et al. [32] first proposed the use of multi-kernel maximum mean difference (MK-MMD) to manipulate the disparity features generated by the Siamese fully connected network in order to achieve alignment between the source and target domain disparity features. The selection of high-confidence shared features between multiple domains is key to improving the efficiency and performance of DA. In general, sample selection strategies are combined with measures of feature distribution differences, such as migration strategies based on centroid distance iteration [33]; migration schemes based on centroid memory mechanisms [34]; and migration strategies based on course management [35].
Based on the above introduction, the UDA method based on feature alignment mainly consists of high-confidence sample screening and to-be-aligned feature extraction with domain feature difference metric calculation. Therefore, our work is mainly based on these three parts and is targeted at CD-UDA.

3. Methodology

The proposed multi-scale change detection unsupervised domain adaptation model based on illumination–reflection decoupling (IRD-CD-UDA) is shown in Figure 1. The model implements an end-to-end CD-UDA framework that includes a mid-deep layer domain alignment strategy and a target domain bi-temporal high-confidence pseudo-labeled sample selection strategy.
The IRD-CD-UDA is comprised of the following five sections: 1. The low-frequency illumination module extracts the shared global features of the two temporal images (Section 3.1.1) and is also used to align the marginal cross-domain distributions in the middle layer (Section 3.2.1); 2. The high-frequency reflection module extracts the content features of the two temporal data points (Section 3.1.2); 3. The Atrous Apatial Pyramid Pooling (ASSP)-based fusion module [36] is utilized to achieve the multi-scale fusion of illumination and reflection features (Section 3.1.3); 4. Using the difference feature decoding module to map the fused difference feature into a CD probability map (Section 3.1.4), where the classifier layer features are used for deep domain conditional distribution alignment (Section 3.2.2); 5. An easy-to-hard sample selection strategy based on entropy is implemented to achieve CD-UDA transferable layer feature selection and generation (Section 3.2.3).

3.1. Structure of the Illumination–Reflection Decoupled Change Detection Multi-Scale Unsupervised Domain Adaptation Model (IRD-CD-UDA)

For global disturbances affecting multi-temporal remote sensing images, such as illumination, shading, and color distribution differences, they can cause significant distortions in feature distributions at different times within the same region, and lead to the CD model not being able to effectively extract robust difference features [37,38]. Therefore, the CD model with illumination-reflection decoupling is established by relying on the Retinex theory [39] in image enhancement algorithms. According to the Retinex theory, an image can be described as the product of illumination and reflectance components, where the reflectance component usually represents an intrinsic property of the object. The general formulation of the Retinex theory is as follows:
I ( x , y ) = R ( x , y ) × L ( x , y ) ,
where I ( x , y ) represents the original image captured by the sensors, L ( x , y ) represents the illumination component, which reflects the overall illumination information of the image, and  R ( x , y ) is the reflectance component, which is used to characterize the inherent properties of the object. The main factors determining the effectiveness of remote sensing imaging are the reflectivity of ground objects and lighting conditions. Assuming that the satellite is located at the zenith and not taking into account the scattering and absorption effects of the atmosphere, the reflectance is the ratio of the intensity of the reflected radiation received by the sensor from the ground target to the intensity of the solar radiation received by the ground target itself.
Based on the aforementioned idea, considering the problem of coupling between illumination and reflectance features in remote sensing images, a deep network model is employed to achieve image feature decoupling. The schematic diagram of the model, shown in Figure 1, includes a low-frequency illumination feature extraction module, a high-frequency reflectance feature extraction module, a illumination and reflectance feature fusion module, and a difference feature decoding module.

3.1.1. Illumination Feature Extraction Module

The role of the low-frequency illumination feature extraction module consists of three main parts. First, it aims to correct the global illumination features of two remote sensing images by ignoring the discrete high-frequency difference information, so as to effectively mitigate the distribution differences caused by global factors (illumination and atmosphere). Second, the module generates global representation features of the source or target domain to provide domain-discriminative information for CD-UDA, and minimizes cross-domain global differences in subsequent convolutional neural network processing. Finally, the bi-temporal shared features generated by the model in the intermediate layer are used to achieve cross-domain marginal distribution alignment, which improves overall performance.
The structure diagram of the low-frequency illumination feature extraction module is shown in Figure 2. The low-frequency illumination feature extraction module is a two-stream structure with shared parameters. The operational steps are as follows: inputting multi-temporal images { X 1 , X 2 } ; using convolutional networks to realize feature extraction, which includes convolutional operation with a kernel of seven to realize feature extraction with a large receptive field; finally, the generated deep features are inputted to the spatial attention module [40] to obtain the low-frequency illumination features X L R 1 × H 8 × W 8 , where H and W are the height and width of the feature, respectively.

3.1.2. Reflection Feature Extraction Module

The high-frequency reflection feature extraction module is the backbone module of the IRD-CD model and is composed of residual structures [41]. It aims to improve the CD model’s ability to identify changed regions by emphasizing the edges of critical feature targets through high-frequency texture feature weighting operations. As shown in Algorithm 1, the operational steps for the high-frequency reflection feature extraction module are as follows: 1. Convert the multi-temporal data to the Lab color space separately, extract the L channel to obtain { X L 1 , X L 2 } ; 2. Convolve { X L 1 , X L 2 } with two fixed parameter Sobel operators to derive the high-frequency texture features { X L n S x , X L n S y } , n = 1 , 2 in both X and Y directions; 3. Use the square root operation to compute the texture weighting coefficients { X L 1 W , X L 2 W | X L n W = X L n S x 2 + X L n S y 2 , n = 1.2 } ; 4. Apply the weighting to { X 1 , X 2 } , yielding { X 1 W I , X 2 W I | X n W I = X n + X n W , n = 1.2 } ; 5. Obtain the high-frequency texture features { X 1 H , X 2 H | X n H R 128 × H 8 × W 8 } by using a residue structure.
Algorithm 1 Reflection feature extraction module
Require: multi-temporal images { X 1 , X 2 } .
 High frequency weighting:
 1:
calculate the L channel { X L 1 , X L 2 } ;
 2:
calculate high-frequency texture features { X L n S x , X L n S y } , n = 1 , 2 by Sobel operators;
 3:
calculate texture weighting factor { X L 1 W , X L 2 W } ;
 4:
weight high frequency feature { X 1 W I , X 2 W I } ;
Feature Extraction: obtain the high-frequency texture features { X 1 H , X 2 H | X n H R 128 × H 8 × W 8 } by using a residual structure.
 Output:  { X 1 H , X 2 H }

3.1.3. Reflection and Illumination Feature Fusion Module

After separately extracting high-frequency and low-frequency features, the feature fusion module is employed to execute the weighted fusion of { X n L , X n H } , n = 1 , 2 . Given the considerable variation in feature target sizes within remote sensing images, the atrous spatial pyramid pooling (ASPP) [36] module is utilized to achieve multi-scale fusion of features, thereby enhancing the completeness of multi-scale feature fusion. The operation steps are as follows: X n H and X n L are input to the ASSP module to obtain X n H a s s p = φ ( { x 1 H , , x k H } ) and X n L a s s p = φ ( { x 1 L , , x k L } ) , respectively; then the fused features { X 1 F , X 2 F | X n F = c o n v ( X n H a s s p · ( 1 + X n L a s s p ) ) , n = 1 , 2 } , are obtained by multi-scale weighting, splicing, and convolution operations.

3.1.4. Differential Feature Decoding Module

The difference feature decoding module is designed to map fused features { X 1 F , X 2 F } into difference features and decode them into CD probabilistic maps. The module uses deconvolution and residual blocks for feature mapping. A skip connection between the high-frequency texture feature extraction module and the differential feature mapping module is incorporated to facilitate cascading between layers with the same subsampling and coding scales. This method utilizes the spatial details in the shallow layers of the network to complement the more abstract and less localized information in the coded data, thus improving the sensitivity to the difference features. After three deconvolution iterations, the change prediction probability map f o u t R B , 2 , W , H is finally obtained.
During training, the IRD-CD model utilizes cross-entropy loss L c to assess the discrepancy between the model’s output results and the true distribution outcomes.
L c ( p s , l s ) = [ p s · log ( l s ) + ( 1 p s ) · log ( 1 l s ) ] ,
where p s and l s are the prediction result of CD and the CD source domain label.

3.2. Strategy of Change Detection Unsupervised Domain Adaptation Model (CD-UDA)

Based on the CD model described in Section 1, the proposed CD-UDA strategy encompasses the following three parts: firstly, a domain alignment strategy targeting the marginal distribution of features in the middle layer; secondly, a domain alignment strategy targeting the conditional distribution of features in the classification layer; thirdly, an easy-to-hard sample selection strategy based on a entropy threshold.

3.2.1. Marginal Distribution Domain Alignment of Illumination Feature

Different domain datasets have different global features such as illumination and atmosphere [37], and the global style features are stable and consistent within a domain. Therefore, the use of the low-frequency illumination feature extraction module (Section 3.1.1) is proposed, in order to extract the global style features in the source and target domains, and realize the marginal distribution domain alignment, which ultimately achieves the purposes of inter-domain global feature alignment and stabilized model convergence.
Although the global illumination feature X L can serve as one of the features for domain discrimination, there are significant differences in the global features even for the same domain data (same dataset) due to the complex atmospheric illumination. In such cases, enforcing the global features to narrow between different domains will lead X L to converge to a constant value, which is not physically meaningful. Therefore, the similarity between the global features { X S L , X T L } of the metric source and target domains, rather than the absolute difference, is utilized to align the interlayer features. The style similarity measure is implemented using the Frobenius norm, which computes the low-level features of different images, borrowed from the image style migration model. Notably, there is also prior work in the DA field that leverages the Frobenius norm to achieve domain alignment [21].
Based on the above method, the dimensions of { X S L , X T L } are first converted to { X S L R 1 × H + W 8 , X T L R 1 × H + W 8 } , respectively. Then, they are normalized to obtain X L ˜ = X L X L ¯ . The optimization function for the midlayer domain alignment is as follows:
L m i d D A = 1 4 n t 2 C S C T F 2 C S = 1 n s 1 X s L ˜ T X s L ˜ 1 n s 1 T X s L ˜ T 1 T X s L ˜ C T = 1 n t 1 X t L ˜ T X t L ˜ 1 n t 1 T X t L ˜ T 1 T X t L ˜ ,
where n t is the number of samples in the target domain, and the global difference matrix A = C S C T , A R n × m between the source and target domains is set. The Frobenius norm · F is formulated as follows:
A F = i = 1 n j = 1 m a i , j 2 , a i , j A .

3.2.2. Conditional Distribution Domain Alignment of Classification Feature

As deep neural network models deepen, their features evolve from general to specific features. Shallow features enable the characterization of the global style, but they cannot extract crucial feature information for surface instances. The proposed deep feature domain alignment method utilizes the input features { X S C , X T C } , X S C R 32 × H , W , X T C R 32 × H , W of the classification layer to achieve conditional distribution alignment based on the pseudo-labels output by the CD classifier.
However, due to the gap between the learned difference features and the large variability of the distance in the Euclidean space between domains, the domain similarity cannot be appropriately characterized, leading to the model’s convergence to a local minimum. To address this issue, the cosine distance is used to measure the difference between domains. Specifically, the feature vector of a point in the source domain image on the transferable feature is denoted as x s X S C , and the feature vector of a point in the target domain is represented as x t X T C . The inter-domain difference matrix D s t ( x s i , x t j ) is
D s t ( x s i , x t j ) = ( x s i ) ( x t j ) T x s i × x t j , i ( 0 , n s ) , j ( 0 , n t ) ,
where n s and n t are the number of samples in the source and target domains. Since P ( X s ) P ( X t ) , the difference features between the changed and unchanged regions exhibit significantly different feature distributions. If the difference between the two domains is directly reduced using marginal distribution alignment, features from different classes will be fitted into the same feature space. The specificity of the features between the different classes would be reduced, resulting in lower classification performance.
To address the above issues, we propose a conditional distribution adaptation, to minimize intra-class domain difference and maximize inter-class domain difference. To prevent transfer failure caused by fitting features from different classes to the same feature space, the labels of the source domain are denoted as l s = { l s , u c h g , l s , c h g } R n s × 2 , where l s , u c h g and l s , c h g represent the changed and unchanged labels of the source domain data, respectively. The CD probability of the target domain obtained by IRD-CD is represented as p t = { p t , u c h g , p t , c h g } R n t × 2 , where p t , u c h g and p t , c h g denote the probabilities of the changed and unchanged classification results of the target domain data fed into the source domain training model, respectively. The pseudo-label of the target domain is l t R n t . The domain weight matrix M t s R ( n s + n t ) × ( n s + n t ) is as follows:
M t s = p t , u c h g l s , u c h g T , p t , c h g l s , c h g T .
The adaptive weighted difference feature DA loss L h D A is defined as follows:
L h D A { X s , l s } , { X t } | θ = M t s · D f θ ( x s i ) , f θ ( x t j ) ,
where D ( · ) is the Euclidean distance.

3.2.3. Sample Selection Strategy

There is frequently an imbalance between unchanged and changed regions in CD samples, resulting in a higher number of unchanged instances. The sample imbalance causes the CD model to easily converge to a local minimum, which can misidentify the changed regions as unchanged regions and reduce the recall of the classifier. The above problem is particularly prominent in the CD-UDA domain. In addition, in convolution-based CD networks, a model with one data batch provides a large amount of deep feature data, leading to excessive computational intensity. Therefore, as shown in Algorithm 2, an easy-to-hard sample selection strategy is proposed, which can be used to extract effective target domain data by manipulating the entropy threshold of the CD map.
Algorithm 2 Transferable sample selection strategy
Require: The transferable features from source and target domain { f s , f t } .
 Initialization: Set the entropy threshold T p , sample number to be selected n s and n t .
 Transferable sample selection:
 1:
the samples of E t > T p in f t are selected, and the number is counted as n t 0 ;
 2:
if n t 0 < n t then T p = υ · T p ;
 3:
select the features with E t > T p in f t and count their number as n t 1 . Get the target samples f t by concatenating;
 4:
the number of n t 1 transferable features are randomly selected in f s , and then the samples are concatenated to obtain f s .
Output: Transferable samples { f s , f t } .
Entropy can be used as a measure of the model’s prediction confidence on samples from the target domain. For the CD task, a low entropy value usually indicates that the model is relatively more certain about its predictions, while a high entropy value indicates a high degree of uncertainty in the predictions. The transferable features within the source and target domains are denoted as f s R b × c × h × w and f t R b × c × h × w , respectively. Based on the target domain classification probability { p t u n c h g , p t c h g } , the entropy E t of the target domain is calculated as follows:
E t c h g = p t c h g log 2 ( p t c h g ) E t u n c h g = p t u n c h g log 2 ( p t u n c h g ) ,
where E t c h g and E t u n c h g are the entropy values of the changed and unchanged regions, respectively. Features greater than the entropy threshold T p are selected as { f t c h g i , f t u n c h g i | E t i > T p , f t i R c , i = 1 , , n t 0 } from the target domain samples.
However, the imbalance of samples leads to a decrease in the confidence E t c h g of the samples in the changed region predicted by the CD model, resulting in a large number of samples in the changed region being incorrectly discarded. Therefore, if the number of samples n t 0 obtained from the first entropy screening is less than n t (a preset parameter), the probability threshold of the changed region for the second screening is reduced to υ · T p , and  { f t c h g i , f t u n c h g i | E t u n c h g i > υ · T p , E t c h g i > T p , f t i f t , i = 1 , , n t 1 } is obtained where the number of samples is n t 1 . The  n t 1 source domain features are randomly selected from the source domain samples. The source domain for feature alignment is f s = [ f s c h g 0 , f s c h g n , f s u n c h g 0 , , f s u n c h g m ] , f s R ( n + m ) , c and the target domain features are f t = [ f t c h g 0 , f t c h g a , f t u n c h g 0 , , f t u n c h g b ] , f t R ( a + b ) , c after stitching. To overcome the effect of sample imbalance on domain alignment, the number of selected changed/unchanged features is set to n = m = a = b . Figure 3 shows the results of the sample selection.
In summary, the algorithm flow of IRD-CD-UDA is shown in Algorithm 3. The overall optimization function is as follows:
L D A = L c + L m i d D A + L h D A .
Algorithm 3 Framework of the IRD-CD-UDA algorithm
Require: Multi-temporal data from source domain and target domain: { I S , I T } R w × h × c .
 Initialization: randomly set parameter of IRD-CD Θ follow the standard normal distribution.
 Normalization:  { I S ^ , I T ^ } .
 Pretraining based on I S ^ :
 1:
Using l s and p s , Update the Θ by minimizing L c (3);
Unsupervised domain adaptation of CD:
 2:
The classification layer transferable feature { X S C , X T C } and mid-layer transferable feature { X S L , X T L } are obtained by { I S ^ , I T ^ , Θ } ;
 3:
Use the sample selection strategy from easy to hard based on probability threshold to get { f s c h g i , f s c h g i , f t u n c h g i , f s u n c h g i m } ;
 4:
The mid-layer feature marginal distribution domain adaptation by (4);
 5:
Classification layer feature conditional distribution domain adaptation by (8);
 6:
Update the Θ by (10) until converged;
 Output: CD results on the target domain I T ^ .

4. Experiment

4.1. Experimental Setting and Datasets

The experiments are set up in the following three parts: first, the experiments on the performance of the IRD-CD model, which were proposed in Section 3.1, to verify the effectiveness of the proposed improvements; second, ablation experiments on the proposed UDA strategy, to verify the effectiveness of the UDA strategy; third, comparing the domain adaptive methods for change detection and adding the CD-UDA strategy to the better performing CD model, in order to realize the generalization of the proposed CD-UDA strategy for validation. An NVIDIA 3090 24 GB graphics processor is utilized as the hardware platform for model training with a batch size of 10. The learning rate is 10 3 , and training is performed for five epochs. Table 1 provides an overview of the experimental dataset, where P c and P u represent the proportion of samples in the changing and unchanged regions.

4.2. Evaluation Metrics

The experimental evaluation metrics [10] include overall accuracy (OA), change area accuracy (CAcc), unchanged area accuracy (UAcc), mean cross-ratio union (mIoU), and F1 score (F1) in the confusion matrix. The imbalanced CD samples make it difficult to evaluate model performance from the OA metric alone, because even a high OA has no application value if a large number of samples in the class to which the model is fitted are at extreme values (local minima). Therefore, the experimental analysis will focus on F1 and mIoU [45].

4.3. Performance Evaluation of Change Detection

Since UDA-CD is an extension and enhancement of CD, the performance of the CD model has a direct impact on the migratability effect of the CD-UDA model. Therefore, the performances of supervised IRD-CD models need to be evaluated. Six methods with stable and advanced performance are selected for comparative experiments, including Deeplab [46], FCSiamD [47], EUNet [48], UCDNet [49], ISNet [50] and proposed IRDNet. These comparison experiments are performed separately for each dataset, and the results are presented in Table 2, with the best results highlighted in bold.
The supervised CD methods for which comparative experiments are conducted are robust and have excellent performance. The results show that their overall performance is relatively close. Among these methods, DeepLab and EUNet (both single-stream models) show suboptimal performance on all three datasets, confirming the idea that the two-stream Siamese model is more suitable for the CD task. Compared to the other concatenated models, the proposed IRD-CD model achieves the best performance on all three datasets, with a clear advantage on the most imbalanced GZ dataset.

4.4. Ablation Experiment of Change Detection Unsupervised Domain Adaptation Model (CD-UDA)

According to the DA strategy and the sample selection strategy introduced in Section 3.2, the effectiveness of each DA strategy is verified by an ablation experiment. This experiment has the following three parts:
  • D0: Evaluate the UDA performance of the model when pre-trained with the source domain data, using only the mid-layer marginal distribution alignment (Section 3.2.1);
  • D1: Evaluate the UDA performance of the model when pre-trained with the source domain data, using transferable features derived from the probabilistic easy-to-hard target domain sample selection strategy (Section 3.2.3) for the inter-domain marginal distribution alignment (Section 3.2.1) prior to the classification layer;
  • D2: Building on D1, evaluate the UDA performance of the mid-layer marginal distribution alignment (Section 3.2.2).
The results of the ablation experiments are shown in Table 3, where G L refers to the source domain data being GZ and the target domain data being LEVIR. These results show that the proposed mid-level domain edge distribution domain alignment, deep conditional distribution domain alignment, and sample selection strategies mainly improve the performance of CD-UDA. Among them, the probability-based easy-to-hard sample selection strategy significantly mitigates the convergence of CD-UDA to a local optimum, due to sample imbalance. The D1 approach with conditional distribution domain constraints shows significant overall improvement in the experiments, and in particular, the UDA strategy is more effective in mitigating the severe sample imbalance in the CD task. The above experimental results show that the sample imbalance problem can be alleviated by setting the same number of transferable samples for the changed and unchanged regions in the sample screening process. However, if the mid-level marginal distribution adaptation is not included, the convergence direction of the model will be different from that of the source domain model, i.e., the prior knowledge of the source domain is ignored. Given that the target domain data are unlabeled, CD-UDA can only improve the target domain test performance by minimizing the feature distribution difference between the source and target domain data.
There are two customized parameters, T p and υ , in the sample screening strategy Algorithm 2, and the specific values of the parameters are determined through ablation experiments. This experiment is carried out on the basis of all the additions to the D2 CD-UDA strategy. The ablation experiment is first performed for the initial probability threshold { T p | 0.5 < T p 1 } . Not performing the second screening will lead to model optimization failure due to sample imbalance, so here the second screening parameter υ is set to the empirical value of 0.9 . Table 4 shows the performance of T p { 0.6 , 0.7 , 0.8 , 0.9 } on different cross-sample datasets, where the experimental metrics are F 1 and m I O U . After determining the value of T p = 0.9 , ablation experiments are performed on the second sample screening weight υ . Table 5 shows the performance of υ { 0.6 , 0.7 , 0.8 , 0.9 } on different cross-sample datasets where the experimental metrics are F 1 and m I O U . Among them, the bolded result is the best performance. From the results of the experiment, it was determined that υ = 0.8 .

4.5. Performance Evaluation of Change Detection Unsupervised Domain Adaptation Model (CD-UDA)

The research on CD-UDA is very limited. In order to evaluate the performance of the proposed CD-UDA, the experiments are divided into the following three parts: first, evaluate the transferable characteristics of the multi-domain fully-connected layer by MK-MMD using the DSDANet model; second, incorporate the inter-domain conditional distribution difference and probabilistic sample-selection-based transfer strategy proposed in this paper into the two-stream UCDNet and ISNet models, so as to verify the generality of the CD-UDA method proposed in this paper; third, generalize the proposed CD-UDA method by creating a benchmark; select several CD models without a DA strategy but with better performance for cross-domain dataset testing. The experimental results are shown in Table 6, Table 7, Table 8, Table 9, Table 10 and Table 11, with the best results in bold.
UDA is essentially a migration of existing knowledge from the source domain, and it is impossible to have a situation where knowledge is created out of nothing. Therefore, it is first necessary to analyze the situation of the dataset to be migrated, and then obtain the application scenario of CD-UDA. From the experimental results and the analysis of the experimental process, it is known that the CD results, without adding the UDA strategy, can be used to obtain the distribution relationship between the datasets, using the test results on the target domain, although the experimental results, in this case, may have drastic oscillations within the iteration process. Based on the experimental results of the model without the CD-UDA strategy, the following conclusions can be drawn: firstly, the C A c c metrics are significantly lower in most experimental results, which indicates richer content information and greater domain uncertainty in the change region in the CD task; secondly, F 1 and C A c c are higher in S→L (Table 10) and S→G (Table 11), which can be attributed to the fact that SUYU and the other two datasets differ not only in the distribution of the data, but also in the range of the labeling—the labeling range of SUYU is larger than the other two datasets; thirdly, as shown in Table 6 and Table 9, the domain similarity is more obvious in the G and L datasets.
From the experimental results, it can be seen that the DSDANet model for the CD-DA task has more obvious instability, which is caused by the following two reasons: firstly, due to the sample imbalance in the CD dataset, if a suitable sample selection strategy is not adopted, it will lead to the model easily converging to the significant change region and the unchanged region, which will produce negative migration; secondly, due to the specificity of the CD dual-stream model, if domain feature alignment is only implemented in the classification layer, the style information of the samples will be ignored, and the aligned domain features only characterize the change detection results.
In order to evaluate the generalization of the CD-UDA strategy proposed in this paper, FCSiamD and UCDNet are chosen as the experiments for the addition of the CD-UDA strategy. Since FCSiamD and UCDNet cannot generate transferable features applicable to the marginal distribution alignment in the middle layer, only the conditional distribution alignment strategy and the sample selection strategy for the classification layer are added to the experiment. Overall, the addition of the proposed UDA strategy to the CD model results in more significant improvements, especially in S→L (Table 10) and S→G (Table 11).
From L→S (Table 8) and G→S (Table 7), the results have high accuracy for unchanged regions, low accuracy for changed regions, and insignificant CD-UDA performance. Combined with the overview of the experimental data (Table 1), we believe that this problem is caused by the following two problems: firstly, the difference between SUYU and the other two datasets not only in data distribution, but also in the labeling range (discrimination threshold for the changed region), which is larger in SUYU than in the other two datasets; and, secondly, the source domain data with a serious sample imbalance increase the difficultly of performing the CD-UDA task.
From the experimental results of S→L (Table 10) and S→G (Table 11), we can see that most of the methods improve their results significantly after adding the proposed UDA strategy. Combined with the overview of the experimental data (Table 1), we believe that the performance of the CD-UDA model on the target domain dataset is enhanced when the source domain data are broader and the feature distribution is wider.
From the experimental results of G→L (Table 6) and L→G (Table 9), it can be seen that CD-UDA mitigates the sample imbalance, resulting in the easier convergence of the target domain test results with the unchanged region. Combined with the overview of the experimental data (Table 1), CD-UDA is more effective for datasets with similar labeling scales (GZ and LEVIR) and is also more stable during the training process.
The proposed IRD-CD-UDA achieves the best performance in most of the cross-domain data experiments, and in particular, the performance of the F1 and mIoU metrics is significantly improved, with F1 improving by 3–22% and Miou by 2–13%. This demonstrates the effectiveness of the proposed CD-UDA in mitigating the sample imbalance problem, which often causes the model to converge to a local minimum. Meanwhile, the OA performance achieves the best results on G→S, L→S, L→G, and S→G, and outperforms most of the comparison methods on S→L and L→S. The results show that the CD-UDA proposed in this paper improves the performance of the target domain data without destroying the prior source domain and improves the generalization ability of the model.

4.6. Visualization

A set of bi-temporal data randomly selected from the three datasets are fed into the trained IRD-CD-UDA to produce visualization results. As shown in Figure 4, the experimental results indicate that the annotation criteria for the CD dataset vary across the datasets. These criteria dictate the designation of regions as changed or unchanged, which directly affects the performance of the CD-UDA model. As shown in Figure 4, the model exhibits increased sensitivity to regional changes when the source domain is SUYU, compared to the LEVIR and GZ datasets. In Figure 4, due to the large amount of data in the SUYU dataset and the richness of annotation categories, the changed regions of the GZ data can be better identified.
To visualize the distribution of differential features before and after CD-UDA, the T-SNE algorithm [51] was used to plot the classification feature distribution before and after domain adaptation, as shown in Figure 5, which shows the altered/invariant region-coupled features, altered intra-class features, and modified intra-class features. Obviously, prior to DA (Figure 4a–c), the distributions of the coupled features and the intra-class features are mixed, with the intra-class feature distributions being distinctly different for different domains. After DA (Figure 5d–f), the distributions of the coupled features are clustered by distinct classes, while the intra-class feature distributions of different domains approximate by class. Due to the sample imbalance, the model exhibits reduced confidence in the features of changing regions within a few target domains, leading to the conflation of depth features of these changing regions with features of other categories.

5. Conclusions

In this paper, a multi-scale unsupervised domain adaptation model for illumination–reflection decoupled change detection (IRD-CD-UDA) is proposed. The illumination–reflection decoupled CD model is able to extract and fuse illumination–reflection features to improve the supervised CD performance. Three strategies are designed to address the specific characteristics of CD-UDA: a domain alignment method for the marginal distribution of global features in the mid-layer, a domain alignment scheme for the conditional distribution of features in the classification layer, and an easy-to-hard sample selection method based on the probabilistic threshold of the CD map. The experiments verify that the proposed IRD-CD-UDA model can effectively improve the performance of a new homogeneous CD dataset (target domain) without additional data labeling. In addition, by performing ablation experiments and comparative analysis on three different CD datasets, it is confirmed that embedding the proposed CD-UDA strategy into different CD models can effectively improve their performance on different datasets, and ultimately improve the reusability of CD models.

Author Contributions

Conceptualization, J.Y. and R.F.; methodology, R.F.; software, R.F.; validation, R.F. and J.X.; formal analysis, R.F.; resources, J.Y. and Z.H.; data curation, Y.X.; writing—original draft preparation, R.F.; writing—review and editing, J.Y. and H.H.; visualization, R.F.; supervision, J.X.; project administration, J.Y.; funding acquisition, J.Y. and H.H. All authors have read and agreed to the published version of the manuscript.

Funding

This work was financially supported by the National Natural Science Foundation of China (Grant Nos. 12174314) and Innovation Foundation for Doctor Dissertation of Northwestern Polytechnical University Nos. CX2023064.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
CDChange Detection
UDAUnsupervised Domain Adaptation
IRDIllumination-Reflection Feature Decoupling
ASSPAtrous Apatial Pyramid Pooling

References

  1. Shi, W.; Zhang, M.; Zhang, R.; Chen, S.; Zhan, Z. Change Detection Based on Artificial Intelligence: State-of-the-Art and Challenges. Remote Sens. 2020, 12, 1688. [Google Scholar] [CrossRef]
  2. Zhang, R.; Zhang, H.; Ning, X.; Huang, X.; Wang, J.; Cui, W. Global-aware siamese network for change detection on remote sensing images. ISPRS J. Photogramm. Remote Sens. 2023, 199, 61–72. [Google Scholar] [CrossRef]
  3. Wang, X.; Yan, X.; Tan, K.; Pan, C.; Ding, J.; Liu, Z.; Dong, X. Double U-Net (W-Net): A change detection network with two heads for remote sensing imagery. Int. J. Appl. Earth Obs. Geoinf. 2023, 122, 103456. [Google Scholar] [CrossRef]
  4. Li, S.; Wang, Y.; Cai, H.; Lin, Y.; Wang, M.; Teng, F. MF-SRCDNet: Multi-feature fusion super-resolution building change detection framework for multi-sensor high-resolution remote sensing imagery. Int. J. Appl. Earth Obs. Geoinf. 2023, 119, 103303. [Google Scholar] [CrossRef]
  5. Kumar, A.; Mishra, V.; Panigrahi, R.K.; Martorella, M. Application of Hybrid-Pol SAR in Oil-Spill Detection. IEEE Geosci. Remote Sens. Lett. 2023, 20, 1–5. [Google Scholar] [CrossRef]
  6. Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the International Conference on Machine Learning, Lille, France, 6–11 July 2015; pp. 448–456. [Google Scholar] [CrossRef]
  7. Wu, C.; Du, B.; Zhang, L. Slow Feature Analysis for Change Detection in Multispectral Imagery. IEEE Trans. Geosci. Remote Sens. 2014, 52, 2858–2874. [Google Scholar] [CrossRef]
  8. Li, H.C.; Yang, G.; Yang, W.; Du, Q.; Emery, W.J. Deep nonsmooth nonnegative matrix factorization network with semi-supervised learning for SAR image change detection. ISPRS J. Photogramm. Remote Sens. 2020, 160, 167–179. [Google Scholar] [CrossRef]
  9. Zhang, X.; Su, H.; Zhang, C.; Gu, X.; Tan, X.; Atkinson, P.M. Robust unsupervised small area change detection from SAR imagery using deep learning. ISPRS J. Photogramm. Remote Sens. 2021, 173, 79–94. [Google Scholar] [CrossRef]
  10. Jiang, X.; Li, G.; Zhang, X.P.; He, Y. A Semisupervised Siamese Network for Efficient Change Detection in Heterogeneous Remote Sensing Images. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–18. [Google Scholar] [CrossRef]
  11. Xie, S.; Zheng, Z.; Chen, L.; Chen, C. Learning Semantic Representations for Unsupervised Domain Adaptation. In Proceedings of the 35th International Conference on Machine Learning PMLR, Stockholm, Sweden, 10–15 July 2018; Dy, J., Krause, A., Eds.; Volume 80, pp. 5423–5432. [Google Scholar]
  12. Vega, P.J.S.; da Costa, G.A.O.P.; Feitosa, R.Q.; Adarme, M.X.O.; de Almeida, C.A.; Heipke, C.; Rottensteiner, F. An unsupervised domain adaptation approach for change detection and its application to deforestation mapping in tropical biomes. ISPRS J. Photogramm. Remote Sens. 2021, 181, 113–128. [Google Scholar] [CrossRef]
  13. Li, X.; Du, Z.; Huang, Y.; Tan, Z. A deep translation (GAN) based change detection network for optical and SAR remote sensing images. ISPRS J. Photogramm. Remote Sens. 2021, 179, 14–34. [Google Scholar] [CrossRef]
  14. Zhao, W.; Chen, X.; Ge, X.; Chen, J. Using Adversarial Network for Multiple Change Detection in Bitemporal Remote Sensing Imagery. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
  15. Li, J.; Zi, S.; Song, R.; Li, Y.; Hu, Y.; Du, Q. A Stepwise Domain Adaptive Segmentation Network With Covariate Shift Alleviation for Remote Sensing Imagery. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–15. [Google Scholar] [CrossRef]
  16. Long, M.; Cao, Y.; Wang, J.; Jordan, M. Learning transferable features with deep adaptation networks. In Proceedings of the International Conference on Machine Learning, PMLR, Lille, France, 7–9 July 2015; pp. 97–105. [Google Scholar]
  17. Geng, J.; Deng, X.; Ma, X.; Jiang, W. Transfer Learning for SAR Image Classification Via Deep Joint Distribution Adaptation Networks. IEEE Trans. Geosci. Remote Sens. 2020, 58, 5377–5392. [Google Scholar] [CrossRef]
  18. Zhao, H.; Combes, R.; Zhang, K.; Gordon, G.J. On Learning Invariant Representation for Domain Adaptation. arXiv 2019, arXiv:1901.09453. [Google Scholar]
  19. Pan, S.J.; Tsang, I.W.; Kwok, J.T.; Yang, Q. Domain Adaptation via Transfer Component Analysis. IEEE Trans. Neural Netw. 2011, 22, 199–210. [Google Scholar] [CrossRef]
  20. Zellinger, W.; Grubinger, T.; Lughofer, E.; Natschläger, T.; Saminger-Platz, S. Central moment discrepancy (cmd) for domain-invariant representation learning. arXiv 2017, arXiv:1702.08811. [Google Scholar]
  21. Sun, B.; Saenko, K. Deep coral: Correlation alignment for deep domain adaptation. In Proceedings of the Computer Vision–ECCV 2016 Workshops, Amsterdam, The Netherlands, 8–10 and 15–16 October 2016; Proceedings, Part III 14. Springer: Berlin/Heidelberg, Germany, 2016; pp. 443–450. [Google Scholar]
  22. Gretton, A.; Sejdinovic, D.; Strathmann, H.; Balakrishnan, S.; Pontil, M.; Fukumizu, K.; Sriperumbudur, B.K. Optimal kernel choice for large-scale two-sample tests. Adv. Neural Inf. Process. Syst. 2012, 25. [Google Scholar]
  23. Long, M.; Wang, J.; Ding, G.; Sun, J.; Yu, P.S. Transfer Feature Learning with Joint Distribution Adaptation. In Proceedings of the 2013 IEEE International Conference on Computer Vision (CVPR), Portland, OR, USA, 23–28 June 2013; pp. 2200–2207. [Google Scholar] [CrossRef]
  24. Zhang, J.; Li, W.; Ogunbona, P. Joint Geometrical and Statistical Alignment for Visual Domain Adaptation. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 5150–5158. [Google Scholar] [CrossRef]
  25. Tzeng, E.; Hoffman, J.; Zhang, N.; Saenko, K.; Darrell, T. Deep domain confusion: Maximizing for domain invariance. arXiv 2014, arXiv:1412.3474. [Google Scholar]
  26. Long, M.; Zhu, H.; Wang, J.; Jordan, M.I. Deep transfer learning with joint adaptation networks. In Proceedings of the International Conference on Machine Learning, Sydney, Australia, 6–11 August 2017; pp. 2208–2217. [Google Scholar] [CrossRef]
  27. Wang, J.; Chen, Y.; Hao, S.; Feng, W.; Shen, Z. Balanced Distribution Adaptation for Transfer Learning. In Proceedings of the 2017 IEEE International Conference on Data Mining (ICDM), New Orleans, LA, USA, 18–21 November 2017; pp. 1129–1134. [Google Scholar] [CrossRef]
  28. Wang, J.; Chen, Y.; Feng, W.; Yu, H.; Huang, M.; Yang, Q. Transfer learning with dynamic distribution adaptation. ACM Trans. Intell. Syst. Technol. (TIST) 2020, 11, 1–25. [Google Scholar] [CrossRef]
  29. Yan, H.; Ding, Y.; Li, P.; Wang, Q.; Xu, Y.; Zuo, W. Mind the Class Weight Bias: Weighted Maximum Mean Discrepancy for Unsupervised Domain Adaptation. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 945–954. [Google Scholar] [CrossRef]
  30. Shen, J.; Qu, Y.; Zhang, W.; Yu, Y. Wasserstein distance guided representation learning for domain adaptation. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; Volume 32. [Google Scholar] [CrossRef]
  31. Kang, G.; Jiang, L.; Yang, Y.; Hauptmann, A.G. Contrastive Adaptation Network for Unsupervised Domain Adaptation. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 4888–4897. [Google Scholar] [CrossRef]
  32. Chen, H.; Wu, C.; Du, B.; Zhang, L. DSDANet: Deep Siamese domain adaptation convolutional neural network for cross-domain change detection. arXiv 2020, arXiv:2006.09225v1. [Google Scholar]
  33. Ahmed, S.M.; Raychaudhuri, D.S.; Paul, S.; Oymak, S.; Roy-Chowdhury, A.K. Unsupervised Multi-source Domain Adaptation Without Access to Source Data. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 10098–10107. [Google Scholar] [CrossRef]
  34. Liang, J.; Hu, D.; Feng, J. Domain Adaptation with Auxiliary Target Domain-Oriented Classifier. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 16627–16637. [Google Scholar] [CrossRef]
  35. Yang, L.; Balaji, Y.; Lim, S.N.; Shrivastava, A. Curriculum manager for source selection in multi-source domain adaptation. In Proceedings of the Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020; Proceedings, Part XIV 16. Springer: Berlin/Heidelberg, Germany, 2020; pp. 608–624. [Google Scholar]
  36. Chen, L.C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 834–848. [Google Scholar] [CrossRef] [PubMed]
  37. Wu, H.; Zheng, S.; Zhang, J.; Huang, K. Fast end-to-end trainable guided filter. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 1838–1847. [Google Scholar] [CrossRef]
  38. Weyermann, J.; Kneubühler, M.; Schläpfer, D.; Schaepman, M.E. Minimizing Reflectance Anisotropy Effects in Airborne Spectroscopy Data Using Ross–Li Model Inversion With Continuous Field Land Cover Stratification. IEEE Trans. Geosci. Remote Sens. 2015, 53, 5814–5823. [Google Scholar] [CrossRef]
  39. Petro, A.B.; Sbert, C.; Morel, J.M. Multiscale Retinex. Image Process. Line 2014, 71–88. [Google Scholar] [CrossRef]
  40. Hu, J.; Shen, L.; Sun, G. Squeeze-and-Excitation Networks. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar] [CrossRef]
  41. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the 2016 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar] [CrossRef]
  42. Shi, Q.; Liu, M.; Li, S.; Liu, X.; Wang, F.; Zhang, L. A Deeply Supervised Attention Metric-Based Network and an Open Aerial Image Dataset for Remote Sensing Change Detection. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–16. [Google Scholar] [CrossRef]
  43. Peng, D.; Bruzzone, L.; Zhang, Y.; Guan, H.; Ding, H.; Huang, X. SemiCDNet: A Semisupervised Convolutional Neural Network for Change Detection in High Resolution Remote-Sensing Images. IEEE Trans. Geosci. Remote Sens. 2021, 59, 5891–5906. [Google Scholar] [CrossRef]
  44. Chen, H.; Shi, Z. A spatial-temporal attention-based method and a new dataset for remote sensing image change detection. Remote Sens. 2020, 12, 1662. [Google Scholar] [CrossRef]
  45. Xie, B.; Yuan, L.; Li, S.; Liu, C.H.; Cheng, X. Towards Fewer Annotations: Active Learning via Region Impurity and Prediction Uncertainty for Domain Adaptive Semantic Segmentation. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 8058–8068. [Google Scholar] [CrossRef]
  46. Luo, X.; Li, X.; Wu, Y.; Hou, W.; Wang, M.; Jin, Y.; Xu, W. Research on Change Detection Method of High-Resolution Remote Sensing Images Based on Subpixel Convolution. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 1447–1457. [Google Scholar] [CrossRef]
  47. Caye Daudt, R.; Le Saux, B.; Boulch, A. Fully Convolutional Siamese Networks for Change Detection. In Proceedings of the 2018 25th IEEE International Conference on Image Processing (ICIP), Athens, Greece, 7–10 October 2018; pp. 4063–4067. [Google Scholar] [CrossRef]
  48. Raza, A.; Huo, H.; Fang, T. EUNet-CD: Efficient UNet++ for Change Detection of Very High-Resolution Remote Sensing Images. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
  49. Basavaraju, K.S.; Sravya, N.; Lal, S.; Nalini, J.; Reddy, C.S.; Acqua, D. UCDNet: A Deep Learning Model for Urban Change Detection From Bi-Temporal Multispectral Sentinel-2 Satellite Images. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–10. [Google Scholar] [CrossRef]
  50. Cheng, G.; Wang, G.; Han, J. ISNet: Towards Improving Separability for Remote Sensing Image Change Detection. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–11. [Google Scholar] [CrossRef]
  51. Maaten, L. Accelerating t-SNE using tree-based algorithms. J. Mach. Learn. Res. 2014, 15, 3221–3245. [Google Scholar]
Figure 1. Framework of the IRD-CD-UDA algorithm.
Figure 1. Framework of the IRD-CD-UDA algorithm.
Remotesensing 16 00799 g001
Figure 2. Structure of low-frequency illumination feature extraction network.
Figure 2. Structure of low-frequency illumination feature extraction network.
Remotesensing 16 00799 g002
Figure 3. The results of the proposed sample selection strategy; the yellow dots are the changed region sample selection points, the green area is the unchanged region sample selection points. (a,b) are the LEVIR dataset’s multi-temporal phase data; (c,d) are the GZ dataset’s multi-temporal phase data; (a,c) is the unchanged region heat mask map; and (b,d) is the changed region heat mask map.
Figure 3. The results of the proposed sample selection strategy; the yellow dots are the changed region sample selection points, the green area is the unchanged region sample selection points. (a,b) are the LEVIR dataset’s multi-temporal phase data; (c,d) are the GZ dataset’s multi-temporal phase data; (a,c) is the unchanged region heat mask map; and (b,d) is the changed region heat mask map.
Remotesensing 16 00799 g003
Figure 4. CD-UDA visualization results, where black is TN, white is TP, blue is FP, and red is FN. TN indicates that the unchanged pixel is correctly detected, TP indicates that the changed pixel is correctly detected, FN indicates the unchanged pixel is incorrectly detected, and FP indicates the changed pixel is incorrectly detected. (ad) are LEVIR→SYSU (top) and GZ→SYSU (bottom) experimental results, and the UDA methods are, respectively, FCSiamD-UDA, DSDANet, and IRD-CD-UDA; (eh) are LEVIR→GZ (top) and SYSU→GZ (bottom) experimental results, and the UDA methods are, respectively, FCSiamD-UDA, DSDANet, and MDCS-CD-UDA; (il) are SYSU→LEVIR (top) and GZ→LEVIR (bottom) experimental results, and the UDA methods are, respectively, FCSiamD-UDA, DSDANet, and IRD-CD-UDA.
Figure 4. CD-UDA visualization results, where black is TN, white is TP, blue is FP, and red is FN. TN indicates that the unchanged pixel is correctly detected, TP indicates that the changed pixel is correctly detected, FN indicates the unchanged pixel is incorrectly detected, and FP indicates the changed pixel is incorrectly detected. (ad) are LEVIR→SYSU (top) and GZ→SYSU (bottom) experimental results, and the UDA methods are, respectively, FCSiamD-UDA, DSDANet, and IRD-CD-UDA; (eh) are LEVIR→GZ (top) and SYSU→GZ (bottom) experimental results, and the UDA methods are, respectively, FCSiamD-UDA, DSDANet, and MDCS-CD-UDA; (il) are SYSU→LEVIR (top) and GZ→LEVIR (bottom) experimental results, and the UDA methods are, respectively, FCSiamD-UDA, DSDANet, and IRD-CD-UDA.
Remotesensing 16 00799 g004
Figure 5. Feature distribution before and after DA. x t , u c h g is the transferable unchanged feature vector of a point in the target domain, x t , u c h g x t ; x t , c h g is the transferable changed feature vector of a point in the target domain, x t , c h g x t ; x s , u c h g is the transferable unchanged feature vector of a point in the source domain, x s , u c h g x s ; and x s , c h g is the transferable changed feature vector of a point in the source domain, x s , c h g x s . (ac) Feature distribution visualization before DA. (df) Feature distribution visualization after DA.
Figure 5. Feature distribution before and after DA. x t , u c h g is the transferable unchanged feature vector of a point in the target domain, x t , u c h g x t ; x t , c h g is the transferable changed feature vector of a point in the target domain, x t , c h g x t ; x s , u c h g is the transferable unchanged feature vector of a point in the source domain, x s , u c h g x s ; and x s , c h g is the transferable changed feature vector of a point in the source domain, x s , c h g x s . (ac) Feature distribution visualization before DA. (df) Feature distribution visualization after DA.
Remotesensing 16 00799 g005
Table 1. Overview information of experimental datasets.
Table 1. Overview information of experimental datasets.
NameSourceGround Target Composition P c : P u
SYSU [42]Aerial imagesUrban construction, suburbs,
pre-construction, vegetation, road, ocean
0.20:1
GZ [43]Google EarthWater bodies, roads, farmland,
bare land, forests, buildings, ships
0.10:1
LEVIR [44]Google EarthBuildings, including urban and rural scenes0.05:1
Table 2. Performance of six CD models on three datasets.
Table 2. Performance of six CD models on three datasets.
DataMethodOAUAccCAccF1mIoUParp
GZDeepLab0.9670.9920.6790.8760.79526.7M
FCSiamD0.9800.9910.8700.9330.8842.1M
EUNet0.9680.9990.6270.8720.7942.3M
UCDNet0.9640.9790.8000.8840.8054.3M
ISNet0.9680.9690.9510.9070.83912.3M
IRDNet0.9850.9890.9330.9090.9086.3M
LEVIRDeepLab0.9940.9980.7330.9270.83826.7M
FCSiamD0.9950.9980.7890.9210.8632.1M
EUNet0.9920.9970.6370.8560.7752.3M
UCDNet0.9950.9990.7740.9200.8624.3M
ISNet0.9940.9960.8950.9180.85912.3M
IRDNet0.9960.9980.8470.9360.8866.3M
SUYUDeepLab0.8950.9530.6870.8240.73226.7M
FCSiamD0.8970.9420.7300.8440.7412.1M
EUNet0.8640.9440.5750.7810.6612.3M
UCDNet0.8880.9180.7800.8400.7344.3M
ISNet0.8930.9540.6700.8320.72612.3M
IRDNet0.9050.9620.6970.8510.7516.3M
Table 3. Ablation experimental results for domain adaptation strategies with different source domains.
Table 3. Ablation experimental results for domain adaptation strategies with different source domains.
D0D1D2DataOAUAccCAccF1mIoU
×× G L 0.9730.9920.2320.6430.568
× G L 0.9640.9890.2570.6460.577
G L 0.9740.9900.3050.6730.597
×× G S 0.7960.9920.0760.5110.434
× G S 0.7920.9840.0870.5170.435
G S 0.7970.9660.1880.5840.478
×× L S 0.7860.9760.0880.5140.432
× L S 0.7850.9810.0650.4960.421
L S 0.7770.9580.1110.5240.434
×× L G 0.9410.9940.1380.5970.534
× L G 0.9410.9870.2470.6570.574
L G 0.9400.9780.3610.6980.606
×× S L 0.9390.9530.3330.5870.521
× S L 0.8430.8450.7690.5510.472
S L 0.9510.9660.3480.6140.548
×× S G 0.6710.6540.9240.5240.400
× S G 0.8520.8560.7880.6570.547
S G 0.8770.8860.7330.6780.571
Table 4. Effect of the initial probability threshold T p ( F 1 / m I O U ).
Table 4. Effect of the initial probability threshold T p ( F 1 / m I O U ).
T p 0.60.70.80.9
G L 0.534/0.5090.608/0.5490.613/0.6080.615/0.556
G S 0.497/0.4200.558/0.5220.562/0.5310.568/0.511
L S 0.488/0.4180.485/0.4170.502/0.4190.497/0.418
L G 0.613/0.5440.643/0.5650.654/0.5730.660/0.581
S L 0.542/0.4990.579/0.5310.582/0.5320.610/0.553
S G 0.533/0.4950.594/0.5430.597/0.5480.604/0.493
Table 5. Effect of the initial probability threshold υ ( F 1 / m I O U ).
Table 5. Effect of the initial probability threshold υ ( F 1 / m I O U ).
υ 0.60.70.80.9
G L 0.583/0.5360.655/0.5800.673/0.5970.658/0.584
G S 0.535/0.4560.557/0.4780.584/0.4790.542/0.461
L S 0.485/0.4160.497/0.4180.523/0.4340.510/0.428
L G 0.627/0.5640.684/0.6520.698/0.6100.661/0.585
S L 0.596/0.5290.615/0.5510.614/0.5480.614/0.549
S G 0.608/0.5490.627/0.5640.659/0.5620.627/0.564
Table 6. Performance of CD-UDA with the GZ source domain and the LEVIR target domain (GZ→LEVIR).
Table 6. Performance of CD-UDA with the GZ source domain and the LEVIR target domain (GZ→LEVIR).
Method OA UAcc CAcc F 1 mIoU
DeepLab0.9700.9920.0950.5590.521
FCSiamD0.9730.9940.0870.5510.522
EUNet0.9760.9990.0300.5220.502
UCDNet0.9740.9970.0440.4600.507
ISNet0.9690.9880.2180.6210.558
IRDNet0.9740.9910.1760.6090.552
IRD-CD-UDA-Source0.9690.9880.9300.8990.909
UCDNet-UDA0.9660.9980.0520.5410.513
FCSiamD-UDA0.9660.9880.0740.5410.509
DSDANet0.9410.9940.1260.5890.528
IRD-CD-UDA0.9740.9900.3050.6730.597
Table 7. Performance of CD-UDA with the GZ source domain and the SUYU target domain (GZ→SUYU).
Table 7. Performance of CD-UDA with the GZ source domain and the SUYU target domain (GZ→SUYU).
Method OA UAcc CAcc F 1 mIoU
DeepLab0.7940.9960.0530.4910.422
FCSiamD0.7930.9950.0540.5080.422
EUNet0.7900.9930.0440.4820.415
UCDNet0.7890.9980.0200.5320.404
ISNet0.7860.9790.0770.5050.427
IRDNet0.7890.9800.0810.5100.430
IRD-CD-UDA-Source0.9790.9900.9270.9010.901
UCDNet-UDA0.7910.9770.1010.5270.440
FCSiamD-UDA0.7850.9910.0390.4850.408
DSDANet0.7880.9920.0390.4770.412
IRD-CD-UDA0.8000.9660.1880.5850.478
Table 8. Performance of CD-UDA with the LEVIR source domain and the SUYU target domain (LEVIR → SUYU).
Table 8. Performance of CD-UDA with the LEVIR source domain and the SUYU target domain (LEVIR → SUYU).
Method OA UAcc CAcc F 1 mIoU
DeepLab0.7870.9990.0050.4450.395
FCSiamD0.7880.9990.0110.4510.399
EUNet0.7870.9990.0050.4450.396
UCDNet0.7870.9990.0060.4470.396
ISNet0.7890.9970.0230.4620.405
IRDNet0.7860.9990.0040.4440.395
IRD-CD-UDA-Source0.9890.9910.8510.9280.881
UCDNet-UDA0.7920.9970.0400.4790.415
FCSiamD-UDA0.7860.9990.0460.4530.405
DSDANet0.7860.9990.0540.4410.399
IRD-CD-UDA0.7870.9580.1120.5240.434
Table 9. Performance of CD-UDA with the LEVIR source domain and the GZ target domain (LEVIR→GZ).
Table 9. Performance of CD-UDA with the LEVIR source domain and the GZ target domain (LEVIR→GZ).
Method OA UAcc CAcc F 1 mIoU
DeepLab0.9330.9780.2580.6130.564
FCSiamD0.9420.9940.1520.6080.541
EUNet0.9280.9670.3380.6650.576
UCDNet0.9410.9940.1310.5930.531
ISNet0.9130.9880.2280.6480.621
IRD-CD-UDA-Source0.9910.9960.8510.9380.888
IRDNet0.9480.9920.2990.6130.607
IRD-CD-UDA-Source0.9890.9910.8510.9280.881
UCDNet-UDA0.9120.9530.2510.6050.528
FCSiamD-UDA0.9170.9590.2880.6290.547
DSDANet0.9270.9680.3040.6520.566
IRD-CD-UDA0.9500.9800.3610.6980.610
Table 10. Performance of CD-UDA with the SUYU source domain and the LEVIR target domain (SUYU→LEVIR).
Table 10. Performance of CD-UDA with the SUYU source domain and the LEVIR target domain (SUYU→LEVIR).
Method OA UAcc CAcc F 1 mIoU
DeepLab0.9470.9660.1770.5560.511
FCSiamD0.9500.9690.2070.5710.521
EUNet0.7540.7550.7330.4910.408
UCDNet0.6000.6010.5520.3730.313
ISNet0.8470.8510.6760.5740.470
IRDNet0.8750.8800.6640.5670.493
IRD-CD-UDA-Source0.9010.9580.6910.8440.749
UCDNet-UDA0.9010.9220.2720.5370.486
FCSiamD-UDA0.9690.9920.1030.5330.512
DSDANet0.8740.8770.7690.5780.499
IRD-CD-UDA0.9510.9660.3480.6140.548
Table 11. Performance of CD-UDA with the SUYU source domain and the GZ target domain (SUYU→GZ).
Table 11. Performance of CD-UDA with the SUYU source domain and the GZ target domain (SUYU→GZ).
Method OA UAcc CAcc F 1 mIoU
DeepLab0.8000.8000.7920.5960.493
FCSiamD0.8800.9060.4960.6370.541
EUNet0.5360.5080.9570.4380.310
UCDNet0.7520.7660.5370.4610.431
ISNet0.8070.8120.7300.6040.494
IRDNet0.8150.8160.7890.6190.507
IRD-CD-UDA-Source0.9020.9510.6810.8410.742
UCDNet-UDA0.8180.8210.7660.6190.508
FCSiamD-UDA0.8770.8860.7330.6680.570
DSDANet0.7810.7770.8430.5970.481
IRD-CD-UDA0.8980.9240.4910.6590.562
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Fan, R.; Xie, J.; Yang, J.; Hong, Z.; Xu, Y.; Hou, H. Multiscale Change Detection Domain Adaptation Model Based on Illumination–Reflection Decoupling. Remote Sens. 2024, 16, 799. https://doi.org/10.3390/rs16050799

AMA Style

Fan R, Xie J, Yang J, Hong Z, Xu Y, Hou H. Multiscale Change Detection Domain Adaptation Model Based on Illumination–Reflection Decoupling. Remote Sensing. 2024; 16(5):799. https://doi.org/10.3390/rs16050799

Chicago/Turabian Style

Fan, Rongbo, Jialin Xie, Jianhua Yang, Zenglin Hong, Yuqi Xu, and Hong Hou. 2024. "Multiscale Change Detection Domain Adaptation Model Based on Illumination–Reflection Decoupling" Remote Sensing 16, no. 5: 799. https://doi.org/10.3390/rs16050799

APA Style

Fan, R., Xie, J., Yang, J., Hong, Z., Xu, Y., & Hou, H. (2024). Multiscale Change Detection Domain Adaptation Model Based on Illumination–Reflection Decoupling. Remote Sensing, 16(5), 799. https://doi.org/10.3390/rs16050799

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop