An Adaptive Weighted Method for Remote Sensing Image Retrieval with Noisy Labels

: Due to issues with sample quality, there is an increasing interest in deep learning models that can handle noisy labels. Currently, the optimal way to deal with noisy labels is by combining robust active and passive loss functions. However, the weighting parameters for these functions are typically determined manually or through a large number of experimental iterations, and even the weighting parameters change as the dataset and the noisy rate change. This can lead to suboptimal results and be time-consuming. Therefore, we propose an adaptively weighted method for the combined active passive loss (APL) in remote sensing image retrieval with noisy labels. First, two metrics are selected to measure the noisy samples: the ratio of the entropy to the standard deviation and the difference of the predicted probabilities. Then, an adaptive weighted learning network with a hidden layer is designed to dynamically learn the weighting parameters. The network takes the above two metrics as inputs and is trained concurrently with the feature extraction network in each batch, without significantly increasing the computational complexity. Extensive experiments demonstrate that our improved APL method outperforms the original manually weighted APL method and other state-of-the-art robust loss methods while saving the time on manual parameter selection.


Introduction
The demand for the fast and efficient retrieval of images from large remote sensing databases has become increasingly urgent due to military and civilian needs in geospatial information science [1].Content-based remote sensing image retrieval (CBRSIR) is an effective method to solve meet this demand [2] and has attracted an increasing amount of research interest.
In recent years, various methods based on deep learning models, which can automatically learn the high-level semantic features of remote sensing images, have become the mainstream method for CBRSIR [3][4][5].These deep learning model-based methods are data-driven and require large amounts of sample data.Therefore, to reduce the cost of labelling large datasets, many researchers have proposed clustering, semi-automatic labelling and crowdsourcing methods in real-world application scenarios [6][7][8].However, these methods can introduce noisy labels into the sample datasets.According to the literature [9], existing datasets contain between 8.0% and 38.5% noisy labels.Such noisy labels may lead to overfitting of the deep learning-based methods [10], which can reduce the performance of remote sensing image retrieval.For example, Li et al. [11] found that noisy labels significantly affect the accuracy of deep learning-based classifiers, which can also negatively impact retrieval results.Kang et al. [12] also confirmed that deep learning models are not robust enough in benchmarking datasets with noisy labels.Therefore, it is crucial to consider the effect of noisy labels during deep learning model training.
To reduce the effect of noisy labels, several robust loss functions have been proposed.For example, Ghosh et al. [13] proposed a mean absolute error (MAE) loss, which is more robust than cross-entropy (CE).However, it converges slowly due to gradient saturation.By combining the MAE with the faster converging CE, Zhang et al. [14] proposed a generalised cross-entropy (GCE).Inspired by the symmetry of Kullback-Leibler divergence, a robust symmetric cross-entropy (SCE) [15] is proposed by combining the robust reverse cross-entropy (RCE) and CE with two weighting hyperparameters.Additionally, Chen et al. [16] proposed an adaptive cross-entropy (ACE) by replacing the two weighting parameters of the SCE with the output probability of the deep learning network.This approach can eliminate the need to manually select the weighting parameters for the SCE.Furthermore, Ma et al. [17] categorised existing robust losses into active and passive losses and introduced an active passive loss (APL) by combining the two different types of robust losses with two weighting hyperparameters.This method provides a theoretical explanation for the effectiveness of the combination losses in combating noisy labels.Compared with other robust methods, the APL method has the best results.However, the APL method is time-consuming, and it may not always achieve optimal performance, as the two weighting parameters are typically determined manually or through a large number of experimental iterations.More unfortunately, the two weighting parameters vary with the dataset and the noise rate.This limitation restricts its widespread use in real-world applications, especially in contexts where noise levels in datasets are unknown.In addition, the ACE parameter replacement method is not applicable to APL.
To solve this problem, we propose an adaptive weighted method for the robust APL method in remote sensing image retrieval.Specifically, a new metric based on the entropy and standard deviation of the predicted probabilities of the samples is developed to represent the complexity of the sample.Additionally, the predicted probability difference is chosen as a second metric of whether the sample has a noisy label.Then, an adaptive weighted learning network (AWNet) with one hidden layer is designed to dynamically learn the weighting parameters in each training batch using the above two metrics as inputs.Our code is available at https://github.com/GeoRSAI/APL_AWNet(accessed on 24 January 2024).
The rest of this paper is organised as follows.Section 2 reviews related works on CBRSIR based on deep learning, robust loss functions and multilayer perceptron for remote sensing.Section 3 details our proposed method, including the framework, two metrics and AWNet.The experimental results and analysis are presented in Section 4, while Section 5 provides the conclusion of this paper.

CBRSIR Based on Deep Learning
Deep learning has been extensively applied in CBRSIR and has achieved excellent performance.It has gradually replaced the low-and mid-level feature-based methods.This is due to the ability of deep neural networks to extract high-level semantic features, which can better represent the content of remote sensing images.For instance, Zhou et al. [18] trained a mainstream convolutional neural network model on a remote sensing dataset, and the performance of the model was significantly better than that of the low-and mid-level feature-based methods.Other research has attempted to enhance feature extraction by using more complex network structures, such as the contrastive self-supervised learning network [19] and attention mechanisms [20,21].On clean datasets, many of the current methods have achieved near-saturation retrieval accuracy.However, when the training data contains noisy labels, the model tends to overfit to noisy samples.This significantly reduces the accuracy of the classification model and, subsequently, the retrieval accuracy.To solve the above problem, Li et al. [22] proposed a fault-tolerant deep learning method for remote sensing scene classification.The method utilises ensemble learning to enhance the accuracy of the error correction of noisy labels, with data cleaning as the underlying concept.However, the method fuses several large networks and iterates many times in the training process, so the number of parameters and computations is large.Damodaran et al. [23] proposed a loss with entropic optimal transport (CLEOT), which designed a robust loss by exploring the joint publication of images and labels, and achieved good results, but the method performed poorly at low proportion noise.Overall, there is a need to complement the research on remote sensing image retrieval with noisy labels.

Noise Robust Loss Functions
A loss function is considered noise-robust if the classifier achieves the same classification accuracy on both noisy and noise-free data [24].Compared to other robust methods such as relabelling [25] and sample importance weighting [26], robust loss is a simpler and more general method.Currently, symmetric loss is one of the mainstream robust loss functions [27,28], such as MAE [13], RCE [15] and so on.To make any loss symmetric, Ma et al. [17] proposed normalised loss functions using a simple normalisation operation.However, this operation actually changes the form of loss functions, which leads to difficulties in optimisation.Consequently, the fitting ability of the symmetric loss function is limited by the symmetry condition [29], and the model underfitting problem is prone to occur.Inspired by the advantages of symmetric [15] or complementary learning [30], the APL [17] framework was proposed for robust and sufficient learning, which is combined with active loss and passive loss to mutually reinforces each other and solve the model underfitting problem.Although APL has shown excellent performance among many robust loss functions, it has two hyperparameters ( and ), which cannot maintain optimal performance on any dataset without tuning.Recently, asymmetric loss functions have been proposed as a new class of robust loss functions by Zhou et al. [31].Meanwhile, several asymmetric loss functions such as asymmetric unhinged loss (AUL) and asymmetric exponential loss (AEL) have also been proposed by Zhou et al. [31].However, this type of asymmetric loss function requires that each sample in the training dataset has a higher probability of being labelled with a true semantic label than any other class of labels during image classification.This means that it is ineffective for noisy label types that are easily confused.Therefore, in this paper, the active passive loss function (APL) is still chosen as an improved benchmark for remote sensing image retrieval with noisy labels.

Multilayer Perceptron for Remote Sensing
As a representative neural network structure, multilayer perceptron (MLP) is widely used in remote sensing tasks such as remote sensing image classification [32][33][34], object detection [35] and change detection [36,37].The MLP can have multiple hidden layers between the input and output layers, but the simplest MLP has only one hidden layer.Despite its simple structure, numerous computer vision experiments [38] have demonstrated that the MLP still has the same feature representation capabilities as traditional convolution and transformers, even under complex network and large dataset conditions.In addition, the MLP has excellent compatibility with CNNs [39].For example, by combining the spectral features extracted by MLP with the spatial features represented by a CNN, Zhang et al. [40] used a rule-based decision fusion method to integrate the MLP with the CNN and achieved excellent classification accuracy of very fine spatial resolution remote sensing images.It is also inspired by meta-learning [41] and makes it possible to use MLP for the automatic determination of hyperparameters such as weights.Therefore, we adopt the MLP as the main structure of our AWNet to automatically acquire active and passive loss weights, which, in turn, improves its generalisation.

Methodology
In this section, we first introduce the APL and analyse its limitations.Then, we present the framework of our method, which consists of feature extraction, AWNet and image retrieval components.Next, we explain the meaning and role of the two metrics required for the adaptive determination of the APL weights, namely the ratio of entropy to standard deviation (abbreviated as RES) and δ.Finally, we introduce our core method of the AWNet and describe the algorithm in detail, showing the connection between the AWNet, APL weights and classification.

The Active Passive Loss (APL)
To address the model underfitting problem caused by noisy labels, Ma et al. [17] proposed a framework to construct robust loss functions called APL, which combines the active loss (i.e., CE, normalised CE, focal loss and normalised focal loss) and the passive loss (i.e., MAE, normalised MAE, RCE and normalised RCE).The APL   can be defined as follows.
where variables  and  are used to compensate for robust losses and ,  > 0. The term   represents active losses that explicitly maximise the network's output probability at the class position specified by the label.The term   represents passive losses, which explicitly minimise the probability at least one other class position.Therefore, for noisy samples, more passive learning can preserve the effective information of the samples and avoid misguiding the model.Conversely, more active learning can speed up the learning of the model and avoid underfitting.However, we found that, for different datasets, the weights of active and passive losses need to be manually adjusted to achieve the best performance, which increases the training cost and limits the generalisation performance of the model.The aim of our method is to automatically adjust the weighting hyperparameters  and  in remote sensing image retrieval.

Framework of Our Method
The framework of our method is shown in Figure 1.Our method mainly consists of feature extraction module, the AWNet module and a querying module.
The feature extraction module can use any type of convolutional neural network model (e.g., ResNet, DenseNet, MobileNet, etc.) with the APL method to learn image features from training samples with noisy labels.The AWNet module aims to dynamically adjust the weighting hyperparameters of the APL method according to the predicted probability of the feature extraction module.In the training stage, two different metrics for the complexity and noise level of each sample image in a batch can be computed based on the predicted probability of the feature extraction module.The metric values of all images within the batch are used together as inputs to the AWNet module.With the AWNet, a different set of weighting hyperparameters,  and , is obtained for each sampled image.To mitigate the negative effects of noisy labels in training samples, it is recommended to use more passive losses, which require a large .Conversely, for clean training samples with correct labels, it is advisable to use more active losses to facilitate quick convergence of the model, which requires a large .The active passive loss can then be calculated for each image.The average APL of all images within a batch is used to update the classifier parameters of the image feature extraction module.In turn, the average value of the APL within the same batch is recalculated using the newly updated classifier and used to update the AWNet.Details of the two metrics and the adaptive weighted learning network are described in the following section.
In the query stage, we first use the optimal model obtained in the training stage to extract the features of the query image.Then, we calculate the similarity between the feature vector of the query image and the feature vector of each image in the database.In this paper, the Euclidean distance is selected as the similarity metric.Finally, based on the descending order of image feature similarity, the top-ranked images are selected as the final retrieval results.

Two Metrics
The weighting hyperparameters of the combined robust loss are closely related to the training sample.For example, complex datasets require more active learning (such as large ) and less passive learning (such as small ) to achieve good performance [17].Furthermore, the weighting parameters are correlated with the noise rate in the trained samples, because noisy samples require more passive learning.Therefore, two metrics are designed to reflect the sample complexity and noise level.
In the field of digital image processing, the complexity of an image is reflected by its information entropy (. ) which is defined as Equation (2).Furthermore, Equation (3) defines the standard deviation of the output layer of the classifier (.).This indirectly reflects the complexity of the sample, because a more complex image is more difficult for the classifier, resulting in a greater difference in the predicted probability of each category in the output layer and a lower value of (. ).

𝐸(𝑝
where  represents the total number of labels,   represents the predicted probability of each class from the output layer of the classifier, and   ̂ means the average value of   .It is important to note that the value of (. ) increases and the value of (. ) decreases as the image complexity increases.Therefore, as shown in Equation ( 4), the first metric is defined as the ratio of entropy to standard deviation (abbreviated as RES).The higher the RES, the higher the complexity of the sample.

RES = 𝐸(𝑝 𝑖 ) 𝑆(𝑝 𝑖 )
In addition, the probability difference δ [42] is selected as the second metric, which can be used to determine the noise level of the sample.It is defined as follows: where   denotes the probability of being predicted as a correct category by the classifier,   denotes the maximum probability of being predicted as other incorrect categories, and δ[−1, 1].The smaller the δ value of a sample when δ < 0, the more likely it is to be a noisy sample.

Adaptive Weighted Learning Network
Inspired by meta-learning [41], we expect the model to be able to learn the relationship between training samples and weighting hyperparameters automatically.At the same time, we aim to achieve this without incurring high computational costs.To accomplish this, we construct an adaptive weight learning network (AWNet) based on a multilayer perceptron.The network consists of an input layer, a hidden layer (with 100 neurons) and an output layer.The hidden and output layers consist of a linear model and an activation function.To avoid the issues of gradient explosion and gradient disappearance, ReLU converts the input gradient into 1 and 0. Therefore, the activation function of the hidden layer is ReLU, and the activation function of the output layer is Sigmoid.Full connectivity is used between the different layers.Specifically, the complexity and noise level of the training samples are expressed quantitatively by two metrics: RES and δ.The two metrics are then used as the input of AWNet to fit their relationship with  and , which can automatically weigh each sample.The forward computation procedure of AWNet can be written as where μ and ReLU are the Sigmoid and ReLU activation functions, respectively. 1  and  2  are the feature weights between the input layer and the hidden layer and between the hidden layer and the output layer, respectively.
Algorithm 1 describes the process for updating parameters of our proposed AWNet and the classifier during the training stage.To ensure the reliability of the classifier's output probability, we employ a fixed-weight APL (α = 1, β = 1) for pretraining the classifier in the first   epochs.In our experiments, we set the number of pretraining epochs   to 3. Detailed information about the setting of   will be introduced in Section 4.5.Once the pretraining is completed, the classifier proceeds to the formal training stage.In this stage, we calculate the RES and δ for each sample by using the output probabilities of the classifier.These two metrics are then input into AWNet to obtain α and β for each sample.Subsequently, we update the parameters of the classifier by Equation (1) and then input the same samples into the updated classifier to update the AWNet parameters with the new loss.The classifier and AWNet parameters are iteratively updated until training is complete.Our method can be trained directly on noisy data   without the need for additional clean data as a guide.

Datasets and Experimental Setup
Three widely used public remote sensing image datasets: UC-Merced dataset (UCMD) [43], aerial image dataset (AID) [44] and Northwestern Polytechnical University dataset (NWPU) [45] are used to evaluate the effectiveness of our proposed method.The UCMD dataset [43] consists of 21 categories, and each category contains 100 images with a size of 256 × 256 pixels.The AID dataset [44] contains 30 categories and a total of 10,000 images, with 220-420 images per category.The image size is 600 × 600 pixels.The NWPU dataset [45] contains 31,500 images and 45 categories.There are 700 images per category, and each image is 256 × 256 pixels in size.These three datasets can be used to validate the robustness of the method, as they have different levels of complexity and intra-class diversity.Examples of all categories in the three datasets are shown in Figures 2-4.In the experiments, we randomly select 60% of the images as the training set, 20% of the images as the validation set and 20% of the images as the test set.To evaluate the effectiveness of the methods against noisy labels, we add different proportions of symmetric noise (e.g., 5%, 10%, 20% and 30%) to simulate noisy labels in the training and validation sets.Specifically, the label is flipped uniformly across all the classes with probability , regardless of the similarity between the classes.In this case, the label transition matrix E has the entries 1 −  in the diagonal and  in the off-diagonal elements.
All experiments are repeated three times to ensure the reliability of the results.All methods are performed with ImageNet pretrained classifiers and the Adam optimiser.The initial learning rate and weight decay are both set to 0.00015.To avoid unreliable noisy sample evaluation results at the beginning of training,  and  are set to 1 for the first three epochs.Our method's feature extraction module uses ResNet50 as the basic backbone.AWNet also uses Adam as the optimiser, with an initial learning rate of 0.001 and a weight decay of 0.0001.All models are trained using PyTorch 1.8.2 on a single NVIDIA GeForce RTX 3090 GPU with a batch size of 128 for 20 epochs.

Experiments on Adaptive Weights versus Manual Weights
In these experiments, we chose  NCE +  RCE as a representative of the APL.Tables 1 and 2 display the retrieval performance of the CE, APL with 12 manual weights and our method on three different datasets with varying noise rates.The green and red fonts represent the lowest and highest retrieval performance of the 12 manually weighted combinations of  NCE +  RCE, respectively.Table 1.mAPs (%) (mean ± standard deviation) on the UCMD dataset for adaptive weights versus manual weights.

Loss
The It can be seen from Tables 1-3 that the original APL method outperforms the traditional CE method in the presence of noisy labels.However, the weighting hyperparameters required for optimal performance vary across the three datasets and noise rates.This means that, as the dataset and noise rate change, the original APL method must be repeatedly tuned.This is a time-consuming process with uncertain results.Furthermore, it can also be seen that our method achieves the best performance in the three datasets at different noise rates.This indicates that our adaptive weighted method is effective for the robust APL method.In Table 4, we compare the image classification accuracy between adaptive weights and manual weights.It is evident that our method is more conducive to improving the feature extraction ability of the model and achieves higher image classification accuracy compared to manual weights.Therefore, our method is also suitable for image-level computer vision tasks based on high-level features such as image classification.

Comparison with Various SOTA Losses
In these experiments, we apply our method to four types of APLs on the UCMD dataset with 20% noise rate (results in Table 5) and compare them with seven state-of-the-art robust losses on the NWPU dataset with 20% noise rate (results in Table 6).These methods are described below.
(1) MAE [13]: It is a passive and symmetric loss, as described in Section 1.Although it can maintain gradient stability for different input values, its training is limited due to slow convergence.(2) GCE [14]: It is also a passive and symmetric loss, as described in Section 1.Its robustness is achieved by combining the MAE with the CE, and it is only robust when reduced to the MAE loss.(3) RCE [15]: It can be seen as the reverse version of the CE, as it exchanges the positions of the predicted probability and the one-hot coding in the formula of the cross-entropy loss function.However, it also converges slowly.(4) SCE [15]: It combines the CE loss with the RCE.Its robustness and convergence stability are guaranteed by RCE and CE, respectively.However, it requires the adjustment of two hyperparameters.( 5) ACE [16]: It uses the predicted probabilities   of the true labels of the samples to adaptively determine the two weights in SCE.As the   of the sample tends toward zero, it gradually transforms into RCE.( 6) AUL [31]: It is a noise robust function that is an asymmetric version of the unhinged loss [27].( 7) AEL [31]: It is an asymmetric noise-robust function, which assumes that the noise distribution in the data satisfies the clean label domination assumption.[17] 87.70 ± 0.11 MAE [13] 86.53 ± 0.13 NCE + MAE [17] 85.82 ± 0.53 GCE [14] 87.97 ± 0.48 NFL + RCE [17] 88.04 ± 0.32 RCE [15] 86.39 ± 0.37 NFL + MAE [17] 84.07 ± 1.20 SCE [15] 86.51 ± 0.19 A-NCE + RCE (ours) 88.37 ± 0.29 * ACE [16] 86.52 ± 0.33 A-NCE + MAE (ours) 88.95 ± 0.45 * AUL [31] 86.74 ± 0.82 A-NFL + RCE (ours) 88.76 ± 0.25 * AEL [31] 85.79 ± 0.08 A-NFL + MAE (ours) 88.49 ± 0.26 * * Represents the highest retrieval precision.
As shown in Table 5, the four APL combinations also have different weighting parameters to achieve optimality, and our method still outperforms their 12 manually weighted combinations.Additionally, our method outperforms the best manually weighted methods by 0.67-4.42%and other robust loss methods by 0.4-2.56%.
It can be seen from Table 6 that our improved APL methods only yield a slight improvement by 0.4-0.98%over the GCE method.However, it is important to note that the GCE method requires manual determination of a hyperparameter in the (0,1] interval, and unfortunately, the literature [14] does not provide a method for parameter selection.Compared to the ACE method, which also adaptively determines the hyperparameter, our improved APL methods show a more significant improvement of 1.85-2.43%.Moreover, our method is also superior to asymmetric loss functions such as AUL and AEL.The above results indicate that our weighted learning method is suitable for all types of APL methods and yields superior retrieval outcomes compared to other robust methods.

Efficiency and Backbone Analysis
The training time and floating point of operations (FLOPs) are used to evaluate the efficiency on the UCMD dataset with a 20% noise rate using different backbones.The experimental results are presented in Table 7.On the one hand, it is evident that the FLOPs of our improved APL method do not increase significantly.This is because AWNet is a shallow neural network with significantly fewer parameters than deep learning models.As a result, the training time of our improved APL method does not increase significantly compared to the original APL.Considering that the original APL method has to try at least 12 different weighting parameters, it actually takes longer to train than our improved APL.On the other hand, it can be seen that our improved APL method achieves better retrieval results than the original APL method under different deep learning models.This indicates that our method has better generalisation.

Ablation Experiment of Two Metrics
To verify the validity of the two metrics (RES and δ) of the AWNet, we perform an ablation experiment on the UCMD dataset with noise rates of 20% and 30%, respectively.In the ablation experiments, the retrieval effects of the metrics RES and δ alone as inputs of the AWNet are investigated.In addition, the retrieval effects of the three parameters prediction probability, entropy and standard deviation S, which make up RES and δ, are also examined alone as the input of the AWNet.Specifically, the prediction probability denotes the predicted probability of the model classifier (i.e., the output of the softmax output layer).
Table 8 displays the retrieval performance of the above experiments.The results indicate that using both RES and δ as inputs to AWNet can help the AWNet to better predict  and , resulting in the highest retrieval accuracy.This finding confirms the validity of the two metrics.Additionally, to assess the rationality of the pretraining epochs   in this paper, we conduct ablation experiments on the UCMD dataset with 20% noise.The pretraining epochs   are set to 1, 2, 3, 4 and 5, respectively.The experimental results are shown in Table 9.The results show that the retrieval performance achieved the best when   = 3.Too few pretraining epochs (e.g., 1) may lead to underfitting of the model and affect the reliability of RES and δ.Therefore, in other experiments, the pretraining epoch is set to 3. To determine the optimal structure for the AWNet, we evaluate the model performance with varying numbers of hidden layers and neurons per layer on the UCMD dataset with 20% noise.Specifically, we test the model's performance with one, two and three hidden layers and with 50 and 100 neurons per layer, respectively.The experimental results are presented in Table 10.The results indicate that using one hidden layer yields a better retrieval accuracy compared to using two or three hidden layers while also requiring fewer parameters and thus reducing calculation costs.In addition, a hidden layer with 100 neurons is better suited for fitting the relationship between training samples and weighting hyperparameters, resulting in a higher retrieval accuracy.

Conclusions
In this paper, we propose an adaptively weighted method based on active passive loss (APL) for remote sensing image retrieval.To automatically determine the weight hyperparameters of active losses and passive losses of different samples, we first design or select two metrics to measure the sample complexity and noise level based on entropy, standard deviation and predicted probability difference.Then, an adaptive weighted learning network (AWNet) based on the multilayer perceptron is designed to automatically predict the weighting parameters.
In order to verify the effectiveness and portability of our method, four groups of experiments are designed.First of all, the experimental results show that the retrieval accuracy of our method is better than that of 12 manual weighting combinations on three datasets: UCMD, AID and NWPU with five noise rates.Secondly, compared with seven other state-of-the-art robust losses, our method achieves the best performance, and the mAPs are improved by 0.4% to 2.56%.In the third group of experiments, we compare the retrieval accuracy and model complexity of our method using six different backbones.The results show that our method has excellent performance and good portability without increasing the computational cost too much.Finally, we proved the rationality of the AWNet's input metrics and structure through several groups of ablation experiments.In addition, the results show that our method achieves better image classification accuracy than manual weighting.Therefore, our process of adaptively learning the weighting parameters can benefit other areas such as image classification and segmentation with noisy labels.
Although the existing retrieval models have achieved excellent retrieval performance on single-domain datasets, they are difficult to generalise to test datasets in other domains.To address this issue, many scholars have proposed remote sensing image retrieval methods that enhance the generalisation performance of models on different data sources.For example, Wang et al. [4] implemented unsupervised cross-domain remote sensing image retrieval by using pseudo-label self-training and consistency regulation.However, those domain adaptation methods lack research on noise labels.Therefore, our future work will be concentrated on the effect of noise samples on domain adaptation remote sensing image retrieval.

Figure 1 .
Figure 1.Framework of our proposed method.

Table 2 .
mAPs (%) (mean ± standard deviation) on the AID dataset for adaptive weights versus manual weights.
* Represents the highest retrieval precision at the same noise rate, and superscripts 1 and 2 represent the highest and lowest retrieval performance of the 12 manually weighted combinations of  NCE +  RCE, respectively.* Represents the highest retrieval precision at the same noise rate, and superscripts 1 and 2 represent the highest and lowest retrieval performance of the 12 manually weighted combinations of  NCE +  RCE, respectively.
* Represents the highest retrieval precision at the same noise rate, and superscripts 1 and 2 represent the highest and lowest retrieval performance of the 12 manually weighted combinations of  NCE +  RCE, respectively.

Table 4 .
Comparison of image classification accuracy between manual and automatic weighting methods on the UCMD dataset with 20% noise.
* Represents the highest retrieval precision.

Table 5 .
mAPs (%) (mean ± standard deviation) on the UCMD dataset with 20% noise using 4 types of APLs.Represents the highest retrieval precision, and superscripts 1 and 2 represent the highest and lowest retrieval performance of the 12 manually weighted combinations of  NCE +  RCE, respectively. *

Table 6 .
mAPs (%) (mean ± standard deviation) of comparison with robust losses on the NWPU dataset with 20% noise.
* Represents the highest retrieval precision.
* Represents the highest retrieval precision.
* Represents the highest retrieval precision.

Table 10 .
mAPs (%) (mean ± standard deviation) of different numbers of hidden layers on the UCMD dataset with 20% noise.

Number of Hidden Layers The Number of Neurons in Each
* Represents the highest retrieval precision.