Author Contributions
Conceptualization, J.L. and X.Z.; methodology, X.M. and J.L.; software, R.D.; validation, X.M.; formal analysis, X.M. and J.L.; resources, X.M. and S.N.; data curation, R.D.; writing—review and editing, X.M. and J.L.; writing—original draft, X.M.; visualization, X.M. and R.D.; funding acquisition, J.L.; supervision, J.L., S.N. and X.Z. All authors have read and agreed to the published version of the manuscript.
Figure 1.
Architecture of the proposed SDLS framework. First, features are extracted from the original images using the self-distillation stream. Second, local images are generated via the LIGM. Third, features are extracted from the local images using the local stream. Finally, the representative logits from both streams are fused for final classification. FC represents the fully connected layer. , and indicate the number of channels in the CNN internal feature maps. and represent the trainable parameters used for generating local image. In this figure, numbers 1–3 denote the classification scores produced by Branch1_MobileNetV2, Branch2_MobileNetV2, and the CNN backbone, respectively. Number 4 represents the classification score from the feature fusion classifier, and number 5 corresponds to the classification score generated by Local_MobileNetV2.
Figure 1.
Architecture of the proposed SDLS framework. First, features are extracted from the original images using the self-distillation stream. Second, local images are generated via the LIGM. Third, features are extracted from the local images using the local stream. Finally, the representative logits from both streams are fused for final classification. FC represents the fully connected layer. , and indicate the number of channels in the CNN internal feature maps. and represent the trainable parameters used for generating local image. In this figure, numbers 1–3 denote the classification scores produced by Branch1_MobileNetV2, Branch2_MobileNetV2, and the CNN backbone, respectively. Number 4 represents the classification score from the feature fusion classifier, and number 5 corresponds to the classification score generated by Local_MobileNetV2.
Figure 2.
Structural details of the MGA module. Conv denotes the standard convolution operation. Dconv, Pconv, and Gconv refer to depthwise convolution, pointwise convolution, and group convolution, respectively. The symbols k, s, p, and g represent kernel size, stride, padding, and groups in the convolution. and represent the height and width of the feature map, respectively. and denote the number of channels.
Figure 2.
Structural details of the MGA module. Conv denotes the standard convolution operation. Dconv, Pconv, and Gconv refer to depthwise convolution, pointwise convolution, and group convolution, respectively. The symbols k, s, p, and g represent kernel size, stride, padding, and groups in the convolution. and represent the height and width of the feature map, respectively. and denote the number of channels.
Figure 3.
The confusion matrix on the RSSCN7 dataset with a training ratio of 50%.
Figure 3.
The confusion matrix on the RSSCN7 dataset with a training ratio of 50%.
Figure 4.
The confusion matrix on the UCM dataset with a training ratio of 50%.
Figure 4.
The confusion matrix on the UCM dataset with a training ratio of 50%.
Figure 5.
The confusion matrix on the AID dataset with a training ratio of 50%.
Figure 5.
The confusion matrix on the AID dataset with a training ratio of 50%.
Figure 6.
The confusion matrix on the NWPU dataset with a training ratio of 20%.
Figure 6.
The confusion matrix on the NWPU dataset with a training ratio of 20%.
Figure 7.
Attention feature maps visualisation results of the three networks equipped with SDLS on the UCM dataset with a training ratio of 80%.
Figure 7.
Attention feature maps visualisation results of the three networks equipped with SDLS on the UCM dataset with a training ratio of 80%.
Figure 8.
Attention feature maps visualization results of the ShuffleNetV2-1.5 network equipped with SDLS on the UCM dataset with a training ratio of 80%.
Figure 8.
Attention feature maps visualization results of the ShuffleNetV2-1.5 network equipped with SDLS on the UCM dataset with a training ratio of 80%.
Figure 9.
Visualization results of the ResNet18 network equipped with SDLS on the AID dataset with a training ratio of 50%.
Figure 9.
Visualization results of the ResNet18 network equipped with SDLS on the AID dataset with a training ratio of 50%.
Figure 10.
Variations of the two trainable parameters during training. (a) Variation of the parameter during training. (b) Variation of the parameter during training.
Figure 10.
Variations of the two trainable parameters during training. (a) Variation of the parameter during training. (b) Variation of the parameter during training.
Figure 11.
Visualization results of the ResNet18 network equipped with SDLS at different epochs on the AID dataset with a training ratio of 50%. “” denotes the original weights. “” represents the weights after applying the fourth power transformation.
Figure 11.
Visualization results of the ResNet18 network equipped with SDLS at different epochs on the AID dataset with a training ratio of 50%. “” denotes the original weights. “” represents the weights after applying the fourth power transformation.
Figure 12.
Examples of the top five misclassified categories identified from the confusion matrix.
Figure 12.
Examples of the top five misclassified categories identified from the confusion matrix.
Table 1.
Experiment results of OA (%) on four datasets.
Table 1.
Experiment results of OA (%) on four datasets.
| Backbone | Networks/Classifiers | AID | RSSCN7 | UCM | NWPU |
|---|
| 20% | 50% | 20% | 50% | 50% | 80% | 10% | 20% |
|---|
| | Baseline | 93.71 ± 0.16 | 96.10 ± 0.16 | 91.75 ± 0.87 | 94.70 ± 0.55 | 97.75 ± 0.49 | 98.71 ± 0.12 | 89.75 ± 0.27 | 92.46 ± 0.13 |
| | Branch1_MobileNetV2 | 94.33 ± 0.23 | 96.49 ± 0.26 | 93.66 ± 0.40 | 96.13 ± 0.41 | 98.46 ± 0.42 | 99.05 ± 0.48 | 90.28 ± 0.18 | 92.89 ± 0.07 |
| | Branch2_MobileNetV2 | 94.40 ± 0.25 | 96.61 ± 0.14 | 93.88 ± 0.50 | 96.00 ± 0.40 | 98.44 ± 0.21 | 99.10 ± 0.28 | 90.28 ± 0.15 | 92.85 ± 0.20 |
| ResNet18 | ResNet18 | 94.46 ± 0.11 | 96.55 ± 0.14 | 93.16 ± 0.61 | 95.47 ± 0.43 | 98.36 ± 0.18 | 98.62 ± 0.51 | 90.50 ± 0.08 | 92.82 ± 0.15 |
| | Feature fusion classifier | 95.35 ± 0.24 | 97.22 ± 0.19 | 94.34 ± 0.40 | 96.37 ± 0.40 | 98.78 ± 0.28 | 99.15 ± 0.12 | 91.88 ± 0.05 | 94.05 ± 0.12 |
| | Local_MobileNetV2 | 95.41 ± 0.18 | 97.21 ± 0.17 | 94.31 ± 0.44 | 96.46 ± 0.28 | 98.69 ± 0.38 | 99.05 ± 0.34 | 91.96 ± 0.12 | 94.07 ± 0.11 |
| | SDLS | 95.47 ± 0.22 | 97.24 ± 0.15 | 94.50 ± 0.48 | 96.42 ± 0.36 | 98.78 ± 0.28 | 99.15 ± 0.19 | 92.03 ± 0.05 | 94.11 ± 0.13 |
| | Baseline | 94.37 ± 0.29 | 96.56 ± 0.23 | 93.12 ± 0.61 | 95.69 ± 0.16 | 97.94 ± 0.17 | 98.72 ± 0.44 | 91.15 ± 0.18 | 93.64 ± 0.06 |
| | Branch1_MobileNetV2 | 94.50 ± 0.35 | 96.67 ± 0.15 | 93.75 ± 0.56 | 95.97 ± 0.37 | 98.50 ± 0.22 | 99.05 ± 0.52 | 90.22 ± 0.20 | 93.01 ± 0.11 |
| | Branch2_MobileNetV2 | 94.43 ± 0.24 | 96.48 ± 0.21 | 93.90 ± 0.47 | 95.90 ± 0.45 | 98.53 ± 0.21 | 99.29 ± 0.40 | 90.33 ± 0.29 | 93.05 ± 0.09 |
| WRN50-2 | WRN50-2 | 94.44 ± 0.25 | 96.62 ± 0.19 | 94.10 ± 0.32 | 95.97 ± 0.44 | 98.46 ± 0.23 | 99.05 ± 0.42 | 91.22 ± 0.11 | 93.44 ± 0.15 |
| | Feature fusion classifier | 95.59 ± 0.27 | 97.45 ± 0.18 | 94.78 ± 0.30 | 96.64 ± 0.24 | 98.91 ± 0.27 | 99.38 ± 0.32 | 92.52 ± 0.16 | 94.56 ± 0.12 |
| | Local_MobileNetV2 | 95.66 ± 0.22 | 97.46 ± 0.10 | 94.62 ± 0.33 | 96.46 ± 0.30 | 98.91 ± 0.28 | 99.47 ± 0.35 | 92.53 ± 0.20 | 94.53 ± 0.18 |
| | SDLS | 95.64 ± 0.26 | 97.47 ± 0.14 | 94.82 ± 0.33 | 96.60 ± 0.22 | 98.90 ± 0.24 | 99.43 ± 0.35 | 92.57 ± 0.18 | 94.58 ± 0.16 |
| | Baseline | 94.80 ± 0.17 | 96.90 ± 0.26 | 92.59 ± 0.76 | 94.87 ± 0.29 | 98.44 ± 0.28 | 98.86 ± 0.41 | 91.52 ± 0.15 | 93.94 ± 0.06 |
| | Branch1_MobileNetV2 | 94.41 ± 0.36 | 96.45 ± 0.10 | 94.00 ± 0.40 | 95.97 ± 0.15 | 98.57 ± 0.21 | 99.19 ± 0.44 | 90.17 ± 0.29 | 92.95 ± 0.15 |
| | Branch2_MobileNetV2 | 94.55 ± 0.27 | 96.60 ± 0.12 | 93.86 ± 0.49 | 95.87 ± 0.33 | 98.55 ± 0.30 | 99.05 ± 0.40 | 90.18 ± 0.20 | 92.85 ± 0.09 |
| Efficientnet_b3 | Efficientnet_b3 | 95.20 ± 0.29 | 97.06 ± 0.11 | 93.35 ± 0.53 | 95.76 ± 0.29 | 98.54 ± 0.24 | 99.05 ± 0.37 | 92.13 ± 0.25 | 94.25 ± 0.08 |
| | Feature fusion classifier | 95.96 ± 0.27 | 97.62 ± 0.08 | 94.81 ± 0.45 | 96.59 ± 0.15 | 98.97 ± 0.23 | 99.38 ± 0.32 | 92.85 ± 0.24 | 94.94 ± 0.05 |
| | Local_MobileNetV2 | 95.75 ± 0.33 | 97.54 ± 0.19 | 94.68 ± 0.42 | 96.45 ± 0.20 | 98.84 ± 0.22 | 99.38 ± 0.28 | 92.60 ± 0.27 | 94.83 ± 0.08 |
| | SDLS | 95.92 ± 0.29 | 97.59 ± 0.14 | 94.84 ± 0.48 | 96.56 ± 0.14 | 98.95 ± 0.22 | 99.43 ± 0.32 | 92.82 ± 0.24 | 94.91 ± 0.06 |
| | Baseline | 93.44 ± 0.11 | 95.90 ± 0.16 | 92.66 ± 0.38 | 94.92 ± 0.50 | 97.67 ± 0.26 | 98.38 ± 0.23 | 89.13 ± 0.16 | 92.11 ± 0.12 |
| | Branch1_MobileNetV2 | 94.43 ± 0.32 | 96.47 ± 0.16 | 94.04 ± 0.37 | 96.09 ± 0.41 | 98.57 ± 0.34 | 98.95 ± 0.24 | 90.14 ± 0.28 | 92.74 ± 0.03 |
| | Branch2_MobileNetV2 | 94.45 ± 0.24 | 96.54 ± 0.17 | 93.91 ± 0.29 | 96.03 ± 0.35 | 98.69 ± 0.22 | 99.15 ± 0.12 | 90.17 ± 0.19 | 92.86 ± 0.13 |
| ShuffleNetV2-1.5 | ShuffleNetV2-1.5 | 93.62 ± 0.22 | 95.90 ± 0.23 | 92.53 ± 0.85 | 95.23 ± 0.46 | 97.64 ± 0.34 | 98.76 ± 0.35 | 89.21 ± 0.14 | 92.03 ± 0.09 |
| | Feature fusion classifier | 95.33 ± 0.21 | 97.11 ± 0.14 | 94.57 ± 0.34 | 96.47 ± 0.38 | 98.88 ± 0.28 | 99.29 ± 0.15 | 91.59 ± 0.17 | 94.03 ± 0.04 |
| | Local_MobileNetV2 | 95.24 ± 0.24 | 97.12 ± 0.17 | 94.35 ± 0.40 | 96.43 ± 0.20 | 98.74 ± 0.37 | 99.38 ± 0.28 | 91.67 ± 0.24 | 94.07 ± 0.04 |
| | SDLS | 95.35 ± 0.22 | 97.16 ± 0.11 | 94.50 ± 0.36 | 96.41 ± 0.33 | 98.88 ± 0.26 | 99.38 ± 0.19 | 91.75 ± 0.18 | 94.13 ± 0.05 |
Table 2.
Experiment results of OA (%) on two datasets.
Table 2.
Experiment results of OA (%) on two datasets.
| Networks/Classifiers | AID | NWPU |
|---|
| 20% | 50% | 10% | 20% |
|---|
| Baseline (300 × 300) | 95.68 ± 0.18 | 97.38 ± 0.15 | 92.45 ± 0.13 | 94.57 ± 0.09 |
| Branch1_MobileNetV2 | 94.93 ± 0.21 | 97.06 ± 0.08 | 90.66 ± 0.20 | 93.36 ± 0.14 |
| Branch2_MobileNetV2 | 94.98 ± 0.25 | 97.08 ± 0.25 | 90.62 ± 0.22 | 93.27 ± 0.13 |
| EfficientNet-b3 | 96.07 ± 0.20 | 97.66 ± 0.12 | 92.73 ± 0.14 | 94.78 ± 0.14 |
| Feature fusion classifier | 96.55 ± 0.13 | 97.91 ± 0.08 | 93.29 ± 0.14 | 95.33 ± 0.08 |
| Local_EfficientNet-b3 | 96.57 ± 0.12 | 97.99 ± 0.07 | 93.50 ± 0.10 | 95.44 ± 0.08 |
| SDLS | 96.63 ± 0.12 | 98.00 ± 0.09 | 93.59 ± 0.10 | 95.46 ± 0.10 |
| Baseline (330 × 330) | 96.07 ± 0.13 | 97.67 ± 0.13 | 92.70 ± 0.13 | 94.77 ± 0.08 |
| Branch1_MobileNetV2 | 95.37 ± 0.21 | 97.26 ± 0.19 | 91.01 ± 0.17 | 93.39 ± 0.08 |
| Branch2_MobileNetV2 | 95.33 ± 0.23 | 97.19 ± 0.13 | 91.05 ± 0.18 | 93.40 ± 0.10 |
| EfficientNet-b3 | 96.33 ± 0.14 | 97.78 ± 0.11 | 93.08 ± 0.11 | 94.94 ± 0.18 |
| Feature fusion classifier | 96.79 ± 0.18 | 98.05 ± 0.09 | 93.64 ± 0.13 | 95.42 ± 0.07 |
| Local_EfficientNet-b3 | 96.82 ± 0.15 | 98.09 ± 0.07 | 93.87 ± 0.13 | 95.46 ± 0.10 |
| SDLS | 96.93 ± 0.15 | 98.11 ± 0.06 | 93.89 ± 0.08 | 95.52 ± 0.10 |
Table 3.
Comparison with other methods.
Table 3.
Comparison with other methods.
| Method (Year) | AID | NWPU |
|---|
| 20% | 50% | 10% | 20% |
|---|
| ! Attention GANs (2020) [13] | 78.95 ± 0.23 | 84.52 ± 0.18 | 72.21 ± 0.21 | 77.99 ± 0.19 |
| † MIDC-Net (2020) [16] | 88.51 ± 0.41 | 92.95 ± 0.17 | 86.12 ± 0.29 | 87.99 ± 0.18 |
| † TFADNN (2020) [17] | 93.21 ± 0.32 | 95.64 ± 0.16 | 87.78 ± 0.11 | 90.86 ± 0.24 |
| † MSA-Network (2021) [18] | 93.53 ± 0.21 | 96.01 ± 0.43 | 90.38 ± 0.17 | 93.52 ± 0.21 |
| † LSE-Net (2021) [53] | 94.41 ± 0.16 | 96.36 ± 0.19 | 92.23 ± 0.14 | 93.34 ± 0.15 |
| † SKAL_ResNet18 (2022) [23] | 94.38 ± 0.10 | 96.76 ± 0.20 | 90.04 ± 0.15 | 92.79 ± 0.11 |
| † LML_ResNet50 (2022) [24] | 96.21 ± 0.13 | 97.86 ± 0.09 | 92.67 ± 0.15 | 94.73 ± 0.11 |
| ‡ ET-GSNet (2022) [40] | 95.58 ± 0.18 | 96.88 ± 0.19 | 92.72 ± 0.28 | 94.50 ± 0.18 |
| ‡ EMTCAL (2022) [54] | 94.69 ± 0.14 | 96.41 ± 0.23 | 91.63 ± 0.19 | 93.65 ± 0.12 |
| ‡ HHTL (2022) [55] | 95.62 ± 0.13 | 96.88 ± 0.21 | 92.07 ± 0.44 | 94.21 ± 0.09 |
| ‡ SCViT (2022) [56] | 95.56 ± 0.17 | 96.98 ± 0.16 | 92.72 ± 0.04 | 94.66 ± 0.10 |
| † LSCNet (2023) [57] | 95.38 ± 0.15 | 97.14 ± 0.14 | 92.80 ± 0.14 | 94.54 ± 0.19 |
| ‡ EMSCNet_ViT-B (2023) [58] | 96.02 ± 0.18 | 97.35 ± 0.17 | 93.58 ± 0.22 | 95.37 ± 0.07 |
| ‡ LDBST (2023) [59] | 95.10 ± 0.09 | 96.84 ± 0.20 | 93.86 ± 0.18 | 94.36 ± 0.12 |
| † MDRCN (2024) [60] | 93.64 ± 0.19 | 95.66 ± 0.18 | 91.59 ± 0.29 | 93.82 ± 0.17 |
| † CGINet (2024) [61] | 95.35 ± 0.14 | 97.10 ± 0.24 | 92.28 ± 0.17 | 94.38 ± 0.13 |
| † HFAM (2024) [62] | 94.71 ± 0.06 | 96.73 ± 0.04 | 90.81 ± 0.11 | – |
| ‡ MSE-Net (2024) [63] | 96.30 ± 0.10 | 97.44 ± 0.05 | 92.80 ± 0.17 | 94.70 ± 0.16 |
| † LSDGNet_ENet-b3 (2025) [64] | 95.81 ± 0.14 | 97.49 ± 0.19 | 93.60 ± 0.17 | 95.05 ± 0.11 |
| † MSCN (2025) [65] | 95.86 ± 0.16 | 97.46 ± 0.12 | 92.64 ± 0.09 | 94.59 ± 0.11 |
| ‡ STMSF (2025) [66] | 96.15 ± 0.16 | 97.51 ± 0.37 | 92.88 ± 0.16 | 94.95 ± 0.11 |
| † ENet-b3_MV2_224 (SDLS) | 95.92 ± 0.29 | 97.59 ± 0.14 | 92.82 ± 0.24 | 94.91 ± 0.06 |
| † ENet-b3_ENet-b3_300 (SDLS) | 96.63 ± 0.12 | 98.00 ± 0.09 | 93.59 ± 0.10 | 95.46 ± 0.10 |
| † ENet-b3_ENet-b3_330 (SDLS) | 96.93 ± 0.15 | 98.11 ± 0.06 | 93.89 ± 0.08 | 95.52 ± 0.10 |
Table 4.
Comparison of model complexity and OA.
Table 4.
Comparison of model complexity and OA.
| Method (Year) | Params | FLOPs | AID_20% | NWPU_10% |
|---|
| EFCOMFFNetv1-DenseNet (2023) [67] | 19.94 M | 6.02 G | 95.86 ± 0.13 | 92.40 ± 0.15 |
| EFCOMFFNetv2-DenseNet (2023) [67] | 27.75 M | 5.93 G | 95.69 ± 0.15 | 92.36 ± 0.12 |
| MBFANet (2023) [68] | 24.48 M | 4.51 G | 93.98 ± 0.15 | 91.61 ± 0.14 |
| CGINet (2024) [61] | 26.10 M | 4.14 G | 95.35 ± 0.14 | 92.28 ± 0.17 |
| MBFNet (2025) [69] | 22.09 M | 4.25 G | 95.81 ± 0.13 | – |
| ResNet18_MV2_224 (SDLS) | 18.65 M | 2.85 G | 95.47 ± 0.22 | 92.03 ± 0.05 |
| ENet-b3_MV2_224 (SDLS) | 19.55 M | 2.08 G | 95.92 ± 0.29 | 92.82 ± 0.24 |
Table 5.
Model evaluation results on the AID dataset with a 20% training ratio.
Table 5.
Model evaluation results on the AID dataset with a 20% training ratio.
| Networks/Classifiers | OA (%) | Params (M) | FLOPs (G) | Inference Time (ms/Image) |
|---|
| RTX 3090 | Tesla P100 | Tesla T4 |
|---|
| Baseline (ResNet18) | 93.71 ± 0.16 | 11.20 (1.00×) | 1.83 (1.00×) | 3.12 | 2.37 | 2.31 |
| Branch1_MobileNetV2 | 94.33 ± 0.23 | 2.43 (0.22×) | 0.93 (0.51×) | 7.84 | 5.47 | 5.68 |
| Branch2_MobileNetV2 | 94.40 ± 0.25 | 2.96 (0.26×) | 1.33 (0.73×) | 8.68 | 5.97 | 6.25 |
| ResNet18 | 94.46 ± 0.11 | 11.20 (1.00×) | 1.83 (1.00×) | 3.12 | 2.37 | 2.31 |
| Feature fusion classifier | 95.35 ± 0.24 | 16.35 (1.46×) | 2.53 (1.38×) | 17.69 | 12.59 | 12.98 |
| Local_MobileNetV2 | 95.41 ± 0.18 | 18.61 (1.66×) | 2.85 (1.56×) | 24.37 | 17.35 | 17.88 |
| SDLS | 95.47 ± 0.22 | 18.65 (1.67×) | 2.85 (1.56×) | 24.39 | 17.89 | 17.84 |
| Baseline (WRN50-2) | 94.37 ± 0.29 | 66.89 (1.00×) | 11.44 (1.00×) | 9.01 | 8.07 | 9.74 |
| Branch1_MobileNetV2 | 94.50 ± 0.35 | 2.93 (0.04×) | 2.50 (0.22×) | 8.95 | 5.95 | 6.18 |
| Branch2_MobileNetV2 | 94.43 ± 0.24 | 6.45 (0.10×) | 5.37 (0.47×) | 11.12 | 7.24 | 7.59 |
| WRN50-2 | 94.44 ± 0.25 | 66.89 (1.00×) | 11.44 (1.00×) | 9.01 | 8.07 | 9.74 |
| Feature fusion classifier | 95.59 ± 0.27 | 74.02 (1.11×) | 12.33 (1.08×) | 23.54 | 15.44 | 15.89 |
| Local_MobileNetV2 | 95.66 ± 0.22 | 76.28 (1.14×) | 12.65 (1.11×) | 30.40 | 20.16 | 21.22 |
| SDLS | 95.64 ± 0.26 | 76.32 (1.14×) | 12.65 (1.11×) | 31.00 | 20.43 | 20.88 |
Table 6.
Experiment results of OA (%) on the UCM dataset with a 80% training ratio.
Table 6.
Experiment results of OA (%) on the UCM dataset with a 80% training ratio.
| Backbone | Networks/Classifiers | MGA Module |
|---|
| × | ✓ |
|---|
| | Branch1_MobileNetV2 | 99.05 ± 0.26 | 99.05 ± 0.48 |
| | Branch2_MobileNetV2 | 99.05 ± 0.26 | 99.10 ± 0.28 |
| | ResNet18 | 98.67 ± 0.36 | 98.62 ± 0.51 |
| ResNet18 | Feature fusion classifier | 99.05 ± 0.21 | 99.15 ± 0.12 |
| | Local_MobileNetV2 | 99.10 ± 0.18 | 99.05 ± 0.34 |
| | SDLS | 99.15 ± 0.12 | 99.15 ± 0.19 |
| | Branch1_MobileNetV2 | 99.05 ± 0.21 | 99.05 ± 0.52 |
| | Branch2_MobileNetV2 | 99.00 ± 0.32 | 99.29 ± 0.40 |
| | WRN50-2 | 98.71 ± 0.49 | 99.05 ± 0.42 |
| WRN50-2 | Feature fusion classifier | 99.29 ± 0.26 | 99.38 ± 0.32 |
| | Local_MobileNetV2 | 99.24 ± 0.18 | 99.47 ± 0.35 |
| | SDLS | 99.29 ± 0.21 | 99.43 ± 0.35 |
| | Branch1_MobileNetV2 | 99.05 ± 0.26 | 98.95 ± 0.24 |
| | Branch2_MobileNetV2 | 99.10 ± 0.53 | 99.15 ± 0.12 |
| | ShuffleNetV2-1.5 | 98.76 ± 0.46 | 98.76 ± 0.35 |
| ShuffleNetV2-1.5 | Feature fusion classifier | 99.29 ± 0.40 | 99.29 ± 0.15 |
| | Local_MobileNetV2 | 99.24 ± 0.46 | 99.38 ± 0.28 |
| | SDLS | 99.28 ± 0.50 | 99.38 ± 0.19 |
Table 7.
Branch ablation and attention module comparison of MGA.
Table 7.
Branch ablation and attention module comparison of MGA.
| Networks/Classifiers | Branch Ablation | Attention Modules |
|---|
| w/o Conv | w/o Depthwise | w/o Group | MGA | SE | CBAM |
|---|
| Branch1_MobileNetV2 | 90.22 ± 0.33 | 90.20 ± 0.23 | 90.34 ± 0.29 | 90.28 ± 0.18 | 90.23 ± 0.16 | 89.98 ± 0.20 |
| Branch2_MobileNetV2 | 90.30 ± 0.15 | 90.16 ± 0.23 | 90.16 ± 0.17 | 90.28 ± 0.15 | 90.26 ± 0.25 | 89.91 ± 0.15 |
| ResNet18 | 90.34 ± 0.18 | 90.47 ± 0.18 | 90.36 ± 0.20 | 90.50 ± 0.08 | 90.33 ± 0.19 | 90.50 ± 0.18 |
| Feature fusion classifier | 91.79 ± 0.20 | 91.86 ± 0.19 | 91.74 ± 0.22 | 91.88 ± 0.05 | 91.73 ± 0.20 | 91.76 ± 0.12 |
| Local_MobileNetV2 | 91.76 ± 0.23 | 91.86 ± 0.25 | 91.73 ± 0.26 | 91.96 ± 0.12 | 91.78 ± 0.18 | 91.79 ± 0.16 |
| SDLS | 91.86 ± 0.22 | 91.95 ± 0.23 | 91.84 ± 0.25 | 92.03 ± 0.05 | 91.82 ± 0.19 | 91.89 ± 0.15 |
Table 8.
Results of distillation on the AID dataset with a 50% training ratio.
Table 8.
Results of distillation on the AID dataset with a 50% training ratio.
| Networks/Classifiers | Distillation |
|---|
| × | ✓ |
|---|
| Branch1_MobileNetV2 | 96.24 ± 0.26 | 96.49 ± 0.26 |
| Branch2_MobileNetV2 | 96.31 ± 0.31 | 96.61 ± 0.14 |
| ResNet18 | 95.94 ± 0.18 | 96.55 ± 0.14 |
| Feature fusion classifier | 96.86 ± 0.25 | 97.22 ± 0.19 |
| Local_MobileNetV2 | 96.89 ± 0.24 | 97.21 ± 0.17 |
| SDLS | 96.95 ± 0.21 | 97.24 ± 0.15 |
| Branch1_MobileNetV2 | 96.37 ± 0.06 | 96.67 ± 0.15 |
| Branch2_MobileNetV2 | 96.39 ± 0.12 | 96.48 ± 0.21 |
| WRN50-2 | 96.44 ± 0.13 | 96.62 ± 0.19 |
| Feature fusion classifier | 97.21 ± 0.13 | 97.45 ± 0.18 |
| Local_MobileNetV2 | 97.17 ± 0.11 | 97.46 ± 0.10 |
| SDLS | 97.22 ± 0.12 | 97.47 ± 0.14 |
Table 9.
Effect of distillation temperature T.
Table 9.
Effect of distillation temperature T.
| Networks/Classifiers | Distillation Temperature T |
|---|
| | | | |
|---|
| Branch1_MobileNetV2 | 94.26 ± 0.07 | 94.33 ± 0.23 | 94.22 ± 0.49 | 82.72 ± 3.04 | 14.45 ± 2.52 |
| Branch2_MobileNetV2 | 94.32 ± 0.14 | 94.40 ± 0.25 | 94.28 ± 0.42 | 82.98 ± 3.36 | 14.62 ± 2.34 |
| ResNet18 | 93.84 ± 0.21 | 94.46 ± 0.11 | 94.42 ± 0.25 | 83.46 ± 3.67 | 18.62 ± 2.54 |
| Feature fusion classifier | 95.29 ± 0.11 | 95.35 ± 0.24 | 95.16 ± 0.41 | 85.29 ± 3.19 | 15.14 ± 2.08 |
| Local_MobileNetV2 | 95.27 ± 0.18 | 95.41 ± 0.18 | 95.28 ± 0.42 | 85.37 ± 3.13 | 15.39 ± 1.88 |
| SDLS | 95.39 ± 0.16 | 95.47 ± 0.22 | 95.30 ± 0.43 | 85.35 ± 3.18 | 15.18 ± 2.05 |
Table 10.
Results of two-stream logits fusion.
Table 10.
Results of two-stream logits fusion.
| Networks/Classifiers | Fusion Techniques |
|---|
| Sum | Mean | Max |
|---|
| Branch1_MobileNetV2 | 96.49 ± 0.26 | 96.88 ± 0.13 | 96.70 ± 0.18 |
| Branch2_MobileNetV2 | 96.61 ± 0.14 | 96.90 ± 0.11 | 96.66 ± 0.13 |
| ResNet18 | 96.55 ± 0.14 | 96.80 ± 0.10 | 96.53 ± 0.19 |
| Feature fusion classifier | 97.22 ± 0.19 | 97.18 ± 0.12 | 96.97 ± 0.22 |
| Local_MobileNetV2 | 97.21 ± 0.17 | 97.22 ± 0.10 | 97.04 ± 0.19 |
| SDLS | 97.24 ± 0.15 | 97.20 ± 0.12 | 96.98 ± 0.22 |
| Branch1_MobileNetV2 | 96.67 ± 0.15 | 96.98 ± 0.06 | 96.86 ± 0.20 |
| Branch2_MobileNetV2 | 96.48 ± 0.21 | 96.93 ± 0.05 | 96.90 ± 0.10 |
| WRN50-2 | 96.62 ± 0.19 | 97.12 ± 0.23 | 97.01 ± 0.16 |
| Feature fusion classifier | 97.45 ± 0.18 | 97.40 ± 0.15 | 97.25 ± 0.12 |
| Local_MobileNetV2 | 97.46 ± 0.10 | 97.42 ± 0.16 | 97.27 ± 0.11 |
| SDLS | 97.47 ± 0.14 | 97.42 ± 0.15 | 97.25 ± 0.12 |
Table 11.
Results of different feature fusion strategies.
Table 11.
Results of different feature fusion strategies.
| Networks/Classifiers | Fusion Techniques |
|---|
| Sum | Cat | Weighted Average |
|---|
| Branch1_MobileNetV2 | 96.49 ± 0.26 | 96.64 ± 0.09 | 96.68 ± 0.20 |
| Branch2_MobileNetV2 | 96.61 ± 0.14 | 96.56 ± 0.11 | 96.50 ± 0.12 |
| ResNet18 | 96.55 ± 0.14 | 96.39 ± 0.17 | 96.50 ± 0.23 |
| Feature fusion classifier | 97.22 ± 0.19 | 97.20 ± 0.14 | 97.25 ± 0.14 |
| Local_MobileNetV2 | 97.21 ± 0.17 | 97.20 ± 0.17 | 97.13 ± 0.08 |
| SDLS | 97.24 ± 0.15 | 97.20 ± 0.11 | 97.28 ± 0.12 |
| Branch1_MobileNetV2 | 96.67 ± 0.15 | 96.52 ± 0.25 | 96.60 ± 0.13 |
| Branch2_MobileNetV2 | 96.48 ± 0.21 | 96.50 ± 0.13 | 96.56 ± 0.14 |
| WRN50-2 | 96.62 ± 0.19 | 96.40 ± 0.23 | 96.54 ± 0.27 |
| Feature fusion classifier | 97.45 ± 0.18 | 97.36 ± 0.27 | 97.38 ± 0.14 |
| Local_MobileNetV2 | 97.46 ± 0.10 | 97.36 ± 0.23 | 97.25 ± 0.09 |
| SDLS | 97.47 ± 0.14 | 97.36 ± 0.23 | 97.36 ± 0.11 |
Table 12.
Ablation results of varying the initial values of and for local image generation on the AID dataset with a 50% training ratio.
Table 12.
Ablation results of varying the initial values of and for local image generation on the AID dataset with a 50% training ratio.
| Networks/Classifiers | Initial Values of |
|---|
| (0.2, 0.8) | (0.4, 0.6) | (0.6, 0.4) | (0.8, 0.2) |
|---|
| Branch1_MobileNetV2 | 96.49 ± 0.26 | 96.62 ± 0.16 | 96.54 ± 0.15 | 96.46 ± 0.13 |
| Branch2_MobileNetV2 | 96.61 ± 0.14 | 96.42 ± 0.11 | 96.56 ± 0.21 | 96.54 ± 0.10 |
| ResNet18 | 96.55 ± 0.14 | 96.39 ± 0.18 | 96.33 ± 0.10 | 96.38 ± 0.17 |
| Feature fusion classifier | 97.22 ± 0.19 | 97.12 ± 0.25 | 97.23 ± 0.16 | 97.15 ± 0.12 |
| Local_MobileNetV2 | 97.21 ± 0.17 | 97.12 ± 0.19 | 97.22 ± 0.11 | 97.10 ± 0.11 |
| SDLS | 97.24 ± 0.15 | 97.15 ± 0.24 | 97.28 ± 0.10 | 97.17 ± 0.11 |
Table 13.
Analysis of the exponent x in the power-based transformation on the AID dataset with a 20% training ratio.
Table 13.
Analysis of the exponent x in the power-based transformation on the AID dataset with a 20% training ratio.
| Networks/Classifiers | Exponent x in |
|---|
| | | |
|---|
| Branch1_MobileNetV2 | 94.47 ± 0.30 | 94.41 ± 0.18 | 94.33 ± 0.23 | 94.48 ± 0.13 |
| Branch2_MobileNetV2 | 94.41 ± 0.24 | 94.42 ± 0.13 | 94.40 ± 0.25 | 94.31 ± 0.20 |
| ResNet18 | 94.36 ± 0.15 | 94.32 ± 0.21 | 94.46 ± 0.11 | 94.47 ± 0.19 |
| Feature fusion classifier | 95.35 ± 0.19 | 95.30 ± 0.13 | 95.35 ± 0.24 | 95.35 ± 0.17 |
| Local_MobileNetV2 | 95.35 ± 0.22 | 95.39 ± 0.16 | 95.41 ± 0.18 | 95.35 ± 0.20 |
| SDLS | 95.40 ± 0.23 | 95.40 ± 0.12 | 95.47 ± 0.22 | 95.40 ± 0.17 |
Table 14.
Statistical significance analysis on the AID and NWPU datasets.
Table 14.
Statistical significance analysis on the AID and NWPU datasets.
| Dataset | Metric | Branch1_MobileNetV2 | Branch2_MobileNetV2 | EfficientNet_b3 | Feature Fusion Classifier | Local_MobileNetV2 | SDLS |
|---|
| AID 20% | Mean ± Std | 95.35 ± 0.26 | 95.34 ± 0.24 | 96.36 ± 0.18 | 96.79 ± 0.20 | 96.84 ± 0.19 | 96.93 ± 0.16 |
| 95% CI | [−0.921, −0.527] | [−0.925, −0.549] | [+0.162, +0.416] | [+0.576, +0.858] | [+0.606, +0.938] | [+0.708, +1.004] |
| p-value | 1.65 | 9.79 | 6.17 | 1.08 | 2.31 | 3.61 |
| AID 50% | Mean ± Std | 97.23 ± 0.18 | 97.22 ± 0.12 | 97.82 ± 0.12 | 98.09 ± 0.10 | 98.11 ± 0.07 | 98.15 ± 0.08 |
| 95% CI | [−0.604, −0.316] | [−0.629, −0.319] | [−0.026, +0.274] | [+0.250, +0.554] | [+0.261, +0.567] | [+0.335, +0.589] |
| p-value | 4.82 | 6.87 | 9.49 | 2.09 | 1.74 | 1.76 |
| NWPU 10% | Mean ± Std | 91.04 ± 0.20 | 91.02 ± 0.21 | 93.03 ± 0.12 | 93.61 ± 0.15 | 93.81 ± 0.15 | 93.86 ± 0.13 |
| 95% CI | [−1.783, −1.461] | [−1.746, −1.536] | [+0.261, +0.469] | [+0.825, +1.067] | [+1.026, +1.264] | [+1.095, +1.303] |
| p-value | 2.89 | 5.73 | 2.36 | 2.59 | 4.30 | 8.38 |
| NWPU 20% | Mean ± Std | 93.45 ± 0.10 | 93.44 ± 0.12 | 94.95 ± 0.17 | 95.43 ± 0.08 | 95.49 ± 0.10 | 95.53 ± 0.10 |
| 95% CI | [−1.420, −1.180] | [−1.416, −1.198] | [+0.050, +0.356] | [+0.586, +0.768] | [+0.634, +0.854] | [+0.680, +0.880] |
| p-value | 1.45 | 6.18 | 1.48 | 4.21 | 9.16 | 2.81 |