Author Contributions
Conceptualization, Z.W. (Zhechao Wang); methodology, Z.W. (Zhechao Wang); software, Z.W. (Zhechao Wang); validation, Z.W. (Zhechao Wang), P.C., S.D., K.C. and Z.W. (Zhirui Wang); formal analysis, Z.W. (Zhechao Wang) and P.C.; writing—original draft preparation, Z.W. (Zhechao Wang); writing—review and editing, Z.W. (Zhechao Wang), P.C., S.D., K.C., Z.W. (Zhirui Wang), X.L. and X.S.; visualization, Z.W. (Zhechao Wang); supervision, P.C., Z.W. (Zhirui Wang), and X.S.; project administration, P.C., Z.W. (Zhirui Wang), X.L. and X.S.; funding acquisition, Z.W. (Zhirui Wang), X.L. and X.S. All authors have read and agreed to the published version of the manuscript.
Figure 1.
Illustration of multiple remote sensing platforms’ collaborative perception. The members of the collaborative group can enhance their local perception with the help of collaborative information from other platforms.
Figure 1.
Illustration of multiple remote sensing platforms’ collaborative perception. The members of the collaborative group can enhance their local perception with the help of collaborative information from other platforms.
Figure 2.
Overall framework of the proposed DCP-Net. This framework defines the platform in need of collaboration as requester, potential collaborating platforms as candidates, and the selected candidate as a supporter. In the given scenario, platform 1 acts as the requester, while the remaining platforms serve as candidates. Platform 4 is selected as the supporter through the SMIM module, responsible for providing collaborative features.
Figure 2.
Overall framework of the proposed DCP-Net. This framework defines the platform in need of collaboration as requester, potential collaborating platforms as candidates, and the selected candidate as a supporter. In the given scenario, platform 1 acts as the requester, while the remaining platforms serve as candidates. Platform 4 is selected as the supporter through the SMIM module, responsible for providing collaborative features.
Figure 3.
Functional Schematic of the SMIM Module. Within the collaborative perception network, each remote sensing platform makes autonomous decisions based on its local observations. For instance, platform 1, upon recognizing the necessity for collaboration, sends requests to other platforms. Upon assessing the correlation feedback, platform 1 selects platform 2 as the suitable supporter. Meanwhile, platforms 2, 3, and 4 execute their perception task independently.
Figure 3.
Functional Schematic of the SMIM Module. Within the collaborative perception network, each remote sensing platform makes autonomous decisions based on its local observations. For instance, platform 1, upon recognizing the necessity for collaboration, sends requests to other platforms. Upon assessing the correlation feedback, platform 1 selects platform 2 as the suitable supporter. Meanwhile, platforms 2, 3, and 4 execute their perception task independently.
Figure 4.
Process framework of the SMIM Module. The module is divided into two stages: “self-information match”, depicted in the left part, and “mutual information match”, depicted in the right part.
Figure 4.
Process framework of the SMIM Module. The module is divided into two stages: “self-information match”, depicted in the left part, and “mutual information match”, depicted in the right part.
Figure 5.
Exhibition of the RFF Module. The reshape operation converts the image features into a sequence format. In addition, the complementary operation can be simply regarded as subtracting the input from 1.
Figure 5.
Exhibition of the RFF Module. The reshape operation converts the image features into a sequence format. In addition, the complementary operation can be simply regarded as subtracting the input from 1.
Figure 6.
Simulated multiplatform observation dataset of ISPRS Potsdam.
Figure 6.
Simulated multiplatform observation dataset of ISPRS Potsdam.
Figure 7.
Simulated multiplatform observation dataset of iSAID.
Figure 7.
Simulated multiplatform observation dataset of iSAID.
Figure 8.
The images taken by Gaofen-2 and SuperView-1 are presented in row 1 and row 2, respectively.
Figure 8.
The images taken by Gaofen-2 and SuperView-1 are presented in row 1 and row 2, respectively.
Figure 9.
Multiplatform joint observation dataset of DFC23.
Figure 9.
Multiplatform joint observation dataset of DFC23.
Figure 10.
Visualization of the Homo-CIS mode.
Figure 10.
Visualization of the Homo-CIS mode.
Figure 11.
Visualization of the Homo-PIS mode.
Figure 11.
Visualization of the Homo-PIS mode.
Figure 12.
Visualization of the Hetero-PIS mode.
Figure 12.
Visualization of the Hetero-PIS mode.
Figure 13.
The discrepancies of accuracy in assessing collaboration opportunities and supporter selection between DCP-Net and When2com in the Homo-CIS mode.
Figure 13.
The discrepancies of accuracy in assessing collaboration opportunities and supporter selection between DCP-Net and When2com in the Homo-CIS mode.
Figure 14.
Ablation experiments on the request threshold in the self-information match stage of the SMIM module.
Figure 14.
Ablation experiments on the request threshold in the self-information match stage of the SMIM module.
Figure 15.
Ablation experiments on the request size used in the mutual information match stage of the SMIM module.
Figure 15.
Ablation experiments on the request size used in the mutual information match stage of the SMIM module.
Figure 16.
The quantitative and visual descriptions of the influence of our designed SMIM and RFF modules in the collaboration of Homo-CIS mode.
Figure 16.
The quantitative and visual descriptions of the influence of our designed SMIM and RFF modules in the collaboration of Homo-CIS mode.
Figure 17.
The quantitative and visual descriptions of the influence of our designed SMIM and RFF modules in the collaboration of Homo-PIS mode.
Figure 17.
The quantitative and visual descriptions of the influence of our designed SMIM and RFF modules in the collaboration of Homo-PIS mode.
Figure 18.
The quantitative and visual descriptions about the influence of our designed SMIM and RFF modules in the collaboration of Hetero-PIS mode.
Figure 18.
The quantitative and visual descriptions about the influence of our designed SMIM and RFF modules in the collaboration of Hetero-PIS mode.
Figure 19.
The visualization of results predicted by various baselines and DCP-Net in the Homo-CIS mode of the Potsdam dataset.
Figure 19.
The visualization of results predicted by various baselines and DCP-Net in the Homo-CIS mode of the Potsdam dataset.
Figure 20.
The visualization of results predicted by various baselines and DCP-Net in the Homo-PIS mode of the Potsdam dataset.
Figure 20.
The visualization of results predicted by various baselines and DCP-Net in the Homo-PIS mode of the Potsdam dataset.
Figure 21.
The visualization of results predicted by various baselines and DCP-Net in the Homo-CIS mode of the iSAID dataset.
Figure 21.
The visualization of results predicted by various baselines and DCP-Net in the Homo-CIS mode of the iSAID dataset.
Figure 22.
The visualization of results predicted by various baselines and DCP-Net in the Homo-PIS mode of the iSAID dataset.
Figure 22.
The visualization of results predicted by various baselines and DCP-Net in the Homo-PIS mode of the iSAID dataset.
Figure 23.
The visualization of results predicted by various baselines and DCP-Net in the Hetero-PIS mode of the DFC23 dataset.
Figure 23.
The visualization of results predicted by various baselines and DCP-Net in the Hetero-PIS mode of the DFC23 dataset.
Figure 24.
The large-scale visualization of predicted results in the DFC23 dataset.
Figure 24.
The large-scale visualization of predicted results in the DFC23 dataset.
Table 1.
Summary of datasets utilized in the study: the number of classes and training/validation samples.
Table 1.
Summary of datasets utilized in the study: the number of classes and training/validation samples.
Dataset | Classes | Training Samples | Validation Samples |
---|
Potsdam [65] | 6 | 7200 | 2800 |
iSAID [66] | 16 | 19,790 | 6289 |
DFC23 [64] | 2 | 3688 | 1752 |
Table 2.
Description and comparison of different collaborative perception methods.
Table 2.
Description and comparison of different collaborative perception methods.
Method | Description | Advantages | Shortcomings |
---|
No-Interaction | Independent execution of downstream tasks without any information interaction. | No transmission expense. | Limited performance due to lack of collaboration. |
Who2com [15] | Utilizes an attention mechanism to select a perception helper based on relevance during each iteration. | Selects the most relevant platform for information supplementation during each interaction. | Can be redundant for well-informed platforms. |
When2com [16] | An evolved version of Who2com, determining the suitable collaboration opportunity based on interplatform correlation. | Reduces unnecessary interactions by determining whether collaboration is needed. | May establish low-profit collaboration prompted by highly relevant platforms, even when local information is abundant. |
MASH [17] | Collects predictions from each collaborator and uses local features to generate masked predictions for final fusion. | Low transmission volume due to result-level fusion. | High complexity and potential for prediction conflicts. |
MRCP-GNN [18] | Models relationships among collaborative platforms as a graph and aggregates feature maps from neighbors through a GCN-based feature fusion mechanism. | Effectively integrates features through a graph network structure. | High computational and communication cost. |
RandCom | A naive distributed collaboration approach where one of the other platforms is randomly selected as a perception supporter. | Simple and easy to implement, requiring only one perception assistant at a time. | Random selection may not always be optimal. |
CatAll | A simple centralized model baseline that concatenates the extracted features from all platforms for downstream tasks. | Simple and easy to implement. | High communication overhead. |
AuxAttend | Employs an attention mechanism to assign weights to the auxiliary views provided by other platforms. | Simple and easy to implement. | High communication overhead. |
Table 3.
Comparison of different advanced methods in terms of parameters (Params) and computational complexity measured in floating-point operations (FLOPs).
Table 3.
Comparison of different advanced methods in terms of parameters (Params) and computational complexity measured in floating-point operations (FLOPs).
Type | Method | Params (M) | FLOPs (G) |
---|
Centralized | MRCP-GNN [18] | 29.39 | 1.88 |
MASH [17] | 28.24 | 2.55 |
Distributed | Who2com [15] | 18.71 | 1.40 |
When2com [16] | 18.83 | 1.41 |
Ours | 24.95 | 1.66 |
Table 4.
Baselines and DCP-Net experimental results in the Homo-CIS mode. Arrows indicate whether a higher or lower value is preferable: ↑ means higher is better, and ↓ means lower is better. Since there is no transmission cost in No-Interaction, the value of Comm. Cost and CE is represented as N/A.
Table 4.
Baselines and DCP-Net experimental results in the Homo-CIS mode. Arrows indicate whether a higher or lower value is preferable: ↑ means higher is better, and ↓ means lower is better. Since there is no transmission cost in No-Interaction, the value of Comm. Cost and CE is represented as N/A.
Homo-CIS | Postdam | iSAID |
---|
Type
|
Method
|
mIoU ↑
|
Comm. Cost ↓
|
CE ↑
|
mIoU ↑
|
Comm. Cost ↓
|
CE ↑
|
---|
Noisy
|
Normal
|
Avg.
|
Noisy
|
Normal
|
Avg.
|
---|
Individual | No-Interaction | 50.18 | 65.09 | 57.38 | N/A | N/A | 38.77 | 49.33 | 44.24 | N/A | N/A |
Centralized | CatAll | 55.48 | 65.92 | 60.60 | 1.50 | 2.15 | 44.57 | 52.20 | 48.47 | 1.50 | 2.82 |
AuxAttend | 64.25 | 65.49 | 65.10 | 1.50 | 5.15 | 49.01 | 53.15 | 51.12 | 1.50 | 4.59 |
MRCP-GNN [18] | 54.54 | 65.04 | 59.67 | 1.50 | 1.53 | 46.31 | 52.14 | 49.28 | 1.50 | 3.36 |
MASH [17] | 64.55 | 65.77 | 65.17 | 3.00 | 5.19 | 46.37 | 51.60 | 49.06 | 3.00 | 1.61 |
Distributed | Randcom | 49.58 | 60.67 | 54.98 | 0.50 | −4.80 | 40.03 | 51.06 | 45.73 | 0.50 | 2.98 |
Who2com [15] | 64.59 | 65.90 | 65.25 | 0.50 | 15.74 | 43.66 | 50.06 | 46.91 | 0.50 | 5.34 |
When2com [16] | 64.62 | 65.12 | 64.88 | 0.26 | 29.07 | 48.61 | 49.52 | 49.03 | 0.28 | 17.42 |
Ours | 65.39 | 66.36 | 65.87 | 0.26 | 33.29 | 51.45 | 52.13 | 51.71 | 0.25 | 29.88 |
Table 5.
Experimental results of baselines and DCP-Net in the Homo-PIS mode.
Table 5.
Experimental results of baselines and DCP-Net in the Homo-PIS mode.
Homo-PIS | Postdam | iSAID |
---|
Type
|
Method
|
mIoU ↑
|
Comm. Cost ↓
|
CE ↑
|
mIoU ↑
|
Comm. Cost ↓
|
CE ↑
|
---|
Noisy
|
Normal
|
Avg.
|
Noisy
|
Normal
|
Avg.
|
---|
Individual | No-Interaction | 48.47 | 63.30 | 55.37 | N/A | N/A | 38.59 | 50.42 | 44.69 | N/A | N/A |
Centralized | CatAll | 50.56 | 64.18 | 56.92 | 1.50 | 1.03 | 41.83 | 51.61 | 46.89 | 1.50 | 1.47 |
AuxAttend | 51.11 | 63.67 | 56.98 | 1.50 | 1.07 | 43.17 | 52.32 | 47.91 | 1.50 | 2.15 |
MRCP-GNN [18] | 51.00 | 63.54 | 56.86 | 1.50 | 0.99 | 43.54 | 53.17 | 47.79 | 1.50 | 2.07 |
MASH [17] | 51.47 | 63.68 | 57.14 | 3.00 | 0.59 | 42.58 | 52.05 | 47.64 | 3.00 | 0.98 |
Distributed | Randcom | 48.81 | 63.60 | 55.66 | 0.50 | 0.58 | 39.85 | 48.84 | 44.46 | 0.50 | −0.46 |
Who2com [15] | 49.64 | 63.44 | 56.07 | 0.50 | 1.40 | 40.33 | 48.67 | 44.61 | 0.50 | −0.16 |
When2com [16] | 49.71 | 61.77 | 55.33 | 0.02 | −2.67 | 36.42 | 48.43 | 42.59 | 0.11 | −19.09 |
Ours | 54.43 | 63.70 | 58.91 | 0.39 | 9.12 | 45.56 | 50.59 | 47.97 | 0.39 | 8.52 |
Table 6.
Experimental results of baselines and DCP-Net in the Hetero-PIS mode.
Table 6.
Experimental results of baselines and DCP-Net in the Hetero-PIS mode.
Hetero-PIS | DFC23 |
---|
Type
|
Method
|
mIoU ↑
|
Comm. Cost ↓
|
CE ↑
|
---|
Noisy
|
Normal
|
Avg.
|
---|
Individual | No-Interaction | 54.94 | 61.04 | 57.88 | N/A | N/A |
Centralized | CatAll | 55.46 | 61.43 | 58.37 | 1.50 | 0.33 |
AuxAttend | 55.50 | 62.22 | 58.85 | 1.50 | 0.65 |
MRCP-GNN [18] | 55.87 | 62.17 | 58.92 | 1.50 | 0.69 |
MASH [17] | 55.74 | 62.60 | 58.96 | 3.00 | 0.36 |
Distributed | Randcom | 55.82 | 61.58 | 58.62 | 0.50 | 1.48 |
Who2com [15] | 54.28 | 61.63 | 57.91 | 0.50 | 0.06 |
When2com [16] | 55.74 | 61.82 | 58.77 | 0.27 | 3.36 |
Ours | 56.03 | 62.81 | 59.39 | 0.36 | 4.25 |
Table 7.
Experimental results of various backbones in Hetero-PIS mode using DFC23 dataset.
Table 7.
Experimental results of various backbones in Hetero-PIS mode using DFC23 dataset.
Hetero-PIS | DFC23 |
---|
Backbone
|
Type
|
Method
| mIoU.Avg ↑ |
Comm. Cost ↓
|
CE ↑
|
---|
MobileNetV2-1.0 [69] | Individual | No-Interaction | 55.25 | 61.74 | 58.38 | N/A | N/A |
Centralized | MRCP-GNN [18] | 56.14 | 62.40 | 59.21 | 1.50 | 0.55 |
MASH [17] | 56.43 | 63.10 | 59.72 | 3.00 | 0.45 |
Distributed | Who2com [15] | 55.14 | 62.05 | 58.67 | 0.50 | 0.58 |
When2com [16] | 56.14 | 62.40 | 59.32 | 0.23 | 4.02 |
Ours | 56.75 | 62.94 | 59.84 | 0.32 | 4.56 |
EfficientNet-B0 [70] | Individual | No-Interaction | 55.84 | 62.30 | 59.06 | N/A | N/A |
Centralized | MRCP-GNN [18] | 57.01 | 63.20 | 60.08 | 1.50 | 0.68 |
MASH [17] | 57.23 | 63.12 | 60.14 | 3.00 | 0.36 |
Distributed | Who2com [15] | 56.43 | 62.40 | 59.42 | 0.50 | 0.72 |
When2com [16] | 57.08 | 62.86 | 59.96 | 0.28 | 3.17 |
Ours | 57.36 | 63.17 | 60.24 | 0.31 | 3.76 |
Table 8.
Experimental results of various downstream decoders in the Hetero-PIS mode.
Table 8.
Experimental results of various downstream decoders in the Hetero-PIS mode.
Hetero-PIS | DFC23 |
---|
Decoder
|
Type
|
Method
| mIoU.Avg ↑ |
Comm. Cost ↓
|
CE ↑
|
---|
PSPNet [68] | Individual | No-Interaction | 57.76 | N/A | N/A |
Centralized | MRCP-GNN [18] | 58.92 | 1.50 | 0.77 |
MASH [17] | 58.88 | 3.00 | 0.37 |
Distributed | Who2com [15] | 58.57 | 0.50 | 1.62 |
When2com [16] | 58.87 | 0.25 | 4.44 |
Ours | 59.61 | 0.39 | 4.72 |
DeepLabV3 [34] | Individual | No-Interaction | 61.12 | N/A | N/A |
Centralized | MRCP-GNN [18] | 61.61 | 3.00 | 0.16 |
MASH [17] | 61.69 | 3.00 | 0.19 |
Distributed | Who2com [15] | 61.50 | 1.00 | 0.38 |
When2com [16] | 61.36 | 0.63 | 0.38 |
Ours | 62.22 | 0.68 | 1.62 |
UPerNet [35] | Individual | No-Interaction | 60.15 | N/A | N/A |
Centralized | MRCP-GNN [18] | 60.46 | 1.50 | 0.21 |
MASH [17] | 60.86 | 3.00 | 0.24 |
Distributed | Who2com [15] | 60.01 | 0.50 | −0.28 |
When2com [16] | 60.46 | 0.24 | 1.29 |
Ours | 61.22 | 0.40 | 2.68 |
Table 9.
Ablation experiments on each designed module in the Homo-CIS mode.
Table 9.
Ablation experiments on each designed module in the Homo-CIS mode.
Homo-CIS |
---|
Components
|
Postdam
|
iSAID
|
---|
SMIM
|
RFF
|
mIoU ↑
|
Comm. Cost ↓
|
CE ↑
|
mIoU ↑
|
Comm. Cost ↓
|
CE ↑
|
---|
Noisy
|
Normal
|
Avg.
|
Noisy
|
Normal
|
Avg.
|
---|
| | 50.18 | 65.09 | 57.38 | N/A | N/A | 38.77 | 49.33 | 44.24 | N/A | N/A |
| ✓ | 65.49 | 66.36 | 65.92 | 1.50 | 5.69 | 51.65 | 52.96 | 52.24 | 1.50 | 5.33 |
✓ | | 64.74 | 65.87 | 65.31 | 0.26 | 31.10 | 50.11 | 51.37 | 50.69 | 0.25 | 25.98 |
✓ | ✓ | 65.39 | 66.36 | 65.87 | 0.26 | 33.29 | 51.45 | 52.13 | 51.71 | 0.25 | 30.09 |
Table 10.
Ablation experimentson each designed module in the Homo-PIS mode.
Table 10.
Ablation experimentson each designed module in the Homo-PIS mode.
Homo-PIS |
---|
Components
|
Postdam
|
iSAID
|
---|
SMIM
|
RFF
|
mIoU ↑
|
Comm. Cost ↓
|
CE ↑
|
mIoU ↑
|
Comm. Cost ↓
|
CE ↑
|
---|
Noisy
|
Normal
|
Avg.
|
Noisy
|
Normal
|
Avg.
|
---|
| | 48.47 | 63.30 | 55.37 | N/A | N/A | 38.59 | 50.42 | 44.69 | N/A | N/A |
| ✓ | 56.19 | 64.54 | 60.11 | 1.50 | 3.16 | 47.20 | 50.82 | 48.95 | 1.50 | 2.84 |
✓ | | 49.98 | 63.48 | 56.19 | N/A | N/A | 39.93 | 49.67 | 45.03 | 0.31 | 1.08 |
✓ | ✓ | 54.43 | 63.70 | 58.91 | 0.39 | 9.12 | 45.56 | 50.59 | 47.97 | 0.39 | 8.52 |
Table 11.
Ablation experiments on each designed module in the Hetero-PIS mode.
Table 11.
Ablation experiments on each designed module in the Hetero-PIS mode.
Hetero-PIS |
---|
Components
|
DFC23
|
---|
SMIM
|
RFF
|
mIoU ↑
|
Comm. Cost ↓
|
CE ↑
|
---|
Noisy
|
Normal
|
Avg.
|
---|
| | 54.94 | 61.04 | 57.88 | N/A | N/A |
| ✓ | 55.96 | 63.29 | 59.66 | 1.50 | 1.19 |
✓ | | 55.73 | 62.21 | 58.89 | 0.36 | 2.85 |
✓ | ✓ | 56.03 | 62.81 | 59.39 | 0.36 | 4.25 |