Author Contributions
Conceptualization, Y.Z. (Yue Zhang) and R.Y.; methodology, Y.Z. (Yue Zhang), R.Y., Q.D. and L.W.; software, Y.Z. (Yue Zhang) and R.Y.; validation, Y.Z. (Yue Zhang), R.Y., Q.D., J.W. and L.W.; formal analysis, L.W. and Q.D.; investigation, W.X. and Y.Z. (Yili Zhao); resources, L.W., Q.D., W.X. and Y.Z. (Yili Zhao); data curation, Y.Z. (Yue Zhang) and R.Y.; writing—original draft preparation, Y.Z. (Yue Zhang) and R.Y.; writing—review and editing, Y.Z. (Yue Zhang), R.Y., L.W., Q.D., W.X. and Y.Z. (Yili Zhao); visualization, Y.Z. (Yue Zhang) and R.Y.; supervision, L.W. and Q.D.; project administration, W.X. and Y.Z. (Yili Zhao); funding acquisition, J.W., L.W., Q.D., W.X. and Y.Z. (Yili Zhao). All authors have read and agreed to the published version of the manuscript.
Figure 1.
Visual comparison of object boundaries obtained with multiple inputs and the Res-UNet network. (a) The edge was obtained from input RGB bands (3 channels), (b) from RGB-NIR+NDWI+GNDVI bands (6-channel), and (c) from RGB-NIR+NDWI+GNDVI-NDVI+DVI+RVI bands (9-channel). The white color represents the boundary of the ground truth, the red color represents the boundary obtained from the semantic segmentation, and the green color indicates the boundary where the segmentation results overlap with the ground truth.
Figure 1.
Visual comparison of object boundaries obtained with multiple inputs and the Res-UNet network. (a) The edge was obtained from input RGB bands (3 channels), (b) from RGB-NIR+NDWI+GNDVI bands (6-channel), and (c) from RGB-NIR+NDWI+GNDVI-NDVI+DVI+RVI bands (9-channel). The white color represents the boundary of the ground truth, the red color represents the boundary obtained from the semantic segmentation, and the green color indicates the boundary where the segmentation results overlap with the ground truth.
Figure 2.
The proposed HAE-RNet architecture and block details. (a) HAE-RNet is in an encoder–decoder style. (b) The ASPP Block in the bridge part. (c) The Residual Block in the encoder–decoder part. (d) The ECA Block in the encoder part.
Figure 2.
The proposed HAE-RNet architecture and block details. (a) HAE-RNet is in an encoder–decoder style. (b) The ASPP Block in the bridge part. (c) The Residual Block in the encoder–decoder part. (d) The ECA Block in the encoder part.
Figure 3.
Edge detection detail output map.
Figure 3.
Edge detection detail output map.
Figure 4.
3-Channel, 4-Channel, 6-Channel, and 9-Channel belief map (output of softmax function) obtained using HAE-RNet.
Figure 4.
3-Channel, 4-Channel, 6-Channel, and 9-Channel belief map (output of softmax function) obtained using HAE-RNet.
Figure 5.
Comparison of segmentation results of HAE-RNet for four channel combinations of 3C, 4C, 6C-2, and 9C on the GID test set.
Figure 5.
Comparison of segmentation results of HAE-RNet for four channel combinations of 3C, 4C, 6C-2, and 9C on the GID test set.
Figure 6.
Comparison of segmentation results of HAE-RNet for four channel combinations of 3C, 4C, 6C-2, and 9C on the Vaihingen test set.
Figure 6.
Comparison of segmentation results of HAE-RNet for four channel combinations of 3C, 4C, 6C-2, and 9C on the Vaihingen test set.
Figure 7.
Comparison of edge segmentation results for Res-UNet, Res-UNet+aspp, Res-UNet+ED, and HAE-RNet and ground truth on four GID test sets where the X-axis represents the serial numbered 3rd, 4th, 9th, and 10th images of the displayed GID test set. (a) is the result of EES values for the four networks, (b) is the result of GCE values for the four networks, (c) is the result of VoI values for the four networks, and (d) is the result of PRI values for the four networks.
Figure 7.
Comparison of edge segmentation results for Res-UNet, Res-UNet+aspp, Res-UNet+ED, and HAE-RNet and ground truth on four GID test sets where the X-axis represents the serial numbered 3rd, 4th, 9th, and 10th images of the displayed GID test set. (a) is the result of EES values for the four networks, (b) is the result of GCE values for the four networks, (c) is the result of VoI values for the four networks, and (d) is the result of PRI values for the four networks.
Figure 8.
Comparison of edge segmentation results for 3C, 6C, and 9C and ground truth on four GID test sets where the x-axis represents the ordinal number of the test set. (a) is the result of EES values for the four networks, (b) is the result of GCE values for the four networks, (c) is the result of VoI values for the four networks, and (d) is the result of PRI values for the four networks.
Figure 8.
Comparison of edge segmentation results for 3C, 6C, and 9C and ground truth on four GID test sets where the x-axis represents the ordinal number of the test set. (a) is the result of EES values for the four networks, (b) is the result of GCE values for the four networks, (c) is the result of VoI values for the four networks, and (d) is the result of PRI values for the four networks.
Figure 9.
Comparison of ground truth and HAE-RNet segmentation results on the GID test set. The first row is the original RGB images. The second row is the ground truth. The third is our segmentation results. The fourth row, the red/green image, is where green and red indicate correct and misclassification pixels, and the fifth row, the red/green image, is overlayed on the original images, respectively.
Figure 9.
Comparison of ground truth and HAE-RNet segmentation results on the GID test set. The first row is the original RGB images. The second row is the ground truth. The third is our segmentation results. The fourth row, the red/green image, is where green and red indicate correct and misclassification pixels, and the fifth row, the red/green image, is overlayed on the original images, respectively.
Figure 10.
Comparison of ground truth and HAE-RNet segmentation results on the Vaihingen test set. The first row is the original RGB images. The second row is the ground truth. The third is our segmentation results. The fourth row, the red/green image, is where green and red indicate correct and misclassification pixels, and the fifth row, the red/green image, is overlayed on the original images, respectively.
Figure 10.
Comparison of ground truth and HAE-RNet segmentation results on the Vaihingen test set. The first row is the original RGB images. The second row is the ground truth. The third is our segmentation results. The fourth row, the red/green image, is where green and red indicate correct and misclassification pixels, and the fifth row, the red/green image, is overlayed on the original images, respectively.
Figure 11.
OA, Recall, F1, mIoU, and FWIoU plots for VGG-16, ResNet-50, U-Net, Res-UNet, and HAE-RNet; (a) is the results on the GID test set, and (b) is the results on the Vaihingen test set.
Figure 11.
OA, Recall, F1, mIoU, and FWIoU plots for VGG-16, ResNet-50, U-Net, Res-UNet, and HAE-RNet; (a) is the results on the GID test set, and (b) is the results on the Vaihingen test set.
Figure 12.
Comparison of Segmentation Results of VGG-16, U-Net, ResNet-50, Res-UNet, and HAE-RNet (6C-2) on the GID Dataset.
Figure 12.
Comparison of Segmentation Results of VGG-16, U-Net, ResNet-50, Res-UNet, and HAE-RNet (6C-2) on the GID Dataset.
Figure 13.
Comparison of Segmentation Results of VGG-16, U-Net, ResNet-50, Res-UNet, and HAE-RNet (6C-2) on the Vaihingen Dataset.
Figure 13.
Comparison of Segmentation Results of VGG-16, U-Net, ResNet-50, Res-UNet, and HAE-RNet (6C-2) on the Vaihingen Dataset.
Table 1.
Considered band combinations as inputs for HAE-RNet.
Table 1.
Considered band combinations as inputs for HAE-RNet.
Combination Names | Combination Description |
---|
3C | Original Bands |
4C | Original Bands-1 |
6C-1 | Original Bands-1 + Water index/DSM + Vegetation index-1 |
6C-2 | Original Bands-1 + Water index/DSM + Vegetation index-2 |
6C-3 | Original Bands-1 + Water index/DSM + Vegetation index-3 |
6C-4 | Original Bands-1+Water index/DSM + Vegetation index-4 |
9C | Original Bands-1 + Water index/DSM + All Vegetation indices |
Table 2.
Evaluation of segmentation results for different band inputs of 3C, 4C, 6C-2, and 9C in the HAE-RNet method on the GID test set.
Table 2.
Evaluation of segmentation results for different band inputs of 3C, 4C, 6C-2, and 9C in the HAE-RNet method on the GID test set.
Channels | Built-Up | Farmland | Forest | Meadow | Water | OA | Recall | F1 | mIoU | FWIoU |
---|
Pre | IoU | Pre | IoU | Pre | IoU | Pre | IoU | Pre | IoU |
---|
3C | 98.99 | 97.73 | 95.25 | 89.15 | 93.45 | 91.33 | 97.63 | 93.04 | 85.88 | 78.39 | 93.93 | 94.95 | 94.57 | 89.93 | 88.80 |
4C | 99.34 | 96.54 | 96.84 | 91.53 | 99.27 | 98.39 | 96.20 | 95.71 | 90.05 | 85.83 | 95.88 | 96.98 | 96.64 | 93.60 | 92.19 |
6C-2 | 98.90 | 98.51 | 98.10 | 95.87 | 99.75 | 99.43 | 97.85 | 96.67 | 96.38 | 93.15 | 97.86 | 98.45 | 98.32 | 96.72 | 95.82 |
9C | 99.33 | 98.37 | 98.79 | 94.01 | 99.11 | 98.55 | 97.43 | 97.25 | 91.20 | 89.40 | 97.08 | 98.24 | 97.67 | 95.52 | 94.40 |
Table 3.
Evaluation of segmentation results for different band inputs of 3C, 4C, 6C-2, and 9C in the HAE-RNet method on the Vaihingen test set.
Table 3.
Evaluation of segmentation results for different band inputs of 3C, 4C, 6C-2, and 9C in the HAE-RNet method on the Vaihingen test set.
Input Channels | Building | Imp-Surface | Tree | Low-Vegetation | Car | OA | Recall | F1 | mIoU | FWIoU |
---|
Pre | IoU | Pre | IoU | Pre | IoU | Pre | IoU | Pre | IoU |
---|
3C | 87.63 | 81.16 | 89.85 | 78.62 | 82.77 | 67.33 | 76.58 | 63.83 | 68.75 | 60.09 | 84.86 | 83.65 | 82.22 | 70.21 | 74.02 |
4C | 91.71 | 87.40 | 92.89 | 82.45 | 83.25 | 69.13 | 77.66 | 66.11 | 65.84 | 59.48 | 87.24 | 86.17 | 83.92 | 72.91 | 77.90 |
6C-2 | 93.85 | 88.70 | 93.81 | 86.27 | 88.35 | 75.00 | 75.00 | 64.38 | 66.05 | 59.58 | 89.59 | 87.34 | 85.07 | 74.79 | 81.78 |
9C | 91.47 | 87.83 | 93.35 | 81.41 | 80.73 | 67.63 | 76.66 | 64.44 | 61.76 | 55.15 | 86.64 | 85.33 | 82.69 | 71.29 | 77.06 |
Table 4.
Evaluation of segmentation results for different 6-channel combinations as inputs to the HAE-RNet method on the GID test set.
Table 4.
Evaluation of segmentation results for different 6-channel combinations as inputs to the HAE-RNet method on the GID test set.
Channels | Built-Up | Farmland | Forest | Meadow | Water | OA | Recall | F1 | mIoU | FWIoU |
---|
Pre | IoU | Pre | IoU | Pre | IoU | Pre | IoU | Pre | IoU |
---|
6C-1 | 99.05 | 97.25 | 97.71 | 92.90 | 99.04 | 97.24 | 97.66 | 96.48 | 90.75 | 88.23 | 96.50 | 97.40 | 97.10 | 94.42 | 93.31 |
6C-2 | 98.90 | 98.51 | 98.10 | 95.87 | 99.75 | 99.43 | 97.85 | 96.67 | 96.38 | 93.15 | 97.86 | 98.45 | 98.32 | 96.72 | 95.82 |
6C-3 | 99.12 | 97.50 | 99.04 | 94.91 | 99.67 | 97.77 | 96.23 | 95.69 | 92.72 | 91.68 | 97.34 | 98.09 | 97.69 | 95.51 | 94.86 |
6C-4 | 99.47 | 98.67 | 98.39 | 93.19 | 99.43 | 98.93 | 97.44 | 97.01 | 90.38 | 88.01 | 96.71 | 97.99 | 97.47 | 95.16 | 93.74 |
Table 5.
Evaluation of segmentation results for different 6-channel combinations as inputs to the HAE-RNet method on the Vaihingen test set.
Table 5.
Evaluation of segmentation results for different 6-channel combinations as inputs to the HAE-RNet method on the Vaihingen test set.
Channels | Building | Imp-Surface | Tree | Low-Vegetation | Car | OA | Recall | F1 | mIoU | FWIoU |
---|
Pre | IoU | Pre | IoU | Pre | IoU | Pre | IoU | Pre | IoU |
---|
6C-1 | 91.36 | 87.73 | 93.04 | 82.95 | 83.23 | 68.36 | 77.43 | 65.20 | 65.08 | 58.66 | 87.13 | 85.90 | 83.65 | 72.58 | 77.75 |
6C-2 | 93.85 | 88.70 | 93.81 | 86.27 | 88.35 | 75.00 | 75.00 | 64.38 | 66.05 | 59.58 | 89.59 | 87.34 | 85.07 | 74.79 | 81.78 |
6C-3 | 91.47 | 87.83 | 93.29 | 83.21 | 85.21 | 68.40 | 75.69 | 64.97 | 64.41 | 58.26 | 87.18 | 85.96 | 83.60 | 72.53 | 77.87 |
6C-4 | 91.47 | 87.53 | 93.45 | 82.16 | 83.16 | 68.37 | 75.52 | 64.66 | 67.78 | 60.16 | 86.92 | 85.58 | 83.69 | 72.58 | 77.47 |
Table 6.
Evaluation of the results of the HAE-RNet ablation experiment on the GID test set.
Table 6.
Evaluation of the results of the HAE-RNet ablation experiment on the GID test set.
Baseline | ASPP | ED | OA | Recall | F1 | mIoU | FWIoU |
---|
√ | | | 97.42 | 98.19 | 97.79 | 95.71 | 95.00 |
√ | √ | | 97.51 | 98.10 | 97.81 | 95.74 | 95.16 |
√ | | √ | 97.70 | 98.30 | 98.13 | 96.36 | 95.54 |
√ | √ | √ | 97.86 | 98.45 | 98.32 | 96.72 | 95.82 |
Table 7.
Evaluation of the results of the HAE-RNet ablation experiment on the Vaihingen test set.
Table 7.
Evaluation of the results of the HAE-RNet ablation experiment on the Vaihingen test set.
Baseline | ASPP | ED | OA | Recall | F1 | mIoU | FWIoU |
---|
√ | | | 86.60 | 85.55 | 82.10 | 70.56 | 76.94 |
√ | √ | | 86.31 | 84.99 | 82.79 | 71.27 | 76.54 |
√ | | √ | 86.23 | 85.47 | 81.68 | 69.99 | 76.37 |
√ | √ | √ | 89.59 | 87.34 | 85.07 | 74.79 | 81.78 |
Table 8.
Evaluation of segmentation results for VGG-16, ResNet-50, U-Net, Res-UNet, and HAE-RNet as input 6C-2 band on the GID test set.
Table 8.
Evaluation of segmentation results for VGG-16, ResNet-50, U-Net, Res-UNet, and HAE-RNet as input 6C-2 band on the GID test set.
Networks | Built-Up | Farmland | Forest | Meadow | Water | OA | Recall | F1 | mIoU | FWIoU |
---|
Pre | IoU | Pre | IoU | Pre | IoU | Pre | IoU | Pre | IoU |
---|
VGG-16 | 98.19 | 96.37 | 99.16 | 92.62 | 96.83 | 96.10 | 93.91 | 93.72 | 88.51 | 87.53 | 96.08 | 97.84 | 96.49 | 93.27 | 92.53 |
ResNet-50 | 97.40 | 95.83 | 98.96 | 93.73 | 99.50 | 96.86 | 95.97 | 94.39 | 89.69 | 88.88 | 96.51 | 97.53 | 96.85 | 93.94 | 93.33 |
U-Net | 99.49 | 97.90 | 97.57 | 93.51 | 98.58 | 97.97 | 94.67 | 93.36 | 93.88 | 89.94 | 96.68 | 97.53 | 97.17 | 94.55 | 93.62 |
Res-UNet | 98.70 | 97.47 | 98.59 | 94.88 | 98.51 | 98.00 | 97.25 | 96.24 | 94.03 | 91.95 | 97.42 | 98.19 | 97.79 | 95.71 | 95.00 |
HAE-RNet | 98.90 | 98.51 | 98.10 | 95.87 | 99.75 | 99.43 | 97.85 | 96.67 | 96.38 | 93.15 | 97.86 | 98.45 | 98.32 | 96.72 | 95.82 |
Table 9.
Evaluation of segmentation results for VGG-16, ResNet-50, U-Net, Res-UNet, and HAE-RNet as input 6C-2 band on the Vaihingen test set.
Table 9.
Evaluation of segmentation results for VGG-16, ResNet-50, U-Net, Res-UNet, and HAE-RNet as input 6C-2 band on the Vaihingen test set.
Networks | Building | Imp-Surface | Tree | Low-Vegetation | Car | OA | Recall | F1 | mIoU | FWIoU |
---|
Pre | IoU | Pre | IoU | Pre | IoU | Pre | IoU | Pre | IoU |
---|
VGG-16 | 88.34 | 82.49 | 90.23 | 77.86 | 83.78 | 67.49 | 73.90 | 61.83 | 52.75 | 46.37 | 84.49 | 82.73 | 79.66 | 67.21 | 73.69 |
ResNet-50 | 85.14 | 79.62 | 91.61 | 77.46 | 84.32 | 67.77 | 73.73 | 63.44 | 66.19 | 55.89 | 84.32 | 82.72 | 81.22 | 68.84 | 73.27 |
U-Net | 85.74 | 72.44 | 87.83 | 71.46 | 83.55 | 65.30 | 58.97 | 52.28 | 59.31 | 51.09 | 79.84 | 79.49 | 76.54 | 62.52 | 67.40 |
Res-UNet | 91.60 | 86.42 | 92.06 | 81.47 | 81.55 | 68.41 | 78.11 | 65.31 | 56.13 | 51.21 | 86.60 | 85.55 | 82.10 | 70.56 | 76.94 |
HAE-RNet | 93.85 | 88.70 | 93.81 | 86.27 | 88.35 | 75.00 | 75.00 | 64.38 | 66.05 | 59.58 | 89.59 | 87.34 | 85.07 | 74.79 | 81.78 |
Table 10.
Numbers of Parameters and FLOPs Used by Different Networks.
Table 10.
Numbers of Parameters and FLOPs Used by Different Networks.
Network | Params | FLOPs(G) |
---|
VGG-16 | 14,716,416 | 1290 |
U-Net | 31,036,870 | 3520 |
ResNet-50 | 23,570,560 | 1610 |
Res-UNet | 22,085,446 | 5040 |
Ours | 52,085,830 | 5050 |