EUNet: Edge-UNet for Accurate Building Extraction and Edge Emphasis in Gaofen-7 Images
Abstract
:1. Introduction
- Considering that building extraction for GF-7 can help in the subsequent work of 3D modeling, a building dataset (GF-7 Building Dataset) based on orthophotos from GF-7 stereo images is produced.
- To enhance the perception of edges and improve the accuracy and completeness of building edge extraction, a multi-task network (EUNet) based on the UNet structure is proposed, where an edge extraction module is incorporated into the up-sampling part.
- To verify the superiority of our proposed network, we conducted experiments on the GF-7 dataset, WHU dataset, and Massachusetts Buildings Dataset; our proposed method outperforms existing approaches across all three datasets.
2. Materials and Methods
2.1. Materials
2.1.1. Production of GF-7 Dataset
2.1.2. Existing Building Extraction Datasets
2.2. Methods
2.2.1. Optimization of the UNet Network Structure by Adding Edge-Detection Block
2.2.2. Feature Extraction Block
2.2.3. Edge-Detection Block
2.2.4. Evaluation Metrics
- IoU is used to describe the degree of overlap of two sets, which is equal to the number of items contained in the intersection set of the two sets divided by the total number of elements contained in their concatenated set. When used as an index for judging the accuracy of the semantic segmentation, the formula is as follows:
- Precision is used to describe the proportion of pixels classified as positive samples that are true positive samples and the formula as follows:
- Recall: the recall rate is used to describe the proportion of correctly classified pixels that are correctly classified as positive samples, and the formula is given by:
- The F1 score is a comprehensive assessment of the performance of a binary classifier that takes into account both the Precision and Recall of the classifier. Precision measures how many of the samples predicted by the classifier to be in the positive category are true positive examples, while Recall measures the ability of the classifier to correctly identify all true positive examples. The F1 score is the reconciled mean of precision and recall and can be expressed by the following equation:
3. Experiment and Result
3.1. Experiment
3.2. Result
4. Discussion
4.1. Regarding the Result of the Comparison Experiments
4.2. Regarding the Proposed EUNet Framework
4.3. Regarding the GF-7 Building Dataset
4.4. Methodological Limitations and Perspectives for Future Work
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Zakharov, A.; Tuzhilkin, A.; Zhiznyakov, A. Automatic Building Detection from Satellite Images Using Spectral Graph Theory. In Proceedings of the 2015 International Conference on Mechanical Engineering, Automation and Control Systems (MEACS), Tomsk, Russia, 1–4 December 2015; pp. 1–5. [Google Scholar]
- Chen, L.-C.; Huang, C.-Y.; Teo, T.-A. Multi-Type Change Detection of Building Models by Integrating Spatial and Spectral Information. Int. J. Remote Sens. 2012, 33, 1655–1681. [Google Scholar] [CrossRef]
- Zhang, Y. Optimisation of Building Detection in Satellite Images by Combining Multispectral Classification and Texture Filtering. ISPRS J. Photogramm. Remote. Sens. 1999, 54, 50–60. [Google Scholar] [CrossRef]
- Awrangjeb, M.; Zhang, C.; Fraser, C.S. Improved building detection using texture information. Int. Arch. Photogramm. Remote. Sens. Spat. Inf. Sci. 2013, XXXVIII-3/W22, 143–148. [Google Scholar] [CrossRef]
- Ding, Z.; Wang, X.Q.; Li, Y.L.; Zhang, S.S. Study on building extraction from high-resolution images using MBI. Int. Arch. Photogramm. Remote. Sens. Spat. Inf. Sci. 2018, XLII-3, 283–287. [Google Scholar] [CrossRef]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
- Song, J.; Gao, S.; Zhu, Y.; Ma, C. A Survey of Remote Sensing Image Classification Based on CNNs. Big Earth Data 2019, 3, 232–254. [Google Scholar] [CrossRef]
- Zhang, F.; Du, B.; Zhang, L.; Xu, M. Weakly Supervised Learning Based on Coupled Convolutional Neural Networks for Aircraft Detection. IEEE Trans. Geosci. Remote Sens. 2016, 54, 5553–5563. [Google Scholar] [CrossRef]
- Shelhamer, E.; Long, J.; Darrell, T. Fully Convolutional Networks for Semantic Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 640–651. [Google Scholar] [CrossRef] [PubMed]
- Li, Y.; He, B.; Long, T.; Bai, X. Evaluation the Performance of Fully Convolutional Networks for Building Extraction Compared with Shallow Models. In Proceedings of the 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Fort Worth, TX, USA, 23–28 July 2017; pp. 850–853. [Google Scholar]
- Sariturk, B.; Bayram, B.; Duran, Z.; Seker, D.Z. Feature Extraction from Satellite Images Using Segnet and Fully Convolutional Networks (FCN). Int. J. Eng. Geosci. 2020, 5, 138–143. [Google Scholar] [CrossRef]
- Maggiori, E.; Tarabalka, Y.; Charpiat, G.; Alliez, P. Convolutional Neural Networks for Large-Scale Remote-Sensing Image Classification. IEEE Trans. Geosci. Remote Sens. 2017, 55, 645–657. [Google Scholar] [CrossRef]
- Cui, W.; Xiong, B.; Zhang, L. Multi-scale fully convolutional neural network for building extraction. Acta Geod. Cartogr. Sin. 2019, 48, 597–608. [Google Scholar]
- Shrestha, S.; Vanneschi, L. Improved Fully Convolutional Network with Conditional Random Fields for Building Extraction. Remote Sens. 2018, 10, 1135. [Google Scholar] [CrossRef]
- Bittner, K.; Cui, S.; Reinartz, P. Building extraction from remote-sensing data using fully convolutional networks. Int. Arch. Photogramm. Remote. Sens. Spat. Inf. Sci. 2017, XLII-1/W1, 481–486. [Google Scholar] [CrossRef]
- Bittner, K.; Adam, F.; Cui, S.; Körner, M.; Reinartz, P. Building Footprint Extraction From VHR Remote Sensing Images Combined With Normalized DSMs Using Fused Fully Convolutional Networks. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 2615–2629. [Google Scholar] [CrossRef]
- Badrinarayanan, V.; Kendall, A.; Cipolla, R. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2481–2495. [Google Scholar] [CrossRef] [PubMed]
- Noh, H.; Hong, S.; Han, B. Learning Deconvolution Network for Semantic Segmentation. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 1520–1528. [Google Scholar]
- Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015; Springer: Cham, Switzerland, 2015; pp. 234–241. [Google Scholar]
- Alsabhan, W.; Alotaiby, T. Automatic Building Extraction on Satellite Images Using Unet and ResNet50. Comput. Intell. Neurosci. 2022, 2022, e5008854. [Google Scholar] [CrossRef] [PubMed]
- Hinton, G.E.; Mnih, V. Machine Learning for Aerial Image Labeling. Ph.D. Thesis, University of Toronto, Toronto, ON, Canada, 2013. [Google Scholar]
- Abdollahi, A.; Pradhan, B. Integrating Semantic Edges and Segmentation Information for Building Extraction from Aerial Images Using UNet. Mach. Learn. Appl. 2021, 6, 100194. [Google Scholar] [CrossRef]
- Oktay, O.; Schlemper, J.; Folgoc, L.L.; Lee, M.J.; Heinrich, M.; Misawa, K.; Mori, K.; McDonagh, S.G.; Hammerla, N.Y.; Kainz, B.; et al. Attention U-Net: Learning Where to Look for the Pancreas. arXiv 2018, arXiv:1804.03999. [Google Scholar]
- Yu, M.; Chen, X.; Zhang, W.; Liu, Y. AGs-Unet: Building Extraction Model for High Resolution Remote Sensing Images Based on Attention Gates U Network. Sensors 2022, 22, 2932. [Google Scholar] [CrossRef]
- Qiu, W.; Gu, L.; Gao, F.; Jiang, T. Building Extraction From Very High-Resolution Remote Sensing Images Using Refine-UNet. IEEE Geosci. Remote Sens. Lett. 2023, 20, 6002905. [Google Scholar] [CrossRef]
- Hui, J.; Du, M.; Ye, X.; Qin, Q.; Sui, J. Effective Building Extraction From High-Resolution Remote Sensing Images With Multitask Driven Deep Neural Network. IEEE Geosci. Remote Sens. Lett. 2019, 16, 786–790. [Google Scholar] [CrossRef]
- Chollet, F. Xception: Deep Learning with Depthwise Separable Convolutions. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 1800–1807. [Google Scholar]
- Yin, J.; Wu, F.; Qiu, Y.; Li, A.; Liu, C.; Gong, X. A Multiscale and Multitask Deep Learning Framework for Automatic Building Extraction. Remote Sens. 2022, 14, 4744. [Google Scholar] [CrossRef]
- Hong, D.; Qiu, C.; Yu, A.; Quan, Y.; Liu, B.; Chen, X. Multi-Task Learning for Building Extraction and Change Detection from Remote Sensing Images. Appl. Sci. 2023, 13, 1037. [Google Scholar] [CrossRef]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021; pp. 9992–10002. [Google Scholar]
- Yang, H.; Xu, M.; Chen, Y.; Wu, W.; Dong, W. A Postprocessing Method Based on Regions and Boundaries Using Convolutional Neural Networks and a New Dataset for Building Extraction. Remote Sens. 2022, 14, 647. [Google Scholar] [CrossRef]
- Yang, G.; Zhang, Q.; Zhang, G. EANet: Edge-Aware Network for the Extraction of Buildings from Aerial Images. Remote Sens. 2020, 12, 2161. [Google Scholar] [CrossRef]
- Moghalles, K.; Li, H.-C.; Al-Huda, Z.; Hezzam, E.A. Multi-Task Deep Network for Semantic Segmentation of Building in Very High Resolution Imagery. In Proceedings of the 2021 International Conference of Technology, Science and Administration (ICTSA), Taiz, Yemen, 22–24 March 2021; pp. 1–6. [Google Scholar]
- Shi, F.; Zhang, T. A Multi-Task Network with Distance–Mask–Boundary Consistency Constraints for Building Extraction from Aerial Images. Remote Sens. 2021, 13, 2656. [Google Scholar] [CrossRef]
- 2D Semantic Labeling. Available online: https://www.isprs.org/education/benchmarks/UrbanSemLab/semantic-labeling.aspx (accessed on 23 April 2024).
- Maggiori, E.; Tarabalka, Y.; Charpiat, G.; Alliez, P. Can Semantic Labeling Methods Generalize to Any City? The Inria Aerial Image Labeling Benchmark. In Proceedings of the 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Fort Worth, TX, USA, 23–28 July 2017; pp. 3226–3229. [Google Scholar]
- Ji, S.; Wei, S.; Lu, M. Fully Convolutional Networks for Multisource Building Extraction From an Open Aerial and Satellite Imagery Data Set. IEEE Trans. Geosci. Remote Sens. 2019, 57, 574–586. [Google Scholar] [CrossRef]
- Wang, J.; Hu, X.; Meng, Q.; Zhang, L.; Wang, C.; Liu, X.; Zhao, M. Developing a Method to Extract Building 3D Information from GF-7 Data. Remote Sens. 2021, 13, 4532. [Google Scholar] [CrossRef]
- Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Xie, S.; Tu, Z. Holistically-Nested Edge Detection. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 1395–1403. [Google Scholar]
- He, J.; Zhang, S.; Yang, M.; Shan, Y.; Huang, T. BDCN: Bi-Directional Cascade Network for Perceptual Edge Detection. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 44, 100–113. [Google Scholar] [CrossRef]
- Soria, X.; Riba, E.; Sappa, A. Dense Extreme Inception Network: Towards a Robust CNN Model for Edge Detection. In Proceedings of the 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), Snowmass Village, CO, USA, 1–5 March 2020; pp. 1912–1921. [Google Scholar]
- Zhao, H.; Shi, J.; Qi, X.; Wang, X.; Jia, J. Pyramid Scene Parsing Network. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 6230–6239. [Google Scholar]
- Chen, L.-C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. In Computer Vision—ECCV 2018; Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y., Eds.; Springer International Publishing: Cham, Switzerland, 2018; pp. 833–851. [Google Scholar]
- Cao, H.; Wang, Y.; Chen, J.; Jiang, D.; Zhang, X.; Tian, Q.; Wang, M. Swin-Unet: Unet-Like Pure Transformer for Medical Image Segmentation. In Computer Vision—ECCV 2022 Workshops; Karlinsky, L., Michaeli, T., Nishino, K., Eds.; Springer Nature: Cham, Switzerland, 2023; pp. 205–218. [Google Scholar]
- Chen, S.; Zhang, Y.; Nie, K.; Li, X.; Wang, W. Extracting Building Areas from Photogrammetric DSM and DOM by Automatically Selecting Training Samples from Historical DLG Data. ISPRS Int. J. Geo-Inf. 2020, 9, 18. [Google Scholar] [CrossRef]
- Liu, W.; Yang, M.; Xie, M.; Guo, Z.; Li, E.; Zhang, L.; Pei, T.; Wang, D. Accurate Building Extraction from Fused DSM and UAV Images Using a Chain Fully Convolutional Neural Network. Remote Sens. 2019, 11, 2912. [Google Scholar] [CrossRef]
- Li, P.; Sun, Z.; Duan, G.; Wang, D.; Meng, Q.; Sun, Y. DMU-Net: A Dual-Stream Multi-Scale U-Net Network Using Multi-Dimensional Spatial Information for Urban Building Extraction. Sensors 2023, 23, 1991. [Google Scholar] [CrossRef] [PubMed]
- Yan, Y.; Tan, Z.; Su, N.; Zhao, C. Building Extraction Based on an Optimized Stacked Sparse Autoencoder of Structure and Training Samples Using LIDAR DSM and Optical Images. Sensors 2017, 17, 1957. [Google Scholar] [CrossRef]
- Luo, H.; He, B.; Guo, R.; Wang, W.; Kuai, X.; Xia, B.; Wan, Y.; Ma, D.; Xie, L. Urban Building Extraction and Modeling Using GF-7 DLC and MUX Images. Remote Sens. 2021, 13, 3414. [Google Scholar] [CrossRef]
GF-7 DLC Imagery | Spectral Range (µm) | Spatial Resolution (m) | Viewing Angles (°) |
---|---|---|---|
Multispectral image | Blue: 0.45–0.52 Green: 0.52–0.59 Red: 0.63–0.69 NIR: 0.77–0.89 | 2.6 | −5 |
Panchromatic image | 0.45–0.9 | Forward: 0.8 Backward: 0.65 | Forward: +26 |
Dataset | Resolution (m) | Source | Pixels | Tiles |
---|---|---|---|---|
Massachusetts | 1 | Aerial | 1500 × 1500 (cut to 500 × 500) | 151 (1072) |
WHU | 0.3–2.5 | Aerial/satellite | 512 × 512 | 204 |
GF-7 | 0.65 | Satellite | 512 × 512 | 75 |
Network | Miou (%) | Recall (%) | Precision (%) | F1 (%) |
---|---|---|---|---|
FCN8s | 65.2 | 71.9 | 82.6 | 76.9 |
Deeplabv3+ | 71.3 | 86.7 | 78.7 | 82.5 |
PSPNet | 73.0 | 85.8 | 81.9 | 83.8 |
Swin-UNet | 77.0 | 85.7 | 86.7 | 86.2 |
UNet | 81.2 | 88.60 | 89.4 | 89.0 |
Attention UNet | 81.4 | 88.7 | 89.6 | 89.1 |
EUNet | 81.1 | 89.2 | 94.9 | 92.0 |
Network | Miou (%) | Recall (%) | Precision (%) | F1 (%) |
---|---|---|---|---|
FCN8s | 70.9 | 80.8 | 83.7 | 82.2 |
Deeplabv3+ | 71.1 | 83.7 | 81.5 | 82.6 |
Swin-UNet | 71.6 | 83.1 | 86.3 | 84.7 |
PSPNet | 76.6 | 85.2 | 87.4 | 86.3 |
UNet | 77.0 | 86.2 | 87.1 | 86.6 |
Attention UNet | 78.6 | 87.3 | 87.7 | 87.5 |
EUNet | 77.4 | 87.0 | 89.6 | 88.3 |
Network | Miou (%) | Recall (%) | Precision (%) | F1 (%) |
---|---|---|---|---|
FCN8s | 65.8 | 75.4 | 81.5 | 78.3 |
PSPNet | 75.3 | 82.4 | 87.2 | 84.7 |
Deeplabv3+ | 74.7 | 84.8 | 84.5 | 84.6 |
Swin-UNet | 77.2 | 86.8 | 92.1 | 89.4 |
UNet | 84.1 | 91.2 | 90.9 | 91.0 |
Attention UNet | 85.6 | 92.2 | 91.8 | 92.0 |
EUNet | 84.6 | 92.3 | 94.9 | 93.6 |
Network | Miou (%) | Recall (%) | Precision (%) | F1 (%) |
---|---|---|---|---|
UNet | 81.17 | 88.63 | 89.42 | 89.02 |
+Dexi-EDB | 81.02 | 89.18 | 94.83 | 91.92 |
+EDB | 81.11 | 89.21 | 94.86 | 91.95 |
Network | Miou (%) | Recall (%) | Precision (%) | F1 (%) |
---|---|---|---|---|
UNet | 76.98 | 86.23 | 87.13 | 86.68 |
+Dexi-EDB | 76.94 | 86.59 | 89.35 | 87.95 |
+EDB | 77.41 | 86.97 | 89.58 | 88.26 |
Network | Miou (%) | Recall (%) | Precision (%) | F1 (%) |
---|---|---|---|---|
UNet | 84.08 | 91.21 | 90.87 | 91.04 |
+Dexi-EDB | 84.36 | 92.17 | 94.77 | 93.45 |
+EDB | 84.57 | 92.33 | 94.85 | 93.57 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Han, R.; Fan, X.; Liu, J. EUNet: Edge-UNet for Accurate Building Extraction and Edge Emphasis in Gaofen-7 Images. Remote Sens. 2024, 16, 2397. https://doi.org/10.3390/rs16132397
Han R, Fan X, Liu J. EUNet: Edge-UNet for Accurate Building Extraction and Edge Emphasis in Gaofen-7 Images. Remote Sensing. 2024; 16(13):2397. https://doi.org/10.3390/rs16132397
Chicago/Turabian StyleHan, Ruijie, Xiangtao Fan, and Jian Liu. 2024. "EUNet: Edge-UNet for Accurate Building Extraction and Edge Emphasis in Gaofen-7 Images" Remote Sensing 16, no. 13: 2397. https://doi.org/10.3390/rs16132397
APA StyleHan, R., Fan, X., & Liu, J. (2024). EUNet: Edge-UNet for Accurate Building Extraction and Edge Emphasis in Gaofen-7 Images. Remote Sensing, 16(13), 2397. https://doi.org/10.3390/rs16132397