Rural Road Extraction in Xiong’an New Area of China Based on the RC-MSFNet Network Model
Abstract
:1. Introduction
2. Study Area and Research Data
2.1. Overview of the Study Area
2.2. Data Source and Preprocessing
2.3. XARoads Dataset Preparation
2.4. DeepGlobe Dataset
3. Rural Road Extraction Method
3.1. RC-MSFNet Network Structure
3.1.1. Downsampling Block Structure
3.1.2. Bottleneck Block Structure
3.1.3. Upsampling Block Structure
3.2. Experimental Design
3.2.1. Parameter Settings
3.2.2. Comparative Experiment Design
3.2.3. Loss Function
3.2.4. Evaluation Indicators
4. Experiment Results and Discussion
4.1. RC-MSFNet Model Parameter Optimization
4.2. Accuracy of Rural Road Extraction Using Different Deep Learning Models on the XARoads Dataset
4.3. Rural Road Extraction Effect of Different Deep Learning Models on the XARoads Dataset
4.4. Accuracy of Rural Road Extraction Using Different Deep Learning Models on the DeepGlobe Dataset
4.5. Rural Road Extraction Effect of Different Deep Learning Models on the DeepGlobe Dataset
5. Conclusions
- (1)
- By optimizing the RC-MSFNet model parameters, all road extraction metrics reach their optimal values when the d parameters in the MSF module are set to 1, 2, 2, 1, and the r parameter in the CoA module is set to 32. This indicates that the RC-MSFNet model is feasible and effective for rural road extraction in high-resolution remote sensing images.
- (2)
- The RC-MSFNet model achieved P, F1, IOU, and COM values of 0.8350, 0.7896, 0.6523, and 0.7489 for rural road extraction in the Xiong’an New Area (XARoads dataset). Compared to models like U-Net, FCN, SegNet, DeeplabV3+, R-Net, and RC-Net, RC-MSFNet demonstrates better performance in extracting narrow rural roads, muddy roads with unclear boundaries, and roads obscured by shadows, with fewer instances of missed or incorrect extractions. Additionally, it can identify roads labeled outside the ground truth.
- (3)
- For the DeepGlobe public dataset, RC-MSFNet achieved P, F1, IOU, and COM values of 0.8266, 0.7821, 0.6380, and 0.7422, further validating the model’s applicability and scalability. This shows that RC-MSFNet is effective for this study’s XARoads dataset and publicly available datasets like DeepGlobe.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Li, Z.; Zhang, S.; Dong, J. Suggestive data annotation for CNN-based building footprint mapping based on deep active learning and landscape metrics. Remote Sens. 2022, 14, 3147. [Google Scholar] [CrossRef]
- Alshehhi, R.; Marpu, P.R. Hierarchical graph-based segmentation for extracting road networks from high-resolution satellite images. ISPRS J. Photogramm. Remote Sens. 2017, 126, 245–260. [Google Scholar] [CrossRef]
- Alshehhi, R.; Marpu, P.R.; Woon, W.L.; Dalla Mura, M. Simultaneous extraction of roads and buildings in remote sensing imagery with convolutional neural networks. ISPRS J. Photogramm. Remote Sens. 2017, 130, 139–149. [Google Scholar] [CrossRef]
- Unsalan, C.; Sirmacek, B. Road network detection using probabilistic and graph theoretical methods. IEEE Trans. Geosci. Remote Sens. 2012, 50, 4441–4453. [Google Scholar] [CrossRef]
- Fang, Y.P.; Wang, X.P.; Li, X.N. Road Extraction from Remote Sensing Images Based on Adaptive Morphology. Laser Optoelectron. Prog. 2022, 59, 1610006. [Google Scholar]
- Feng, H.X.; Li, J.Y.; Wang, Y.M. Research on Road Extraction from Remote Sensing Image Based on Improved Support Vector Machine. J. Zhejiang Univ. Water Resour. Electr. Power 2021, 74–76, 86. [Google Scholar]
- Zhu, D.M.; Wen, X.; Ling, C.L. Road extraction based on the algorithms of MRF and hybrid model of SVM and FCM. In Proceedings of the 2011 International Symposium on Image and Data Fusion, Tengchong, China, 9–11 August 2011; pp. 1–4. [Google Scholar]
- Sharma, P.; Kumar, R.; Gupta, M. Road Features Extraction Using Convolutional Neural Network. In Proceedings of the 2023 International Conference on Advancement in Computation & Computer Technologies (InCACCT), Gharuan, India, 5–6 May 2023; pp. 881–886. [Google Scholar]
- Sharma, P.; Kumar, R.; Gupta, M.; Nayyar, A. A critical analysis of road network extraction using remote sensing images with deep learning. Spat. Inf. Res. 2024, 32, 485–495. [Google Scholar] [CrossRef]
- Papadomanolaki, M.; Vakalopoulou, M.; Karantzalos, K. Patch-based deep learning architectures for sparse annotated very high resolution datasets. In Proceedings of the 2017 Joint Urban Remote Sensing Event (JURSE), Dubai, United Arab Emirates, 6–8 March 2017; pp. 1–4. [Google Scholar]
- Cheng, G.; Wang, Y.; Xu, S.; Wang, H.; Xiang, S.; Pan, C. Automatic road detection and centerline extraction via cascaded end-to-end convolutional neural network. IEEE Trans. Geosci. Remote Sens. 2017, 55, 3322–3337. [Google Scholar] [CrossRef]
- Zhang, Z.; Liu, Q.; Wang, Y. Road extraction by deep residual u-net. IEEE Geosci. Remote Sens. Lett. 2018, 15, 749–753. [Google Scholar] [CrossRef]
- Xu, Y.; Feng, Y.; Xie, Z.; Hu, A.; Zhang, X. A research on extracting road network from high resolution remote sensing imagery. In Proceedings of the 2018 26th International Conference on Geoinformatics, Kunming, China, 28–30 June 2018; pp. 1–4. [Google Scholar]
- Buslaev, A.; Seferbekov, S.; Iglovikov, V.; Shvets, A. Fully convolutional network for automatic road extraction from satellite imagery. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, UT, USA, 18–23 June 2018; pp. 207–210. [Google Scholar]
- Yang, X.; Li, X.; Ye, Y.; Zhang, X.; Zhang, H.; Huang, X.; Zhang, B. Road detection via deep residual dense u-net. In Proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary, 14–19 July 2019; pp. 1–7. [Google Scholar]
- Chen, L.C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs. arXiv 2016, arXiv:1412.7062. [Google Scholar]
- Chen, L.C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 801–818. [Google Scholar]
- Xia, W.; Zhang, Y.Z.; Liu, J.; Luo, L.; Yang, K. Road extraction from high resolution image with deep convolution network—A case study of GF-2 image. Proceedings 2018, 2, 325. [Google Scholar] [CrossRef]
- Li, H.; Chao, Y.; Yu, P.; Li, H.; Zhang, Y. Image Inpainting Algorithm with Diverse Aggregation of Contextual Information. J. Beijing Univ. Posts Telecommun. 2023, 46, 19. [Google Scholar]
- Wang, Q.; Bai, H.; He, C.; Cheng, J. FE-LinkNet: Enhanced D-LinkNet with Attention and Dense Connection for Road Extraction in High-Resolution Remote Sensing Images. In Proceedings of the IGARSS 2022—2022 IEEE International Geoscience and Remote Sensing Symposium, Kuala Lumpur, Malaysia, 17–22 July 2022; pp. 3043–3046. [Google Scholar]
- Demir, I.; Koperski, K.; Lindenbaum, D.; Pang, G.; Huang, J.; Basu, S.; Raskar, R. Deepglobe 2018: A challenge to parse the earth through satellite images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, UT, USA, 18–23 June 2018; pp. 172–181. [Google Scholar]
- Doshi, J. Residual inception skip network for binary segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, UT, USA, 18–23 June 2018; pp. 216–219. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Yu, F. Multi-scale context aggregation by dilated convolutions. arXiv 2015, arXiv:1511.07122. [Google Scholar]
- Chen, D.; Li, X.; Hu, F.; Mathiopoulos, P.T.; Di, S.; Sui, M.; Peethambaran, J. Edpnet: An encoding–decoding network with pyramidal representation for semantic image segmentation. Sensors 2023, 23, 3205. [Google Scholar] [CrossRef] [PubMed]
- Zhou, L.; Zhang, C.; Wu, M. D-LinkNet: LinkNet with pretrained encoder and dilated convolution for high resolution satellite imagery road extraction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, UT, USA, 18–22 June 2018; pp. 182–186. [Google Scholar]
- Guo, M.; Liu, H.; Xu, Y.; Huang, Y. Building extraction based on U-Net with an attention block and multiple losses. Remote Sens. 2020, 12, 1400. [Google Scholar] [CrossRef]
- Vaswani, A. Attention is all you need. In Advances in Neural Information Processing Systems; The MIT Press: Cambridge, MA, USA, 2017. [Google Scholar]
- Sanghyun, W.; Jongchan, P.; Joon-Young, L.; In, S.K. CBAM: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018. [Google Scholar]
- Li, P.; He, X.; Qiao, M.; Cheng, X.; Li, Z.; Luo, H.; Tian, Z. Robust deep neural networks for road extraction from remote sensing images. IEEE Trans. Geosci. Remote Sens. 2020, 59, 6182–6197. [Google Scholar] [CrossRef]
- Wang, Y.; Zeng, X.Q. Road extraction model derived from integrated attention mechanism and dilated convolution. J. Image Graph. 2022, 27, 3102–3115. [Google Scholar] [CrossRef]
- Yerram, V.; Takeshita, H.; Iwahori, Y.; Hayashi, Y.; Bhuyan, M.K.; Fukui, S.; Wang, A. Extraction and calculation of roadway area from satellite images using improved deep learning model and post-processing. J. Imaging 2022, 8, 124. [Google Scholar] [CrossRef] [PubMed]
Downsampling Block | Input | Output | Type | Kernel | Stride | Padding |
---|---|---|---|---|---|---|
Encoder1 | 1024 × 1024 × 3 | 1024 × 1024 × 64 | Double Conv+ResNet | 3 × 3 | 1 | 1 |
1024 × 1024 × 64 | 512 × 512 × 64 | Max pooling | 2 × 2 | 2 | 0 | |
Encoder2 | 512 × 512 × 64 | 512 × 512 × 128 | Double Conv+ResNet | 3 × 3 | 1 | 1 |
512 × 512 × 128 | 256 × 256 × 128 | Max pooling | 2 × 2 | 2 | 0 | |
Encoder3 | 256 × 256 × 128 | 256 × 256 × 256 | Double Conv+ResNet | 3 × 3 | 1 | 1 |
256 × 256 × 256 | 128 × 128 × 256 | Max pooling | 2 × 2 | 2 | 0 | |
Encoder4 | 128 × 128 × 256 | 128 × 128 × 512 | Double Conv+ResNet | 3 × 3 | 1 | 1 |
128 × 128 × 512 | 64 × 64 × 512 | Max pooling | 2 × 2 | 2 | 0 |
MSF Branch | Input | Output | Type | Dilation Rate | Kernel | Stride | Padding |
---|---|---|---|---|---|---|---|
Branch0 | 64 × 64 × 512 | 64 × 64 × 512 | Dilated Conv | 1 | 3 × 3 | 1 | 1 |
Branch1 | 64 × 64 × 512 | 64 × 64 × 512 | Dilated Conv | 2 | 3 × 3 | 1 | 2 |
Branch2 | 64 × 64 × 512 | 64 × 64 × 512 | Dilated Conv | 2 | 3 × 3 | 1 | 2 |
Branch3 | 64 × 64 × 512 | 64 × 64 × 512 | Dilated Conv | 1 | 3 × 3 | 1 | 1 |
Connect | 64 × 64 × 2048 | 64 × 64 × 1024 | Conv | - | 1 × 1 | 1 | 0 |
Upsampling Block | Input | Output | Type | Kernel | Stride | Padding |
---|---|---|---|---|---|---|
Decoder1 | 64 × 64 × 1024 | 128 × 128 × 512 | Upsampling | 2 × 2 | 2 | 0 |
128 × 128 × 512 | 128 × 128 × 1024 | Skip connection (Encoder4) | - | - | - | |
128 × 128 × 1024 | 128 × 128 × 512 | Double Conv+CoA | 3 × 3 | 1 | 1 | |
Decoder2 | 128 × 128 × 512 | 256 × 256 × 256 | Upsampling | 2 × 2 | 2 | 0 |
256 × 256 × 256 | 256 × 256 × 512 | Skip connection (Encoder3) | - | - | - | |
256 × 256 × 512 | 256 × 256 × 256 | Double Conv+CoA | 3 × 3 | 1 | 1 | |
Decoder3 | 256 × 256 × 256 | 512 × 512 × 128 | Upsampling | 2 × 2 | 2 | 0 |
512 × 512 × 128 | 512 × 512 × 256 | Skip connection (Encoder2) | - | - | - | |
512 × 512 × 256 | 512 × 512 × 128 | Double Conv+CoA | 3 × 3 | 1 | 1 | |
Decoder4 | 512 × 512 × 128 | 1024 × 1024 × 64 | Upsampling | 2 × 2 | 2 | 0 |
1024 × 1024 × 64 | 1024 × 1024 × 128 | Skip connection (Encoder1) | - | - | - | |
1024 × 1024 × 128 | 1024 × 1024 × 64 | Double Conv+CoA | 3 × 3 | 1 | 1 | |
1024 × 1024 × 64 | 1024 × 1024 × 1 | Conv | 1 × 1 | 1 | 0 |
No. | Model | Description |
---|---|---|
1 | U-Net | Encoder–decoder architecture |
2 | FCN | Fully Convolutional Network |
3 | SegNet | Encoder–decoder architecture |
4 | Deeplab V3+ | Dilated convolution and encoder–decoder architecture |
5 | R-Net | U-Net with ResNet added to the downsampling |
6 | RC-Net | U-Net with ResNet added to the downsampling and CoA to the upsampling |
No. | Parameter Settings | P | F1 | IOU | COM | |
---|---|---|---|---|---|---|
A | MSF (d 1 = 1.2.4.8) | CoA (r 2 = 4) | 0.8464 3 | 0.7801 | 0.6395 | 0.7234 |
MSF (d = 1.2.4.8) | CoA (r = 8) | 0.8462 | 0.7799 | 0.6392 | 0.7233 | |
MSF (d = 1.2.4.8) | CoA (r = 16) | 0.8247 | 0.7807 | 0.6403 | 0.7412 | |
MSF (d = 1.2.4.8) | CoA (r = 32) | 0.8193 | 0.7699 | 0.6258 | 0.7261 | |
B | MSF (d = 1.4.8.12) | CoA (r = 4) | 0.8212 | 0.7771 | 0.6354 | 0.7374 |
MSF (d = 1.4.8.12) | CoA (r = 8) | 0.8336 | 0.7670 | 0.6220 | 0.7102 | |
MSF (d = 1.4.8.12) | CoA (r = 16) | 0.8311 | 0.7712 | 0.6276 | 0.7193 | |
MSF (d = 1.4.8.12) | CoA (r = 32) | 0.8119 | 0.7820 | 0.6421 | 0.7442 | |
C | MSF (d = 1.2.2.1) | CoA (r = 4) | 0.8263 | 0.7710 | 0.6273 | 0.7227 |
MSF (d = 1.2.2.1) | CoA (r = 8) | 0.8458 | 0.7800 | 0.6393 | 0.7177 | |
MSF (d = 1.2.2.1) | CoA (r = 16) | 0.8214 | 0.7662 | 0.6210 | 0.7179 | |
MSF (d = 1.2.2.1) | CoA (r = 32) | 0.8350 | 0.7896 | 0.6523 | 0.7489 |
Model | P | F1 | IOU | COM |
---|---|---|---|---|
U-Net | 0.7970 | 0.7414 | 0.5890 | 0.6930 |
FCN | 0.7672 | 0.6833 | 0.5190 | 0.6160 |
SegNet | 0.7565 | 0.6820 | 0.5174 | 0.6209 |
DeeplabV3+ | 0.8136 | 0.6981 | 0.5362 | 0.6114 |
R-Net | 0.8292 | 0.7592 | 0.6119 | 0.7001 |
RC-Net | 0.8097 | 0.7645 | 0.6188 | 0.7241 |
RC-MSFNet | 0.8350 1 | 0.7896 | 0.6523 | 0.7489 |
Model | P | F1 | IOU | COM |
---|---|---|---|---|
U-Net | 0.7979 | 0.7658 | 0.6204 | 0.7361 |
FCN | 0.7829 | 0.6787 | 0.5137 | 0.6126 |
SegNet | 0.7663 | 0.6383 | 0.4688 | 0.5469 |
DeeplabV3+ | 0.7945 | 0.6912 | 0.5281 | 0.6116 |
R-Net | 0.8118 | 0.7663 | 0.6211 | 0.7256 |
RC-Net | 0.8186 | 0.7769 | 0.6352 | 0.7393 |
RC-MSFNet | 0.8266 1 | 0.7821 | 0.6380 | 0.7422 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yang, N.; Di, W.; Wang, Q.; Liu, W.; Feng, T.; Tian, X. Rural Road Extraction in Xiong’an New Area of China Based on the RC-MSFNet Network Model. Sensors 2024, 24, 6672. https://doi.org/10.3390/s24206672
Yang N, Di W, Wang Q, Liu W, Feng T, Tian X. Rural Road Extraction in Xiong’an New Area of China Based on the RC-MSFNet Network Model. Sensors. 2024; 24(20):6672. https://doi.org/10.3390/s24206672
Chicago/Turabian StyleYang, Nanjie, Weimeng Di, Qingyu Wang, Wansi Liu, Teng Feng, and Xiaomin Tian. 2024. "Rural Road Extraction in Xiong’an New Area of China Based on the RC-MSFNet Network Model" Sensors 24, no. 20: 6672. https://doi.org/10.3390/s24206672
APA StyleYang, N., Di, W., Wang, Q., Liu, W., Feng, T., & Tian, X. (2024). Rural Road Extraction in Xiong’an New Area of China Based on the RC-MSFNet Network Model. Sensors, 24(20), 6672. https://doi.org/10.3390/s24206672