Distorted Aerial Images Semantic Segmentation Method for Software-Based Analog Image Receivers Using Deep Combined Learning
Abstract
:1. Introduction
- ▪
- We propose a combined deep learning model of an approximating network to be used with a segmentation network. A programmable interconnection between the approximating block and the segmentation block provides the possibility of changing, training, or adjusting participating deep learning models according to the relevance of the problem to be solved.
- ▪
- A comprehensive loss function is proposed to train the combined model optimally.
- ▪
- The proposed method provides a compatible implementation of the software-based image receiver and aerial image segmentation in a small-scale computer such as a single board computer (SBC).
- ▪
- The developed segmentation model is compared with similar benchmark networks to demonstrate the robustness of the proposed method, verifying that the proposed method obtains relative improvements of up to 80% in terms of mean IoU.
2. Related Works
2.1. Classical Segmentation Methods
2.2. Deep-Learning-Based Semantic Segmentation
2.3. Multiple Model Training Methods
3. Methodology
- The approximator model and the segmentation model must keep compatibility at their connecting point. The expected output of the approximator model is given to the segmentation model.
- Each model is composed of modules. Combinations of different approximators and different segmentation models are possible.
- The proposed method follows a modular approach, according to the user’s preference, so different approximators and different segmentation models can be used.
3.1. Dataset
3.2. Distorted Aerial Images
3.3. Approximator Model
3.4. Segmentation Model
3.5. Combined Loss
4. Experiments
5. Results
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Gerard, F.; Petit, S.; Smith, G.; Thomson, A.; Brown, N.; Manchester, S.; Wadsworth, R.; Bugár, G.; Halada, L.; Bezák, P.; et al. Land cover change in Europe between 1950 and 2000 determined employing aerial photography. Prog. Phys. Geogr. Earth Environ. 2010, 34, 183–205. [Google Scholar] [CrossRef] [Green Version]
- Zhou, W.; Huang, G.; Cadenasso, M.L. Does spatial configuration matter? Understanding the effects of land cover pattern on land surface temperature in urban landscapes. Landsc. Urban Plan. 2011, 102, 54–63. [Google Scholar] [CrossRef]
- Ahmed, O.; Shemrock, A.; Chabot, D.; Dillon, C.; Wasson, R.; Franklin, S. Hier-archical land cover and vegetation classification using multispectral data acquired from an unmanned aerial vehicle. Int. J. Remote Sens. 2017, 38, 2037–2052. [Google Scholar] [CrossRef]
- Gupta, A.; Watson, S.; Yin, H. Deep learning-based aerial image segmentation with open data for disaster impact assessment. Neurocomputing 2021, 439, 22–33. [Google Scholar] [CrossRef]
- Kyrkou, C.; Timotheou, S.; Kolios, P.; Theocharides, T.; Panayiotou, C.G. Optimized vision-directed deployment of UAVs for rapid traffic monitoring. In Proceedings of the 2018 IEEE International Conference on Consumer Electronics (ICCE), Las Vegas, NV, USA, 12–14 January 2018; pp. 1–6. [Google Scholar] [CrossRef]
- Petrides, P.; Kyrkou, C.; Kolios, P.; Theocharides, T.; Panayiotou, C. Towards a holistic performance evaluation framework for drone-based object detection. In Proceedings of the 2017 International Conference on Unmanned Aircraft Systems (ICUAS), Miami, FL, USA, 13–16 June 2017; pp. 1785–1793. [Google Scholar] [CrossRef] [Green Version]
- Zhu, X.X.; Tuia, D.; Mou, L.; Xia, G.-S.; Zhang, L.; Xu, F.; Fraundorfer, F. Deep Learning in Remote Sensing: A Comprehensive Review and List of Resources. IEEE Geosci. Remote Sens. Mag. 2017, 5, 8–36. [Google Scholar] [CrossRef] [Green Version]
- Bastani, F.; He, S.; Abbar, S.; Alizadeh, M.; Balakrishnan, H.; Chawla, S.; Madden, S.; DeWitt, D. RoadTracer: Automatic Extraction of Road Networks from Aerial Images. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4720–4728. [Google Scholar]
- Gupta, A.; Welburn, E.; Watson, S.; Yin, H. CNN-Based Semantic Change Detection in Satellite Imagery. In Proceedings of the Artificial Neural Networks and Machine Learning–ICANN 2019: Workshop and Special Sessions: 28th International Conference on Artificial Neural Networks, Munich, Germany, 17–19 September 2019; pp. 669–684. [Google Scholar] [CrossRef]
- Boguszewski, A.; Batorski, D.; Ziemba-Jankowska, N.; Dziedzic, T.; Zambrzycka, A. LandCover.ai: Dataset for Automatic Mapping of Buildings, Woodlands, Water and Roads from Aerial Imagery. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Nashville, TN, USA, 20–25 June 2021; pp. 1102–1110. [Google Scholar]
- Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar]
- Chaurasia, A.; Culurciello, E. LinkNet: Exploiting encoder representations for efficient semantic segmentation. In Proceedings of the 2017 IEEE Visual Communications and Image Processing (VCIP), St. Petersburg, FL, USA, 10–13 December 2018; pp. 1–4. [Google Scholar] [CrossRef] [Green Version]
- Sethi, G.; Saini, B.; Singh, D. Segmentation of cancerous regions in liver using an edge-based and phase congruent region enhancement method. Comput. Electr. Eng. 2016, 53, 244–262. [Google Scholar] [CrossRef]
- Wu, K.; Zhang, D. Robust tongue segmentation by fusing region-based and edge-based approaches. Expert Syst. Appl. 2015, 42, 8027–8038. [Google Scholar] [CrossRef]
- Priyanka, V.P.; Patil, N.C. Gray Scale Image Segmentation using OTSU Thresholding Optimal Approach. J. Res. 2016, 2, 20–24. [Google Scholar]
- Aja-Fernández, S.; Curiale, A.H.; Vegas-Sánchez-Ferrero, G. A local fuzzy thresholding methodology for multiregion image segmentation. Knowl.-Based Syst. 2015, 83, 1–12. [Google Scholar] [CrossRef]
- Zaitoun, N.M.; Aqel, M.J. Survey on Image Segmentation Techniques. Procedia Comput. Sci. 2015, 65, 797–806. [Google Scholar] [CrossRef] [Green Version]
- Niu, S.; Chen, Q.; de Sisternes, L.; Ji, Z.; Zhou, Z.; Rubin, D.L. Robust noise region-based active contour model via local similarity factor for image segmentation. Pattern Recognit. 2017, 61, 104–119. [Google Scholar] [CrossRef] [Green Version]
- Er, A.; Kaur, E.R. Review of Image Segmentation Technique. Int. J. 2017, 8, 36–39. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. arXiv 2015, arXiv:1411.4038. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef] [Green Version]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
- Noh, H.; Hong, S.; Han, B. Learning deconvolution network for semantic segmentation. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1520–1528. [Google Scholar]
- Badrinarayanan, V.; Kendall, A.; Cipolla, R. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2481–2495. [Google Scholar] [CrossRef]
- Ranzato, M.; Huang, F.; Boureau, Y.; LeCun, Y. Unsupervised learning of invariant feature hierarchies with applications to object recognition. In Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, MN, USA, 17–22 June 2007; pp. 1–8. [Google Scholar]
- Ngiam, J.; Khosla, A.; Kim, M.; Nam, J.; Lee, H.; Ng, A. Multimodal deep learning. In Proceedings of the 28th International Conference on Machine Learning (ICML-11), Washington, DC, USA, 28 June–2 July 2011; pp. 689–696. [Google Scholar]
- Liu, W.; Rabinovich, A.; Berg, A.C. ParseNet: Looking Wider to See Better, Computer Vision and Pattern Recognition. arXiv 2016, arXiv:1506.04579. [Google Scholar]
- TsLin, Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
- Zhang, L.; Liu, J.; Shang, F.; Li, G.; Zhao, J.; Zhang, Y. Robust segmentation method for noisy images based on an unsupervised denosing filter. Tsinghua Sci. Technol. 2021, 26, 736–748. [Google Scholar] [CrossRef]
- Huang, S.; Zhang, H.; Pizurica, A. Subspace Clustering for Hyperspectral Images via Dictionary Learning With Adaptive Regularization. IEEE Trans. Geosci. Remote. Sens. 2021, 60, 1–17. [Google Scholar] [CrossRef]
- Goodfellow, I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative Adversarial Networks. arXiv 2014, arXiv:1406.2661. [Google Scholar] [CrossRef]
- Ledig, C.; Theis, L.; Huszár, F.; Caballero, J.; Cunningham, A.; Acosta, A.; Aitken, A.P.; Tejani, A.; Totz, J.; Wang, Z.; et al. Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Isola, P.; Zhu, J.-Y.; Zhou, T.; Efros, A.A. Image-to-Image Translation with Conditional Adversarial Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 1125–1134. [Google Scholar]
- Zhang, J.; He, F.; Duan, Y.; Yang, S. AIDEDNet: Anti-interference and detail enhancement dehazing network for real-world scenes. Front. Comput. Sci. 2022, 17, 172703. [Google Scholar] [CrossRef]
- Tang, W.; He, F.; Liu, Y.; Duan, Y. MATR: Multimodal Medical Image Fusion via Multiscale Adaptive Transformer. IEEE Trans. Image Process. 2022, 31, 5134–5149. [Google Scholar] [CrossRef] [PubMed]
- Chen, X.; Kuang, T.; Deng, H.; Fung, S.H.; Gateno, J.; Xia, J.J.; Yap, P.-T. Dual Adversarial Attention Mechanism for Unsupervised Domain Adaptive Medical Image Segmentation. IEEE Trans. Med. Imaging 2022, 41, 3445–3453. [Google Scholar] [CrossRef] [PubMed]
- Satellite Images of Dubai Dataset. Available online: https://www.kaggle.com/datasets/humansintheloop/semantic-segmentation-of-aerial-imagery (accessed on 5 March 2021).
- Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. In Proceedings of the 3rd International Conference for Learning Representations, San Diego, CA, USA, 7–9 May 2015. [Google Scholar] [CrossRef]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. arXiv 2017, arXiv:1709.01507. [Google Scholar]
- Huang, G.; Liu, Z.; van der Maaten, L.; Weinberger, K.Q. Densely Connected Convolutional Networks, CVPR 2017. arXiv 2017, arXiv:1608.06993. [Google Scholar]
- Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv 2017, arXiv:1704.04861. [Google Scholar]
- Tan, M.; Le, Q.V. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. arXiv 2019, arXiv:1905.11946. [Google Scholar]
Inference Method | U-Net | Linknet | FPN |
---|---|---|---|
Segmentation model only | 0.3870 (0.008) | 0.3970 (0.012) | 0.4615 (0.006) |
Approximator trained separately | 0.4084 (0.004) | 0.4367 (0.005) | 0.4587 (0.007) |
Proposed method | 0.6418 (0.026) | 0.6962 (0.034) | 0.7089 (0.027) |
Segmentation Model | Backbone | Segmentation Model Only | Approximator Trained Separately | Proposed Method |
---|---|---|---|---|
Unet | Resnet18 [34] | 0.7022 (0.015) | 0.6238 (0.002) | 0.7749 (0.0004) |
SeResnet18 [41] | 0.7109 (0.017) | 0.7229 (0.020) | 0.7873 (0.0011) | |
DenseNet121 [42] | 0.6433 (0.004) | 0.5423 (0.001) | 0.7139 (0.0015) | |
InceptionV3 [23] | 0.3541 (0.050) | 0.4881 (0.008) | 0.7065 (0.0022) | |
MobileNetV2 [43] | 0.4024 (0.031) | 0.5283 (0.002) | 0.7330 (0.0004) | |
EfficientNetB0 [44] | 0.4836 (0.009) | 0.5773 (2 × 10−6) | 0.7689 (0.0002) | |
Linknet | Resnet18 [34] | 0.7133 (0.017) | 0.5866 (6 × 10−5) | 0.7706 (0.0002) |
SeResnet18 [41] | 0.6759 (0.009) | 0.6426 (0.004) | 0.7752 (0.0004) | |
DenseNet121 [42] | 0.6946 (0.013) | 0.6109 (0.001) | 0.7569 (9 × 10−6) | |
InceptionV3 [23] | 0.4850 (0.008) | 0.5034 (0.005) | 0.7252 (0.0008) | |
MobileNetV2 [43] | 0.4009 (0.031) | 0.4426 (0.018) | 0.6289 (0.0156) | |
EfficientNetB0 [44] | 0.6847 (0.011) | 0.6132 (0.001) | 0.7586 (2 × 10−5) | |
FPN | InceptionV3 [23] | 0.7327 (0.023) | 0.7086 (0.016) | 0.9189 (0.0272) |
MobileNetV2 [43] | 0.4293 (0.022) | 0.5128 (0.004) | 0.7357 (0.0003) |
Segmentation Model | Backbone | Parameters in Combined Form (M) | Evaluation Time (ms/batch) |
---|---|---|---|
Unet | Resnet18 | 14.75 | 288 |
SeResnet18 | 14.84 | 286 | |
DenseNet121 | 12.55 | 306 | |
InceptionV3 | 30.34 | 316 | |
MobileNetV2 | 8.46 | 288 | |
EfficientNetB0 | 10.52 | 283 | |
Linknet | Resnet18 | 11.93 | 266 |
SeResnet18 | 12.02 | 275 | |
DenseNet121 | 8.76 | 294 | |
InceptionV3 | 26.68 | 311 | |
MobileNetV2 | 4.55 | 290 | |
EfficientNetB0 | 6.50 | 305 | |
FPN | InceptionV3 | 25.44 | 367 |
MobileNetV2 | 5.62 | 342 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
De Silva, K.D.M.; Lee, H.J. Distorted Aerial Images Semantic Segmentation Method for Software-Based Analog Image Receivers Using Deep Combined Learning. Appl. Sci. 2023, 13, 6816. https://doi.org/10.3390/app13116816
De Silva KDM, Lee HJ. Distorted Aerial Images Semantic Segmentation Method for Software-Based Analog Image Receivers Using Deep Combined Learning. Applied Sciences. 2023; 13(11):6816. https://doi.org/10.3390/app13116816
Chicago/Turabian StyleDe Silva, Kalupahanage Dilusha Malintha, and Hyo Jong Lee. 2023. "Distorted Aerial Images Semantic Segmentation Method for Software-Based Analog Image Receivers Using Deep Combined Learning" Applied Sciences 13, no. 11: 6816. https://doi.org/10.3390/app13116816
APA StyleDe Silva, K. D. M., & Lee, H. J. (2023). Distorted Aerial Images Semantic Segmentation Method for Software-Based Analog Image Receivers Using Deep Combined Learning. Applied Sciences, 13(11), 6816. https://doi.org/10.3390/app13116816