A Ship Detection Method in Infrared Remote Sensing Images Based on Image Generation and Causal Inference
Abstract
:1. Introduction
- (1)
- Replacing the mapping network of the original StyleGAN2 network. To reduce the computational complexity and the number of parameters of the original StyleGAN2 network while preserving the original image information in the generated latent space, this paper replaces the mapping network of the original StyleGAN2 with a Variational Auto-Encoder (VAE). The latent space generated through VAE retains the original image information, which is beneficial for the generation of infrared remote sensing ship images;
- (2)
- Introducing a self-attention mechanism in addition to replacing the mapping network of the original StyleGAN2 network addresses the issue of lacking low-level detailed features and the reduction in useful information extraction during the convolution process. This results in the generation of infrared remote sensing ship images that lack edges, textures, and other features. Furthermore, in the generator, the input of noise has been reduced from two noises per feature block to one noise, effectively preventing excessive noise from affecting the quality of generated images;
- (3)
- Conducting comparative experiments on an infrared remote sensing ship image dataset. In this paper, the effectiveness of the proposed method is verified through ablation experiments and comparative experiments with mainstream image generation methods. Subjective and objective evaluation metrics show that the proposed method has a good effect, which is beneficial for addressing the problem of a lack of ship samples in infrared remote sensing images. Frechet Inception Distance (FID) is added to the common evaluation metrics of image generation methods to further assess the diversity of generated images and the similarity of the distribution between generated images and original real images;
- (4)
- The ship detection method based on deep learning has the problems of a complex network, a large number of parameters, and poor interpretability. This paper designs a lightweight multi-class ship detection method for infrared remote sensing images [15], which can detect multi-class ship objects in complex scenes. An interpretability method for ship detection based on causal reasoning is proposed. Singular value decomposition and Transformer are combined to reduce the dimensionality and classify the data, and the model detection results are further explained to improve the interpretability of the model.
2. Related Works
2.1. Traditional Image Generation Methods
2.2. Deep Learning-Based Image Generation Methods
2.3. Infrared Remote Sensing Image Ship Detection Methods Based on Deep Learning
2.4. Causal Inference Methods
3. An Image Generation Method Based on Improved StyleGAN2
3.1. Improved StyleGAN2 Network Structure
3.1.1. Mapping Network Based on the Original Image Information Encoding
3.1.2. A Generator Network Introducing SA Mechanisms
3.2. An Infrared Ship Image Generation Method Based on Improved StyleGAN2 Network
- (1)
- The input infrared remote sensing image is preprocessed, the image data are converted into high-dimensional vector data, and the real infrared remote sensing ship image is encoded. Through the mapping network based on the original image information coding, namely, VAE, the real infrared remote sensing ship image is mapped to the potential space, the feature information of the real infrared remote sensing ship image is extracted, and the real sample hidden variables close to the normal distribution are output. The encoding process is detailed below.① The sample image X outputs two m-dimensional vectors through the encoder;② Assuming that the latent normal distribution can generate the input image, ε is sampled from the standard normal distribution N (0, 1), and then the spatial hidden variable is obtained by Equation (2);③ KL divergence is used to measure the similarity between the distribution of the hidden space and the normal distribution, so that the distribution of the generated potential space is as close as possible to the normal distribution;
- (2)
- The generated hidden variable Z is broadcasted as the input of the AdaIN module in the Synthesis Network for the network training of the Synthesis Network. The Synthesis Network consists of multiple Synthesis Blocks. The input of each Synthesis Block is composed of noise, the output of the previous style block (the input of the first style block is a 512-dimensional constant), and style transformation. Each Synthesis Block includes up-sampling and convolution kernel AdaIN operations. The original image X and the hidden variable Z are used as the input of the Synthesis Network. After the convolution operation with resolutions of 64*64 and 1024*1024, SA is added to generate the image, ;
- (3)
- The infrared remote sensing ship image generated by the generator and the real infrared ship image are simultaneously input into the discriminator for judgment, and the results are fed back to the generator;
- (4)
- The discriminator cost function is updated and optimized. Similarly, the generator cost function is also optimized;
- (5)
- Repeat the above steps to complete all iterations of model training. The number of iterations is set to be 1600 times, and the generated image is output.
4. A Multi-Class Ship Detection Method for Infrared Remote Sensing Images
4.1. A Lightweight Multi-Class Ship Detection Method in Infrared Remote Sensing Images
4.2. A Ship Detection Interpretability Method Based on Causal Reasoning
- (1)
- The singular value curve of the image is drawn to determine the distribution range of the information in the image;
- (2)
- The singular value of the image is selected, and the appropriate singular value k is selected by image reconstruction;
- (3)
- The image after singular value decomposition is combined with Transformer.
- (4)
- Train the model and save weight parameters.
- (5)
- Load the weight parameters and model, input an image, and classify and recognize the input image.
5. Experiments and Analysis
5.1. Environment Configuration and Datasets
5.2. Parameters Setting and Evaluation Metrics
- (1)
- Parameters
- (2)
- Evaluation metrics
5.3. Experiment Results
5.3.1. Comparison Results of Image Generation Ablation Experiments
5.3.2. The Experiment Comparison Results and Analysis of the Proposed Method and Mainstream Methods
5.3.3. Experiment Comparison Results and Analysis of Lightweight Multi-Class Ship Detection Method in Infrared Remote Sensing Images
5.3.4. Experiment Results and Analysis of the Ship Detection Interpretability Method Based on Causal Reasoning
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Chang, L.; Chen, Y.T.; Wang, J.H.; Chang, Y.L. Modified Yolov3 for ship detection with visible and infrared images. Electronics 2022, 11, 739. [Google Scholar] [CrossRef]
- Huang, Y.; Liu, R.W.; Liu, J. A two-step image stabilization method for promoting visual quality in vision-enabled maritime surveillance systems. IET Intell. Transp. Syst. 2023, 17, 435–449. [Google Scholar] [CrossRef]
- Zhang, Z.; Gao, Q.; Liu, L.; He, Y. A high-quality rice leaf disease image data augmentation method based on a dual GAN. IEEE Access 2023, 11, 21176–21191. [Google Scholar] [CrossRef]
- Yang, W.J.; Chen, B.X.; Yang, J.F. CTDP: Depacking with guided depth upsampling networks for realization of multiview 3D video. In Proceedings of the Future of Information and Communication Conference, San Francisco, CA, USA, 2–3 March 2023; pp. 136–152. [Google Scholar]
- Tan, M.K.; Xu, S.K.; Zhang, S.H.; Chen, Q. A review on deep adversarial visual generation. J. Image Graph. 2021, 26, 2751–2766. [Google Scholar]
- Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial networks. Commun. ACM 2020, 63, 139–144. [Google Scholar] [CrossRef]
- Abu-Srhan, A.; Abushariah, M.A.M.; Al-Kadi, O.S. The effect of loss function on conditional generative adversarial networks. J. King Saud Univ. Comput. Inf. Sci. 2022, 34, 6977–6988. [Google Scholar] [CrossRef]
- Gao, H.; Zhang, Y.; Lv, W.; Yin, J.; Qasim, T.; Wang, D. A deep convolutional generative adversarial networks-based method for defect detection in small sample industrial parts images. Appl. Sci. 2022, 12, 6569. [Google Scholar] [CrossRef]
- Phan, H.; Nguyen, H.L.; Chen, O.Y.; Koch, P.; Duong, N.Q.K.; McLoughlin, I.; Mertins, A. Self-attention generative adversarial network for speech enhancement. In Proceedings of the 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada, 6–11 June 2021; pp. 7103–7107. [Google Scholar]
- Chan, E.R.; Lin, C.Z.; Chan, M.A.; Nagano, K.; Pan, B.; Mello, S.; Gallo, O.; Guibas, L.; Tremblay, J.; Khamis, S.; et al. Efficient geometry-aware 3D generative adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 16102–16112. [Google Scholar]
- Brophy, E.; Wang, Z.; She, Q.; Ward, T. Generative adversarial networks in time series: A systematic literature review. ACM Comput. Surv. 2023, 55, 1–31. [Google Scholar] [CrossRef]
- Han, Y.; Liao, J.; Lu, T.; Pu, T.; Peng, Z. KCPNet: Knowledge-driven context perception networks for ship detection in infrared imagery. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5000219. [Google Scholar] [CrossRef]
- Kawai, N.; Koike, H. Facial mask completion using StyleGAN2 preserving features of the person. IEICE Trans. Inf. Syst. 2023, 106, 1627–1637. [Google Scholar] [CrossRef]
- Li, L.; Yu, J.; Chen, F. TISD: A three bands thermal infrared dataset for all day ship detection in spaceborne imagery. Remote Sens. 2022, 14, 5297. [Google Scholar] [CrossRef]
- Zhang, Y.M.; Li, R.Q. A lightweight multi-target detection method for infrared remote sensing image ships. J. Netw. Intell. 2023, 8, 535–545. [Google Scholar]
- Turk, M.A.; Pentland, A.P. Face recognition using eigenfaces. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Maui, HI, USA, 3–6 June 1991; pp. 586–591. [Google Scholar]
- Comon, P. Independent component analysis, a new concept? Signal Process. 1994, 36, 287–314. [Google Scholar] [CrossRef]
- Permuter, H.; Francos, J.; Jermyn, I.H. Gaussian mixture models of texture and colour for image database retrieval. In Proceedings of the 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, Hong Kong, China, 21 May 2003; pp. 569–573. [Google Scholar]
- Rabiner, L.; Juang, B. An introduction to hidden markov models. IEEE ASSP Mag. 1986, 3, 4–16. [Google Scholar] [CrossRef]
- Cross, G.R.; Jain, A.K. Markov random field texture models. IEEE Trans. Pattern Anal. Mach. Intell. 1983, 5, 25–39. [Google Scholar] [CrossRef] [PubMed]
- Hinton, G.E.; Salakhutdinov, R.R. Reducing the dimensionality of data with neural networks. Science 2006, 313, 504–507. [Google Scholar] [CrossRef] [PubMed]
- Fang, Z.; Fu, Y.; Liu, L.X. A dual of transformer features-related map-intelligent generation method. J. Image Graph. 2023, 28, 3281–3294. [Google Scholar]
- Huang, S.Y.; Wu, W.; Yang, Y.; Li, H.X.; Wang, B. A low-exposure image enhancement based on progressive dual network model. Chin. J. Comput. 2021, 44, 384–394. [Google Scholar]
- Wang, Y.H.; He, Y.; Wang, Z. Overview of text-to-image generation methods based on deep learning. Comput. Eng. Appl. 2022, 58, 50–67. [Google Scholar]
- Nishio, M. Machine learning/deep learning in medical image processing. Appl. Sci. 2021, 11, 11483. [Google Scholar] [CrossRef]
- Huang, M.; Mao, Z.; Chen, Z.; Zhang, Y. Towards accurate image coding: Improved autoregressive image generation with dynamic vector quantization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 22596–22605. [Google Scholar]
- Mak, H.W.L.; Han, R.; Yin, H.H.F. Application of Variational AutoEncoder (VAE) model and image processing approaches in game design. Sensors 2023, 23, 3457. [Google Scholar] [CrossRef]
- Zhou, T.; Li, Q.; Lu, H.; Cheng, Q.; Zhang, X. GAN review: Models and medical image fusion applications. Inf. Fusion 2023, 91, 134–148. [Google Scholar] [CrossRef]
- Ho, J.; Saharia, C.; Chan, W.; Fleet, D.J.; Norouzi, M.; Salimans, T. Cascaded diffusion models for high fidelity image generation. J. Mach. Learn. Res. 2022, 23, 1–33. [Google Scholar]
- Wang, S.M. Research on Intelligent Detection Technology of Optical Fiber End Face Based on Feature Fusion. Guangdong University of Technology, Guangzhou, China, 2020.
- Li, L.; Jiang, L.; Zhang, J.; Wang, S.; Chen, F. A complete YOLO-based ship detection method for thermal infrared remote sensing images under complex backgrounds. Remote Sens. 2022, 14, 1534. [Google Scholar] [CrossRef]
- Miao, C.K.; Lou, S.L.; Gong, W.F. Infrared ship target detection algorithm based on improved centernet. Laser Infrared 2022, 52, 1717–1722. [Google Scholar]
- Karras, T.; Aittala, M.; Laine, S.; Harkonen, E.; Hellsten, J.; Lehtinen, J.; Aila, T. Alias-free generative adversarial networks. Adv. Neural Inf. Process. Syst. 2021, 34, 852–863. [Google Scholar]
- Karras, T.; Laine, S.; Aittala, M.; Hellsten, J.; Lehtinen, J.; Aila, T. Analyzing and improving the image quality of StyleGAN. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 8110–8119. [Google Scholar]
- Li, J.N.; Xiong, R.B.; Lan, Y.Y.; Pang, L.; Guo, J.F.; Cheng, X.Q. Overview of the frontier progress of causal machine learning. J. Comput. Res. Dev. 2023, 60, 59–84. [Google Scholar]
- Cui, P.; Athey, S. Stable learning establishes some common ground between causal inference and machine learning. Nat. Mach. Intell. 2022, 4, 110–115. [Google Scholar] [CrossRef]
- Shao, F.; Luo, Y.; Zhang, L.; Ye, L.; Tang, S.; Yang, Y.; Xiao, J. Improving weakly supervised object localization via causal intervention. In Proceedings of the 29th ACM International Conference on Multimedia, New York, NY, USA, 20–24 October 2021; pp. 3321–3329. [Google Scholar]
- Gao, G.; Li, X.; Du, Z. Custom attribute image generation based on improved StyleGAN2. In Proceedings of the 2023 15th International Conference on Machine Learning and Computing, New York, NY, USA, 17–20 February 2023; pp. 335–340. [Google Scholar]
- Sundar, S.; Sumathy, S. An effective deep learning model for grading abnormalities in retinal fundus images using Variational Auto-Encoders. Int. J. Imaging Syst. Technol. 2023, 33, 92–107. [Google Scholar] [CrossRef]
- Li, Y.Z.; Wang, Y.; Huang, Y.H.; Xiang, P.; Liu, W.-X.; Lai, Q.Q.; Gao, Y.Y.; Xu, M.S.; Guo, Y.F. RSU-Net: U-net based on residual and self-attention mechanism in the segmentation of cardiac magnetic resonance images. Comput. Methods Programs Biomed. 2023, 231, 107437. [Google Scholar] [CrossRef]
- Mi, Z.; Jiang, X.; Sun, T.; Xu, K. GAN-generated image detection with self-attention mechanism against gan generator defect. IEEE J. Sel. Top. Signal Process. 2020, 14, 969–981. [Google Scholar] [CrossRef]
- Ma, N.; Zhang, X.; Zheng, H.T.; Sun, J. ShuffleNet V2: Practical guidelines for efficient CNN architecture design. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 122–138. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 6000–6010. [Google Scholar]
- Yu, Y.; Buchanan, S.; Pai, D.; Chu, T.; Wu, Z.; Tong, S.; Haeffele, B.D.; Ma, Y. White-Box Transformers via Sparse Rate Reduction. arXiv 2023, arXiv:2306.01129. [Google Scholar]
- Wang, H.; Li, Y.; Ding, S.; Pan, X.; Gao, Z.; Wan, S.; Feng, J. Adaptive denoising for magnetic resonance image based on nonlocal structural similarity and lowrank sparse representation. Clust. Comput. 2023, 26, 2933–2946. [Google Scholar] [CrossRef]
- Dziembowski, A.; Mieloch, D.; Stankowski, J.; Grzelka, A. IV-PSNR:the objective quality metric for immersive video applications. IEEE Trans. Circuits Syst. Video Technol. 2022, 32, 7575–7591. [Google Scholar] [CrossRef]
- Lee, J.; Lee, M. FIDGAN: A generative adversarial network with an inception distance. In Proceedings of the 2023 International Conference on Artificial Intelligence in Information and Communication, Bali, Indonesia, 20–23 February 2023; pp. 397–400. [Google Scholar]
Class | Liner | Bulk Carrier | Warship | Sailboat | Canoe | Container Ship | Fishing Boat |
---|---|---|---|---|---|---|---|
Amount | 1099 | 4103 | 1761 | 3675 | 1381 | 489 | 6514 |
Methods | SSIM | PSNR | FID |
---|---|---|---|
StyleGAN2 | 0.82 | 22.86 | 55.60 |
VAE + StyleGAN2 | 0.84 | 23.52 | 54.39 |
StyleGAN2 + SA | 0.83 | 23.34 | 47.66 |
Proposed method | 0.86 | 24.84 | 40.49 |
Methods | SSIM | PSNR | ||||
---|---|---|---|---|---|---|
Single- Objective | Multi- Objective | Complex Scenarios | Single- Objective | Multi- Objective | Complex Scenarios | |
StyleGAN2 | 0.73 | 0.63 | 0.23 | 25.37 | 18.58 | 14.80 |
VAE + StyleGAN2 | 0.80 | 0.77 | 0.28 | 29.71 | 26.97 | 15.06 |
StyleGAN2 + SA | 0.75 | 0.67 | 0.25 | 27.62 | 19.89 | 15.33 |
Proposed method | 0.85 | 0.81 | 0.31 | 28.11 | 27.87 | 20.85 |
Methods | SSIM | PSNR | FID |
---|---|---|---|
ProGAN | 0.76 | 19.51 | 64.40 |
StyleGAN | 0.78 | 22.55 | 70.19 |
StyleGAN2 | 0.82 | 22.86 | 55.60 |
Proposed method | 0.86 | 24.84 | 40.49 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, Y.; Li, R.; Du, Z.; Ye, Q. A Ship Detection Method in Infrared Remote Sensing Images Based on Image Generation and Causal Inference. Electronics 2024, 13, 1293. https://doi.org/10.3390/electronics13071293
Zhang Y, Li R, Du Z, Ye Q. A Ship Detection Method in Infrared Remote Sensing Images Based on Image Generation and Causal Inference. Electronics. 2024; 13(7):1293. https://doi.org/10.3390/electronics13071293
Chicago/Turabian StyleZhang, Yongmei, Ruiqi Li, Zhirong Du, and Qing Ye. 2024. "A Ship Detection Method in Infrared Remote Sensing Images Based on Image Generation and Causal Inference" Electronics 13, no. 7: 1293. https://doi.org/10.3390/electronics13071293
APA StyleZhang, Y., Li, R., Du, Z., & Ye, Q. (2024). A Ship Detection Method in Infrared Remote Sensing Images Based on Image Generation and Causal Inference. Electronics, 13(7), 1293. https://doi.org/10.3390/electronics13071293