An Effective Method for Underwater Biological Multi-Target Detection Using Mask Region-Based Convolutional Neural Network
Abstract
:1. Introduction
- (1)
- A commonly used image augmentation method and improved CutMix algorithm were applied to expand dataset samples and solve the overfitting problem in deep learning training due to the class imbalance that occurs in multi-target recognition.
- (2)
- A novel method for underwater image enhancement based on simple weighted fusion was proposed to enhance the image quality in complex underwater environments.
- (3)
- The Mask-RCNN model was improved to prevent problems such as missed and false detections, thereby improving the identification accuracy of the model. The results revealed that the proposed model exhibited superior performance in comparison to the other models.
2. Related Works
2.1. Underwater Image Augmentation
2.2. Underwater Image Enhancement
2.3. Underwater Target Recognition
3. Proposed Method
3.1. Improved CutMix Based Underwater Image Augmentation
- (1)
- Select the images of starfish with good clarity from the dataset and use the image segmentation techniques to segment the starfishes, resulting in dataset X where the background is black and only the main body of the starfish is retained.
- (2)
- Select images with fewer organisms in the dataset to obtain dataset Y, in order to avoid or reduce overlap with other types of organisms when expanding the number of starfish in the dataset.
- (3)
- Randomly select images from dataset Y without putting them back, calculating the color channel components of the image background R, G, and B. Based on the requirements, add a certain number of starfish, randomly select multiple starfish images from dataset X, and convert their black background into the calculated color components to achieve a better fusion effect.
- (4)
- Resize the selected starfish images and adopt the image fusion algorithm to fuse the starfish images into the images selected from dataset Y in step 3, thus completing the image segmentation of the starfish samples.
- (5)
- Repeat steps 3 and 4 until there are no more images in dataset Y.
3.2. Image Fusion-Based Underwater Image Enhancement
3.2.1. White Balance Algorithm
- (1)
- Calculate the sum of the R, G, and B values for each pixel and save the coordinates of the brightest point in the image.
- (2)
- Calculate the threshold T from the top 10% of the sum of R, G, and B or other ratio reference points.
- (3)
- Traverse through each point in the image and calculate the cumulative sum and average of the R, G, and B components for all points where the sum of R, G, and B is greater than the threshold T.
- (4)
- Calculate the gain coefficients of each channel in the image according to the brightest point value and the average calculated results in the previous step.
- (5)
- Quantize each pixel to [0, 255] according to the gain coefficients.
3.2.2. Multiscale Retinal with Color Restoration
3.2.3. Dark Channel Prior Algorithm
3.2.4. Image Fusion-Based Underwater Image Enhancement
3.3. Underwater Biological Multi-Target Recognition Based on the Improved Mask-RCNN
3.3.1. Non-Maximum Suppression Algorithm
- (1)
- NMS Algorithm
- Step 1:
- Arrange all of the bounding boxes in set B in descending order according to their confidence scores.
- Step 2:
- Calculate the intersection-over-union (iou) of the first bounding box M, which has the highest confidence score, and the sequenced bounding boxes bi. The iou is generally set manually. If iou (M, bi) exceeds the rigid threshold Nt, the confidence score of bi will be set to zero.
- Step 3:
- Move the proposal m, with bounding box M, into the set F, which is initialized with an empty set.
- Step 4:
- Repeat the above three steps for the remaining bounding boxes in B until complete traversal.
- (2)
- Soft-NMS
Algorithm 1: soft-NMS |
3.3.2. Improved Feature Pyramid Network
- (1)
- FPN
- (i).
- The nearest neighbor interpolation method is adopted for the process of upsampling, but the high-level semantic information may not be transmitted effectively.
- (ii).
- A lack of effective communication exists among multi-size receptive fields.
- (iii).
- The FPN network applies four stages of backbone network output, which may not be sufficient for output multi-scale information.
- (2)
- AC-FPN
4. Experimental Results and Analysis
4.1. Development of the Underwater Dataset
4.2. Selection of Underwater Dataset
4.3. Parameter Configuration and Evaluation Criteria
4.4. Underwater Image Augmentation Results
4.5. Underwater Image Enhancement Results
4.6. Underwater Biological Multi-Target Recognition Results
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Song, S.; Zhu, J.; Li, X.; Huang, Q. Integrate MSRCR and mask R-CNN to recognize underwater creatures on small sample datasets. IEEE Access 2020, 8, 172848–172858. [Google Scholar]
- Zhou, J.; Yang, Q.; Meng, H.; Gao, D. An underwater target recognition method based on improved YOLOv4 in complex marine environment. Syst. Sci. Control Eng. 2022, 10, 590–602. [Google Scholar] [CrossRef]
- Everingham, M.; Eslami, S.A.; Van Gool, L.; Williams, C.K.; Winn, J.; Zisserman, A. The pascal visual object classes challenge: A retrospective. Int. J. Comput. Vis. 2015, 111, 98–136. [Google Scholar]
- Bao, Z.; Guo, Y.; Wang, J.; Zhu, L.; Huang, J.; Yan, S. Underwater Target Detection Based on Parallel High-Resolution Networks. Sensors 2023, 23, 7337. [Google Scholar]
- Huibin, W.; Qian, Z.; Xin, W.; Zhe, C. Object detection based on regional saliency and underwater optical prior knowledge. Chin. J. Sci. Instrum. 2014, 35, 387–397. [Google Scholar]
- Shi, X.U.X.; Zhang, J.L. Feature extraction of underwater targets using generalized S-transform. J. Comput. Appl. 2012, 32, 280–282. [Google Scholar]
- Jiang, R.; Han, S.; Yu, Y.; Ding, W. An access control model for medical big data based on clustering and risk. Inf. Sci. 2023, 621, 691–707. [Google Scholar]
- Zhou, T.; Wu, W.; Peng, L.; Zhang, M.; Li, Z.; Xiong, Y.; Bai, Y. Evaluation of urban bus service reliability on variable time horizons using a hybrid deep learning method. Reliab. Eng. Syst. Saf. 2022, 217, 108090. [Google Scholar]
- Zhang, J.; Cui, Y.; Ren, J. Dynamic Mission Planning Algorithm for UAV Formation in Battlefield Environment. IEEE Trans. Aerosp. Electron. Syst. 2022, 59, 3750–3765. [Google Scholar] [CrossRef]
- Zhang, W.; Dong, L.; Pan, X.; Zou, P.; Qin, L.; Xu, W. A survey of restoration and enhancement for underwater images. IEEE Access 2019, 7, 182259–182279. [Google Scholar]
- Schettini, R.; Corchs, S. Underwater image processing: State of the art of restoration and image enhancement methods. EURASIP J. Adv. Signal Process. 2010, 2010, 746052. [Google Scholar]
- Chang, H.H.; Cheng, C.Y.; Sung, C.C. Single underwater image restoration based on depth estimation and transmission compensation. IEEE J. Ocean. Eng. 2018, 44, 1130–1149. [Google Scholar] [CrossRef]
- Huang, H.; Zhou, H.; Yang, X.; Zhang, L.; Qi, L.; Zang, A.Y. Faster R-CNN for marine organisms detection and recognition using data a ugmentation. Neurocomputing 2019, 337, 372–384. [Google Scholar] [CrossRef]
- Noh, J.M.; Jang, G.R.; Ha, K.N.; Park, J.H. Data augmentation method for object detection in un-derwater environments. In Proceedings of the 2019 19th International Conference on Control, Automation and Systems (ICCAS), Jeju, Republic of Korea, 15–18 October 2019; pp. 324–328. [Google Scholar]
- Yun, S.; Han, D.; Oh, S.J.; Chun, S.; Choe, J.; Yoo, Y. Cutmix: Regularization strategy to train strong classifiers with localizable features. In Proceedings of the IEEE/CVF International Conference on Computer Vision, New Orleans, LA, USA, 18–24 June 2022; pp. 6023–6032. [Google Scholar]
- Ghani, A.S.A.; Isa, N.A.M. Underwater image quality enhancement through integrated color model with Rayleigh distribution. Appl. Soft Comput. 2015, 27, 219–230. [Google Scholar] [CrossRef]
- Vasamsetti, S.; Mittal, N.; Neelapu, B.C.; Sardana, H.K. Wavelet based perspective on variational enhancement technique for underwater imagery. Ocean Eng. 2017, 141, 88–100. [Google Scholar]
- Iqbal, M.; Riaz, M.M.; Ali, S.S.; Ghafoor, A.; Ahmad, A. Underwater image enhancement using laplace decomposition. IEEE Geosci. Remote Sens. Lett. 2020, 19, 1500105. [Google Scholar] [CrossRef]
- Jobson, D.J.; Rahman, Z.U.; Woodell, G.A. A multiscale retinex for bridging the gap between color images and the human observation of scenes. IEEE Trans. Image Process. 1997, 6, 965–976. [Google Scholar] [CrossRef]
- Li, J.; Skinner, K.A.; Eustice, R.M.; Johnson-Roberson, M. WaterGAN: Unsupervised generative network to enable real-time color correction of monocular underwater images. IEEE Robot. Autom. Lett. 2017, 3, 387–394. [Google Scholar] [CrossRef]
- Li, C.; Guo, C.; Ren, W.; Cong, R.; Hou, J.; Kwong, S.; Tao, D. An underwater image enhancement benchmark dataset and beyond. IEEE Trans. Image Process. 2019, 29, 4376–4389. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar]
- Mittal, S.; Srivastava, S.; Jayanth, J.P. A survey of deep learning techniques for underwater image classification. IEEE Trans. Neural Netw. Learn. Syst. 2022, 34, 6968–6982. [Google Scholar] [CrossRef] [PubMed]
- Chen, L.; Zheng, M.; Duan, S.; Luo, W.; Yao, L. Underwater target recognition based on improved YOLOv4 neural network. Electronics 2021, 10, 1634. [Google Scholar]
- Yeh, C.H.; Lin, C.H.; Kang, L.W.; Huang, C.H.; Lin, M.H.; Chang, C.Y.; Wang, C.C. Lightweight deep neural network for joint learning of underwater object detection and color conversion. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 6129–6143. [Google Scholar]
- Huang, W.; Wang, Y.; Zhu, L. A Time Impulse Neural Network Framework for Solving the Minimum Path Pair Problems of the Time-Varying Network. IEEE Trans. Knowl. Data Eng. 2022, 35, 7681–7692. [Google Scholar]
- Jiang, R.; Kang, Y.; Liu, Y.; Liang, Z.; Duan, Y.; Sun, Y.; Liu, J. A trust transitivity model of small and medium-sized manufacturing enterprises under blockchain-based supply chain finance. Int. J. Prod. Econ. 2022, 247, 108469. [Google Scholar]
- Shi, P.; Xu, X.; Ni, J.; Xin, Y.; Huang, W.; Han, S. Underwater Biological Detection Algorithm Based on Improved Faster-RCNN. Water 2021, 13, 2420. [Google Scholar]
- Li, A.; Yu, L.; Tian, S. Underwater Biological Detection Based on YOLOv4 Combined with Channel Attention. J. Mar. Sci. Eng. 2022, 10, 469. [Google Scholar] [CrossRef]
- Liu, Z.; Zhuang, Y.; Jia, P.; Wu, C.; Xu, H.; Liu, Z. A Novel Underwater Image Enhancement Algorithm and an Improved Underwater Biological Detection Pipeline. J. Mar. Sci. Eng. 2022, 10, 1204. [Google Scholar]
- Yu, K.; Cheng, Y.; Tian, Z.; Zhang, K. High Speed and Precision Underwater Biological Detection Based on the Improved YOLOV4-Tiny Algorithm. J. Mar. Sci. Eng. 2022, 10, 1821. [Google Scholar] [CrossRef]
- Li, J.; Liu, C.; Lu, X.; Wu, B. CME-YOLOv5: An Efficient Object Detection Network for Densely Spaced Fish and Small Targets. Water 2022, 14, 2412. [Google Scholar]
- Buchsbaum, G. A spatial processor model for object colour perception. J. Frankl. Inst. 1980, 310, 1–26. [Google Scholar]
- Wang, J.; Lu, K.; Xue, J.; He, N.; Shao, L. Single image dehazing based on the physical model and MSRCR algorithm. IEEE Trans. Circuits Syst. Video Technol. 2017, 28, 2190–2199. [Google Scholar] [CrossRef]
- Rahman, Z.U.; Jobson, D.J.; Woodell, G.A. Retinex processing for automatic image enhancement. J. Electron. Imaging 2004, 13, 100–110. [Google Scholar]
- He, K.; Sun, J.; Tang, X. Single image haze removal using dark channel prior. IEEE Trans. Pattern Anal. Mach. Intell. 2010, 33, 2341–2353. [Google Scholar]
- Li, Y.; Chen, N.; Zhang, J. Fast and high sensitivity focusing evaluation function. Appl. Res. Comput. 2010, 27, 1534–1536. [Google Scholar]
- Yi, F. Research on an Auto-focusing Algorithm for Microscope. Chin. J. Sci. Instrum. 2005, 26, 1275. [Google Scholar]
- Rothe, R.; Guillaumin, M.; Van Gool, L. Non-maximum suppression for object detection by passing messages between windows. In Computer Vision–ACCV 2014: 12th Asian Conference on Computer Vision, Singapore, 1–5 November 2014; Revised Selected Papers, Part I 12; Springer International Publishing: Berlin/Heidelberg, Germany, 2014; pp. 290–306. [Google Scholar]
- Ni, J.; Shen, K.; Chen, Y.; Yang, S.X. An Improved SSD-Like Deep Network-Based Object Detection Method for Indoor Scenes. IEEE Trans. Instrum. Meas. 2023, 72, 5006915. [Google Scholar]
- Wang, W.; Li, X.; Lyu, X.; Zeng, T.; Chen, J.; Chen, S. Multi-Attribute NMS: An Enhanced Non-Maximum Suppression Algorithm for Pedestrian Detection in Crowded Scenes. Appl. Sci. 2023, 13, 8073. [Google Scholar]
- Bodla, N.; Singh, B.; Chellappa, R.; Davis, L.S. Soft-NMS—Improving object detection with one line of code. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 5561–5569. [Google Scholar]
- Cao, J.; Chen, Q.; Guo, J.; Shi, R. Attention-guided context feature pyramid network for object detection. arXiv Preprint 2020, arXiv:2005.11475. [Google Scholar]
Category | Sea Urchins | Sea Cucumbers | Starfish |
---|---|---|---|
Initial number of creatures | 644 | 543 | 375 |
Number of creatures after augmentation | 1367 | 1326 | 1298 |
Image Augmentation | Sea Urchins (AP) | Sea Cucumbers (AP) | Starfish (AP) | mAP |
---|---|---|---|---|
No | 0.759 | 0.733 | 0.691 | 0.728 |
Yes | 0.804 | 0.794 | 0.791 | 0.796 |
Image Enhancement | Recall | Precision | mAP |
---|---|---|---|
No | 0.809 | 0.791 | 0.796 |
Yes | 0.817 | 0.823 | 0.812 |
Model | mAP | FPS |
---|---|---|
YOLOv5 | 0.661 | 26.72 |
Mask-RCNN | 0.796 | 5.35 |
Proposed | 0.828 | 4.97 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yue, Z.; Yan, B.; Liu, H.; Chen, Z. An Effective Method for Underwater Biological Multi-Target Detection Using Mask Region-Based Convolutional Neural Network. Water 2023, 15, 3507. https://doi.org/10.3390/w15193507
Yue Z, Yan B, Liu H, Chen Z. An Effective Method for Underwater Biological Multi-Target Detection Using Mask Region-Based Convolutional Neural Network. Water. 2023; 15(19):3507. https://doi.org/10.3390/w15193507
Chicago/Turabian StyleYue, Zhaoxin, Bing Yan, Huaizhi Liu, and Zhe Chen. 2023. "An Effective Method for Underwater Biological Multi-Target Detection Using Mask Region-Based Convolutional Neural Network" Water 15, no. 19: 3507. https://doi.org/10.3390/w15193507
APA StyleYue, Z., Yan, B., Liu, H., & Chen, Z. (2023). An Effective Method for Underwater Biological Multi-Target Detection Using Mask Region-Based Convolutional Neural Network. Water, 15(19), 3507. https://doi.org/10.3390/w15193507