Efficient Image Segmentation of Coal Blocks Using an Improved DIRU-Net Model
Abstract
1. Introduction
- (1)
- Due to the current lack of a public dataset of coal block images, we took videos of coal blocks on the conveyor belt in the coal mine and manually annotated them with the help of LabelMe software (version 5.2) to obtain a high-quality dataset of coal blocks. Aiming at the problems of complex coal block contours and long manual annotation time, a data augmentation method was used to expand the limited dataset, saving a significant amount of annotation time. Considering that the RGB color information of the original image in the coal block and background segmentation task does not affect the segmentation result, the image was grayscale processed first. This can reduce the amount of image data by two-thirds and help accelerate the training speed of the model. Then, Contrast Limiting Histogram Equalization (CLAHE), bilateral filtering method and Gamma processing are used to enhance the contrast between coal blocks and the brightness of the image.
- (2)
- Based on the U-shaped deep learning segmentation model, this paper introduces dilated convolution and inverted residual structures into the U-Net network structure and proposes a new coal block image segmentation model: DIRU-Net. The dilated convolutional layer of this model can capture a wider field of view. The depth-separable convolutional layer reduces the number of parameters of the model through the working mode of one convolution kernel responsible for one channel convolution, achieving the rapid and accurate segmentation of coal block images.
- (3)
- To prove the effectiveness of the proposed model, a comparative experiment was conducted with the classic image segmentation model. The analysis results of the comparative experiment can demonstrate that the algorithm proposed in this paper can achieve more accurate segmentation than other segmentation models, while the number of parameters of the model is the least.
2. Materials and Methods
2.1. Dataset Production
2.2. Image Preprocessing
3. DIRU-Net Framework
3.1. DIRU-Net Structure
- (1)
- The encoder and decoder flow idea of U-Net is very popular in the field of segmentation and has high segmentation accuracy. Therefore, this structure is also used in the trunk of our model.
- (2)
- The depth of the network is very important for learning the characteristics of stronger expression ability, but as the layers are deeper, the gradient of back-propagation will become unstable with the multiplication and become particularly larger or smaller, and it will cause gradient explosion or disappearance. Although batch norm and RELU can solve this problem, the performance of the network will become worse and worse with the increase in depth. Therefore, He Kaiming and his team proposed the idea of residual connection to solve this problem. We also adopt this idea in our network to avoid performance degradation, which can also make up for the characteristic information between the upper and lower layers and improve efficiency.
- (3)
- The hot application and development of deep learning networks in various fields have promoted the miniaturization of network models with smaller volumes and faster speeds. To lighten our model and accelerate the prediction speed, we combine the dilated convolution with the inverted residuals and propose the DIR block, which reduces the calculation cost and accelerates the prediction speed.
3.2. DIR Block Architecture
3.2.1. DIR Block
3.2.2. Residual Connection
3.2.3. Depth-Wise Separable Convolution
4. Experimental Results
4.1. Experimental System Configuration
4.2. Evaluation Criteria
4.3. Comparative Experiment
4.4. Ablation Study
4.5. Loss Function Curve
4.6. Real-Time Performance Testing on Different Hardware Types
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Sun, Z. Research on Comprehensive Mechanized Coal Mining Technology in Coal Mines, Coal economy in Inner Mongolia. Inn. Mong. Coal Econ. 2025, 15, 55–57. [Google Scholar] [CrossRef]
- Liu, Y.; Wang, X.; Zhang, Z.; Deng, F. Deep learning in image segmentation for mineral production: A review. Comput. Geosci. 2023, 180, 105455. [Google Scholar] [CrossRef]
- Zhang, H.; Xiao, D. A high-precision and lightweight ore particle segmentation network for industrial conveyor belt. Expert Syst. Appl. 2025, 273, 126891. [Google Scholar] [CrossRef]
- Li, F.; Liu, X.; Yin, Y.; Li, Z. DDR-Unet: A High-Accuracy and Efficient Ore Image Segmentation Method. IEEE Trans. Instrum. Meas. 2023, 72, 5027920. [Google Scholar] [CrossRef]
- Mukherjee, D.P.; Potapovich, Y.; Levner, I.; Zhang, H. Ore Image Segmentation by Learning Image and Shape Features. Pattern Recognit. Lett. 2009, 30, 615–622. [Google Scholar] [CrossRef]
- Zhan, Y.; Zhang, G. An Improved OTSU Algorithm Using Histogram Accumulation Moment for Ore Segmentation. Symmetry 2019, 11, 431. [Google Scholar] [CrossRef]
- Zhang, G.Y.; Liu, G.Z.; Zhu, H.; Qiu, B. Ore Image Thresholding Using Bi-Neighbourhood Otsu’s Approach. Electron. Lett. 2010, 46, 1666. [Google Scholar] [CrossRef]
- Dong, K.; Jiang, D. Automated Estimation of Ore Size Distributions Based on Machine Vision. In Unifying Electrical Engineering and Electronics Engineering; Xing, S., Chen, S., Wei, Z., Xia, J., Eds.; Lecture Notes in Electrical Engineering; Springer: New York, NY, USA, 2014; Volume 238, pp. 1125–1131. ISBN 978-1-4614-4980-5. [Google Scholar]
- Zhang, G.; Liu, G.; Zhu, H. Segmentation Algorithm of Complex Ore Images Based on Templates Transformation and Reconstruction. Int. J. Miner. Metall. Mater. 2011, 18, 385–389. [Google Scholar] [CrossRef]
- Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015; Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F., Eds.; Lecture Notes in Computer Science; Springer International Publishing: Cham, Switzerland, 2015; Volume 9351, pp. 234–241. ISBN 978-3-319-24573-7. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Identity Mappings in Deep Residual Networks. In Proceedings of the Computer Vision—ECCV 2016—14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016. [Google Scholar]
- Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv 2017, arXiv:1704.04861. [Google Scholar] [CrossRef]
- Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.-C. MobileNetV2: Inverted Residuals and Linear Bottlenecks. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4510–4520. [Google Scholar]
- Howard, A.; Sandler, M.; Chen, B.; Wang, W.; Chen, L.-C.; Tan, M.; Chu, G.; Vasudevan, V.; Zhu, Y.; Pang, R.; et al. Searching for MobileNetV3. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 1314–1324. [Google Scholar]
- Zhou, Z.; Yuan, H.; Cai, X. Rock Thin Section Image Identification Based on Convolutional Neural Networks of Adaptive and Second-Order Pooling Methods. Mathematics 2023, 11, 1245. [Google Scholar] [CrossRef]
- Gu, Y.; Deng, L. STAGCN: Spatial–Temporal Attention Graph Convolution Network for Traffic Forecasting. Mathematics 2022, 10, 1599. [Google Scholar] [CrossRef]
- Haq, M.A.; Khan, I.; Ahmed, A.; Eldin, S.M.; Alshehri, A.; Ghamry, N.A. DCNNBT: A Novel Deep Convolution Neural Network-Based Brain Tumor Classification Model. Fractals 2023, 31, 2340102. [Google Scholar] [CrossRef]
- Aguilera, R.C.; Ortiz, M.P.; Banda, A.A.; Aguilera, L.E.C. Blockchain CNN Deep Learning Expert System for Healthcare Emergency. Fractals 2021, 29, 2150227. [Google Scholar] [CrossRef]
- Wang, L.; Wang, X.; Zhao, Z.; Wu, Y.; Xu, J.; Zhang, H.; Yu, J.; Sun, Q.; Bai, Y. Multi-Factor Status Prediction by 4d Fractal CNN Based on Remote Sensing Images. Fractals 2022, 30, 2240101. [Google Scholar] [CrossRef]
- Ma, X.; Zhang, P.; Man, X.; Ou, L. A New Belt Ore Image Segmentation Method Based on the Convolutional Neural Network and the Image-Processing Technology. Minerals 2020, 10, 1115. [Google Scholar] [CrossRef]
- Li, H.; Pan, C.; Chen, Z.; Wulamu, A.; Yang, A. Ore Image Segmentation Method Based on U-Net and Watershed. Comput. Mater. Contin. 2020, 65, 563–578. [Google Scholar] [CrossRef]
- Xiao, D.; Liu, X.; Le, B.T.; Ji, Z.; Sun, X. An Ore Image Segmentation Method Based on RDU-Net Model. Sensors 2020, 20, 4979. [Google Scholar] [CrossRef]
- Jin, Q.; Meng, Z.; Pham, T.D.; Chen, Q.; Wei, L.; Su, R. DUNet: A Deformable Network for Retinal Vessel Segmentation. Knowl.-Based Syst. 2019, 178, 149–162. [Google Scholar] [CrossRef]
- Wang, W.; Li, Q.; Xiao, C.; Zhang, D.; Miao, L.; Wang, L. An Improved Boundary-Aware U-Net for Ore Image Semantic Segmentation. Sensors 2021, 21, 2615. [Google Scholar] [CrossRef]
- Huang, Z.; Zhao, Y.; Liu, Y.; Song, G. GCAUNet: A Group Cross-Channel Attention Residual UNet for Slice Based Brain Tumor Segmentation. Biomed. Signal Process. Control 2021, 70, 102958. [Google Scholar] [CrossRef]
- Gu, Z.; Cheng, J.; Fu, H.; Zhou, K.; Hao, H.; Zhao, Y.; Zhang, T.; Gao, S.; Liu, J. CE-Net: Context Encoder Network for 2D Medical Image Segmentation. IEEE Trans. Med. Imaging 2019, 38, 2281–2292. [Google Scholar] [CrossRef] [PubMed]
- Li, Y.; Yang, J.; Ni, J.; Elazab, A.; Wu, J. TA-Net: Triple Attention Network for Medical Image Segmentation. Comput. Biol. Med. 2021, 137, 104836. [Google Scholar] [CrossRef] [PubMed]
- Shi, Z.; Wang, T.; Huang, Z.; Xie, F.; Liu, Z.; Wang, B.; Xu, J. MD-Net: A Multi-Scale Dense Network for Retinal Vessel Segmentation. Biomed. Signal Process. Control 2021, 70, 102977. [Google Scholar] [CrossRef]
- Yu, F.; Koltun, V. Multi-Scale Context Aggregation by Dilated Convolutions. arXiv 2016, arXiv:1511.07122. [Google Scholar] [CrossRef]
- Zhou, Z.; Siddiquee, M.M.R.; Tajbakhsh, N.; Liang, J. UNet++: Redesigning Skip Connections to Exploit Multiscale Features in Image Segmentation. IEEE Trans. Med. Imaging 2020, 39, 1856–1867. [Google Scholar] [CrossRef]
- Diakogiannis, F.I.; Waldner, F.; Caccetta, P.; Wu, C. ResUNet-a: A Deep Learning Framework for Semantic Segmentation of Remotely Sensed Data. ISPRS J. Photogramm. Remote Sens. 2020, 162, 94–114. [Google Scholar] [CrossRef]
- Badrinarayanan, V.; Kendall, A.; Cipolla, R. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2481–2495. [Google Scholar] [CrossRef]
- Alom, M.Z.; Yakopcic, C.; Taha, T.M.; Asari, V.K. Nuclei Segmentation with Recurrent Residual Convolutional Neural Networks Based U-Net (R2U-Net). In Proceedings of the NAECON 2018—IEEE National Aerospace and Electronics Conference, Dayton, OH, USA, 23–26 July 2018; pp. 228–233. [Google Scholar]












| Contracting Path | Expanding Path | ||||||
|---|---|---|---|---|---|---|---|
| Layer Num. | Layer Type | Filter Size | Output Shape | Layer Num. | Layer Type | Filter Size | Output Shape |
| Input | Input | [1, 128, 128] | Upsample | [512, 16, 16] | |||
| L2 | Conv-1 + BN + ReLU | [16, 128, 128] | L23 | Conv-17 + BN + Hswish | [256, 16, 16] | ||
| L3 | Conv-2 + BN + ReLU | [32, 128, 128] | L24 | DConv-18 + BN + Hswish | [256, 16, 16] | ||
| L4 | DConv-3 + BN + ReLU | [32, 128, 128] | L25 | Conv-19 + BN | [256, 16, 16] | ||
| L5 | Conv-4 + BN | [32, 128, 128] | L26 | Upsample | [256, 32, 32] | ||
| L6 | MaxPool2d | [32, 64, 64] | L27 | Conv-20 + BN + Hswish | [128, 32, 32] | ||
| L7 | Conv-5 + BN + Hswish | [64, 64, 64] | L28 | DConv-21 + BN + Hswish | [128, 32, 32] | ||
| L8 | DConv-6 + BN + Hswish | [64, 64, 64] | L29 | Conv-22 + BN | [128, 32, 32] | ||
| L9 | Conv-7 + BN | [64, 64, 64] | L30 | Upsample | [128, 64, 64] | ||
| L10 | MaxPool2d | [64, 32, 32] | L31 | Conv-23 + BN + Hswish | [64, 64, 64] | ||
| L11 | Conv-8 + BN + Hswish | [128, 32, 32] | L32 | DConv-24 + BN + Hswish | [64, 64, 64] | ||
| L12 | DConv-9 + BN + Hswish | [128, 32, 32] | L33 | Conv-25 + BN | [64, 64, 64] | ||
| L13 | Conv-10 + BN | [128, 32, 32] | L34 | Upsample | [64, 128, 128] | ||
| L14 | MaxPool2d | [128, 16, 16] | L35 | Conv-26 + BN + ReLU | [32, 128, 128] | ||
| L15 | Conv-11 + BN + Hswish | [256, 16, 16] | L36 | DConv-27 + BN + ReLU + Hswish | [32, 128, 128] | ||
| L16 | DConv-12 + BN + Hswish | [256, 16, 16] | L37 | Conv-28 + BN + ReLU | [16, 128, 128] | ||
| L17 | Conv-13 + BN | [256, 16, 16] | L38 | Output = Conv-29 + BN | [1, 128, 128] | ||
| L18 | MaxPool2d | [256, 8, 8] | |||||
| L19 | Conv-14 + BN + Hswish | [512, 8, 8] | |||||
| L20 | DConv-15 + BN + Hswish | [512, 8, 8] | |||||
| L21 | Conv-16 + BN + Hswish | [512, 8, 8] | |||||
| Methods | ACC | Precision | Recall | F1 | Time(s) | FLOPs (MB) | Params (MB) |
|---|---|---|---|---|---|---|---|
| Ours | 0.948 | 0.912 | 0.917 | 0.914 | 6.25 | 399.14 | 0.77 |
| U-Net++ [31] | 0.942 | 0.913 | 0.893 | 0.903 | 18 | 8716.42 | 9.16 |
| ResUnet [32] | 0.941 | 0.901 | 0.906 | 0.903 | 27.5 | 20,208.16 | 13.04 |
| SegNet [33] | 0.923 | 0.896 | 0.844 | 0.869 | 477.5 | 457,657.07 | 29.44 |
| U-Net [10] | 0.943 | 0.911 | 0.901 | 0.905 | 24.75 | 16,361.72 | 34.53 |
| R2U-Net [34] | 0.799 | 0.941 | 0.357 | 0.517 | 59.25 | 38,227.67 | 39.09 |
| Methods | ACC | Precision | Recall | F1 | Time (s) | FLOPs (MB) | Params (MB) |
|---|---|---|---|---|---|---|---|
| Baseline (U-Net) | 0.943 | 0.911 | 0.901 | 0.905 | 24.75 | 16,361.72 | 34.53 |
| Baseline + DIR block (DIRU-Net) | 0.948 | 0.912 | 0.917 | 0.914 | 6.25 | 399.14 | 0.77 |
| Methods | i5-7500 CPU | Methods | GTX1050Ti GPU |
|---|---|---|---|
| Time (s) | Time (s) | ||
| U-Net++ | 26 | U-Net++ | 18 |
| ResUnet | 41.23 | ResUnet | 27.5 |
| SegNet | 754.5 | SegNet | 477.5 |
| U-Net | 34.69 | U-Net | 24.75 |
| R2U-Net | 89.76 | R2U-Net | 59.25 |
| DIRU-Net | 9.45 | DIRU-Net | 6.25 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, J.; Fan, G.; Maimutimin, B. Efficient Image Segmentation of Coal Blocks Using an Improved DIRU-Net Model. Mathematics 2025, 13, 3541. https://doi.org/10.3390/math13213541
Liu J, Fan G, Maimutimin B. Efficient Image Segmentation of Coal Blocks Using an Improved DIRU-Net Model. Mathematics. 2025; 13(21):3541. https://doi.org/10.3390/math13213541
Chicago/Turabian StyleLiu, Jingyi, Gaoxia Fan, and Balaiti Maimutimin. 2025. "Efficient Image Segmentation of Coal Blocks Using an Improved DIRU-Net Model" Mathematics 13, no. 21: 3541. https://doi.org/10.3390/math13213541
APA StyleLiu, J., Fan, G., & Maimutimin, B. (2025). Efficient Image Segmentation of Coal Blocks Using an Improved DIRU-Net Model. Mathematics, 13(21), 3541. https://doi.org/10.3390/math13213541
