Research on Coal and Rock Identification by Integrating Terahertz Time-Domain Spectroscopy and Multiple Machine Learning Algorithms
Abstract
1. Introduction
2. Experiment and Methods
2.1. Experimental Setup
2.2. Sample Preparation
2.3. Optical Parameter Extraction Method
2.4. Modeling in Machine Learning
3. Results and Analysis
3.1. THz Spectral Characteristics of Samples Mixed with Different Coal–Rock Ratios
- (1)
- For samples with coal content between 0% and 30%, the THz time-domain signals exhibit clear and measurable features. Terahertz waves are sensitive to intermolecular vibrations, lattice phonon modes, and hydrogen-bonding networks in organic matter, which can generate characteristic absorption in this frequency range. However, for samples with coal content of 40% and above, the transmitted terahertz signal becomes extremely weak (transmission approaches zero) due to strong absorption and scattering by the coal matrix, making reliable extraction of optical parameters impossible. The proposed method is effective for coal content in the range of 0–30%, which corresponds to the early stage of coal–rock mixing or rock-dominated interfaces. Therefore, the following optical parameter analysis focuses on sample data with coal content ranging from 0% to 30%.
- (2)
- The difference in refractive index (n) between samples leads to variation in the optical path length (n·d) for the THz pulse. This results in different time delays between the reference signal and the sample signals, as observed in the time-domain spectra [33]. In the time-domain spectra, it can be observed that the peak of the reference signal appears at 16.7 ps, while the signals of samples with coal content ranging from 0% to 30% have a large time delay difference relative to the reference signal, with peaks appearing between 18.6 and 19.1 ps. The time delay difference increases with increasing coal content.
- (3)
- The THz wave suffers amplitude attenuation during propagation. This attenuation is mainly caused by absorption by the medium on the THz wave, as well as the reflection and scattering of the THz wave at the surface of the medium. Compared with the reference signal, the amplitude of the sample’s time-domain spectrum has significant attenuation. In addition, the amplitude attenuation of samples’ time-domain spectrum with higher coal content is greater than that of samples with lower coal content.
- (1)
- The absorption coefficient of the samples exhibits large fluctuations at frequencies below 0.7 THz or above 1.3 THz, primarily due to a low signal-to-noise ratio in these spectral regions. Therefore, the calculated absorption coefficient outside the 0.7–1.3 THz range is unreliable, and only the absorption spectrum within this effective band is analyzed.
- (2)
- At lower frequencies, the absorption coefficients of samples with different coal contents have small differences. As the frequency increases, the absorption coefficients of all six sets of data show an upward trend, and the differences between samples in absorption coefficients become increasingly obvious.
- (3)
- At the same frequency point, the higher the coal content of the sample, the higher its absorption coefficient.
3.2. Classification and Identification of Samples Using Machine Learning Algorithms
3.2.1. Dimensionality Reduction Data Research Based on Principal Component Analysis
- (1)
- The variance contribution rates of the first principal component (principal component 1, PC1) and the second principal component (principal component 2, PC2) are 98.91% and 0.92%, respectively. The cumulative sum of the variance contribution rates reaches 99.83%, indicating that the use of principal component analysis to extract features from the refractive index spectra of the samples can achieve very good results.
- (2)
- The PC1 scores of the three samples with coal contents of 30%, 40%, and 50% are positive, while the PC1 scores of the other three samples with different coal contents are negative. Among them, for the sample with a coal content of 30%, the scores of PC1 and PC2 are not significantly different.
- (3)
- As the coal content increases, the PC1 score changes from negative to positive, and there is a proportional relationship between it and coal content.
- (4)
- Among all the negative scores, the PC1 score of the sample with a coal content of 0% (the whole rock sample) is the highest, and among all the positive scores, the PC1 score of the sample with a coal content of 50% is the highest.

- (1)
- The variance contribution rates of the first two principal components are 86.29% and 6.65%, respectively, and the cumulative sum of variance contribution rates reaches 92.94%, which indicates that using the principal component analysis method to extract features from the absorption spectra of the samples can also achieve good results.
- (2)
- The PC1 scores of the three samples with coal contents of 0%, 10%, and 20% are negative values, while the PC1 scores of the other three samples with different coal contents are positive values.
- (3)
- As the coal content increases, the PC1 score changes from a negative value to a positive value, showing a proportional relationship with coal content.
- (4)
- Among all the negative scores, the PC1 score of the sample with 0% coal content (the whole rock sample) is the highest, and among all the positive scores, the PC1 score of the sample with 50% coal content is the highest.

3.2.2. Evaluation of Data Classification Effectiveness Based on Machine Learning Algorithms
4. Discussion
4.1. Comparison with Existing Coal–Rock Identification Techniques
4.2. Limitations and Future Work
5. Conclusions
- (1)
- Terahertz waves exhibit significant sensitivity to the physical and chemical properties of coal–rock mixed media within 0–30% coal content, presenting unique spectral characteristics. The amplitude decay of the time-domain spectrum, the effective frequency band (0.7–1.3 THz), and optical parameters such as the refractive index and absorption coefficient all show a clear correlation with coal content, verifying the feasibility of THz-TDS in coal–rock identification within this range.
- (2)
- After dimensionality reduction by PCA, the random forest algorithm achieved the best classification performance, with a test set accuracy of 96% and a 10-fold cross-validation accuracy of 95.3 ± 1.9%.
- (3)
- The proposed method is currently limited to coal content below 30% in transmission mode; however, it breaks through the accuracy bottleneck of traditional techniques within this range, providing a new solution for rock-dominated coal–rock interface detection. Future work will extend the detectable range using reflection-mode THz-TDS.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Yuan, L.; Zhang, T.; Wang, Y.H.; Wang, X.; Wang, Y.; Hao, X. Scientific problems and key technologies for safe and efficient mining of deep coal resources. J. China Coal Soc. 2025, 50, 1–12. [Google Scholar]
- Yuan, L. Strategic Conception of Carbon Neutralization in Coal Industry. J. Strateg. Study Chin. Acad. Eng. 2023, 25, 103–110. [Google Scholar] [CrossRef]
- Yuan, L. Challenges and countermeasures for high quality development of China’s coal industry. J. China Coal 2020, 46, 6–12. [Google Scholar]
- Liu, W.-B. Thoughts on the high-quality development path of coal enterprises under the goal of ‘double carbon’. J. China Coal Ind. 2022, 5, 12–14. [Google Scholar]
- Zhang, K. Thinking and practice of green development of coal enterprises under the goal of ‘double carbon’. J. China Coal Ind. 2024, 12, 67–69. [Google Scholar]
- Wang, G.-F.; Zhang, J.-H.; Ren, H.-W.; Du, Y.; Zhang, D.; Yan, R.; Yu, X. Research and application of digital intelligence technology and complete equipment for efficient coal mining. J. China Coal Soc. 2025, 50, 43–64. [Google Scholar]
- Wei, R.; Xu, L.-J.; Meng, X.-Y.; Wu, J.-F.; Zhang, K. Coal rock recognition method based on hyperspectral characteristic absorption peak. Spectrosc. Spectr. Anal. 2021, 41, 1942–1948. [Google Scholar]
- Ge, S.-R. Development history of shearer technology (6)-coal-rock interface detection. J. China Coal 2020, 46, 10–24. [Google Scholar]
- Wang, G.-F.; Pang, Y.-H.; Ren, H.-W. Intelligent mining mode and technical path of coal mine. J. Min. Strat. Control Eng. 2020, 2, 5–19. [Google Scholar]
- Feng, G.; Zhang, N.; Feng, X.-W.; Xie, Z.; Li, Y. Autonomous prediction of rock deformation in fault zones of coal roadways using supervised machine learning. Tunn. Undergr. Space Technol. 2024, 147, 105724. [Google Scholar] [CrossRef]
- Zhou, Y.; Qu, J.-B.; Bai, J.-W.; Feng, G.; Cui, B.; Ren, W.; Liu, D.; Zhang, L. Interpretable damage state identification in coal-backfilling structures using hybrid signal processing and machine learning approaches. Mater. Today Commun. 2025, 48, 113308. [Google Scholar] [CrossRef]
- Liu, Y.; Xu, Y.-P.; Chen, P.; Li, J.-Y.; Liu, D.; Chu, X.-L. Non-destructive spectroscopy assisted by machine learning for coal industrial analysis: Strategies, progress, and future prospects. Trends Anal. Chem. 2025, 192, 118322. [Google Scholar] [CrossRef]
- Lu, J.; Jiang, W.; Xie, H.-P.; Gao, H.; Zhang, D. Dynamic disaster mechanism and acoustic emission evolution of deep coal-rock under true triaxial disturbance stress. J. Rock Mech. Geotech. Eng. 2025, 17, 5829–5844. [Google Scholar] [CrossRef]
- Zhang, Y.; Tong, L.; Lai, X.-P.; Cao, S.; Yan, B.; Liu, Y.; Sun, H.; Yang, Y.; He, W. Perception and accurate recognition of coal-rock interface in tunneling space under coal dust environment based on machine vision. J. China Coal Soc. 2024, 49, 3276–3290. [Google Scholar]
- Liu, Y.; Lei, S.; Wang, Z.-B.; Wei, D.; Gu, J.; Li, X.; Dai, J. A time–frequency analysis method of electromagnetic signal for coal and rock properties recognition while drilling based on CWT and GAPSO-ROA. Measurement 2025, 253, 117447. [Google Scholar] [CrossRef]
- Wei, D.-B. Research of Seam Thickness Detection to Automatically Raise Shearer Arm Based on Natural γ-ray. J. Coal Mine Mach. 2015, 36, 3. [Google Scholar]
- Shi, C.-W.; Gao, Z.-B. Study on the influence of rough surface on ground penetrating radar identification results of coal-rock interface. J. Coal Eng. 2024, 56, 176–181. [Google Scholar]
- Jepsen, P.U.; Cooke, D.G.; Koch, M. Terahertz spectroscopy and imaging—Modern techniques and applications. Laser Photonics Rev. 2011, 5, 124–166. [Google Scholar] [CrossRef]
- Liu, X.-M.; Yu, J.-S.; Chen, X.-D. Millimeter wave and terahertz quasi-optical technology: Theory, application and development. J. Terahertz Sci. Electron. Inf. 2022, 20, 631–652. [Google Scholar]
- Qu, B.-L.; Zhu, H.-Q.; Tian, R.; Hu, L.; Wang, J.; Liao, Q.; Gao, R.; Wang, H. Investigation of the impact of pyrite content on the terahertz dielectric response of coals and rapid recognition with kernel-svm. Energy 2023, 285, 129546. [Google Scholar] [CrossRef]
- Zhang, T.; Zheng, Z.-Y.; Zhang, M.-R.; Li, S.; Zheng, X.; Huang, H.; Shen, J.; Zhang, Z.; Qiu, K. Quantitatively characterization of rare earth ore by terahertz time-domain spectroscopy. Infrared Phys. Technol. 2024, 142, 105587. [Google Scholar] [CrossRef]
- Moffa, C.; Curcio, A.; Merola, C.; Migliorati, M.; Palumbo, L.; Felici, A.C.; Petrarca, M. Discrimination of natural and synthetic forms of azurite: An innovative approach based on high-resolution terahertz continuous wave (THz-CW) spectroscopy for Cultural Heritage. Dyes Pigment. 2024, 229, 112287. [Google Scholar] [CrossRef]
- Zhu, H.-Q.; Wang, H.-R.; Liu, J.-L.; Wang, W.; Gao, R.; Zhang, Y. Application of terahertz dielectric constant spectroscopy for discrimination of oxidized coal and unoxidized coal by machine learning algorithms. Fuel 2021, 293, 120470. [Google Scholar] [CrossRef]
- Huang, S.T.; Deng, H.-X.; Wei, X.; Zhang, J. Progress in application of terahertz time-domain spectroscopy for pharmaceutical analyses. Front. Bioeng. Biotechnol. 2023, 11, 1219042. [Google Scholar] [CrossRef]
- Liang, B. Stretchable and wearable terahertz absorbing composites based on three-dimensional graphene. Technol. Wind 2020, 32, 180–181. [Google Scholar]
- Rytik, A.P.; Tuchin, V.V. Effect of terahertz radiation on cells and cellular structures. Front. Optoelectron. 2025, 18, 25–55. [Google Scholar] [CrossRef] [PubMed]
- Zhao, C.-X. Test and analysis of terahertz wave transmission characteristics of polymer materials. Sci. Technol. Inf. 2010, 22, 92. [Google Scholar]
- Hagenvik, H.O.; Skaar, J. Magnetic permeability in Fresnel’s equation. J. Opt. Soc. Am. B 2019, 36, 1386–1395. [Google Scholar] [CrossRef]
- Taneco-Hernandez, M.A.; Morales-Delgado, V.F.; Gómez-Aguilar, J.F. Fundamental solutions of the fractional Fresnel equation in the real half-line. Phys. A-Stat. Mech. Its Appl. 2019, 521, 807–827. [Google Scholar] [CrossRef]
- Duvillaret, L.; Garet, F.; Coutaz, J.L. Highly precise determination of optical constants and sample thickness in terahertz time-domain spectroscopy. Appl. Opt. 1999, 38, 409–415. [Google Scholar] [CrossRef]
- Dorney, T.D.; Baraniuk, R.G.; Mittleman, D.M. Material parameter estimation with terahertz time-domain spectroscopy. J. Opt. Soc. Am. A 2001, 18, 1562–1571. [Google Scholar] [CrossRef]
- Little, M.A.; Varoquaux, G.; Saeb, S.; Lonini, L.; Jayaraman, A.; Mohr, D.C.; Kording, K.P. Using and understanding cross-validation strategies. Perspectives on Saeb et al. GigaScience 2017, 6, 1–6. [Google Scholar]
- Zhang, Y.; Zhou, WH.; Ge, HY.; Jiang, Y.-Y.; Guo, C.-Y.; Wang, H.; Wen, Q.-Q.; Wang, Y.-X. Research on defect detection of GFRP composite materials based on terahertz imaging technology. Spectrosc. Spectr. Anal. 2025, 45, 1874–1881. [Google Scholar]
- Naftaly, M.; Tikhomirov, I.; Hou, P.; Markl, D. Measuring Open Porosity of Porous Materials Using THz-TDS and an Index-Matching Medium. Sensors 2020, 20, 3120. [Google Scholar] [CrossRef] [PubMed]
- Murphy, K.N.; Naftaly, M.; Nordon, A.; Markl, D. Polymer Pellet Fabrication for Accurate THz-TDS Measurements. Appl. Sci. 2022, 12, 3475. [Google Scholar] [CrossRef]









| Serial Number | Coal Powder Content (%) | Coal Powder Weight (mg) |
|---|---|---|
| 1 | 0 | 0 |
| 2 | 10 | 4.0 |
| 3 | 20 | 8.0 |
| 4 | 30 | 12.0 |
| 5 | 40 | 16.0 |
| 6 | 50 | 20.0 |
| 7 | 60 | 24.0 |
| 8 | 70 | 28.0 |
| 9 | 80 | 32.0 |
| 10 | 90 | 36.0 |
| 11 | 100 | 40.0 |
| Serial Number | Coal Powder Content (%) | Thickness (mm) | Serial Number | Coal Powder Content (%) | Thickness (mm) | Serial Number | Coal Powder Content (%) | Thickness (mm) |
|---|---|---|---|---|---|---|---|---|
| 01 | 0 | 1.01 | 20 | 30 | 1.00 | 39 | 70 | 1.16 |
| 02 | 0 | 1.05 | 21 | 40 | 1.21 | 40 | 70 | 1.11 |
| 03 | 0 | 0.89 | 22 | 40 | 1.00 | 41 | 80 | 1.22 |
| 04 | 0 | 0.89 | 23 | 40 | 1.08 | 42 | 80 | 1.07 |
| 05 | 0 | 1.13 | 24 | 40 | 1.08 | 43 | 80 | 1.16 |
| 06 | 10 | 1.22 | 25 | 40 | 0.91 | 44 | 80 | 1.03 |
| 07 | 10 | 1.09 | 26 | 50 | 1.12 | 45 | 80 | 0.91 |
| 08 | 10 | 1.18 | 27 | 50 | 1.27 | 46 | 90 | 0.98 |
| 09 | 10 | 1.01 | 28 | 50 | 1.25 | 47 | 90 | 1.14 |
| 10 | 10 | 1.07 | 29 | 50 | 1.19 | 48 | 90 | 0.98 |
| 11 | 20 | 1.20 | 30 | 50 | 0.97 | 49 | 90 | 1.08 |
| 12 | 20 | 1.10 | 31 | 60 | 1.26 | 50 | 90 | 1.17 |
| 13 | 20 | 0.91 | 32 | 60 | 0.96 | 51 | 100 | 1.15 |
| 14 | 20 | 1.01 | 33 | 60 | 1.13 | 52 | 100 | 1.05 |
| 15 | 20 | 1.22 | 34 | 60 | 1.02 | 53 | 100 | 1.09 |
| 16 | 30 | 1.03 | 35 | 60 | 1.10 | 54 | 100 | 1.00 |
| 17 | 30 | 1.14 | 36 | 70 | 0.86 | 55 | 100 | 1.04 |
| 18 | 30 | 1.14 | 37 | 70 | 1.20 | / | / | / |
| 19 | 30 | 1.26 | 38 | 70 | 1.10 | / | / | / |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Ye, D.; Hu, L.; Xu, J.; Yang, Y.; Liu, Z.; Li, S.; Li, J.; Liu, L.; Li, C. Research on Coal and Rock Identification by Integrating Terahertz Time-Domain Spectroscopy and Multiple Machine Learning Algorithms. Photonics 2026, 13, 409. https://doi.org/10.3390/photonics13050409
Ye D, Hu L, Xu J, Yang Y, Liu Z, Li S, Li J, Liu L, Li C. Research on Coal and Rock Identification by Integrating Terahertz Time-Domain Spectroscopy and Multiple Machine Learning Algorithms. Photonics. 2026; 13(5):409. https://doi.org/10.3390/photonics13050409
Chicago/Turabian StyleYe, Dongdong, Lipeng Hu, Jianfei Xu, Yadong Yang, Zeping Liu, Sitong Li, Jiabao Li, Longhai Liu, and Changpeng Li. 2026. "Research on Coal and Rock Identification by Integrating Terahertz Time-Domain Spectroscopy and Multiple Machine Learning Algorithms" Photonics 13, no. 5: 409. https://doi.org/10.3390/photonics13050409
APA StyleYe, D., Hu, L., Xu, J., Yang, Y., Liu, Z., Li, S., Li, J., Liu, L., & Li, C. (2026). Research on Coal and Rock Identification by Integrating Terahertz Time-Domain Spectroscopy and Multiple Machine Learning Algorithms. Photonics, 13(5), 409. https://doi.org/10.3390/photonics13050409
