Deep Q-Network for Optimal Decision for Top-Coal Caving
Abstract
:1. Introduction
- (1)
- The optimal control of the window’s action of hydraulic support is transformed into a Markov decision process and a new method based on deep Q-network is proposed to regulate the optimal decision of the window’s action. In the method, the state of the environment, the loss function of the optimizer, and the reward of each step are given according to the process of top-coal caving.
- (2)
- A 3D discrete element method simulation platform is created to analyze the process of top-coal caving based on Yade. Based on the simulation platform, simulation experiments were carried out and the results theoretically validate an available way of applying the intelligent method to top-coal caving.
2. Top-Coal Caving 3D Simulation Platform
3. Optimal Decision of Top-Coal Caving with Deep Q-Network
3.1. Markov Process of Top-Coal Caving
3.2. Deep Q-Network for Top-Coal Caving
Algorithm 1: DQN for top-coal caving. |
|
4. Experiment on Top-Coal Caving
4.1. DQN Model of Top-Coal Caving
4.2. Experiment and Result Analysis
5. Conclusions
- (1)
- The DQN method can get more coal particles obviously than the classical method with a very small price of increasing the rock rate. In the 10 tests, the average coal particles of the classical method and DQN are 658.7 and 682.3, respectively. Meanwhile, the rock rate of the DQN only rises 0.001.
- (2)
- The reward of the window’s action by the DQN is better than the classical method. In the 10 tests, the average reward of the DQN is 633.7 while the classical method is 613.1. That means the DQN can produce more benefits than the classical method.
- (1)
- The state of the DQN is selected as the total number of the particles and the coal rate. At present, our method is only used in the simulation; one of the obstacles for practice application is that the DQN needs the data of the state. However, the data are difficult to obtain in practice; hence, in future work, we will try to use a deep neural network to approximate the needed data from other geological information.
- (2)
- The DQN gets the optimal Q-value by training, therefore there should be as much experience as possible, while, in practice, the experience obtained from top-coal caving is not as convenient as simulation. Hence, in future work, the learning mechanism of the DQN will be researched to get a lightweight learning framework based on the state space.
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Leonard, M.D.; Michaelides, E.E.; Michaelides, D.N. Substitution of coal power plants with renewable energy sources—Shift of the power demand and energy storage. Energy Convers. Manag. 2018, 164, 27–35. [Google Scholar] [CrossRef]
- Khatib, H. IEA World Energy Outlook 2010—A comment. Energy Policy 2011, 39, 2507–2511. [Google Scholar] [CrossRef]
- Xu, G.; Wang, W. China’s energy consumption in construction and building sectors: An outlook to 2100. Energy 2020, 195, 117045. [Google Scholar] [CrossRef]
- Energetika. BP Energy Outlook: 2019 Edition; BP Press: London, UK, 2019. [Google Scholar]
- Mohanta, S.; Mishra, B.; Biswal, S. An emphasis on optimum fuel production for Indian coal preparation plants treating multiple coal sources. Fuel 2010, 89, 775–781. [Google Scholar] [CrossRef]
- Jingchao, Z.; Kotani, K.; Saijo, T. Low-quality or high-quality coal? Household energy choice in rural Beijing. Energy Econ. 2019, 78, 81–90. [Google Scholar] [CrossRef] [Green Version]
- Eremin, M.; Esterhuizen, G.; Smolin, I. Numerical simulation of roof cavings in several Kuzbass mines using finite-difference continuum damage mechanics approach. Int. J. Min. Sci. Technol. 2020. [Google Scholar] [CrossRef]
- Dobson, J.A.; Riddiford-Harland, D.L.; Bell, A.F.; Wegener, C.; Steele, J.R. Effect of shaft stiffness and sole flexibility on perceived comfort and the plantar pressures generated when walking on a simulated underground coal mining surface. Appl. Ergon. 2020, 84, 103024. [Google Scholar] [CrossRef]
- Vakili, A.; Hebblewhite, B. A new cavability assessment criterion for Longwall Top Coal Caving. Int. J. Rock Mech. Min. Sci. 2010, 47, 1317–1329. [Google Scholar] [CrossRef]
- Alehossein, H.; Poulsen, B.A. Stress analysis of longwall top coal caving. Int. J. Rock Mech. Min. Sci. 2010, 47, 30–41. [Google Scholar] [CrossRef]
- Si, G.; Jamnikar, S.; Lazar, J.; Shi, J.Q.; Durucan, S.; Korre, A.; Zavšek, S. Monitoring and modelling of gas dynamics in multi-level longwall top coal caving of ultra-thick coal seams, part I: Borehole measurements and a conceptual model for gas emission zones. Int. J. Coal Geol. 2015, 144–145, 98–110. [Google Scholar] [CrossRef]
- Zhang, Q.; Yue, J.; Liu, C.; Feng, C.; Li, H. Study of automated top-coal caving in extra-thick coal seams using the continuum-discontinuum element method. Int. J. Rock Mech. Min. Sci. 2019, 122, 104033. [Google Scholar] [CrossRef]
- Le, T.D.; Zhang, C.; Oh, J.; Mitra, R.; Hebblewhite, B. A new cavability assessment for Longwall Top Coal Caving from discontinuum numerical analysis. Int. J. Rock Mech. Min. Sci. 2019, 115, 11–20. [Google Scholar] [CrossRef]
- Gu, Q.; Ru, W.; Tan, Y.; Ning, J.; Xu, Q. Mechanical Analysis of Weakly Cemented Roof of Gob-side Entry Retaining in Fully-Mechanized Top Coal Caving Mining. Geotech. Geol. Eng. 2019, 37, 2977–2984. [Google Scholar] [CrossRef]
- Zhang, Q.; Yuan, R.; Wang, S.; Li, D.; Li, H.; Zhang, X. Optimizing Simulation and Analysis of Automated Top-Coal Drawing Technique in Extra-Thick Coal Seams. Energies 2020, 13, 232. [Google Scholar] [CrossRef] [Green Version]
- Guo, W.; Tan, Y.; Bai, E. Top coal caving mining technique in thick coal seam beneath the earth dam. Int. J. Min. Sci. Technol. 2017, 27, 165–170. [Google Scholar] [CrossRef]
- Basarir, H.; Oge, I.F.; Aydin, O. Prediction of the stresses around main and tail gates during top coal caving by 3D numerical analysis. Int. J. Rock Mech. Min. Sci. 2015, 76, 88–97. [Google Scholar] [CrossRef]
- Xie, Y.S.; Zhao, Y.S. Numerical simulation of the top coal caving process using the discrete element method. Int. J. Rock Mech. Min. Sci. 2009, 46, 983–991. [Google Scholar] [CrossRef]
- Song, Z.; Zhang, J. Numerical Simulation of Top-Coal Thickness Effect on the Top-CoalRecovery Ratio by Using DEM Method. Electron. J. Geotech. Eng. 2015, 20, 3795–3796. [Google Scholar]
- Wang, J.; Zhang, J.; Li, Z. A new research system for caving mechanism analysis and its application to sublevel top-coal caving mining. Int. J. Rock Mech. Min. Sci. 2016, 88, 273–285. [Google Scholar] [CrossRef]
- Liu, C.; Li, H.; Jiang, D. Numerical simulation study on the relationship between mining heights and shield resistance in longwall panel. Int. J. Min. Sci. Technol. 2017, 27, 293–297. [Google Scholar]
- Shahani, N.M.; Wan, Z.; Guichen, L.; Siddiqui, F.I.; Pathan, A.G.; Yang, P.; Liu, S. Numerical analysis of top coal recovery ratio by using discrete element method. Pak. J. Eng. Appl. Sci. 2019, 25, 26–35. [Google Scholar]
- Liu, C.; Li, H.; Ying, Z. Method of synergetic multi-windows caving in longwall top coal caving working face. J. China Coal Soc. 2019, 44, 2632–2640. [Google Scholar]
- Feng, G.; Wang, P. Simulation of recovery of upper remnant coal pillar while mining the ultra-close lower panel using longwall top coal caving. Int. J. Min. Sci. 2020, 30, 55–61. [Google Scholar] [CrossRef]
- Le, T.D.; Mitra, R.; Oh, J.; Hebblewhite, B. A review of cavability evaluation in longwall top coal caving. Int. J. Min. Sci. Technol. 2017, 27, 907–915. [Google Scholar] [CrossRef]
- Zhang, N.; Liu, C.; Wu, X.; Ren, T. Dynamic random arching in the flow field of top-coal caving mining. Energies 2018, 11, 1106. [Google Scholar] [CrossRef] [Green Version]
- Unver, B.; Yasitli, N. Modelling of strata movement with a special reference to caving mechanism in thick seam coal mining. Int. J. Coal Geol. 2006, 66, 227–252. [Google Scholar] [CrossRef]
- Nikitenko, M.; Kizilov, S.; Nikolaev, P.; Kuznetsov, I. Technical Devices of Powered Roof Support for the Top Coal Caving as Automation Objects; IOP Conference Series: Materials Science and Engineering; IOP Publishing: Bristol, UK, 2018; Volume 354, p. 012014. [Google Scholar]
- Khanal, M.; Adhikary, D.; Balusu, R. Evaluation of mine scale longwall top coal caving parameters using continuum analysis. Min. Sci. Technol. 2011, 21, 787–796. [Google Scholar] [CrossRef]
- Li, Z.; Xu, J.; Yu, S.; Ju, J.; Xu, J. Mechanism and prevention of a chock support failure in the longwall top-coal caving faces: A case study in Datong coalfield, China. Energies 2018, 11, 288. [Google Scholar] [CrossRef] [Green Version]
- Cui, F.; Dong, S.; Lai, X.; Chen, J.; Cao, J.; Shan, P. Study on Rule of Overburden Failure and Rock Burst Hazard under Repeated Mining in Fully Mechanized Top-Coal Caving Face with Hard Roof. Energies 2019, 12, 4780. [Google Scholar] [CrossRef] [Green Version]
- Yates, C.A.; Ford, M.J.; Mort, R.L. A multi-stage representation of cell proliferation as a Markov process. Bull. Math. Biol. 2017, 79, 2905–2928. [Google Scholar] [CrossRef] [Green Version]
- Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction; MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
- Luo, B.; Liu, D.; Huang, T.; Wang, D. Model-free optimal tracking control via critic-only Q-learning. IEEE Trans. Neural Netw. Learn. Syst. 2016, 27, 2134–2144. [Google Scholar] [CrossRef] [PubMed]
- Rummery, G.A.; Niranjan, M. On-Line Q-Learning Using Connectionist Systems; University of Cambridge, Department of Engineering: Cambridge, UK, 1994; Volume 37. [Google Scholar]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436. [Google Scholar] [CrossRef] [PubMed]
- Seide, F.; Li, G.; Yu, D. Conversational speech transcription using context-dependent deep neural networks. In Proceedings of the Twelfth Annual Conference of the International Speech Communication Association, Florence, Italy, 27–31 August 2011; pp. 437–440. [Google Scholar]
- Sainath, T.N.; Mohamed, A.R.; Kingsbury, B.; Ramabhadran, B. Deep convolutional neural networks for LVCSR. In Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, Canada, 26–31 May 2013; pp. 8614–8618. [Google Scholar]
- Schütt, K.; Gastegger, M.; Tkatchenko, A.; Müller, K.R.; Maurer, R.J. Unifying machine learning and quantum chemistry with a deep neural network for molecular wavefunctions. Nat. Commun. 2019, 10, 1–10. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Clary, K.; Tosch, E.; Foley, J.; Jensen, D. Let’s Play Again: Variability of Deep Reinforcement Learning Agents in Atari Environments. In Proceedings of the Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, Montréal, QC, Canada, 3–8 December 2018. [Google Scholar]
- Van Hasselt, H.; Guez, A.; Silver, D. Deep reinforcement learning with double q-learning. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 12–17 February 2016; pp. 2094–2100. [Google Scholar]
- Wang, Z.; Schaul, T.; Hessel, M.; Van Hasselt, H.; Lanctot, M.; De Freitas, N. Dueling network architectures for deep reinforcement learning. arXiv 2015, arXiv:1511.06581. Available online: https://arxiv.org/pdf/1511.06581.pdf (accessed on 1 July 2019).
- Schaul, T.; Quan, J.; Antonoglou, I.; Silver, D. Prioritized experience replay. arXiv 2015, arXiv:1511.05952. Available online: https://arxiv.org/pdf/1511.05952.pdf (accessed on 1 July 2019).
- Hoel, C.J.; Driggs-Campbell, K.; Wolff, K.; Laine, L.; Kochenderfer, M.J. Combining Planning and Deep Reinforcement Learning in Tactical Decision Making for Autonomous Driving. IEEE Trans. Intell. Veh. 2019, 1, 1. [Google Scholar] [CrossRef] [Green Version]
- Kalashnikov, D.; Irpan, A.; Pastor, P.; Ibarz, J.; Herzog, A.; Jang, E.; Quillen, D.; Holly, E.; Kalakrishnan, M.; Vanhoucke, V.; et al. Qt-opt: Scalable deep reinforcement learning for vision-based robotic manipulation. In Proceedings of the 2nd Conference on Robot Learning (CoRL 2018), Zurich, Switzerland, 29–31 October 2018. [Google Scholar]
- Hessel, M.; Soyer, H.; Espeholt, L.; Czarnecki, W.; Schmitt, S.; van Hasselt, H. Multi-task deep reinforcement learning with popart. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; Volume 33, pp. 3796–3803. [Google Scholar]
- Palmer, G.; Tuyls, K.; Bloembergen, D.; Savani, R. Lenient multi-agent deep reinforcement learning. In Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems. International Foundation for Autonomous Agents and Multiagent Systems, Stockholm, Sweden, 10–15 July 2018; pp. 443–451. [Google Scholar]
- Šmilauer, V.; Ning, G.; Alexander, E.; Bruno, C.; Raphael, M.; Thomas, S.; Francois, K.; Luc, S.; Emanuele, C.; Sergei, D.; et al. Yade Documentation, 2nd ed.; The Yade Project, Grenoble University: Grenoble, France, 2015. [Google Scholar]
- Šmilauer, V.; Ning, G.; Alexander, E.; Bruno, C.; Raphael, M.; Thomas, S.; Francois, K.; Luc, S.; Emanuele, C.; Sergei, D.; et al. Using and Programming. In Yade Documentation, 2nd ed.; The Yade Project, Grenoble University: Grenoble, France, 2015. [Google Scholar]
- Šmilauer, V.; Ning, G.; Alexander, E.; Bruno, C.; Raphael, M.; Thomas, S.; Francois, K.; Luc, S.; Emanuele, C.; Sergei, D.; et al. Reference Manual. In Yade Documentation, 2nd ed.; The Yade Project, Grenoble University: Grenoble, France, 2015. [Google Scholar]
- Li, Q.; Yang, Y.; Li, H.; Fei, S. Intelligent control strategy for top coal caving based on Q-learning model. Ind. Mine Autom. 2020, 46, 72–79. (In Chinese) [Google Scholar] [CrossRef]
- Šmilauer, V.; Chareyre, B. DEM formulation. In Yade Documentation, 2nd ed.; The Yade Project, Grenoble University: Grenoble, France, 2015. [Google Scholar]
- Bellman, R. Dynamic programming. Science 1966, 153, 34–37. [Google Scholar] [CrossRef]
- Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.; Fidjeland, A.K.; Ostrovski, G.; et al. Human-level control through deep reinforcement learning. Nature 2015, 518, 529. [Google Scholar] [CrossRef]
- Bottou, L. Large-scale machine learning with stochastic gradient descent. In Proceedings of the 19th International Conference on Computational Statistics, Paris, France, 22–27 August 2010; Springer: Berlin, Germany, 2010; pp. 177–186. [Google Scholar]
6.8 m | 1.5 m | 3.8 m | 3 m | 2 m | 50° | 15° | 45° |
Young | Cohesion | Density | Friction | Poisson | Tensile | Normal | Shear | |
---|---|---|---|---|---|---|---|---|
Modulus (Pa) | (Pa) | (kg/m3) | Angle (°) | Rate | Strength (Pa) | Stiffness (Pa) | Stiffness (Pa) | |
coal | 2 × 108 | 2.06 × 106 | 1373 | 44.82 | 0.29 | 6.9 × 105 | 1.5 × 106 | 1.13 × 106 |
rock | 4 × 108 | 2.11 × 106 | 2542 | 33.6 | 0.23 | 1.51 × 106 | 15.1 × 106 | 1.13 × 106 |
Input Layer | Hidden Layer 1 | Hidden Layer 2 | Output Layer | |
---|---|---|---|---|
Number of neurons | 2 | 56 | 128 | 2 |
Initial | 0.5 | 0.5 | 0.5 | 0.1 |
No. | Coal Number | Rock Number | Coal Rate | Rock Rate | Reward | |||||
---|---|---|---|---|---|---|---|---|---|---|
Cmethod | DQN | Cmethod | DQN | Cmethod | DQN | Cmethod | DQN | Cmethod | DQN | |
1 | 566 | 680 | 2 | 21 | 0.996 | 0.970 | 0.004 | 0.030 | 560 | 617 |
2 | 683 | 704 | 28 | 28 | 0.961 | 0.962 | 0.039 | 0.038 | 599 | 620 |
3 | 680 | 693 | 25 | 24 | 0.965 | 0.967 | 0.035 | 0.033 | 605 | 621 |
4 | 655 | 667 | 16 | 15 | 0.976 | 0.978 | 0.024 | 0.022 | 607 | 622 |
5 | 672 | 664 | 17 | 10 | 0.975 | 0.985 | 0.025 | 0.015 | 621 | 634 |
6 | 649 | 676 | 9 | 13 | 0.986 | 0.981 | 0.014 | 0.019 | 622 | 637 |
7 | 646 | 701 | 6 | 21 | 0.991 | 0.971 | 0.009 | 0.029 | 628 | 638 |
8 | 673 | 675 | 15 | 12 | 0.978 | 0.983 | 0.022 | 0.017 | 628 | 639 |
9 | 682 | 670 | 18 | 10 | 0.974 | 0.985 | 0.026 | 0.015 | 628 | 640 |
10 | 681 | 693 | 16 | 8 | 0.977 | 0.989 | 0.023 | 0.011 | 633 | 669 |
Average | 658.7 | 682.3 | 15.2 | 16.2 | 0.978 | 0.977 | 0.022 | 0.023 | 613.1 | 633.7 |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yang, Y.; Li, X.; Li, H.; Li, D.; Yuan, R. Deep Q-Network for Optimal Decision for Top-Coal Caving. Energies 2020, 13, 1618. https://doi.org/10.3390/en13071618
Yang Y, Li X, Li H, Li D, Yuan R. Deep Q-Network for Optimal Decision for Top-Coal Caving. Energies. 2020; 13(7):1618. https://doi.org/10.3390/en13071618
Chicago/Turabian StyleYang, Yi, Xinwei Li, Huamin Li, Dongyin Li, and Ruifu Yuan. 2020. "Deep Q-Network for Optimal Decision for Top-Coal Caving" Energies 13, no. 7: 1618. https://doi.org/10.3390/en13071618
APA StyleYang, Y., Li, X., Li, H., Li, D., & Yuan, R. (2020). Deep Q-Network for Optimal Decision for Top-Coal Caving. Energies, 13(7), 1618. https://doi.org/10.3390/en13071618