The Application of a Decision Tree and Stochastic Forest Model in Summer Precipitation Prediction in Chongqing
Abstract
:1. Introduction
2. Materials and Methods
2.1. Materials
2.2. Methods
2.2.1. The Decision Tree
2.2.2. Random Forest
- (1)
- If the size of the training set is N (50 in this paper, that is, 1961–2010), for each tree, N training samples are randomly and recursively extracted from the training set (this sampling method is called bootstrap sample method) as the training set of the tree;
- (2)
- If the feature dimension of each sample is M, specify a constant m << M, randomly select m (20 in this paper) feature subsets from M features, and select the optimal one from these m features every time the tree splits.
- (3)
- Every tree grows as fast as possible, and there is no pruning process.
- (4)
- Established a large number of decision trees according to steps (1)–(3), thus forming a random forest. The classification result depends on the number of votes of the tree classifier.
2.2.3. Test Method
3. Results and Analysis of Precipitation Prediction in Summer Test
3.1. Decision Tree Model Test
3.2. Prediction Experiment of Random Forest Model in Summer
4. Conclusions and Discussion
- (1)
- In the concurrent circulation index that affects summer precipitation in Chongqing, the western Pacific subtropical ridge is a very important influencing factor. However, if we only consider the West Pacific sub-ridge ridgeline, there are a total 5 years in the 2011–2018 trend forecast which are accurate. Considering the synergistic effect of the Indian sub-high area and the typhoon landing, the 8-year trend can be accurately predicted, and the trend consistency rate increased by 37.5%. In the case that multiple factors were taken into account for the SST factor in the first winter, the precipitation trend prediction in 8 years was correct, which was 12.5% higher than that in the case that only a single factor of Atlantic meridional model SST was considered. This shows that in the prediction business, as the climate system is the result of the interaction of multiple factors and multiple systems, we not only need to analyze the characteristics and cycles of each part of the system separately, but also need to study the integration behavior of the whole system and the interaction of each subsystem. Using the decision tree to construct a multi-system collaborative impact model is an effective technical method. It is able not only to effectively improve the prediction accuracy, but also the prediction model established by the decision tree is different from the fully black-box effect of the neural network. The affected processes are relatively clear, so there is a higher application prospect for researches such as mechanism analysis.
- (2)
- Using random forest to predict the summer precipitation Ps, Cc and PC scores of Chongqing from 2014 to 2018 are steadily higher than the released forecasts. In addition to the instability of publishing forecasts, the quality of random forest forecasts is relatively stable. The results show that it is feasible to use the random forest algorithm to predict summer rainfall precipitation in Chongqing in actual business. In addition, the random forest algorithm does not have high requirements on data, and it does not need to consider constraints such as the distribution conditions, interaction, nonlinear effects, even missing values of variables. In most cases, the default parameters of the model can give optimal simulation results without tedious parameter adjustment. Therefore, the application of the random forest algorithm in the climate prediction business has good prospects.
Author Contributions
Funding
Conflicts of Interest
References
- Yong-Hua, L.I.; De, L.I.U.; Ye-Yu, Z.H.U.; Yang-Hua, G.A.O.; Wen-shu, M.A.O. Singular Spectrum Analysis of Surface Air Temperature and Precipitation Series in Chongqing. Plateau Meteorol. 2005, 24, 798–804. [Google Scholar]
- Yong Hua, L.; Wen Shu, M.; Yang Hua, G.; Feng Qing, H.; Jia Qi, L. Regional Flood and Drought Indices in Chongqing and their Variation Features Analysis. J. Meteorol. Sci. 2006, 26, 638–644. [Google Scholar]
- Li, Y.H.; Gao, Y.H.; Han, F.Q.; Xiang, M.; Tang, Y.H.; He, Y.K. Features of Annual Temperature and Precipitation Variety with the Effects on NPP in Chongqing. J. Appl. Meteorol. Sci. 2007, 18, 73–79. [Google Scholar]
- Yi, Z.; Yanghua, G.; Xionghong, D. Primary Climatic Characteristics of Summer Precipitation in the Three-Gorges Reservoir Region. J. Southwest Univ. (Nat. Sci. Ed.) 2005, 27, 269–272. [Google Scholar]
- Zhenfeng, M. Forecast of Summer Precipitation over Southwest Region of China. Meteorol. Mon. 2002, 28, 29–33. [Google Scholar]
- Zhang, Q.; Jiang, T.; Wu, Y.J. Impact of ENSO Events on Flood/Drought Disasters of Upper Yangtze River during 1470–2003. J. Glaciol. Geocryol. 2004, 26, 691–696. [Google Scholar]
- De, L.; Yong-hua, L.I.; Yang-hua, G.A.O.; Jing, L.I.; Yun-hui, T.A.N.G.; Zhao, Y.E. Analysis on Eurasian Circulation of Drought and Flood in Summer of Chongqing. Plateau Meteorol. 2005, 24, 275–279. [Google Scholar]
- Xiu, Y.Y.; Han, L.; Feng, H.L. The identification of strong convective weather based on machine learning methods. Electron. Des. Eng. 2016, 24, 4–7. [Google Scholar]
- Quande, S.; Ruili, J.; Jiangjiang, X.; Zhongwei, Y.; Haochen, L.; Jianhua, S.; Lizhi, W.; Zhaoming, L. Adjusting Wind Speed Prediction of Numerical Weather Forecast Model Based on Learning Methods. Meteorol. Mon. 2019, 45, 426–436. [Google Scholar]
- Li, W.; Zhao, F.; Li, M.; Chen, L.; Peng, X. Forecasting and Classification of Severe Convective Weather Based on Numerical Forecast and Random Forest Algorithm. Meteorol. Mon. 2018, 44, 1555–1564. [Google Scholar]
- Jones, N. How machine learning could help to improve climate forecasts. Nature 2017, 548, 379–380. [Google Scholar] [CrossRef] [PubMed]
- Huang, R.F.; Zhou, G.C. Meteorology and Big Data; Science Press: Beijing, China, 2017. [Google Scholar]
- Zhao, Z.Y. Python Machine Learning Algorithm; Electronic Industry Press: Beijing, China, 2017. [Google Scholar]
- Shi, D.; Geng, H.; Ji, C.; Huang, C. Construction and application of road icing prediction model based on C4.5 decision tree algorithm. Meteorol. Sci. 2015, 35, 204–209. [Google Scholar]
- Shi, Y.; Shi, D.; Hao, L.; Zhang, Y.; Wang, P. Research on classification and prediction model of regional summer precipitation days based on CART algorithm of data mining. J. Nanjing Univ. Inf. Technol. (Nat. Sci. Ed.) 2018, 10, 118–123. [Google Scholar]
- Qin, P.C.; Liu, Z.X.; Wan, S.Q.; SU, R.R.; Huang, J.F. Yield limiting factor analysis of rapeseed in Hubei province based on decision tree and random forest model. Chin. J. Agrometeorol. 2016, 37, 691–699. [Google Scholar]
- Zhang, R.; Zhang, R.; Zuo, Z. Impact of Eurasian spring snow decrement on East Asian summer precipitation. J. Clim. 2017, 30, 3421–3437. [Google Scholar] [CrossRef]
- Wu, B.; Su, J.; D’Arrigo, R. Patterns of Asian winter climate variability and links to arctic sea ice. J. Clim. 2015, 28, 6841–6858. [Google Scholar] [CrossRef]
- Weng, H.; Wu, G.; Liu, Y.; Behera, S.K.; Yamagata, T. Anomalous summer climate in China influenced by the tropical Indo-Pacific Oceans. Clim. Dyn. 2011, 36, 769–782. [Google Scholar] [CrossRef] [Green Version]
- Yuan, Y.; Yang, S.; Zhang, Z. Different evolutions of the Philippine Sea anticyclone between eastern and central Pacific El Niño: Possible effect of Indian Ocean SST. J. Climate 2012, 25, 7867–7883. [Google Scholar] [CrossRef]
- Wei, W.; Fengchang, X.; Dawei, S.; Xiaojie, S. Research and application of CART algorithm based summer drought prediction model. J. Meteorol. Sci. 2016, 36, 661–666. [Google Scholar]
- Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
- Wu, J.; Chen, Y.F.; Yu, S.N. Research on drought prediction based on random forest model. China Rural Water Resour. Hydropower 2016, 11, 17–22. [Google Scholar]
- Binren, X.U.; Yuanyuan, W.E.I. Spatial statistical reduction of precipitation data of TRMM on Qinghai-Tibet plateau based on random forest algorithm. Remote Sens. Land Resour. 2018, 30, 181–188. [Google Scholar]
- Qingquan, L.; Yiming, D.; Yihui, L. 10-Year Hindcasts and Assessment Analysis of Summer Rainfall over China from Regional Climate Model. J. Appl. Meteorol. Sci. 2005, S1, 41–47. [Google Scholar]
- Guang-tao, D.; Bo-min, C.; Bao-de, C. Application of Regional Climate Model (RegCM3) on 10-Year Hindcast Experiment and a Real-Time Operation in Summer of 2010 in the Eastern China. Plateau Meteorology. 2012, 31, 1601–1610. [Google Scholar]
- Bai, H.; Gao, H.; Liu, C.Z.; Mao, W.Y.; Du, L.M. Assessment of Multi-model Downscaling Ensemble Prediction System for Monthly Temperature and Precipitation Prediction in GuiZhou. Desert Oasis Meteorol. 2016, 10, 58–63. [Google Scholar]
- Yang, X.B.; Zhang, J. Decision Tree and Its Techniques. Comput. Technol. Dev. 2007, 17, 43–45. [Google Scholar]
- Quinlan, J.R. C4.5: Programs for Machine Learning; Morgan Kaufman: San Mateo, CA, USA, 1993. [Google Scholar]
- Iverson, L.R.; Prasad, A.M.; Matthews, S.N.; Peters, M. Estimating potential habitat for 134 eastern US tree species under six climate scenarios. For. Ecol. Manag. 2008, 254, 390–406. [Google Scholar] [CrossRef]
- Wang, W.J.; Yao, Z.Y.; Jia, S.; Zhao, W.H.; Tan, C.; Zhang, P.; Gao, L.S.; Zhu, X.Y. Application Research on Random Forest Algorithm in the Statistical Test of Rainfall Enhancement Effect. Meteorol. Environ. Sci. 2018, 41, 111–117. [Google Scholar]
- Men, X.L.; Jiao, R.L.; Wang, D.; Zhao, C.G.; Liu, Y.K.; Xia, J.J.; Li, H.C.; Yan, Z.W.; Sun, J.H.; Wang, L.Z. A temperature correction method for multi-model ensemble forecast in North China based on machine learning. Clim. Environ. Res. 2019, 24, 116–124. (In Chinese) [Google Scholar]
- Gao, H.; Ding, T.; Li, W. The three-dimension intensity index for western Pacific subtropical high and its link to the anomaly of rain belt in eastern China (in Chinese). Chin. Sci. Bull. 2017, 62, 3643–3654. [Google Scholar] [CrossRef] [Green Version]
- Shao, X.; Zhou, B. Monitoring and Diagnosis of the 2015/2016 Super El Nino Event. Meteorol. Mon. 2016, 42, 540–547. [Google Scholar]
Situation | Condition 1 | Condition 2 | Condition 3 | Condition 4 |
---|---|---|---|---|
1 | Northerly “+” of the western Pacific subtropical ridge (74%) | More landfall typhoons “+” (88%) | ||
2 | Northerly “+” of the western Pacific subtropical ridge (74%) | Fewer landfall typhoons “−” (50%) | North African Atlantic north American subtropical high north boundary south (83%) | |
3 | Southerly “−” of the western Pacific | India’s subtropical high is larger (75%) | ||
4 | Southerly “−” of the western Pacific subtropical ridge (30%) | Sub-high area of India is relatively small “−” (21%) | Scandinavian tele-related positive “+” (40%) | 30hpa zonal wind “+” (100%) |
5 | Northerly “+” of the western Pacific subtropical ridge (26%) | Fewer landfall typhoons “−” (50%) | North Atlantic north American subtropical high northern boundary “+” (100%) | |
6 | Southerly “−” of the western Pacific subtropical high ridge (70%) | India’s sub-high area is smaller (79%) | Scandinavian telecorrelation negative “−” (100%) | |
7 | Southerly “−” of the western Pacific subtropical high ridge (70%) | India’s sub-high area is smaller (79%) | Scandinavian telecorrelation positive “+” (60%) | 30hPa latitudinal wind slant small “−” (86%) |
Years | Ridge Line of Western Pacific Subtropical High | Landfall Typhoon | India Sub-High Area | North Atlantic North American Subtropical High Northern Border | Scandinavian Telepathic Type | 30 hpa Zonal Wind | Prediction | Observation (%) |
---|---|---|---|---|---|---|---|---|
2011 | 2.22 | 0.14 | −0.04 | 0.48 | 0.04 | −0.33 | less | −30.5 |
2012 | 2.07 | 0.18 | 0.07 | 0.91 | −0.2 | −5.69 | less | −22.1 |
2013 | −0.10 | 0.81 | 0.09 | 1.5 | 0 | 4.56 | less | −26.1 |
2014 | −1.85 | −0.52 | −0.02 | 0.12 | −0.16 | −3.19 | more | 6.3 |
2015 | 0.62 | −0.19 | 0.29 | 1.39 | −0.22 | 2.72 | more | 11.7 |
2016 | −0.49 | −0.19 | 0.36 | 2.73 | −0.36 | 0.72 | less | 9.5 |
2017 | −2.48 | 0.48 | 0.03 | 2.31 | −0.41 | −1.66 | more | −14.4 |
2018 | 3.79 | 1.14 | 0.03 | 2.14 | −0.17 | −4.5 | less | −24.8 |
Situation | Condition 1 | Condition 2 | Condition 3 | Condition 4 |
---|---|---|---|---|
1 | High Atlantic meridional SST (+) (60%) | NINOA low “−” (75%) | Cold tongue ENSO index is larger “+” (90%) | |
2 | High Atlantic meridional SST (+) (60%) | NINOA low “−” (75%) | Cold tongue type ENSO index is small “−” (50%) | Warm pool index of the western hemisphere is relatively small “−” (75%) |
3 | High Atlantic meridional SST (+) (60%) | NINOA high “+” (33%) | Warm pool index of the western hemisphere is larger “−” (50%) | NINO4 low “−” (100%) |
4 | Low Atlantic meridional SST “−” (48%) | NINOA high “+” (67%) | High SST in tidophilic area “+” (100%) | |
5 | Low Atlantic meridional SST “−” (48%) | NINOA high “+” (67%) | Low SST in tidophilic area “−” (25%) | ositive dipole anomaly “+” (100%) in the subtropical southern Indian Ocean |
6 | Low Atlantic meridional SST “−” (48%) | NINOA low “−” (38%) | Low SST in tidophilic area “−” (56%) | Cold tongue type ENSO index is small “−” (80%) |
7 | High Atlantic meridional SST (+) (40%) | NINOA high “+” (67%) | The warm pool index of the western hemisphere is relatively small “−” (100%) | |
8 | High Atlantic meridional SST (+) (40%) | NINOA high “+” (67%) | Warm pool index of the western hemisphere is larger “+” (50%) | NINO4 high SST “+” (75%) |
9 | High Atlantic meridional SST (+) (40%) | NINOA low “−” (67%) | Cold tongue type ENSO index is small “−” (50%) | Warm pool index of western hemisphere is larger “+” (100%) |
10 | Low Atlantic meridional SST “−” (52%) | NINOA high “+” (33%) | Low SST in tidophilic area “−” (75%) | Sub-tropical southern Indian Ocean dipole is small “−” (100%) |
11 | Low Atlantic meridional SST “−” (52%) | NINOA low “−” (63%) | High SST in tidophilic area “+” (85%) | |
12 | Low Atlantic meridional SST “−” (52%) | NINOA low “−” (63%) | Low SST in tidophilic area “−” (44%) | Cold tongue ENSO index is larger “+” (75%) |
Years | Atlantic Meridional SST | NINOA | Western Hemisphere Warm Pool Index | Cold Tongue Type ENSO Index | SST in Tidophilic Zone | NINO4 | Warm Pool Type ENSO Index | Subtropical Southern Indian Ocean Dipole | Tropical Indian Ocean Dipole | Prediction | Observation (%) |
---|---|---|---|---|---|---|---|---|---|---|---|
2011 | 6.04 | −0.23 | −0.1 | −0.46 | 1.92 | −0.94 | −0.83 | 0.6 | 0.12 | less | −30.5 |
2012 | 2.98 | −0.09 | −0.05 | −0.5 | 0.96 | −0.75 | −0.34 | −0.41 | 0.17 | less | −22.1 |
2013 | 2.8 | −0.57 | 0.14 | 0.16 | 1.45 | −0.06 | −0.77 | −0.61 | 0.13 | less | −26.1 |
2014 | 0.57 | −0.32 | 0.29 | −0.06 | 1.24 | −0.25 | −0.38 | −0.53 | −0.05 | more | 6.3 |
2015 | −0.41 | −0.33 | 0.16 | 0.49 | −0.28 | 0.68 | 0.2 | −0.26 | −0.14 | more | 11.7 |
2016 | −0.6 | 0.71 | 2.38 | 0.24 | −0.52 | 1.2 | 1.91 | −0.43 | −0.28 | more | 9.5 |
2017 | 1.79 | 0.46 | 0.35 | −0.29 | −0.35 | −0.37 | −0.08 | 0.53 | 0.04 | less | −14.4 |
2018 | 2.68 | 0.09 | 0.3 | 0.46 | 0.85 | −0.94 | −0.82 | 0.06 | 0.47 | less | −24.8 |
Ps | Cc | PC | ||||
---|---|---|---|---|---|---|
Years | Random Forests | Report | Random Forests | Report | Random Forests | Report |
2011 | 82.2 | 0.12 | 61.8 | |||
2012 | 95.2 | 0.55 | 85.3 | |||
2013 | 90.3 | 0.28 | 73.5 | |||
2014 | 78.8 | 58.8 | 0.26 | 0.15 | 58.8 | 38.2 |
2015 | 79.4 | 58.2 | 0.03 | −0.58 | 58.8 | 29.4 |
2016 | 86.1 | 83.9 | 0.31 | −0.06 | 70.6 | 73.5 |
2017 | 85.0 | 85.7 | 0.30 | 0.25 | 64.7 | 67.6 |
2018 | 93.5 | 75.4 | 0.44 | −0.37 | 82.4 | 55.9 |
2014–2018. | 84.6 | 72.4 | 0.27 | −0.12 | 67.1 | 52.9 |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Xiang, B.; Zeng, C.; Dong, X.; Wang, J. The Application of a Decision Tree and Stochastic Forest Model in Summer Precipitation Prediction in Chongqing. Atmosphere 2020, 11, 508. https://doi.org/10.3390/atmos11050508
Xiang B, Zeng C, Dong X, Wang J. The Application of a Decision Tree and Stochastic Forest Model in Summer Precipitation Prediction in Chongqing. Atmosphere. 2020; 11(5):508. https://doi.org/10.3390/atmos11050508
Chicago/Turabian StyleXiang, Bo, Chunfen Zeng, Xinning Dong, and Jiayue Wang. 2020. "The Application of a Decision Tree and Stochastic Forest Model in Summer Precipitation Prediction in Chongqing" Atmosphere 11, no. 5: 508. https://doi.org/10.3390/atmos11050508
APA StyleXiang, B., Zeng, C., Dong, X., & Wang, J. (2020). The Application of a Decision Tree and Stochastic Forest Model in Summer Precipitation Prediction in Chongqing. Atmosphere, 11(5), 508. https://doi.org/10.3390/atmos11050508