Identification and Temporal Distribution of Typical Rainfall Types Based on K-Means++ Clustering and Probability Distribution Analysis
Abstract
1. Introduction
2. Overview of the Study Area and Data
3. Methodology
3.1. Data Pre-Processing Methods
- (1)
- Definition of independent rainfall events
- (2)
- Establishing a minimum rainfall criterion
- (3)
- Screening of typical rainfall events
3.2. Clustering Algorithm
3.2.1. Principal Component Analysis (PCA)
- (1)
- Create the initial data matrix using the number of eigenvalues (m) and samples (n), which together make up the data matrix X:
- (2)
- To create a standardization matrix , the data were standardized to eliminate dimensional differences.
- (3)
- Determine the eigenvalues and eigenvectors by solving for the covariance matrix of the normalized data.
- (4)
- Based on Kaiser’s criterion, the number of principal components was determined with the cumulative contribution rate greater than 90%, and the low-dimensional feature space was constructed.
- (5)
- The K-means++ algorithm is used for cluster analysis with the dimensionality-reduced principal component features as input.
3.2.2. K-Means++ Cluster Analysis
- (1)
- Choose a sample point at random to serve as the initial clustering center.
- (2)
- To ensure that points farther away have a higher chance of being chosen as the next cluster center, find the minimum distance D(x) between all unselected samples and any existing cluster center. Then, choose the next center with probability proportional to (weighted probability selection).
- (3)
- Repeat step 2 until initial clustering centers are selected.
- (4)
- Use the traditional K-means algorithm for iterative optimization until the termination condition is satisfied or the clustering centers converge.
3.3. Probability Distribution Analysis Method
4. Results and Discussion
4.1. Clustering Results
4.2. Results of the Temporal Distribution of Typical Rainfall Events
4.3. Discussion
- (1)
- How these typical patterns interact with Hangzhou’s urban landscape and drainage infrastructure;
- (2)
- Investigating pattern stability under climate change scenarios would support long-term planning, particularly relevant given Liu et al.’s [13] findings on changing precipitation patterns in eastern China.
- (3)
- Additionally, expanding this analysis to include sub-hourly rainfall data and seasonal variations would provide a more comprehensive understanding of Hangzhou’s precipitation characteristics, especially in the context of East Asian monsoon influences.
5. Conclusions
- (1)
- Rainfall in central Hangzhou is predominantly characterized by Type I rainfall—short-duration, concentrated single-peak events typically occurring in the mid to late portion of the duration. This Type I rainfall type accounts for over 96% of all rainfall events across the selected recurrence periods (1, 2, 3, and 5 years), which were chosen to represent common precipitation scenarios relevant for integrated urban water management. The prevalence of Type I rainfall establishes it as the most representative precipitation type for central Hangzhou’s typical hydrological conditions.
- (2)
- Through secondary classification of Type I rainfall types based on peak position (early peak, middle peak, and late peak), we identified distinct statistical distributions across different recurrence periods. The proportion of late-peak types is higher in events with longer recurrence periods, while early-peak types constitute a larger percentage in events with shorter recurrence periods. This distribution characteristic is significant for hydrological applications as it indicates that more severe typical rainfall events in central Hangzhou are more likely to exhibit late-peak features.
- (3)
- The probability density function curves for rainfall events at all peak positions show higher peaks and narrower distributions as the recurrence period increases. This indicates that typical rainfall events with longer recurrence periods exhibit shorter durations, higher intensities, and greater temporal concentration. The 5-year recurrence period late-peak type shows a slightly advanced peak position, suggesting potential nonlinear variations that warrant further investigation.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Paschalis, A.; Fatichi, S.; Molnar, P.; Rimkus, S.; Burlando, P. On the Effects of Small Scale Space–Time Variability of Rainfall on Basin Flood Response. J. Hydrol. 2014, 514, 313–327. [Google Scholar] [CrossRef]
- Ghate, A.S.; Timbadiya, P.V. True Interval Non-Stationary Intensity-Duration-Frequency Curves under Changing Climate and Effect of Temporal Discretisation on Rainfall Extremes. J. Hydrol. 2023, 618, 129136. [Google Scholar] [CrossRef]
- IPCC. 2023: Climate Change 2023: Synthesis Report. 2023. Available online: https://www.ipcc.ch/report/sixth-assessment-report-cycle/ (accessed on 1 September 2024).
- Sun, Q.; Zhang, X.; Zwiers, F.; Westra, S.; Alexander, L.V. A Global, Continental, and Regional Analysis of Changes in Extreme Precipitation. J. Clim. 2021, 34, 243–258. [Google Scholar] [CrossRef]
- Hirabayashi, Y.; Mahendran, R.; Koirala, S.; Konoshima, L.; Yamazaki, D.; Watanabe, S.; Kim, H.; Kanae, S. Global flood risk under climate change. Nat. Clim. Change 2013, 3, 816–821. [Google Scholar] [CrossRef]
- Zhang, D. Rapid urbanization and more extreme rainfall events. Sci. Bull. 2020, 65, 516–518. [Google Scholar] [CrossRef] [PubMed]
- Ren, X.; Hong, N.; Li, L.; Kang, J.; Li, J. Effect of infiltration rate changes in urban soils on stormwater runoff process. Geoderma 2020, 363, 114158. [Google Scholar] [CrossRef]
- Ding, Y.; Wang, H.; Liu, Y.; Chai, B.; Bin, C. The spatial overlay effect of urban waterlogging risk and land use value. Sci. Total Environ. 2024, 947, 174290. [Google Scholar] [CrossRef]
- Zhou, Z.; Smith, J.A.; Baeck, M.L.; Wright, D.B.; Smith, B.K.; Liu, S. The Impact of the Spatiotemporal Structure of Rainfall on Flood Frequency over a Small Urban Watershed: An Approach Coupling Stochastic Storm Transposition and Hydrologic Modeling. Hydrol. Earth Syst. Sci. 2021, 25, 4701–4717. [Google Scholar] [CrossRef]
- Cristiano, E.; ten Veldhuis, M.-C.; van de Giesen, N. Spatial and Temporal Variability of Rainfall and Their Effects on Hydrological Response in Urban Areas—A Review. Hydrol. Earth Syst. Sci. 2017, 21, 3859–3878. [Google Scholar] [CrossRef]
- Qi, W.; Liu, Y.; Ma, C.; Xu, H.; Lian, J.; Xu, K.; Yao, Y. A combined qualitative–quantitative method for adaptive configuration of urban flood mitigation measure. Urban Clim. 2024, 56, 102004. [Google Scholar] [CrossRef]
- Zhu, Z.; Wright, D.B.; Yu, G. The Impact of Rainfall Space-Time Structure in Flood Frequency Analysis. Water Resour. Res. 2018, 54, 8983–8998. [Google Scholar] [CrossRef]
- Liu, J.; Li, B.; Ma, M. Spatiotemporal Variation and Causes of Typical Extreme Precipitation Events in Shandong Province over the Last 50 Years. Remote Sens. 2023, 16, 1283. [Google Scholar] [CrossRef]
- Chen, G.; Hou, J.; Liu, Y.; Xue, S.; Wu, H.; Wang, T.; Lv, J.; Jing, J.; Yang, S. Urban inundation rapid prediction method based on multi-machine learning algorithm and rain pattern analysis. J. Hydrol. 2024, 633, 131059. [Google Scholar] [CrossRef]
- Oh, B.; Kim, J.; Hwang, S. Influence of Rainfall Patterns on Rainfall–Runoff Processes: Indices for the Quantification of Temporal Distribution of Rainfall. Water 2023, 16, 2904. [Google Scholar] [CrossRef]
- Ghajarnia, N.; Arasteh, P.; Araghinejad, S.; Liaghat, M. The hybrid Bayesian-SVD based method to detect false alarms in PERSIANN precipitation estimation product using related physical parameters. J. Hydrol. 2016, 538, 640–650. [Google Scholar] [CrossRef]
- An, D.; Du, Y.; Berndtsson, R.; Niu, Z.; Zhang, L.; Yuan, F. Evidence of climate shift for temperature and precipitation extremes across Gansu Province in China. Theor. Appl. Climatol. 2020, 139, 1137–1149. [Google Scholar] [CrossRef]
- Guenni, L.; Degaetano, A.; Subba Rao, T.; Serio, G. A model for seasonal variation of rainfall at Adelaide and Turen. Ecol. Model. 1996, 85, 203–217. [Google Scholar] [CrossRef]
- Machiwal, D.; Jha, M.K. Comparative evaluation of statistical tests for time series analysis: Application to hydrological time series. Hydrol. Sci. J. 2008, 53, 353–366. [Google Scholar] [CrossRef]
- Jin, H.; Chen, X.; Wu, P.; Song, C.; Xia, W. Evaluation of spatial-temporal distribution of precipitation in mainland China by statistic and clustering methods. Atmos. Res. 2021, 262, 105772. [Google Scholar] [CrossRef]
- Huff, F.A. Time distribution of rainfall in heavy storms. Water Resour. Res. 1967, 3, 1007–1019. [Google Scholar] [CrossRef]
- Pilgrim, D.H.; Cordery, I. Rainfall temporal patterns for design floods. J. Hydraul. Div. 1975, 101, 81–95. [Google Scholar] [CrossRef]
- Willems, P. Compound intensity/duration/frequency-relationships of extreme precipitation for two seasons and two storm types. J. Hydrol. 2000, 233, 189–205. [Google Scholar] [CrossRef]
- Koutsoyiannis, D.; Kozonis, D.; Manetas, A. A mathematical framework for studying rainfall intensity-duration-frequency relationships. J. Hydrol. 1998, 206, 118–135. [Google Scholar] [CrossRef]
- Tanguy, M.; Chokmani, K.; Bernier, M.; Poulin, J.; Raymond, S. River flood mapping in urban areas combining Radarsat-2 data and flood return period data. Remote Sens. Environ. 2017, 198, 442–459. [Google Scholar] [CrossRef]
- Guo, Y.P.; Adams, B.J. Hydrologic analysis of urban catchments with event-based probabilistic models: 1. Runoff volume. Water Resour. Res. 1998, 34, 3421–3431. [Google Scholar] [CrossRef]
- Medina-Cetina, Z.; Nadim, F. Stochastic design of an early warning system. Georisk Assess. Manag. Risk Eng. Syst. Geohazards 2008, 2, 223–236. [Google Scholar] [CrossRef]
- Fletcher, T.D.; Andrieu, H.; Hamel, P. Understanding, management and modelling of urban hydrology and its consequences for receiving waters: A state of the art. Adv. Water Resour. 2013, 51, 261–279. [Google Scholar] [CrossRef]
- Arnbjerg-Nielsen, K.; Willems, P.; Olsson, J.; Beecham, S.; Pathirana, A.; Bülow Gregersen, I.; Madsen, H.; Nguyen, V.T.V. Impacts of climate change on rainfall extremes and urban drainage systems: A review. Water Sci. Technol. 2013, 68, 16–28. [Google Scholar] [CrossRef]
- Burns, M.J.; Fletcher, T.D.; Walsh, C.J.; Ladson, A.R.; Hatt, B.E. Hydrologic shortcomings of conventional urban stormwater management and opportunities for reform. Landsc. Urban Plan. 2012, 105, 230–240. [Google Scholar] [CrossRef]
- Qiao, Y.; Wang, Y.; Jin, N.; Zhang, S.; Giustozzi, F.; Ma, T. Assessing flood risk to urban road users based on rainfall scenario simulations. Transp. Res. Part D Transp. Environ. 2023, 123, 103919. [Google Scholar] [CrossRef]
- Jiang, Q.; Li, W.; Fan, Z.; He, X.; Sun, W.; Chen, S.; Wen, J.; Gao, J.; Wang, J. Evaluation of the ERA5 Reanalysis Precipitation Dataset over Chinese Mainland. J. Hydrol. 2021, 595, 125660. [Google Scholar] [CrossRef]
- Gomis-Cebolla, J.; Rattayova, V.; Salazar-Galán, S.; Francés, F. Evaluation of ERA5 and ERA5-Land Reanalysis Precipitation Datasets over Spain (1951–2020). Atmos. Res. 2023, 284, 106606. [Google Scholar] [CrossRef]
- Dunkerley, D. Identifying individual rain events from pluviograph records: A review with analysis of data from an Australian dryland site. Hydrol. Process. 2008, 22, 5024–5036. [Google Scholar] [CrossRef]
- Bell, C.D.; McMillan, S.K.; Clinton, S.M.; Jefferson, A.J. Hydrologic response to stormwater control measures in urban watersheds. J. Hydrol. 2016, 541, 1488–1500. [Google Scholar] [CrossRef]
- Greenacre, M.; Groenen, P.J.; Hastie, T.; Iodice, A.; Markos, A.; Tuzhilina, E. Principal component analysis. Nat. Rev. Methods Primers 2022, 2, 100. [Google Scholar] [CrossRef]
- Chatterjee, S.; Khan, A.; Akbari, H.; Wang, Y. Monotonic trends in spatio-temporal distribution and concentration of monsoon precipitation (1901–2002), West Bengal, India. Atmos. Res. 2016, 182, 54–75. [Google Scholar] [CrossRef]
- Zhang, Z.; Chen, X.; Wang, C.; Wang, R.; Song, W.; Nie, F. Structured multi-view k-means clustering. Pattern Recognit. 2025, 160, 111113. [Google Scholar] [CrossRef]
- Abdi, A.; Hassanzadeh, Y.; Ouarda, T.B. Regional frequency analysis using Growing Neural Gas network. J. Hydrol. 2017, 550, 92–102. [Google Scholar] [CrossRef]
- Kapoor, A.; Singhal, A. A Comparative Study of K-Means, K-Means++ and Fuzzy C-Means Clustering Algorithms. In Proceedings of the 2017 3rd International Conference on Computational Intelligence & Communication Technology (CICT), Ghaziabad, India, 9–10 February 2017; pp. 1–6. [Google Scholar]
- Deka, P.; Saha, U. Introduction of k-means clustering into random cascade model for disaggregation of rainfall from daily to 1-hour resolution with improved preservation of extreme rainfall. J. Hydrol. 2023, 620, 129478. [Google Scholar] [CrossRef]
- Yuan, W.; Tu, X.; Su, C.; Liu, M.; Yan, D.; Wu, Z. Research on the Critical Rainfall of Flash Floods in Small Watersheds Based on the Design of Characteristic Rainfall Patterns. Water Resour Manag. 2021, 35, 3297–3319. [Google Scholar] [CrossRef]
- Ye, Z.; Ding, L.; Liu, Z.; Chen, F. Research on the joint adjustment model of regional water resource network based on the network flow theory. Aqua 2024, 73, 608–622. [Google Scholar] [CrossRef]
- Wang, H.; Hu, Y.; Guo, Y.; Wu, Z.; Yan, D. Urban flood forecasting based on the coupling of numerical weather model and stormwater model: A case study of Zhengzhou city. J. Hydrol. Reg. Stud. 2022, 39, 100985. [Google Scholar] [CrossRef]
- Yin, S.Q.; Wang, Y.; Xie, Y.; Liu, A.L. Characteristics of intra-storm temporal pattern over China. Adv. Water Sci. 2014, 25, 617–624. [Google Scholar] [CrossRef]
Category | Characteristic Value Name | Formula | Formula Meaning |
---|---|---|---|
Basic rainfall characteristics | Rainfall duration () | is the time when the rainfall starts, and is the time when the rainfall ends. | |
Cumulative rainfall ( | is the rainfall in the i-th hour, and is the rainfall duration. | ||
Rainfall distribution characteristics | Precipitation concentration index (PCI) | is the rainfall in the i-th hour, and is the cumulative rainfall. The larger the PCI value, the more concentrated the rainfall is in time distribution [37]. | |
Coefficient of variation (CV) | is the standard deviation of rainfall, and is the mean rainfall. | ||
Rainfall peak characteristics | Peak precipitation ratio () | is the maximum hourly rainfall, and is the cumulative rainfall. | |
Peak position ratio () | represents the time when the peak rainfall occurs, and is the rainfall duration. |
Principal Components | Eigenvalue | Contribution Rate (%) | Cumulative Contribution Rate (%) |
---|---|---|---|
PC1 | 5.61 | 43.09 | 43.09 |
PC2 | 3.10 | 23.08 | 66.89 |
PC3 | 1.97 | 15.14 | 82.03 |
PC4 | 1.36 | 10.47 | 92.50 |
Peak Position | Recurrence Periods | ||||
---|---|---|---|---|---|
T = 1 | T = 2 | T = 3 | T = 5 | ||
Early peak | 0 0.4 | 17.6% | 10.0% | 9.5% | 0% |
Middle peak | 0.4 0.6 | 47.1% | 45% | 42.9% | 33.3% |
Late peak | 0. 1 | 35.3% | 45% | 47.6% | 66.7% |
Rain Pattern | Recurrence Period | Best Fit Function | Function Formula |
---|---|---|---|
Early-peak rainfall | T = 1 | Logistic5 | |
T = 2 | |||
T = 3 | |||
Middle-peak rainfall | T = 1 | Logistic5 | |
T = 2 | |||
T = 3 | |||
T = 5 | |||
Late-peak rainfall | T = 1 | Logistic5 | |
T = 2 | |||
T = 3 | |||
T = 5 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, Q.; Qian, J. Identification and Temporal Distribution of Typical Rainfall Types Based on K-Means++ Clustering and Probability Distribution Analysis. Hydrology 2025, 12, 88. https://doi.org/10.3390/hydrology12040088
Zhang Q, Qian J. Identification and Temporal Distribution of Typical Rainfall Types Based on K-Means++ Clustering and Probability Distribution Analysis. Hydrology. 2025; 12(4):88. https://doi.org/10.3390/hydrology12040088
Chicago/Turabian StyleZhang, Qiting, and Jinglin Qian. 2025. "Identification and Temporal Distribution of Typical Rainfall Types Based on K-Means++ Clustering and Probability Distribution Analysis" Hydrology 12, no. 4: 88. https://doi.org/10.3390/hydrology12040088
APA StyleZhang, Q., & Qian, J. (2025). Identification and Temporal Distribution of Typical Rainfall Types Based on K-Means++ Clustering and Probability Distribution Analysis. Hydrology, 12(4), 88. https://doi.org/10.3390/hydrology12040088