Next Article in Journal
Temporal and Machine Learning-Based Principal Component and Clustering Analysis of VOCs and Their Role in Urban Air Pollution and Ozone Formation
Previous Article in Journal
Thermodynamic Study of a Mediterranean Cyclone with Tropical Characteristics in September 2020
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Investigation of the Regularities of the Influence of Meteorological Factors on Avalanches in Eastern Kazakhstan

by
Marzhan Rakhymberdina
1,
Natalya Denissova
2,
Yerkebulan Bekishev
1,
Gulzhan Daumova
1,*,
Milan Konečný
3,
Zhanna Assylkhanova
1 and
Azamat Kapasov
1
1
School of Earth Sciences, D. Serikbayev East Kazakhstan Technical University, Ust-Kamenogorsk 070000, Kazakhstan
2
Department of Information Technology, D. Serikbayev East Kazakhstan Technical University, Ust-Kamenogorsk 070000, Kazakhstan
3
Laboratory on Geoinformatics and Cartography, Faculty of Science, Masaryk University, Kotlářská 2, 611 37 Brno, Czech Republic
*
Author to whom correspondence should be addressed.
Atmosphere 2025, 16(6), 723; https://doi.org/10.3390/atmos16060723
Submission received: 21 April 2025 / Revised: 4 June 2025 / Accepted: 11 June 2025 / Published: 15 June 2025
(This article belongs to the Section Meteorology)

Abstract

:
This paper studies the influence of meteorological factors on avalanche occurrence in East Kazakhstan using modern data analysis methods. A dataset of 111 avalanche events in nine avalanche-prone areas of the region, recorded between 2012 and 2023, was compiled. Primary data on avalanche dates were obtained from the Department of Emergency Situations of East Kazakhstan Region (DES EKR), and meteorological data were sourced from the Kazhydromet website. Descriptive statistics, correlation analysis, principal component analysis (PCA), as well as K-means clustering and DBSCAN algorithms, were used for the analysis. During the analysis of meteorological conditions preceding avalanches at nine avalanche-prone areas in Eastern Kazakhstan, using PCA (Principal Component Analysis), the main weather factors affecting avalanche formation were determined. Clustering of 111 avalanches using the K-Means method allowed the identification of four scenario types: gradual snow accumulation without wind (33 cases), upper layer thawing due to warming (34), high snow cover (28), and storm impact (16). The DBSCAN method revealed two anomalous cases related to extreme snow depth. Correlation analysis revealed significant relationships between avalanches and meteorological parameters such as air temperature, snow cover depth, wind speed and direction, precipitation, and relative humidity. Correlation analysis revealed both negative and positive relationships between meteorological parameters. Principal component analysis identified the most significant variables affecting avalanche activity, with temperature, snow cover height, and wind making the greatest contributions. Cluster analysis demonstrated that avalanches could occur under different combinations of weather conditions within the same areas, confirming the complex nature of avalanche-forming processes. The results emphasize the need for an integrated approach to avalanche forecasting that accounts for the multi-parametric interactions of meteorological factors, and may contribute to the improvement of avalanche risk monitoring and mitigation systems in mountain regions.

1. Introduction

Avalanche forecasting and the analysis of meteorological factors affecting their occurrence are important areas in the study of natural hazards. Snow avalanches represent one of the most dangerous natural phenomena, capable of causing serious damage to both people and infrastructure [1]. Avalanches can destroy buildings, roads, and other structures, lead to loss of human life, and damage ecosystems. Such natural disasters become particularly relevant in light of climate change, as rising temperatures, changing precipitation patterns, and other factors can significantly affect avalanche risk. These processes require constant attention from researchers who use modern methods for accurate forecasting and risk assessment.
Eastern Kazakhstan, located at a junction with China, Mongolia, and Russia, is characterized by diverse geographical and climatic conditions, making it particularly prone to avalanche activity. The climate here is sharply continental: winters are long, cold, and snowy (with temperatures dropping to −40 °C in some places), while summers are hot and dry [2]. The region has about 500 avalanche-prone areas, with more than half (≈60%) directly threatening settlements and economic facilities [3].
Given the high avalanche risk in Eastern Kazakhstan, research aimed at improving avalanche risk assessment and forecasting has important practical significance. Traditional forecasting methods, developed several decades ago, often rely on simple empirical rules and expert intuition. However, such approaches do not always provide sufficiently accurate results under changing climatic and meteorological conditions.
In the period of global climate change, which is mainly associated with warming, there is a significant variability of meteorological factors in spatial and temporal distribution [4]. The global climate change trend shows that global warming is a relevant topic of worldwide significance. The authors of [5] note an increase in air temperature, precipitation, and the duration and intensity of meteorological phenomena. Based on research conducted in Eastern Kazakhstan covering the period from 1942 to 2022, there is an overall increase in air temperature of 0.2–0.4 °C every 10 years. According to weather station data, some settlements show an increase in precipitation of 1.6–7.7 mm every 10 years, while other cases show precipitation increases of 4.1–5.9 mm every 10 years. The authors note a positive trend in air temperature and precipitation anomalies over the last decade. The authors of [6] concluded that there was an increase in air temperature for the period from 1990 to 2021, as well as a slight trend toward decreasing average precipitation over the study period in Kazakhstan. In their study, the authors of [7] examine the impact of climate change on avalanche risk in eastern Kazakhstan. Analyses of long-term data indicate that temperature trends in the region align with global patterns, demonstrating a gradual increase in air temperature. This warming affects snow cover formation and stability through multiple mechanisms. Higher temperatures increase the frequency of melt–freeze cycles, which promote the development of weak layers within the snowpack. Additionally, warmer air can hold more moisture, often leading to heavier snowfall that may overload existing snow layers and elevate the avalanche risk. These processes indicate a connection between regional warming and heightened avalanche hazard, particularly during periods of rapidly changing weather. Strapazzon et al. [8] suggest that, due to climate change, future avalanche risk will be shaped by the interaction between rising air temperatures and potentially increased precipitation intensity.
Consequently, climate change in the region influences the risks of natural disasters, including avalanches, underscoring the importance of accurate prediction and assessment of these hazards.
Research conducted in North America [9], Northern Europe [10], and the Alps [11] has demonstrated the importance and necessity of studying meteorological data for avalanche forecasting using databases. The relationship between climate change and avalanche occurrence has also been investigated in India [12] and China [13]. In [12], the authors highlight that the formation of snow avalanches is influenced by meteorological conditions such as air temperature, atmospheric pressure, and wind speed. In [13], it is noted that precipitation, air temperature, and snow cover conditions can change significantly with climate warming, which also leads to avalanche formation and activity. All of these studies indicate that avalanche formation is influenced by a range of meteorological conditions, with regional climatic features and the relative importance of specific factors varying over time.
Machine learning methods are widely used to analyze the factors influencing avalanches. The successful application of machine learning is primarily due to its ability to extract complex non-linear relationships from data [14]. This is achieved through the use of various algorithms that can adapt to different data structures and reveal hidden patterns. To study the influence of factors on avalanches in the French Alps, the authors of [15] used different predictive frameworks, and the implementation of SHAP (Shapley Additive exPlanations) identified the most important attributes for avalanche susceptibility detection, such as slope, altitude, and wind speed for the region under consideration.
The work by Durlević et al. [16], dedicated to the use of Geographic Information Systems (GIS) for spatial modeling of avalanches in the Šar Mountains, Serbia, represents an important contribution to the field of avalanche hazard forecasting under climate change conditions. The researchers used the Analytic Hierarchy Process (AHP) to evaluate the influence of various factors such as slope, terrain, precipitation, and temperature on avalanche probability. This approach combines expert assessments with GIS data, enabling more accurate modeling of avalanche risks in mountain regions.
To improve forecasting accuracy, it is necessary to employ modern approaches to analyzing large meteorological datasets, including machine learning algorithms [17].
Recent advances in machine learning offer new opportunities for analyzing and predicting avalanche hazard.
Machine learning techniques such as neural networks, decision trees, and ensemble methods enable the modeling of complex dependencies that are often undetectable using traditional statistical approaches. These algorithms are trained on large datasets, allowing them to efficiently process various types of information—including text, images, and time series—and significantly enhance the quality of results [18].
Machine learning methods are particularly well suited for predicting and modeling natural phenomena that involve large volumes of data. The effectiveness of these approaches depends on the careful selection of input variables and the appropriate choice of model architecture [19].
For example, in the work by Tiwari et al. [20], the use of machine learning methods for predicting avalanche locations is investigated using the example of the Leh-Manali Highway in India. In this study, an assessment of parameter importance was conducted, including temperature, precipitation, slope, and terrain, with the aim of improving the model’s predictive capability. The results show that analysis of these factors, as well as the application of machine learning methods such as random forests and decision trees, significantly increases the accuracy of avalanche forecasting.
It is well known that one of the machine learning methods is the use of Graph Neural Networks (GNN) for analyzing spatiotemporal data. Graph Neural Networks represent a powerful tool for modeling complex interactions between various parameters in space and time, which is particularly useful for analyzing natural phenomena such as landslides [21] and avalanches. Graphs allow data to be represented as nodes and edges, where nodes can be, for example, different regions or time intervals, and edges represent connections between these regions or time periods. This enables effective modeling of spatiotemporal dependencies between meteorological factors such as temperature, precipitation, and wind speed, as well as forecasting avalanche hazard dynamics.
In the work by Zhang et al. [22], the use of Graph Neural Networks for predicting dynamic characteristics of granular avalanches is considered. The study showed that GNNs effectively predict parameters such as velocity and density of granular flows, which can be applied to avalanche models where it is important to consider snow cover dynamics and its stability. Unlike traditional methods based on simplified models, the use of GNNs allows for consideration of complex spatiotemporal dependencies, significantly improving prediction accuracy. Graph Neural Networks can account for complex interactions between various meteorological and physical factors, substantially improving forecast accuracy and helping to better understand avalanche-related risks.
The following work by Fromm and Schönberger [23] describes the use of an artificial neural network for assessing snow avalanche danger based on a comprehensive snow cover model. The work presents an approach that combines the power of neural networks and physical snow cover modeling for accurate avalanche hazard forecasting. The artificial neural network, which is the central element of the proposed model, consists of recurrent and convolutional layers, allowing effective combination of temporal and spatial data about snow cover and meteorological factors. As input data for training the neural network, five meteorological records and twenty-five variables of modeled snow cover with hourly resolution were used. These data provide detailed information about snow cover conditions and meteorological conditions that directly affect avalanche hazard. The combined use of recurrent and convolutional layers allows the neural network to account for both temporal and spatial changes, significantly improving avalanche prediction accuracy.
Additionally, the work by Joshi et al. [24] considers the developed HIM-STRAT model, based on neural networks for snow cover simulation and avalanche hazard prediction in the North-West Himalayas. The HIM-STRAT model is based on neural networks and is designed to assess and predict avalanche risks using snow cover data, climatic parameters, and topographic characteristics of the region. The HIM-STRAT model allows not only for modeling snow cover distribution but also for assessing avalanche probability, considering complex interactions between various meteorological and physical factors. One of the key aspects of the research is the use of neural networks for predicting avalanche risks. The model was trained on snow cover data, temperature, precipitation, and other climatic variables, also taking into account the influence of terrain on avalanche hazard.
Another important contribution to avalanche forecasting is the work by Yariyan et al. [25], which considers another direction—the use of hybrid models based on Geographic Information Systems (GIS) for snow avalanche susceptibility mapping. In this work, the authors integrate machine learning methods with GIS to create more accurate avalanche hazard maps. The application of such hybrid models allows for consideration of parameters such as terrain, precipitation, and climatic conditions, making forecasts more dynamic and adapted to environmental changes. The study results showed that the use of hybrid models significantly improves forecast accuracy, surpassing traditional methods based solely on statistical data.
Hybrid machine learning models using combinations of various approaches were applied for spatial modeling of snow avalanche susceptibility by Akay [26]. This research employs a combination of machine learning methods such as random forests and neural networks with geoinformation data, allowing for more accurate assessment of avalanche risks in mountain regions. The results showed that hybrid models integrating GIS and machine learning significantly improve avalanche hazard prediction accuracy, providing more dynamic and adapted forecasts.
A significant contribution to avalanche forecasting is the work by Mayer et al. [27], in which the authors use physical snowpack simulations to assess natural dry-snow avalanche activity. In their work, the researchers used avalanche data from the Swiss Alps and one-dimensional physical snowpack models for virtual slopes to develop a model predicting the probability of dry-snow avalanches in the region surrounding automated weather stations, based on output from a recently developed instability model. The authors presented an integrated approach that combines physical snowpack modeling and machine learning methods for more accurate assessment of avalanche hazard.
Additionally, an important study in avalanche forecasting is the work by Olsen [28], in which the author uses C-band Synthetic Aperture Radar (SAR) imagery from Sentinel 1 satellites to detect avalanches. The work applies the DBSCAN (Density-Based Spatial Clustering of Applications with Noise) algorithm for detecting small-scale avalanches that are difficult to detect using traditional methods. This algorithm effectively classifies and identifies areas with increased avalanche probability using spatial data obtained from satellites. The DBSCAN method applies clustering principles to group data points, allowing for the identification of anomalous events—in this case, small avalanches. This approach enables more accurate detection of previously unnoticed avalanches and serves as an important tool for monitoring avalanche activity in hard-to-reach areas.
In avalanche research, machine-learning methods such as nearest neighbor algorithms for avalanche prediction have been well described in [29]. The NXD2000 and NXD-REG scales presented in that work demonstrate the wide applicability of this method. Francis Gauthier et al. [30] used logistic regression (LR) to calculate the daily probability of avalanches. In their study, the authors analyzed the meteorological conditions affecting avalanches in northern Gaspésie, and the developed LR models proved to be effective tools for predicting days with high levels of avalanche activity. Chawla et al. [31] noted the effectiveness of Random Forest (RF) algorithms for identifying patterns in historical data and for minimizing and quantifying uncertainty in the avalanche forecasting process. In [32], mountainous regions in southwest Iran were investigated for mapping various hazards, including avalanches. The study applied support vector machines (SVM), boosted regression trees (BRT), and generalized linear models (GLM). The authors found that the FDA method provided the most accurate model for predicting avalanche risk. The open-source Pandas library for data analysis and manipulation, based on the Python programming language (3.12.7 packaged by Anaconda, Inc., Austin, TX, USA, Pandas (v2.2.2)), is a powerful and user-friendly tool for large-scale data mining [33].
As these articles point out, machine learning methods can automatically update their models as new data become available, making them particularly useful in dynamically changing environments. This feature allows organizations to quickly adapt to new challenges and make decisions based on the latest data.
Unsupervised methods such as clustering [34] and principal component analysis (PCA) [35] help to identify hidden patterns in the data, which can be useful for understanding the factors contributing to avalanche activity. In [36], the authors note that the AdaBoost-based PCA model outperforms other machine learning algorithms in terms of computational time and is the most suitable model for weather forecasting.
The k-means clustering method is commonly used to classify weather types from meteorological data [37] by grouping observations into clusters with similar conditions. Horton et al. [34] states that k-means clustering is the simplest approach for interpreting complex snow model output data and identifying areas of uncertainty.
Thus, special attention in modern research is given to machine learning methods, which show high potential in detecting anomalous events in natural hazards, including avalanches. In this regard, a deeper and more comprehensive study of the specifics of Eastern Kazakhstan is essential, as the region’s unique geographical and climatic conditions require tailored approaches to risk assessment.
To understand the processes involved in the formation of avalanche factors in mountain regions such as Eastern Kazakhstan, it is important to consider research conducted in other areas with similar climatic conditions. One such study is that of Liu et al. [38], which analyzes the triggering factors and activity characteristics of avalanches in the Aerxiangou area of the Western Tianshan Mountains, China. This work examines key factors such as temperature, precipitation, terrain, and wind, all of which strongly influence avalanche conditions. The assessment of these factors showed that in February, temperature and precipitation have the greatest influence on avalanche occurrence, while in March and April, temperature, slope, and wind speed are the most significant factors.
This approach, based on an integrated analysis of meteorological factors and avalanche modeling, is important for developing more accurate avalanche hazard forecasts in East Kazakhstan, where similar natural conditions are conducive to avalanche activity.
The novelty of this study lies in its comprehensive approach to analyzing the impact of climate change on avalanche risk in East Kazakhstan, making it a unique contribution for this region.
A key strength of this research is the detailed analysis of climatic changes in East Kazakhstan over an extended period (2001–2023), focusing on critical parameters such as rising air temperatures, shifts in precipitation patterns, and their influence on avalanche dynamics in mountainous areas.
By applying machine learning techniques—including Principal Component Analysis (PCA), K-means clustering, and DBSCAN—alongside spatial and temporal analysis of avalanche and meteorological data, this study identifies distinct patterns of avalanche activity that reflect the region’s specific climatic and geographical conditions.
This research presents a typology of avalanches based on the clustering of 111 avalanche events using meteorological data, resulting in a more precise classification system that aligns with the international standards of the European Avalanche Warning Service (EAWS) [39].
Moreover, the findings reveal long-term trends in avalanche activity linked to climate change and offer practical implications for regional authorities and emergency services. These insights can support the development of more effective avalanche risk mitigation strategies and decision-making tools in mountain regions.
The purpose of this study is to identify patterns and determine the most important meteorological factors influencing avalanches.

2. Materials and Methods

2.1. Subject of the Study

East Kazakhstan, with its diverse relief and climatic conditions, is a zone of increased avalanche danger [3]. The region features mountainous terrain with high ridges, creating a variety of microclimatic conditions. The presence of slopes with varying steepness and exposure affects the accumulation and stability of the snow cover. Winters are characterized by low temperatures, often falling below −20 °C, with extremes reaching as low as −54 °C. In summer, temperatures can reach an average of +30 °C. Such a wide seasonal temperature range creates conditions for significant changes in the snow cover. Winter is marked by substantial precipitation in the form of snow, which forms a stable snowpack. Precipitation varies with altitude and location [40].
The distribution of annual precipitation in Eastern Kazakhstan is uneven. In the northeast of the region, precipitation amounts to 400–650 mm, especially in mountainous and foothill areas, while in intermontane depressions, the amount drops below 200 mm per year [41]. A total of 80% of precipitation falls during the warm period of the year (April–October) and predominates over precipitation during the cold period (December–March), which is a characteristic feature of the region’s continental climate [42]. The maximum amount of precipitation is observed in most of the territory during summer, most often in the second half of the season [40]. During the winter period in Ust-Kamenogorsk, the average monthly precipitation is 142 mm, predominantly in the form of snow. In the Zaisan Basin, this amount can be even less, reaching only 130–200 mm per year [41]. The maximum amount of precipitation is observed over most of the territory during the summer, most often in the second half of the season.

2.2. Data Sources

Information on avalanche dates was obtained from the Department of Emergency Situations of the East Kazakhstan region, Republic of Kazakhstan (Table S1).
Information on avalanche dates was obtained from the Department of Emergency Situations of the East Kazakhstan region, the Republic of the Kazakhstan.
The table contains data on 111 avalanche events at 9 avalanche-prone areas from 2012 to 2023 (Figure 1, Table 1). All 111 cases are related to natural avalanches. The impact of avalanches on the region’s infrastructure is quite significant, as at the foot of the mountains there are roads of regional importance connecting the center of the region with settlements. The blocking of roads by snow debris prevents communication and the transport of vital goods for the population.
Table 1 summarizes the key topographic parameters of avalanche-prone areas derived from SRTM elevation data. The listed characteristics include geographic coordinates, elevation range, slope steepness, and slope aspect. The elevation of the studied avalanche-prone areas varies significantly, from 430 m (Putintsevo pit) to 2452 m (Berel), indicating that avalanche activity occurs across a wide altitudinal gradient. Most areas exhibit slope angles between 25° and 42°, which aligns with typical avalanche-prone slope thresholds reported in the literature [3,43,44]. The slope aspect is predominantly oriented toward the southwest (SW) and northeast (NE), suggesting that both sun-exposed and shaded slopes contribute to avalanche formation.
Initial meteorological data were downloaded from the Kazhydromet website [40]. The dataset content was organized according to the “7 + 1” system, i.e., 7 previous days and the day of avalanche. The data were then averaged (Table 2).
The dataset consists of averages of meteorological data, dates and a categorical column ‘Site Code’, which was converted to numeric based on the Label Encoding technique. This technique converts categorical values into numerical values to be used in machine learning models. That is, each plot was assigned an integer.
Table 2 shows the descriptive statistics for the numeric columns. The table consists of the following values:
Mean—arithmetic mean of the column;
Min and max—minimum and maximum values in the column;
Percentiles (25%, 50%, 75%)—values below which a certain percentage of data is below (25%—first quartile, 50%—median, 75%—third quartile);
Std (standard deviation)—a measure of the spread of data around the mean.

2.3. Data Processing Pipeline

Data processing was carried out on the basis of standard (basic) statistical methods (descriptive statistics, correlation) and machine learning methods (Principal Component Analysis (PCA) [31], K-means clustering [45], and Density-Based Spatial Clustering of Applications with Noise (DBSCAN) [46]. All data were processed in Python 3.12.7 (Anaconda distribution) using Pandas [33], NumPy (v1.26.4) [47], Sklearn (v1.5.1) [48] libraries and visualized using Matplotlib (v3.9.2) and Seaborn (v0.13.2) [49]. Below is a workflow of using machine learning methods to identify patterns, determine the most important meteorological factors influencing avalanches (Figure 2).
The data processing workflow includes the following key ML steps and methods:
  • Meteorological data collection. This study encompassed nine avalanche-prone areas in the East Kazakhstan region, for which meteorological data were collected over an 8-day period preceding each recorded avalanche event (7 days prior to the event and the day of occurrence). The data were sourced from the national hydro meteorological service, Kazhydromet, and included parameters such as air and surface temperatures, snow cover depth, precipitation, wind speed, atmospheric pressure, and relative humidity.
  • Descriptive statistics—analyses the basic characteristics of data, including mean values, standard deviation, minimum and maximum values.
  • Scaling—normalizing the data to allow further statistical and machine learning techniques to be performed correctly.
  • Correlation analysis—identification of relationships between meteorological parameters. This set of works was performed to check the adequacy and accuracy of meteorological data for further use [50,51].
  • Principal Component Analysis (PCA). To identify key factors influencing avalanche formation, Principal Component Analysis (PCA) was applied. This method facilitated the reduction in the feature space’s dimensionality and enabled the determination of each parameter’s contribution to the total variance. The primary meteorological factors were interpreted based on component loadings.
  • K-means Clustering (K-means method). To classify avalanche scenarios, the K-Means algorithm was applied to the dataset after dimensionality reduction using Principal Component Analysis (PCA). The first four principal components, which captured the majority of variance in the meteorological data, were selected as input features for clustering. The K-means algorithm was used with the optimal number of clusters determined by the elbow method and the silhouette coefficient [52]. The application of the K-means method was motivated by its interpretability and ability to identify stable groups with appropriate parameter selection. To minimize the method’s sensitivity to the choice of initial cluster centers, k-means++ initialization was used, allowing avoidance of poor local minima. This improved resistance to local minima through more uniform distribution of initial centers. The algorithm was run 10 times (n_init = 10) with different initializations, and the result with minimal within-cluster variance was selected. The optimal number of clusters was determined based on the elbow method and silhouette coefficient.
  • Clustering. DBSCAN. Construction of K-distance graph to determine the optimal distance threshold between points. Execution of DBSCAN algorithm based on the detected eps and minPts, which finds dense groups of points by pruning outliers. For the DBSCAN algorithm, the key parameters are eps (neighborhood radius) and minPts (minimum points for a dense region). For objective selection of eps, a K-Distance graph with sorting by distance to the k-neighbor was used, while minPts was chosen considering the dimensionality and density of the sample. Although DBSCAN may struggle with clusters of different densities, in this case, the main task was not clustering but identifying anomalous avalanche events.
  • Clustering results are visualized, allowing analysis of the resulting groups of data [53].
  • Cartogram of avalanche distribution by clusters, which gives an idea of avalanche occurrence patterns depending on meteorological conditions.

3. Results and Discussion

3.1. Correlation

To identify relationships among meteorological variables and to preliminarily examine the data before applying dimensionality reduction or clustering methods, a correlation analysis was performed. Pearson correlation coefficients were calculated to quantify the linear relationships between variables. The analysis utilized 8-day average values of surface and air temperatures, snow depth, wind speed, relative humidity, and atmospheric pressure. The resulting correlation matrix is presented in Figure 3.
A strong positive correlation (r = 0.96) was observed between air and surface temperatures, confirming their close relationship. Snow depth exhibited weak positive correlations with surface temperature (r = 0.29) and air temperature (r = 0.24), suggesting that snow cover can persist at higher temperatures, although other factors also influence its stability.
Negative correlations were found between wind speed and both snow depth (r = −0.19) and precipitation (r = −0.25), indicating that higher wind speeds may contribute to reduced snow accumulation and precipitation levels. Conversely, a positive correlation between wind speed and air temperature suggests that wind intensity increases with rising temperatures.
An above-average positive correlation (r = 0.67) was identified between relative humidity and atmospheric pressure. Additionally, air temperature showed negative correlations with relative humidity (r = −0.49) and atmospheric pressure (r = −0.48), aligning with established meteorological patterns. These findings support the adequacy of the initial dataset for further analysis.

3.2. PCA

Principal Component Analysis (PCA) determines eigenvector values based on the covariance matrix and creates new data (components). This method is used to reduce the dataset and determine the weight of each type of data (column).
Before using this method, the data were preprocessed; more specifically, using the Standard Scaler library, all the data were standardized on a scaling basis to bring all the data into a single system. Also, site and date codes were removed from the dataset. SimpleImputer (strategy = ‘mean’) was also used, which replaced voids (NaN) with the mean of the column.
For the new data set, four principal components were used. Thus, the proportion of explained variance was as follows: PC1-45.14%, PC-23.54%, PC3-12.83%, and PC4-7.38%, making a total of 88.89%. This indicator was considered as a sufficient level as there was only a small increase in variance (3–4%) with further addition of new components. PC1 and PC2 explain more than half (68.68%) of the features of the data.
In the next step of this method, the loadings, i.e., the contribution of each original feature to each principal component, were determined (Table 3 and Figure 3).
Table 3 presents the contribution of each original trait to the principal components (PC1, PC2, PC3, PC4). Based on these contributions, the most influential attributes for each component were identified.
PC1 is the most significant factor (45.1% of the variance). The principal component is negatively related to relative humidity (−0.46) and atmospheric pressure (−0.44). Surface temperature (0.43) and air temperature (0.48) have significant positive weights, indicating that this component is strongly influenced by increasing temperature. This component reflects warming conditions. Increasing temperature weakens the snow cover through melting, forming weak snow cover layers.
PC2 is positively dominated by snow depth (0.51) and snowfall (0.54). The component is associated with active snowfall and snow accumulation. It has been demonstrated that an increase in the amount of fresh snow results in an elevated load on slopes. This component can be conventionally termed the snow-sediment component.
The key variables in PC3 are positive values of snow depth (0.69) and negative values of wind speed (−0.44). This component is likely to characterize periods of heavy snowfall with low wind speed.
The fourth component (7.38%) is characterized by a negative surface temperature (−0.35) and air temperature (−0.32), positive precipitation (0.52) and wind (0.55). This is similar to storm conditions with blizzards. Strong winds during snowfalls favor the rapid formation of unstable snow layers (Figure 4).
From this, we can generally conclude that the greatest avalanche risk is associated with Components 1 and 2 (totaling ~68.5% of the variance): warming (PC1) and rapid snow accumulation (PC2).

3.3. K-Means

The K-means method for already created principal components was used to divide avalanches into groups (clusters) according to the type of influence of meteorological data. K-means is a clustering algorithm that divides a dataset into k different non-overlapping clusters. The advantages of this method are the simple algorithm and easily interpretable results, but the disadvantage may be that the number of clusters must be predetermined.
To determine the optimal number of clusters for the K-Means algorithm, Elbow method and Silhouette Analysis were used. Using this Elbow method, the Within-Cluster Sum of Squares (WCSS) was determined and a graph was plotted to determine the number of clusters on a bend
Silhouette analysis is a method for assessing the quality of clustering. It shows how well objects within the same cluster are similar to each other and how separated they are from other clusters. The optimal number of clusters is the maximum silhouette coefficient (Figure 5).
Figure 5A shows that WCSS decreases sharply up to four clusters and then the decrease slows down. According to Figure 5B, the silhouette coefficient reaches its maximum value at four clusters.
After determining the number of clusters, k-means was applied to the PCA data. The results are illustrated in PC1 and PC2 coordinates (Figure 6), also in 3D using three PCs (Figure 7).
Figure 6 and Figure 7 illustrate avalanche clusters in principal component coordinates. The interpretation of key factors that influence the division into these groups (clusters) was then carried out on the basis of these figures. Based on the results of the analysis, all 111 avalanches were classified into four clusters, each characterized by specific meteorological conditions (e.g., warming, high snow depth, storm, etc.). The mean values of the data included in these four clusters were derived (Table 4).
The low-temperature regime and high values of relative humidity and atmospheric pressure are responsible for the avalanche group of cluster 0 (purple points). This cluster is indicative of snow cover accumulation under anticyclonic conditions. It is probable that the gradual accumulation of snow without wind compaction leads to loading or fragility of the snowpack layer.
Cluster 1 (blue points) is located on the right side of the PC1 axis, indicating relatively high air and surface temperatures, and low values of humidity and pressure. The cluster also has average values on the PC2 axis (Figure 6) and a high position in PC3 (Figure 7), indicating high snow cover on the slopes. The assumption is made that warming causes thawing of the upper snow layer, which makes it heavier and less stable, and that wind forms wind slabs—dense layers of snow that can collapse.
Cluster 2 (green points) is characterized by the highest snow cover (mean-77.91 cm), abundant precipitation and low wind speeds. The group is characterized by the accumulation of loose, unstable snow. Mild snowfalls without wind transport do not compact the cover, resulting in the formation of weak snow layers.
The following meteorological factors can be attributed to the occurrence of avalanches in cluster 3 (yellow points): average low air temperatures, precipitation effects and wind gusts. Presumably, snowfall with wind leads to the formation of transportable snow, sharp temperature drop after precipitation increases snow fragility. This group of avalanches can be tentatively called ‘Storm Influence’.
Thus, the number of avalanches falling into each cluster was calculated (Table 5). By territoriality, the clusters are scattered over all nine avalanche-prone areas.
According to Table 5, it can be concluded that there is no dominant meteorological condition causing avalanches. In the studied region, the storm influence factor has less weight than the other three pre-avalanche meteorological conditions.
The classification of avalanche scenarios was compared with the international typology adopted by the European Avalanche Warning Service (EAWS), which identifies problems related to new snow, wind-drifted snow, temperature increase, and deep weak layers [39].
Table 6 presents the classification of avalanche scenarios derived from K-Means clustering, aligned with the typology developed by the EAWS. The clusters reflect typical meteorological and snowpack conditions associated with avalanche hazards.
Cluster 0 is characterized by gradual snow accumulation without wind influence and corresponds to the New Snow Problem in the EAWS framework. Cluster 1 represents warming and snowmelt processes, related to the Wet Snow Problem. Cluster 2 reflects conditions of deep snowpack, associated with the formation of Persistent Weak Layers. Finally, Cluster 3 is linked to wind-driven snow transport and precipitation, matching the Wind Slab problem.
The use of unsupervised learning (K-Means) effectively revealed distinct avalanche scenarios, each corresponding to a specific EAWS problem type, thus allowing for standardized hazard interpretation and improved communication in avalanche forecasting.
When using the clustering method, outliers (cluster 2) were visually detected (Figure 6). The k-means method forcibly added them to the cluster. The DBSCAN method was used to analyze the data points (rows) and detect outliers.

3.4. DBSCAN

Density-Based Spatial Clustering of Applications with Noise (DBSCAN) is a clustering algorithm based on data density. Unlike K-Means, it does not require specifying the number of clusters in advance and selects clusters of arbitrary shape. DBSCAN uses two main parameters: eps (epsilon)—the radius of the point vicinity where neighboring points are searched for; minPts—the fewest number of points required to form a cluster.
The Nearest Neighbor library from Sklearn was used to adequately fit the eps value. For each point the distance to the K-th neighbor is taken and these distances are sorted and plotted. The graph identifies a kink, i.e., a point after which the distances increase sharply (Figure 8).
In the graph (Figure 8), the ‘kink’ is at 15, which was chosen as the eps value. The number of minPts was determined by fitting from 3 to 10. The optimal value was found to be 5, as low numbers excessively increase the number of clusters (up to 6–7), and high increase the number of outliers (noise point).
Thus, taking the following parameters, eps-15 andminPts-5, this method was applied to the data of the four principal components of PCA, the results of which are presented in the graph (Figure 9).
DBSCAN divided the PCA (avalanche) data into two main clusters. It merged k-means clusters 0 and 2 into one new cluster 0 (green points) and also merged 1 and 3 into cluster 1 (yellow points). The algorithm computed the noise points and disposed them into cluster −1.
The number of noise points was 2, which is 1.08% of the total number of avalanches studied (111). The releases are one avalanche at Pikhtovka pit (19 February 2017) and one case of avalanche at Putintsevo pit avalanche-prone areas (29 January 2020).
Analysis of the original data set revealed that the cause of these anomalies was high snow cover. Before the anomalous avalanche at the Pikhtovka pit, the snow depth on the slopes was 95.13 cm, which is 17.22 cm above the average value of cluster 2, to which k-means was forcibly added. At the Putintsevo pit, the anomalous avalanche had a snow depth of 115 cm, which is 37.09 cm higher than the average value of the cluster.
In the next step, by adding a new column ‘Cluster’ to the original dataset, the belonging of each avalanche to a particular cluster was determined. The results of this can be seen in the histogram (Figure 10).
The cartogram and histogram (Figure 10) show that there can be several clusters in one area, i.e., avalanches in one area can be triggered by different pre-avalanche meteorological conditions. At the same time, it can be seen that there are no locations where all 4 k-means clusters are mixed. There are 2 clusters of avalanches at one avalanche-prone areas: either 0 and 2 or 1 and 3. The only exception is the Pikhtovka pit (№ 5 on the map), where there is an avalanche from cluster 2 together with clusters 1 and 3. The explanation is that this avalanche is anomalous (outburst) and is forced to be added by the k-means method. The second anomalous avalanche (Putintsevo pit (№ 8 on the map)) is not highlighted in the histogram because it belongs to cluster 2, and this cluster is the main one in the Putintsevo pit avalanche area.
It can be seen that the distribution of avalanches by location corresponds to the DBSCAN clustering. Avalanche-prone areas 1, 2, 3, 8 and 9 are assigned to avalanches of cluster 0, while the remaining avalanches are assigned to avalanches of cluster 1 of the DBSCAN clustering.
Thus, the study used machine learning methods and algorithms (Figure 2), which allowed us to establish correlations between meteorological conditions and avalanches, to identify anomalous areas with high deviation of snow cover values at the Pikhtovka and Putintsevo pit avalanche-prone areas. Correlation analysis showed a high correlation between ground surface and air temperature, while PCA allowed us to identify the main components: warming and rapid snow accumulation (total ~68.5% of the variance) affecting avalanche hazard. Clustering analyses using K-Means and DBSCAN algorithms have demonstrated that avalanches can be triggered by diverse meteorological conditions, even within the same avalanche-prone areas.
The PCA results indicate that temperature and snow accumulation are the two primary factors influencing avalanche conditions (PC1 and PC2). These factors support the hypothesis that regional warming plays a critical role in altering snow cover stability. For instance, Cluster 1 reflects conditions involving the melting of surface layers and the formation of wet, heavy snow—factors well known to precede avalanche events. Similarly, Cluster 2 is characterized by intense snowfall, which may be intensified by evolving climatic patterns. These scenarios align with the understanding that rising temperatures can simultaneously weaken snowpack stability and increase snow load, thereby directly contributing to heightened avalanche activity.

4. Conclusions

The present study has successfully identified key meteorological factors influencing avalanche formation in Eastern Kazakhstan and developed a methodology for their analysis using modern machine learning algorithms.
In this research, we applied unsupervised machine learning methods—K-Means, DBSCAN, and Principal Component Analysis (PCA)—to classify 111 avalanche events based on meteorological features. The analysis revealed four distinct clusters corresponding to different avalanche scenarios in terms of triggering conditions and spatial characteristics.
The main conclusions are as follows:
K-Means successfully grouped avalanches into four clearly separated clusters, each associated with specific combinations of snow cover thickness, temperature fluctuations, and wind speed.
DBSCAN provided additional insights, particularly in identifying noise points or outliers that do not conform to dominant avalanche growth patterns.
PCA effectively reduced the dimensionality of the dataset while preserving the most informative variance, thereby improving the interpretability of cluster separation.
These clusters may reflect different types of avalanche mechanisms according to the EAWS classification, such as New Snow Problem (cluster 0), Wet Snow Problem (cluster 1), Persistent Weak Layer (cluster 2), and Wind Slab (cluster 3).
The results obtained have high practical significance and can be used to develop more accurate avalanche monitoring and warning systems in mountain regions, as well as to optimize the placement of engineering infrastructure and emergency planning. The methodological approach presented here, combining statistical analysis and machine learning, can be adapted for other regions with similar climatic and geographical conditions.
Future research will focus on improving the understanding of the processes that influence avalanche formation, such as stress changes in the snow cover and snow flow dynamics under different precipitation types and wind loads. Particular attention will be paid to the development of predictive models that can take into account the spatial and temporal characteristics of the data, which will provide more accurate avalanche hazard prediction.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/atmos16060723/s1, Table S1. Avalanche events by location (2012–2023).

Author Contributions

Conceptualization M.R. and Y.B.; methodology, Y.B.; software, Y.B.; validation, M.R., Y.B. and Z.A.; formal analysis, G.D.; investigation, M.R. and Y.B.; resources, M.R., Y.B., A.K. and Z.A.; data curation, Y.B. and A.K.; writing—original draft preparation, M.R., Y.B., M.K., G.D. and N.D.; writing—review and editing, M.R., M.K., Y.B., G.D. and Z.A.; visualization, M.R. and Y.B.; supervision, N.D.; project administration, N.D.; funding acquisition, N.D. All authors have read and agreed to the published version of the manuscript.

Funding

The article presents the results of scientific research obtained during the implementation of a sci-entific and technical program of Committee of Science of the Ministry of Science and Higher Education of the Republic of Kazakhstan BR21882022 on the topic: “Research of avalanche activity in the East Kazakhstan region for development of monitoring systems and scientific substantiation of their placement” within the framework of program-targeted financing.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data sources are listed in the text.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

  1. Denissova, N.; Nurakynov, S.; Petrova, O.; Chepashev, D.; Daumova, G.; Yelisseyeva, A. Remote Sensing Techniques for Assessing Snow Avalanche Formation Factors and Building Hazard Monitoring Systems. Atmosphere 2024, 15, 1343. [Google Scholar] [CrossRef]
  2. Petrova, O.; Denissova, N.; Daumova, G.; Ivashchenko, Y.; Sergazinov, E. Regional climatic changes and their impact on the level of avalanche hazard in East Kazakhstan. Heliyon 2025, 11, e41807. [Google Scholar] [CrossRef] [PubMed]
  3. Rakhymberdina, M.; Levin, E.; Daumova, G.; Bekishev, Y.; Assylkhanova, Z.; Kapasov, A. Combined Remote Sensing and GIS Methods for Detecting Avalanches in Eastern Kazakhstan. ES Energy Environ. 2024, 26, 1350. [Google Scholar] [CrossRef]
  4. Wen, H.; Wu, X.; Shu, X.; Wang, D.; Zhao, S.; Zhou, G.; Li, X. Spatial heterogeneity and temoral tendency of channeled snow avalanche activity retrieved from Landsat images in the maritime snow climate of the Parlung Tsangpo catchment, Southeastern Tibet. Cold Reg. Sci. Technol. 2024, 223, 104206. [Google Scholar] [CrossRef]
  5. Munaitpasova, A.; Orakova, G.; Musralinova, G.; Zheksenbaeva, A.; Nyshanbay, A. Modern climate changes in Eastern Kazakhstan. Hydrometeorol. Ecol. 2024, 3, 31–39. [Google Scholar] [CrossRef]
  6. Climate Knowledge Portal. 2021. Available online: https://climateknowledgeportal.worldbank.org/country/kazakhstan/climate-data-historical (accessed on 20 May 2025).
  7. Denissova, N.; Nurakynov, S.; Petrova, O.; Daumova, G.; Chepashev, D.; Alpysbay, M.; Chettykbayev, R. Dependence of Avalanche Risk on Slope Insolation Level and Albedo. Atmosphere 2025, 16, 556. [Google Scholar] [CrossRef]
  8. Strapazzon, G.; Schweizer, J.; Chiambretti, I.; Brodmann Maeder, M.; Brugger, H.; Zafren, K. Effects of Climate Change on Avalanche Accidents and Survival. Front. Physiol. 2021, 12, 639433. [Google Scholar] [CrossRef]
  9. Schauer, A.R.; Hendrikx, J.; Birkeland, K.W.; Mock, C.J. Synoptic atmospheric circulation patterns associated with deep persistent slab avalanches in the western United States. Nat. Hazards Earth Syst. Sci. 2021, 21, 757–774. [Google Scholar] [CrossRef]
  10. Hancock, H.; Hendrikx, J.; Eckerstorfer, M.; Wickström, S. Synoptic control on snow avalanche activity in central Spitsbergen. Cryosphere 2021, 15, 3813–3837. [Google Scholar] [CrossRef]
  11. Eckert, N.; Keylock, C.J.; Castebrunet, H.; Lavigne, A.; Naaim, M. Temporal trends in avalanche activity in the French Alps and subregions: From occurrences and runout altitudes to unsteady return periods. J. Glaciol. 2013, 59, 93–114. [Google Scholar] [CrossRef]
  12. Ballesteros-Cánovas, J.A.; Trappmann, D.; Madrigal-González, J.; Eckert, N.; Stoffel, M. Climate warming enhances snow avalanche risk in the Western Himalayas. Proc. Natl. Acad. Sci. USA 2018, 115, 3410–3415. [Google Scholar] [CrossRef] [PubMed]
  13. Hao, J.; Zhang, X.; Cui, P.; Li, L.; Wang, Y.; Zhang, G.; Li, C. Impacts of Climate Change on Snow Avalanche Activity Along a Transportation Corridor in the Tianshan Mountains. Int. J. Disast. Risk Sc. 2023, 14, 510–522. [Google Scholar] [CrossRef]
  14. Seifert, A.; Rasp, S. Potential and limitations of machine learning for modeling warm-rain cloud microphysical processes. J. Adv. Model. Earth Syst. 2020, 12, e2020MS002301. [Google Scholar] [CrossRef]
  15. Kayhan, E.C.; Ekmekcioğlu, Ö. Coupling Different Machine Learning and Meta-Heuristic Optimization Techniques to Generate the Snow Avalanche Susceptibility Map in the French Alps. Water 2024, 16, 3247. [Google Scholar] [CrossRef]
  16. Durlević, U.; Valjarević, A.; Novković, I.; Ćurčić, N.B.; Smiljić, M.; Morar, C.; Stoica, A.; Barišić, D.; Lukić, T. GIS-Based Spatial Modeling of Snow Avalanches Using Analytic Hierarchy Process. A Case Study of the Šar Mountains, Serbia. Atmosphere 2022, 13, 1229. [Google Scholar] [CrossRef]
  17. Blagovechshenskiy, V.; Medeu, A.; Gulyayeva, T.; Zhdanov, V.; Ranova, S.; Kamalbekova, A.; Aldabergen, U. Application of Artificial Intelligence in the Assessment and Forecast of Avalanche Danger in the Ile Alatau Ridge. Water 2023, 15, 1438. [Google Scholar] [CrossRef]
  18. Reichstein, M.; Camps-Valls, G.; Stevens, B.; Jung, M.; Denzler, J.; Carvalhais, N.; Prabhat. Deep learning and process understanding for data-driven Earth system science. Nature 2019, 566, 195–204. [Google Scholar] [CrossRef]
  19. Kuglitsch, M.; Albayrak, A.; Aquino, R.; Craddock, A.; Edward-Gill, J.; Kanwar, R.; Koul, A.; Ma, J.; Marti, A.; Menon, M.; et al. Artificial Intelligence for Disaster Risk Reduction: Opportunities, Challenges and Prospects. Early Warn. Anticip. Action. 2022, 71. Available online: https://wmo.int/media/magazine-article/artificial-intelligence-disaster-risk-reduction-opportunities-challenges-and-prospects (accessed on 22 May 2025).
  20. Tiwari, A.; Arun, A.; Vishwakarma, B.D. Parameter importance assessment improves efficacy of machine learning methods for predicting snow avalanche sites in Leh-Manali Highway, India. Sci. Total Environ. 2021, 794, 148738. [Google Scholar] [CrossRef]
  21. Bao, S.; Liu, J.; Wang, L.; Konečný, M.; Che, X.; Xu, S.; Li, P. Landslide Susceptibility Mapping by Fusing Convolutional Neural Networks and Vision Transformer. Sensors 2023, 23, 88. [Google Scholar] [CrossRef]
  22. Zhang, L.; Chen, J.; Zhang, H.; Huang, D. The prediction of dynamical quantities in granular avalanches based on graph neural networks. J. Chem. Phys. 2023, 159, 214901. [Google Scholar] [CrossRef] [PubMed]
  23. Fromm, R.; Schönberger, C. Estimating the danger of snow avalanches with a machine learning approach using a comprehensive snow cover model. Mach. Learn. Appl. 2022, 10, 100405. [Google Scholar] [CrossRef]
  24. Joshi, J.C.; Kaur, P.; Kumar, B.; Singh, A.; Satyawi, P.K. HIM-STRAT: A neural network-based model for snow cover simulation and avalanche hazard prediction over North-West Himalaya. Nat. Hazards 2020, 103, 1239–1260. [Google Scholar] [CrossRef]
  25. Yariyan, P.; Omidvar, E.; Karami, M.; Cerdà, A.; Pham, Q.B.; Tiefenbacher, J.P. Evaluating novel hybrid models based on GIS for snow avalanche susceptibility mapping: A comparative study. Cold Reg. Sci. Technol. 2022, 194, 103453. [Google Scholar] [CrossRef]
  26. Akay, H. Spatial modeling of snow avalanche susceptibility using hybrid and ensemble machine learning techniques. Catena 2021, 206, 105524. [Google Scholar] [CrossRef]
  27. Mayer, S.; Techel, F.; Schweizer, J.; van Herwijnen, A. Prediction of natural dry-snow avalanche activity using physics-based snowpack simulations. Nat. Hazards Earth Syst. Sci. 2023, 23, 3445–3465. [Google Scholar] [CrossRef]
  28. Olsen, O. Using Sentinel 1 C-Band SAR imagery to Detect Avalanches: An Analysis of Smaller Scale Avalanches and Proposed Algorithm. Bachelor’s Thesis, Harvard University Engineering and Applied Sciences, Boston, MA, USA, 2024. [Google Scholar]
  29. Gassner, M.; Brabec, B. Nearest neighbour models for local and regional avalanche forecasting. Nat. Hazards Earth Syst. Sci. 2002, 2, 247–253. [Google Scholar] [CrossRef]
  30. Gauthier, F.; Germain, D.; Hétu, B. Logistic models as a forecasting tool for snow avalanches in a cold maritime climate: Northern Gaspésie, Québec, Canada. Nat. Hazards J. Int. Soc. Prev. Mitig. Nat. Hazards 2017, 89, 201–232. [Google Scholar] [CrossRef]
  31. Chawla, M.; Singh, A. Data efficient Random Forest model for avalanche forecasting. Nat. Hazards Earth Syst. Sci. 2019, 379, 1–33. [Google Scholar] [CrossRef]
  32. Yousefi, S.; Pourghasemi, H.R.; Emami, S.N.; Pouyan, S.; Eskandari, S.; Tiefenbacher, J.P. A machine learning framework for multi-hazards modeling and mapping in a mountainous area. Sci. Rep. 2020, 10, 12144. [Google Scholar] [CrossRef]
  33. Campesato, O. Chapter 1—Introduction to Pandas. In Python 3 and Machine Learning Using ChatGPT/GPT-4, 1st ed.; Mercury Learning and Information: Boston, Berlin, 2024; pp. 1–35. [Google Scholar] [CrossRef]
  34. Horton, S.; Herla, F.; Haegeli, P. Clustering simulated snow profiles to form avalanche forecast regions. Geosci. Model. Dev. 2025, 18, 193–209. [Google Scholar] [CrossRef]
  35. Greenacre, M.; Groenen, P.J.F.; Hastie, T.; D’Enza, A.I.; Markos, A.; Tuzhilina, E. Principal component analysis. Nat. Rev. Methods Primers. 2022, 2, 100. [Google Scholar] [CrossRef]
  36. Sen, S.; Saha, S.; Chaki, S.; Saha, P.; Dutta, P. Analysis of PCA based AdaBoost Machine Learning Model for Predict Mid-Term Weather Forecasting. Comput. Intell. Mach. Learn. 2021, 2, 41–52. [Google Scholar] [CrossRef]
  37. Doan, Q.-V.; Amagasa, T.; Pham, T.-H.; Sato, T.; Chen, F.; Kusaka, H. Structural k-means (S k-means) and clustering uncertainty evaluation framework (CUEF) for mining climate data. Geosci. Model. Dev. 2023, 16, 2215–2233. [Google Scholar] [CrossRef]
  38. Liu, J.; Zhang, T.; Hu, C.; Wang, B.; Yang, Z.; Sun, X.; Yao, S. A Study on Avalanche-Triggering Factors and Activity Characteristics in Aerxiangou, West Tianshan Mountains, China. Atmosphere 2023, 14, 1439. [Google Scholar] [CrossRef]
  39. EAWS. Standards and Avalanche Problems; European Avalanche Warning Services: Davos, Switzerland, 2023; Available online: https://www.avalanches.org/standards/ (accessed on 17 May 2025).
  40. East Kazakhstan Region. Available online: https://www.kazhydromet.kz/uploads/files/68/file/5ec145aed3e93.pdf (accessed on 28 March 2025).
  41. Visit East Kazakhstan. Nature of Eastern Kazakhstan. Available online: https://visiteast.kz/en/o-vostochnom-kazaxstane/priroda-vostochnogo-kazaxstana.html (accessed on 22 May 2025).
  42. Kazhydromet. Climate of Kazakhstan. Available online: https://www.kazhydromet.kz/en/klimat/klimat-kazahstana (accessed on 22 May 2025).
  43. Schweizer, J.; Jamieson, J.B. Snowpack properties for snow profile analysis. Cold Reg. Sci. Technol. 2003, 37, 233–241. [Google Scholar] [CrossRef]
  44. Yang, J.; Li, C.; Li, L.; Ding, J.; Zhang, R.; Han, T.; Liu, Y. Automatic detection of regional snow avalanches with scattering and interference of C-band SAR data. Remote Sens. 2020, 12, 2781. [Google Scholar] [CrossRef]
  45. Fromm, R. Estimating the forecasting success of artificially triggering of avalanches with the combination of cluster and discriminant analysis. In Proceedings of the International Snow Science Workshop, Davos, Switzerland, 27 September–2 October 2009; Available online: https://arc.lib.montana.edu/snow-science/objects/issw-2009-0366-0370.pdf (accessed on 21 May 2025).
  46. Hanafi, N.; Saadatfar, H. A fast DBSCAN algorithm for big data based on efficient density calculation. Expert Syst. Appl. 2022, 203, 117501. [Google Scholar] [CrossRef]
  47. Gupta, P.; Bagchi, A. Chapter 4—Introduction to NumPy. In Essentials of Python for Artificial Intelligence and Machine Learning; Synthesis Lectures on Engineering, Science, and Technology; Springer: Berlin/Heidelberg, Germany, 2024; pp. 127–159. [Google Scholar] [CrossRef]
  48. Scikit-learn. Machine Learning in Python. Available online: https://scikit-learn.org/stable/ (accessed on 28 December 2024).
  49. Sial, A.; Rashdi, S.; Khan, A. Comparative analysis of data visualization libraries Matplotlib and Seabornin Python. Int. J. Adv. Trends Comput. Sci. Eng. 2021, 10, 277–281. [Google Scholar] [CrossRef]
  50. Sadenova, M.A.; Beisekenov, N.A.; Rakhymberdina, M.; Varbanov, P.S.; Klemeš, J.J. Mathematical Modelling in Crop Production to Predict Crop Yields. Chem. Eng. Trans. 2021, 88, 1225–1230. [Google Scholar] [CrossRef]
  51. Cerruti, B.; Vives, E. Correlations in avalanche critical points. Phys. Rev. E 2009, 80, 011105. [Google Scholar] [CrossRef] [PubMed]
  52. Umargono, E.; Suseno, J.; Vincensius Gunawan, S.K. K-Means Clustering Optimization Using the Elbow Method and Early Centroid Determination Based on Mean and Median Formula. In Proceedings of the 2nd International Seminar on Science and Technology (ISSTEC 2019), Yogyakarta, Indonesia, 25 November 2019. [Google Scholar] [CrossRef]
  53. Migoya-Orué, Y.; Abe, O.E.; Radicella, S. Regional Spatial Mean of Ionospheric Irregularities Based on K-Means Clustering of ROTI Maps. Atmosphere 2024, 15, 1098. [Google Scholar] [CrossRef]
Figure 1. Location map of avalanche prone areas in the territory of East Kazakhstan region.
Figure 1. Location map of avalanche prone areas in the territory of East Kazakhstan region.
Atmosphere 16 00723 g001
Figure 2. Workflow for identifying patterns in the influence of meteorological parameters on avalanches.
Figure 2. Workflow for identifying patterns in the influence of meteorological parameters on avalanches.
Atmosphere 16 00723 g002
Figure 3. Correlation matrix of meteorological data.
Figure 3. Correlation matrix of meteorological data.
Atmosphere 16 00723 g003
Figure 4. Feature loading plot.
Figure 4. Feature loading plot.
Atmosphere 16 00723 g004
Figure 5. Determinations of the optimal number of clusters: (A) elbow method; (B) silhouette analysis.
Figure 5. Determinations of the optimal number of clusters: (A) elbow method; (B) silhouette analysis.
Atmosphere 16 00723 g005
Figure 6. K-means clustering of PCA data in PC1 and PC2 coordinates.
Figure 6. K-means clustering of PCA data in PC1 and PC2 coordinates.
Atmosphere 16 00723 g006
Figure 7. Distribution of data in the space of three main components.
Figure 7. Distribution of data in the space of three main components.
Atmosphere 16 00723 g007
Figure 8. K-Distance graph (search for optimal eps).
Figure 8. K-Distance graph (search for optimal eps).
Atmosphere 16 00723 g008
Figure 9. Results of DBSCAN.
Figure 9. Results of DBSCAN.
Atmosphere 16 00723 g009
Figure 10. Distribution of avalanches by clusters (a) cartogram; (b) histogram.
Figure 10. Distribution of avalanches by clusters (a) cartogram; (b) histogram.
Atmosphere 16 00723 g010
Table 1. Features of avalanche-prone areas Based on Shuttle Radar Topography Mission (SRTM).
Table 1. Features of avalanche-prone areas Based on Shuttle Radar Topography Mission (SRTM).
Avalanche-Prone Areas CoordinatesElevation Range, mSlope Angle Range, DegreesAspect, Degrees (Direction)
Prokhodnaya490 58′ 25″ N,
820 57′ 04″ E
400–71230–35200–220 (SW)
Tainty490 19′ 53″ N,
830 06′ 40″ E
1030–126025–30230–250 (SW)
Laily490 07′ 59″ N,
830 20′ 07″ E
770–85025–2840–50 (NE)
Chekmar500 23′ 11″ N,
830 36′ 02″ E
780–93328–32180–200 (SW)
Pikhtovka490 43′ 34″, N
830 17′ 24″ E
500–85030–3770–80 (NE), 220–230 (SW)
Sogornoe490 15′ 48″, N
850 18′ 10″ E
670–91535–4260–70 (NE)
Berel490 28′ 36″, N
860 25′ 09″ E
1315–245230–40250–260 (SW)
Putintsevo pit490 50′ 13″, N
840 19′ 16″ E
430–65020–3040–50 (NE)
Bogatyrevskaya pit490 49′ 25″, N
840 20′ 44″ E
440–71235–3770–80 (NE)
Note: SW—South-West; NE—North-East.
Table 2. Dataset statistics.
Table 2. Dataset statistics.
SettingsMeanMin25%50%75%MaxStd
Surface temperature, °C−10.21−23−13.74−10.12−8.0614.55.41
Air temperature, °C−9.03−20.87−12.11−9.28−6.783.084.67
Snow depth, cm50.53531.8750.6269.8811524.84
Wind speed, m/s2.810.381.582.73.98.121.46
Relative humidity, %74.525370.9275.4378.7190.437.71
Precipitation, mm1.9700.711.332.5810.71.89
Atmospheric pressure, gPa943.37893.35905.89963.61970.27989.2532.64
Table 3. Loadings on main components.
Table 3. Loadings on main components.
Main ComponentsSurface Temp., °CAir Temp., °CSnow Depth, cmWind Speed, m/sRelative Humidity, %Precipitation, mmAtmospheric Pressure, gPa
PC10.430.480.080.37−0.46−0.19−0.44
PC20.420.340.51−0.240.240.540.17
PC3−0.19−0.170.69−0.44−0.22−0.42−0.21
PC4−0.35−0.320.370.55−0.210.52−0.17
Table 4. Mean values of clusters.
Table 4. Mean values of clusters.
ClusterSurface Temp., °CAir Temp., °CSnow Depth, cmWind Speed, m/sRelative Humidity, %Precipitation, mmAtmospheric Pressure, gPa
0−12.84−11.6231.042.4179.322.29973.96
1−7.64−5.0460.893.6567.261.03906.69
2−11.11−10.1477.911.7978.892.97966.89
3−12.23−10.2320.793.6872.421.52917.07
Table 5. Number of avalanches in each cluster.
Table 5. Number of avalanches in each cluster.
ClusterNumber of Avalanches
033
134
228
316
Table 6. Clustering by EAWS types.
Table 6. Clustering by EAWS types.
ClusterDescriptionEAWS Type
0Gradual accumulation of snow without windsNew Snow Problem
1Warming and thawingWet Snow Problem
2High snow coverPersistent Weak Layer
3Wind and precipitationWind slab
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Rakhymberdina, M.; Denissova, N.; Bekishev, Y.; Daumova, G.; Konečný, M.; Assylkhanova, Z.; Kapasov, A. Investigation of the Regularities of the Influence of Meteorological Factors on Avalanches in Eastern Kazakhstan. Atmosphere 2025, 16, 723. https://doi.org/10.3390/atmos16060723

AMA Style

Rakhymberdina M, Denissova N, Bekishev Y, Daumova G, Konečný M, Assylkhanova Z, Kapasov A. Investigation of the Regularities of the Influence of Meteorological Factors on Avalanches in Eastern Kazakhstan. Atmosphere. 2025; 16(6):723. https://doi.org/10.3390/atmos16060723

Chicago/Turabian Style

Rakhymberdina, Marzhan, Natalya Denissova, Yerkebulan Bekishev, Gulzhan Daumova, Milan Konečný, Zhanna Assylkhanova, and Azamat Kapasov. 2025. "Investigation of the Regularities of the Influence of Meteorological Factors on Avalanches in Eastern Kazakhstan" Atmosphere 16, no. 6: 723. https://doi.org/10.3390/atmos16060723

APA Style

Rakhymberdina, M., Denissova, N., Bekishev, Y., Daumova, G., Konečný, M., Assylkhanova, Z., & Kapasov, A. (2025). Investigation of the Regularities of the Influence of Meteorological Factors on Avalanches in Eastern Kazakhstan. Atmosphere, 16(6), 723. https://doi.org/10.3390/atmos16060723

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop