Harnessing TabTransformer Model and Particle Swarm Optimization Algorithm for Remote Sensing-Based Heatwave Susceptibility Mapping in Central Asia

Wang, Antao; Sun, Linan; Jia, Huicong

doi:10.3390/atmos16101166

Open AccessArticle

Harnessing TabTransformer Model and Particle Swarm Optimization Algorithm for Remote Sensing-Based Heatwave Susceptibility Mapping in Central Asia

by

Antao Wang

¹,

Linan Sun

² and

Huicong Jia

^3,4,*

¹

Department of Cyber Security, Henan Police College, Zhengzhou 450046, China

²

School of Urban and Rural Planning, Henan University of Economics and Law, Zhengzhou 450046, China

³

International Research Center of Big Data for Sustainable Development Goals, Beijing 100094, China

⁴

Key Laboratory of Digital Earth Science, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China

^*

Author to whom correspondence should be addressed.

Atmosphere 2025, 16(10), 1166; https://doi.org/10.3390/atmos16101166

Submission received: 14 August 2025 / Accepted: 2 October 2025 / Published: 7 October 2025

(This article belongs to the Special Issue Environmental Footprints of Drought: Focusing on Emerging Issues and Their Underlying Mechanisms (2nd Edition))

Download

Browse Figures

Versions Notes

Abstract

This study pioneers a fully remote sensing-based framework for mapping heatwave susceptibility, integrating the TabTransformer deep learning model with Particle Swarm Optimization (PSO) for robust hyperparameter tuning. The central question addressed is whether a fully remote sensing-driven, PSO-optimized TabTransformer can achieve accurate, scalable, and spatially detailed heatwave susceptibility mapping in data-scarce regions such as Central Asia. Utilizing ERA5-derived heatwave evidence and thirteen environmental and socio-economic predictors, the workflow produces high-resolution susceptibility maps spanning five Central Asian countries. Comparative analysis evidences that the PSO-optimized TabTransformer model outperforms the baseline across multiple metrics. On the test set, the optimized model achieved an RMSE of 0.123, MAE of 0.034, and R² of 0.938, outperforming the standalone TabTransformer (RMSE = 0.132, MAE = 0.038, R² = 0.93). Discriminative capacity also improved, with AUROC increasing from 0.933 to 0.940. The PSO-tuned model delivered faster convergence, lower final loss, and more stable accuracy during training and validation. Spatial outputs reveal heightened susceptibility in southern and southwestern sectors—Turkmenistan, Uzbekistan, southern Kazakhstan, and adjacent lowlands—with statistically significant improvements in spatial precision and class delineation confirmed by Chi-squared, Friedman, and Wilcoxon tests, all with congruent p-values of <0.0001. Feature importance analysis consistently identifies maximum temperature, frequency of hot days, and rainfall as dominant predictors. These advancements validate the potential of data-driven, deep learning approaches for reliable, scalable environmental hazard assessment, crucial for climate adaptation planning in vulnerable regions.

Keywords:

natural hazards; deep learning; metaheuristic techniques; evolutionary optimization; attention algorithm; risk management

1. Introduction

Heatwaves, defined as prolonged periods of excessively high temperatures, represent one of the most severe climate-induced hazards of the 21st century [1,2]. Their frequency, duration, and spatial extent have increased markedly in recent decades, particularly in regions already experiencing arid and semi-arid conditions [3]. Heatwaves have well-established and significant effects on public health, playing a major role in increasing heat-related deaths and illnesses across diverse regions such as Europe, the United States, Russia, and Korea [4,5]. Between 2000 and 2016, heatwaves affected approximately 125 million additional people worldwide and contributed to over 166,000 heat-related deaths in the preceding decade, with low-income countries facing the greatest risks due to limited capacity for adaptation and response [6]. Heatwaves are driving considerable economic and ecological losses globally by disrupting agriculture, reducing labor productivity, straining infrastructure, and damaging ecosystems. In Europe, heat-related damages have already reached up to 0.5% of GDP, with projections suggesting a fivefold increase by 2060 if climate action remains insufficient [7,8]. Agriculture is particularly affected, as extreme heat and drought conditions cause significant reductions in crop yields—especially in wheat and maize [9,10,11,12]. Labor productivity also declines during heatwaves, especially in low- and middle-income countries, slowing economic growth [13,14,15,16]. Moreover, heatwaves inflict lasting damage on ecosystems, triggering mass tree die-offs, coral bleaching, and wildlife losses, as seen in a Western Australian event that affected over 300,000 km² of land and sea [11,17]. The broader financial toll includes heightened wildfire threats, increased power outages, and stressed water supplies, as demonstrated by the 2010 Russian heatwave, which resulted in US$15 billion in damages and extensive land loss [18,19,20,21,22].

Central Asia, characterized by its continental climate, complex topography, and socio-economic vulnerabilities, has emerged as a hotspot for heatwave intensification. The region’s exposure is exacerbated by rapid urban expansion, land use changes, climate change, and limited adaptive capacity, all of which amplify the risks associated with extreme heat events [5,6,23]. From declining crop yields and water shortages to elevated mortality and morbidity rates [4], the cascading impacts of heatwaves in Central Asia are multifaceted, urgent, and insufficiently mapped. Despite the rising threat, heatwave susceptibility assessments in Central Asia remain sparse, fragmented, and often constrained by the lack of comprehensive and high-resolution spatial data. Conventional studies have largely relied on meteorological station data, static indices, and machine learning models [24,25,26,27,28], which are limited in spatial coverage and unable to capture the dynamic, multi-dimensional nature of heatwave causality. This presents a substantial gap in hazard science, where a robust, spatially explicit, and predictive approach is critical for proactive risk management and mitigation strategies. Recent scholarship has broadened the scope of heatwave research to encompass vulnerability assessment, climate risk modeling, and the operationalization of adaptation strategies in diverse contexts. Case studies highlight adaptation planning in European cities [29,30], and the role of socio-economic capacity in shaping heatwave resilience [31]. Advances in modeling approaches combine machine learning and remote sensing for urban heat vulnerability mapping [32], and employ climate modeling to project future heatwave risks at national and regional scales [33,34,35,36,37]. Together, these works underline the imperative of integrating physical hazard mapping with socio-economic vulnerability metrics to support climate adaptation. A key challenge lies in integrating diverse environmental, climatic, and anthropogenic variables into a unified framework that can identify areas most prone to heatwave occurrence and intensity. Remote sensing technologies offer an effective solution to this challenge [38]. By capturing consistent, repeatable, and wide-area observations, satellite-based data enable the extraction of essential biophysical indicators that drive heatwave dynamics. Variables such as land surface temperature (LST), vegetation indices (e.g., NDVI), soil moisture, built-up density, albedo, and elevation can serve as either evidence layers or causative factors in susceptibility mapping [39,40,41,42,43]. These data sources are particularly valuable in Central Asia, where ground-based observations are often sparse or inaccessible [44]. The ability of remote sensing to characterize environmental gradients at fine resolutions provides a powerful foundation for developing sophisticated and regionally adapted models of heatwave susceptibility [45].

Parallel to the advancement of remote sensing is the rapid evolution of artificial intelligence, particularly deep learning, which has transformed the modeling landscape in geospatial sciences. Deep learning models are uniquely capable of handling complex, nonlinear interactions and high-dimensional datasets that characterize environmental systems [43,46]. Among these, the TabTransformer model has recently gained attention for its superior performance in tabular data analysis, combining the interpretability of decision trees with the deep representational power of transformer-based architectures [47,48]. Its capacity to handle mixed-type inputs (continuous and categorical), learn contextual relationships, and generalize well with limited labeled data makes it an ideal candidate for susceptibility modeling where data heterogeneity is high [49,50]. However, deep learning models are notoriously sensitive to hyperparameter settings. Suboptimal configurations can significantly degrade model performance, interpretability, and computational efficiency. To address this, metaheuristic optimization algorithms, particularly Particle Swarm Optimization (PSO), provide an efficient mechanism for global hyperparameter tuning. Inspired by the social behavior of bird flocking, PSO balances exploration and exploitation to converge on optimal solutions in complex search spaces. When integrated with deep learning, PSO can significantly enhance model convergence, generalizability, and classification accuracy in predictive mapping tasks [51,52].

This study proposes a novel framework that combines the TabTransformer deep learning architecture with PSO-based optimization to develop a high-resolution, remote sensing-driven heatwave susceptibility map for Central Asia. All input data layers, whether evidence or causative, are derived exclusively from satellite observations and processed to maintain spatial consistency and thematic relevance. The hybrid model not only leverages the spatiotemporal strengths of remote sensing but also harnesses the computational intelligence of deep learning and evolutionary optimization [52,53,54,55]. By targeting an underexplored yet highly vulnerable geographic region, this research offers methodological innovation and regional insight that can inform climate adaptation policies, emergency planning, and sustainable land management strategies. It fills a critical gap in hazard assessment by providing an interpretable, scalable, and data-rich approach to understanding and mitigating heatwave risks under accelerating climate change. Previous efforts to map heatwave hazard and susceptibility have primarily relied on station-based indices and gridded meteorological datasets [24,25,26,27,28], often applying empirical or statistical thresholds for extreme heat events [2,4,5]. For instance, Perkins and Alexander [56] provide a comprehensive review of heatwave measurement methodologies and station-based index applications across diverse climatic settings. Classic machine learning algorithms such as Random Forests, Support Vector Machines, and logistic regression have been utilized to improve predictive mapping, yet they remain limited in spatial coverage, scalability, or in capturing non-linear interactions and high-dimensional data [24,25,26,27,28,43]. In response to such limitations, researchers have increasingly explored novel combinations of data sources and modelling strategies. Beyond heatwave-specific research, recent studies have demonstrated the potential of integrating machine learning with high-resolution reanalysis datasets for environmental and atmospheric applications. For example, Shikhovtsev et al. [57] employed neural networks trained on meteorological parameters from ERA5 to model atmospheric optical turbulence (“seeing”), achieving improved prediction skill over purely physical parameterizations. Similarly, Sun et al. [58] applied a random-forest-based correction of ERA5 precipitation estimates using dense gauge networks across the Third Pole region, producing a 70-year, 10-km gridded dataset that substantially improved hydrological model performance. These studies illustrate how coupling reanalysis data with machine learning algorithms can enhance predictive accuracy in complex environmental systems—an approach that directly supports the methodology adopted in the present work. Recent studies have also highlighted the intensification and impacts of severe heatwaves, with Russo et al. [59] characterizing the top European events since 1950, and Murari et al. [60] examining future risks and mortality outcomes in India using statistical and ML-based approaches. More recent advances have explored the use of remote sensing products to overcome data sparsity and provide spatially consistent, wide-area biophysical variables—including LST, NDVI, soil moisture, and urban/built-up indices—as key inputs for hazard assessment [38,39,40,41,42,43,44]. Tomlinson et al. [61] demonstrate the use of MODIS-based LST data for detailed urban heat island mapping, while Ahmadalipour and Moradkhani [62] leverage satellite observations to quantify escalating heat-stress mortality risk in the Middle East and North Africa. Nevertheless, most approaches remain constrained by either static indices or regionally fragmented data [44,45]. Smid et al. [63] employ both remote sensing and city-level observational data to rank European capitals by their exposure to heatwaves, highlighting methodological diversity but also fragmentation. The integration of deep learning in environmental hazard mapping is an emerging direction [43,46], with transformer-based models such as TabTransformer recently showing strong potential for handling diverse tabular geospatial data [47,48,49,50]. However, the application of such advanced models—especially in combination with metaheuristic optimization techniques like PSO—for large-scale, remote sensing-based heatwave susceptibility mapping is exceedingly rare or absent from the literature [51,52,53,54,55]. Consequently, our study addresses these gaps by proposing a fully remote sensing-driven and metaheuristically optimized TabTransformer framework, designed for high-resolution and transferable susceptibility mapping across the trans-boundary and data-scarce environment of Central Asia.

While various machine learning models have been applied to heatwave risk mapping, most rely on conventional algorithms such as RF, SVM, or logistic regression using mixed meteorological and static variables, often with limited scalability to heterogeneous, data-scarce regions. The TabTransformer offers a unique capacity to model complex, non-linear interactions in mixed-type, RS-derived tabular datasets by encoding contextual relationships between features—an aspect underexplored in climate hazard mapping. Its integration with PSO enables efficient, wide-range hyperparameter search, improving convergence stability and generalization in high-dimensional geospatial contexts. Despite the maturity of transformers and PSO separately, no prior work combines them for fully remote sensing–based heatwave susceptibility mapping, nor applies them in the trans-boundary, data-limited landscapes of Central Asia. This coupling not only enhances methodological robustness but also supports location-specific and country-level interpretations, enabling targeted adaptation strategies at sub-national scales (Section 4.6 and Section 4.7) and informing management and policy pathways for heatwave preparedness in arid and semi-arid environments (Section 4.8). The relevance of this framework extends beyond methodological innovation, as Section 4.1 provides a detailed examination of the underlying hazard dynamics, identifying key atmospheric and surface drivers of heatwave susceptibility in Central Asia and comparing their roles with findings from other climatic contexts.

The central research question guiding this study is whether a fully remote sensing–driven, PSO-optimized TabTransformer framework can reliably produce high-resolution, spatially transferable heatwave susceptibility maps in a data-scarce, trans-boundary region such as Central Asia. Accordingly, the primary objectives of this study are fourfold: (1) to develop a comprehensive heatwave susceptibility mapping framework for Central Asia using entirely remote sensing-derived indicators that reflect climatic, environmental, and anthropogenic drivers; (2) to employ the TabTransformer deep learning model for robust classification and pattern recognition in high-dimensional tabular geospatial datasets; (3) to enhance model performance and generalizability through Particle Swarm Optimization (PSO)-based hyperparameter tuning; and (4) to identify and spatially delineate regions of varying heatwave susceptibility and their contributing factors, thereby providing critical insights for targeted risk reduction, policy-making, and climate adaptation planning. By integrating cutting-edge deep learning with satellite-based environmental monitoring, this research contributes a novel, scalable, and transferable methodology to the growing field of climate hazard assessment.

2. Materials and Methods

2.1. Study Area

Central Asia, comprising Kazakhstan, Kyrgyzstan, Tajikistan, Turkmenistan, and Uzbekistan, is a vast landlocked region situated within the heart of the Eurasian continent (Figure 1). Its continental climate—with extreme temperature fluctuations—is shaped by its distance from the ocean, strong seasonal contrasts, and diverse orography [64]. The region spans semi-arid steppes, arid deserts, and high-altitude mountain zones such as the Tian Shan, Pamir, and Alai ranges. Annual precipitation varies widely—from less than 100 mm in low deserts to several hundred millimeters in foothills and upland areas—peaking in early spring and declining sharply toward summer. Climatically, Central Asia is recognized as one of the most rapidly warming regions globally, with its summers becoming hotter and longer, and its winters experiencing shortening periods of freezing conditions. Specifically, Kazakhstan has undergone an average warming rate of approximately 0.31 °C per decade from 1950 to 2020 [64]. Documented heatwaves in the region increased significantly between 1981 and 2020, particularly in the eastern and southwestern zones, where both intensity and duration have risen steadily [65]. These conditions exacerbate soil moisture depletion, triggering compound drought–heatwave events whose agricultural and hydrological consequences are amplified.

In March 2025, an exceptional early spring heatwave impacted Central Asia, with maximum daily temperatures exceeding near 30 °C in lowland areas—and minima remaining unusually warm—even at elevations of 1000 m [66]. Record anomalies during this event varied from 10 to 15 °C above climatological March norms, establishing the “second-largest” warming event globally since 1990, behind only a heatwave in South America. Attribution studies confirm that anthropogenic influences accounted for at least 4 °C of this temperature surge—and that such events are becoming more likely given the rapid warming of spring months. The socio-environmental ramifications in Central Asia are pressing. Many inhabitants rely on rainfed or glacier-fed agriculture, and approximately half of workers in Tajikistan and Uzbekistan are employed in farming, making timing-sensitive crop stages highly vulnerable to unexpected heat. Moreover, upstream glacial melt in the Tian Shan and Pamir systems—accelerated by rising temperatures—is reducing long-term water security. Urban centers like Tashkent and Bishkek, already experiencing rapid urbanization and increasing Urban Heat Island effects, face heightened public health and energy stress during heat extremes. Hence, Central Asia presents a compelling region for heatwave susceptibility mapping: its continental climate, rising temperature trends, varied ecotypes (from semi-arid steppes to alpine mountain ranges), and socio-economic reliance on temperature-sensitive systems create a multidimensional hazard context. The March 2025 heatwave—and the broader trend of warming—underscore the urgent need for spatially explicit, data-rich mapping approaches to inform regional resilience planning.

2.2. Data Sources and Processing

This study adopts an integrated, data-driven framework to map heatwave susceptibility across Central Asia, leveraging remote sensing data, the TabTransformer deep learning model, and Particle Swarm Optimization (PSO) (Figure 2).

The approach emphasizes full remote sensing dependency, from generating the evidence layer of past heatwave occurrences to extracting the causative factors, ensuring consistency, scalability, and applicability over vast and diverse landscapes. The workflow involves five main stages: (1) extraction of a binary heatwave evidence layer using satellite-derived ERA5 atmospheric reanalysis dataset, (2) selection and preprocessing of key environmental and anthropogenic causative factors from multi-source remote sensing data, (3) construction of a TabTransformer model tailored to capture complex patterns in structured geospatial inputs, (4) optimization of model hyperparameters via PSO, and (5) production of a continuous susceptibility map indicating the spatial probability of future heatwave exposure. A key component of the study is the comparative analysis between the standalone TabTransformer model and its PSO-coupled counterpart, aiming to assess the impact of automated hyperparameter tuning on predictive performance. The use of TabTransformer allows for nuanced representation of variable interactions through attention mechanisms, while PSO ensures efficient and effective model calibration. Together, they provide a robust analytical backbone for heatwave risk prediction, enabling the generation of scientifically grounded, spatially explicit susceptibility outputs to inform climate adaptation and regional planning efforts.

2.2.1. Evidence Layer: Heatwave Occurrence Mapping

To construct the binary evidence layer required for supervised modeling, this study utilized the Heat Wave Index (HWI) derived from the ERA5 reanalysis dataset, accessed through the World Bank Climate Knowledge Portal (https://climateknowledgeportal.worldbank.org/; accessed on 20 June 2025) (Figure 3) [15]. ERA5 provides hourly global atmospheric reanalysis data at a spatial resolution of approximately 31 km (~0.25°), making it a reliable source for identifying extreme temperature events over multi-decadal periods. The dataset is produced by the European Centre for Medium-Range Weather Forecasts (ECMWF) through the assimilation of a wide range of satellite observations—including those from NOAA polar orbiters, Meteosat geostationary satellites, and instruments such as ATOVS—alongside in situ measurements and other conventional observations into a global numerical weather prediction model [67,68,69,70]. This multi-source integration corrects for observational gaps, ensures spatial consistency, and yields data products validated against independent reference networks, providing high accuracy for variables such as temperature, humidity, and wind. The hourly temporal sampling captures diurnal and synoptic variability critical for heatwave identification, while the 31 km resolution enables robust regional-scale mapping. Our study period spans 1991–2020, covering three decades of climate variability across Central Asia, during which heatwave events were identified based on exceedance of temperature thresholds sustained over consecutive days, consistent with established climatological definitions. Heatwave occurrences were identified when temperatures exceeded threshold values for consecutive days, following established climatological definitions. In this study, the Heat Wave Index (HWI) was derived from ERA5 daily maximum 2 m temperature fields (

T_{m a x, d}

). Following commonly adopted climatological practice [56,59], a heatwave was defined as a period of ≥3 consecutive days where

T_{m a x, d}

exceeded the 90th percentile (

T_{90}

) of daily maximum temperatures for the corresponding calendar day, calculated over a 30-year baseline (1991–2020) for each grid cell. The binary HWI for day

d

was computed as (Equation (1)):

{H W I}_{d} = \{\begin{matrix} 1, i f T_{m a x, d} > T_{90} f o r a t l e a s t 3 c o n s e c u t i v e d a y s \\ 0, o t h e r w i s e \end{matrix}

(1)

where

T_{90}

is location- and calendar-day specific, ensuring the index reflects local climatology. This approach captures both the duration and intensity aspects of extreme temperature events while providing spatially consistent heatwave occurrence data across the study area.

We extracted 200 georeferenced points throughout the study area, labeling each as either ‘heatwave-affected’ or ‘non-affected’ to form a binary classification set. Samples were distributed to represent different elevations, land cover types, and climate zones. The dataset was subsequently split into training and testing subsets using a 70:30 ratio [71,72,73]. This stratified division preserved the balance between positive and negative cases in both sets, ensuring that the machine learning model could generalize beyond the training data. These labeled samples formed the target variable input for training the TabTransformer model in the subsequent stages of analysis. To minimize overfitting risks and ensure model robustness, particularly given the limited dataset size, we applied a 10-fold cross-validation procedure in which the model was iteratively trained and validated on different data partitions, with each sample serving as validation exactly once.

2.2.2. Causative Factors for Heatwave Susceptibility

The selection of causative factors for modeling heatwave susceptibility was grounded in empirical evidence and previous climatological, environmental, and hazard susceptibility research. A total of thirteen variables were employed, each representing a biophysical or anthropogenic characteristic known to influence the likelihood, intensity, or spatial occurrence of heatwave events. These factors were carefully chosen to reflect both static and dynamic environmental conditions across Central Asia, a region marked by heterogeneous topography, continental climate regimes, and varying degrees of human influence. Topographic variables (elevation, slope, aspect) were obtained from SRTM to capture how terrain influences heat retention. Vegetation and surface reflectance (NDVI, albedo, LST) came from Landsat-8 (2015–2024). These indicators reflect vegetation health, surface energy balance, and local thermal dynamics that affect heatwaves. Land cover data extracted from ESA Sentinel-2 products (2017–2023) was included to distinguish between urban, agricultural, and natural surfaces, each with distinct thermal properties. In parallel, population density data—sourced from WorldPop—served as a proxy for anthropogenic heat emissions and population exposure, both critical for heatwave impact assessments in urban and peri-urban environments. To capture climatic and meteorological conditions directly linked to heatwave formation, four variables were obtained from the ERA5 reanalysis dataset (1991–2020): maximum temperature, total rainfall, relative humidity, and the number of hot days (defined as days exceeding 30 °C). Additionally, the number of days with a Heat Index above 35 °C was included to account for perceived thermal stress, which integrates both temperature and humidity effects. The complete list of variables, along with their respective data sources and temporal coverage, is presented in Table 1. Thematic representations of the associated causative factors are provided in Figure 4. Our dataset combined high-resolution sources (e.g., Landsat-8 at 30 m, Sentinel-2 at 10 m) with coarser datasets (e.g., ERA5 at 0.25° ≈ 27.75 km, WorldPop at 1 km). To ensure spatial consistency, all predictors were resampled to a unified 30 m grid—a balance between detail and abundance—using bilinear interpolation for coarse-to-fine transitions. Examples include interpolating ERA5 maximum temperature, rainfall, and humidity layers, as well as 1 km population density, to match the Landsat/Sentinel grid. This method preserved detail from fine-resolution inputs and produced outputs suitable for local-scale decision-making. Although coarse-to-fine resampling can introduce artificial spatial patterns, smooth gradients, and inflate certainty in areas dominated by coarse inputs as literature noted these risks (e.g., Yu and Chen [74]), they also find that fine-resolution mapping generally improves model discrimination compared to uniformly coarse resampling. The alternative—downsampling all data to the coarsest resolution—prevents false detail but discards valuable high-resolution information, reducing spatial accuracy. Thus, our use of coarse-to-fine bilinear interpolation was a deliberate choice to balance spatial detail and interpretability. Further, categorical variables were encoded as necessary, and samples with missing values were omitted prior to model training. Continuous variables were standardized using z-score normalization to remove scale effects, ensuring comparability across predictors. This combination of remote sensing and reanalysis-based indicators offers a comprehensive and multi-dimensional view of heatwave causality, setting a robust foundation for the deep learning modeling phase that follows.

2.3. TabTransformer Model Architecture

The TabTransformer model was employed as the core deep learning architecture for predicting heatwave susceptibility across Central Asia. Originally developed to handle structured tabular data, TabTransformer integrates the strengths of both deep neural networks and Transformer-based attention mechanisms [75], making it particularly suited for high-dimensional, heterogeneous geospatial datasets such as those used in this study. At the heart of the TabTransformer architecture lies its ability to model complex, non-linear interactions between features while capturing contextual dependencies through self-attention layers. Unlike conventional feed-forward models that treat each input feature independently, TabTransformer embeds categorical and continuous variables into a shared latent space where attention is applied across feature tokens. This allows the model to learn contextual relationships among causative variables—such as the combined effect of land cover, albedo, and elevation on localized heatwave dynamics.

The model architecture includes three main components: (1) an embedding layer that transforms numerical and categorical inputs into dense vectors; (2) a series of Transformer encoder blocks with multi-head self-attention mechanisms that enable inter-feature interaction modeling; and (3) a fully connected neural network head responsible for the final prediction of heatwave susceptibility [75,76,77]. Dropout and layer normalization techniques were used throughout the network to improve generalization and prevent overfitting. To accommodate the structured format of geospatial data, continuous variables (e.g., LST, NDVI, slope) were normalized and passed directly into the embedding layer, while categorical variables (e.g., land cover classes) were one-hot encoded and then embedded. The final output layer employed a sigmoid activation function to produce a probabilistic score for each input sample, indicating the relative susceptibility to heatwave occurrence. The interpretability of TabTransformer is another advantage, as attention weights can be analyzed to evaluate the contribution of individual features to the model’s predictions. This capability is particularly relevant in environmental modeling, where transparent decision-making is critical for policy and planning applications. In mathematical terms, the TabTransformer processes categorical features

C = \{c_{1}, c_{2}, \dots, c_{m}\}

and continuous features

X \in R^{n_{f}}

by embedding the categorical inputs into dense vectors [75]:

E_{i} = W_{e m b, i} . o n e_h o t (c_{i}) f o r i = 1, \dots, m

(2)

These embeddings are passed to a multi-head self-attention (MHSA) encoder to model inter-feature dependencies:

A t t e n t i o n (Q, K, V) = s o f t m a x (\frac{{Q K}^{T}}{\sqrt{d_{k}}}) . V

(3)

where

Q = {E W}_{Q}

,

K = {E W}_{K}

,

V = {E W}_{V}

, and

d_{k}

is the key dimension.

The encoder output

Z

is concatenated with normalized continuous features

\tilde{X}

:

H = [Z; \tilde{X}]

(4)

Finally,

H

is passed to a feed-forward multilayer perceptron (MLP) for prediction:

\tilde{y} = f_{M L P} (H)

(5)

In addition to the primary 70:30 train–validation division, the TabTransformer architecture incorporates an internal batch-wise evaluation process during training. The model is trained in mini-batches, where a small portion of the training data is held out within each iteration to compute validation-like metrics before parameter updates. This mechanism allows the model to continually assess its generalization ability during each epoch, analogous to an embedded validation loop. Moreover, stochastic elements such as random shuffling of mini-batches and dropout regularization ensure that the network encounters diverse data combinations at different stages of the optimization process, enhancing robustness. When paired with the 10-fold cross-validation protocol applied to the overall training/validation set, this dual-layer validation scheme provides a comprehensive safeguard against overfitting. Performance stability was monitored through training vs. internal validation convergence curves at each fold. Hence, the near-parallel trajectories and convergence of loss and accuracy curves across folds—reinforced by the agreement between cross-validated results and the independent 30% validation set—serve to mitigate overfitting and demonstrate the model’s capacity to generalize across the diverse climatic and environmental gradients of Central Asia.

2.4. Particle Swarm Optimization for Hyperparameter Tuning

Particle Swarm Optimization (PSO) was applied to fine-tune the hyperparameters of the TabTransformer model, ensuring that its performance was optimized for the spatial and statistical characteristics of the study area. PSO, inspired by the social behavior of bird flocks and fish schools, is a population-based metaheuristic algorithm that excels at exploring complex, multi-dimensional search spaces efficiently [78]. Its adaptability and computational efficiency make it especially suitable for deep learning hyperparameter tuning tasks, where traditional grid or random search approaches can be computationally prohibitive. In this study, key hyperparameters of the TabTransformer architecture—such as learning rate, number of Transformer layers, attention heads, dropout rate, and batch size—were defined as search dimensions within the PSO framework. An initial population (swarm) of candidate solutions was randomly generated, each representing a unique hyperparameter configuration [79]. During each iteration, the particles adjusted their positions based on both their own best-known performance (personal best) and the globally best-performing particle (global best), guided by a fitness function defined as the model’s validation accuracy on the training set [80]. The optimization process proceeded until convergence criteria were met, including a maximum number of iterations and no significant improvement in the global best fitness. In mathematical terms, given a swarm of

N

particles, each particle

i

has a position vector

X_{i}^{t}

(candidate hyperparameters) and velocity vector

V_{i}^{t}

at iteration

t

. The velocity update is

V_{i}^{t + 1} = {ω V}_{i}^{t} + c_{1} r_{1} (p_{i} - X_{i}^{t}) + c_{2} r_{2} (g - X_{i}^{t})

(6)

The position update is

X_{i}^{t + 1} = X_{i}^{t} + V_{i}^{t + 1}

(7)

where

ω

is inertia weight,

c_{1}, c_{2}

are cognitive and social learning coefficients,

r_{1}, r_{2} \sim U (0,1)

are random scalars,

p_{i}

is particle’s best-known position, and

g

is global best-known position. The objective function

J (X)

is the validation loss metric, which PSO minimizes by updating particles until a stopping criterion (max iterations or convergence threshold).

This process allowed for the automatic discovery of optimal configurations that maximized the model’s predictive accuracy while minimizing overfitting. The synergy between TabTransformer’s high representational capacity and PSO’s global search ability ensured a robust, data-driven calibration of the deep learning model. To ensure the added value of PSO, its performance-enhanced model was directly compared to a baseline TabTransformer trained with default parameters, allowing for a critical assessment of PSO’s contribution to model accuracy and generalization. In this study, the PSO algorithm was configured with a local learning coefficient of 2 and a global learning coefficient of 2, which control the relative influence of individual and global best positions on particle movement (commonly varied between ~1–3). The minimum inertia weight was set to 0.4 to balance global search and local exploitation (often tuned between 0.4–0.9), with a population size of 50 particles governing the breadth of the search space, and a maximum of 100 epochs determining the number of iterations available for convergence.

2.5. Model Performance Assessment During Training and Validation

To ensure a rigorous and transparent evaluation of model performance at both the training and validation stages, a suite of complementary statistical and graphical approaches was employed [81,82]. These methods collectively enabled a detailed examination of the learning dynamics, calibration, predictive accuracy, and the statistical significance of spatial susceptibility outputs for both the baseline and PSO-optimized TabTransformer models.

2.5.1. Loss and Accuracy Plots

Throughout the training process, we systematically recorded and visualized the progression of loss and classification accuracy for both training and validation datasets. Loss curves, derived from the binary cross-entropy function, reflect how effectively the model minimizes misclassification risk over epochs, while accuracy curves indicate the proportion of correctly classified samples as learning progresses. These plots enable the monitoring of convergence speed, the stability of learning, and the detection of overfitting or underfitting behaviors by highlighting divergence or plateauing trends between training and validation stages. The comparative analysis of these trajectories for the default and PSO-optimized TabTransformer directly informed conclusions regarding the efficacy and regularization impact of hyperparameter optimization.

2.5.2. Quantitative Performance Metrics (RMSE, MAE, R²)

Following training, both models were evaluated using regression-based metrics to capture the correspondence between predicted and observed susceptibility at the pixel or sample level. Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE) provide absolute measures of prediction error magnitude, reflecting model precision and robustness across the range of susceptibility outputs. The coefficient of determination (R²) quantifies the proportion of variance in observed values accounted for by the model, serving as an indicator of overall model fit and explanatory power. By reporting these metrics for both training and independent test sets, we assessed not only in-sample prediction quality but also the generalizability and stability of each model configuration.

2.5.3. Discriminative Performance and AUROC Analysis

To further interrogate the models’ classification capabilities, we computed and plotted the Area Under the Receiver Operating Characteristic Curve (AUROC). The AUROC evaluates a model’s ability to distinguish between heatwave-prone and non-prone areas across all possible decision thresholds by aggregating the trade-off between true positive and false positive rates. Corresponding ROC plots offer a graphical summary of this relationship; a higher area under the curve signifies stronger discriminative power. This threshold-independent assessment is especially critical in environmental applications, where optimal thresholds may vary by context.

2.5.4. Statistical Comparison of Susceptibility Outputs

To rigorously evaluate whether hyperparameter optimization induced significant changes in susceptibility classification, we applied three non-parametric statistical tests to the predicted susceptibility maps from each model. The Chi-squared test assessed whether the frequency distributions of susceptibility classes differed significantly between the two models, providing insight into large-scale shifts in susceptibility categorization. The Friedman test enabled comparison of mean rank differences in susceptibility values across models due to its suitability for related, non-normally distributed samples, quantifying the spatial reordering of susceptibility assignment. The Wilcoxon signed-rank test offered a pairwise, location-specific assessment of whether the observed differences in susceptibility predictions were statistically significant, revealing local agreement or divergence in model outputs. Together, these statistical analyses validate whether the refinements introduced by PSO tuning are reflected not only in map patterns and summary metrics, but also in highly significant improvements in the spatial precision and plausibility of mapped heatwave susceptibility.

By integrating graphical diagnostics with a comprehensive set of performance metrics and robust statistical hypothesis tests, our methodological framework ensured that both the learning behavior and the final spatial predictions of each model variant were subjected to critical, quantitative scrutiny. This multi-faceted evaluation underpins the reliability and operational relevance of the reported heatwave susceptibility assessments for Central Asia.

3. Results

3.1. Feature Importance Analysis

The relative importance of the 13 causative factors used to model heatwave susceptibility was evaluated using the feature attribution capabilities embedded in the TabTransformer architecture. These values reflect the contribution of each variable to the model’s predictive performance, with higher scores indicating stronger influence on classification outcomes. The results revealed that Maximum Temperature had the highest importance score (0.313), followed by Number of Hot Days (0.245), and Rainfall (0.190), underscoring the dominant role of thermal and atmospheric indicators in driving heatwave occurrences in Central Asia. Land Surface Temperature (LST) and Aspect also showed considerable influence, with importance scores of 0.140 and 0.136, respectively. Variables such as Humidity (0.114), Slope (0.100), Albedo (0.098), Population Density (0.096), and Elevation (0.095) displayed moderate contributions. On the lower end, NDVI (0.072) and Land Cover (0.047) exhibited more limited predictive relevance. Notably, Number of Days with Heat Index >35 °C recorded the lowest importance (0.028), suggesting a relatively weaker direct impact in the context of the modeling framework. These values are summarized in Table 2, which lists the full ranking of feature importance as derived from the trained TabTransformer model.

3.2. PSO Hyperparameter Optimization Outcomes

To enhance the predictive performance of the TabTransformer model, Particle Swarm Optimization (PSO) was employed to fine-tune key architectural hyperparameters. The PSO algorithm was configured with standard coefficients—a local coefficient of 2, a global coefficient of 2, and a minimum inertia weight of 0.4—and was executed for 100 epochs with a swarm population size of 50 particles (Table 3). The optimization process targeted maximization of the R² score on the validation set as the fitness criterion. The comparison between the default and optimized hyperparameters is presented in Table 4. The default configuration of the TabTransformer included an embedding dimension of 32, transformer depth of 4, 8 attention heads, and fixed dropout rates of 0.1 for both attention and feedforward layers. In contrast, PSO converged on a notably simpler architecture: an embedding dimension of 8, transformer depth of 1, 3 attention heads, an attention dropout of 0.2, and a feedforward dropout of 0.17. This optimized configuration yielded a best R² value of 0.96, reflecting strong predictive alignment with the validation data under this parameter setting. The results suggest that a more compact and regularized TabTransformer architecture, discovered through PSO, may generalize better under the specific data structure and feature interactions characteristic of heatwave susceptibility modeling in Central Asia.

3.3. PSO Convergence Dynamics

The convergence curve of the PSO algorithm is shown in Figure 5. The R² score remained stable at 0.90 for the first 25 iterations, followed by a sharp improvement between iterations 30 and 63. The curve plateaued at 0.96 by iteration 63 and remained unchanged through iteration 100, indicating early convergence and effective optimization of the TabTransformer model’s hyperparameters.

3.4. Learning Dynamics of TabTransformer and TabTransformer–PSO: Loss and Accuracy Trajectories

Overfitting risk was monitored via training vs. validation loss and accuracy convergence curves at each epoch. The analysis of epoch-wise loss and accuracy plots for both TabTransformer and its PSO-optimized counterpart, TabTransformer–PSO, provides a clear view of the learning progression and generalization capacity achieved by each model during training and validation (Figure 6 and Figure 7). For the baseline TabTransformer, the loss curves for training and validation decline steadily and in close alignment, initiating at values above 0.8 and reaching approximately 0.12 after 100 epochs. The accuracy plot for this model reveals a rapid initial increase, with both training and validation accuracy quickly surpassing 80% after ~18 epochs and continuing to climb, ultimately stabilizing at values above 95% by the end of the training process. Importantly, the proximity between training and validation trajectories for both loss and accuracy demonstrates a well-calibrated model that learns effectively without overfitting, albeit at a moderate convergence pace. In comparison, the TabTransformer–PSO model, benefitting from hyperparameter optimization via Particle Swarm Optimization, displays a pronounced acceleration in learning. The loss curves for both training and validation decrease sharply during the early epochs, approaching a minimum close to 0.10 by epoch 75 and remaining virtually flat through to epoch 200. Strikingly, the validation loss not only mirrors but at times slightly undercuts the training loss, suggesting robust generalization and effective regularization. Correspondingly, the accuracy curves show the TabTransformer–PSO reaching over 90% validation accuracy within the first 25 epochs, and nearly perfect accuracy, close to 98–99%, by epoch 100. Throughout the latter stages of training, the validation accuracy matches or modestly exceeds its training counterpart, further underscoring the stability and consistency of the model’s learning process. Overall, these results demonstrate that the incorporation of PSO results in faster convergence, lower final loss, and higher, more stable accuracy on both training and validation datasets for heatwave susceptibility prediction. The absence of divergence and the consistently high validation scores across all epochs, alongside strong performance on the independent validation set, indicate that neither the standalone model nor the PSO-optimized model exhibited overfitting tendencies.

3.5. Spatial Patterns of Heatwave Susceptibility: A Geographic Comparison

Figure 8 presents the spatial distribution of heatwave susceptibility across Central Asia as derived from the TabTransformer (a) and TabTransformer–PSO (b) models. Both susceptibility maps classify the region into five distinct categories: very low, low, moderate, high, and very high susceptibility. In general, the overall spatial trends are consistent between the two modeling approaches, each capturing the pronounced north–south and west–east gradients in exposure. However, the PSO-optimized model exhibits a notable refinement and sharper delineation of susceptibility classes, with transitional zones better resolved and extreme classes more localized. In both outputs, the southern and southwestern sectors, encompassing much of Turkmenistan, Uzbekistan, southern Kazakhstan, and parts of southwestern Kyrgyzstan and Tajikistan, are persistently mapped as zones of high and very high susceptibility. These areas are characterized by broad, contiguous swathes of intense exposure, reflecting their climatological and topographic context. Conversely, northern Kazakhstan and the northeastern fringes—spanning foothill and mountainous areas—reliably emerge as regions of low to very low susceptibility under both models. Areas classified as moderate susceptibility dominate transitional steppe and foothill regions, where environmental and climatic gradients interact.

Comparing the two outputs, the TabTransformer map (a) tends to depict broader swathes of high and very high susceptibility, with somewhat diffused transitions between classes, particularly across western Uzbekistan and Turkmenistan as well as the Kazakh steppe margin. The TabTransformer–PSO model (b), however, demonstrates more spatial precision: susceptible zones, especially those marked as “very high,” appear more concentrated in the south and southwest, while moderate and low-susceptibility regions expand in the north and east. This results in a more nuanced, stratified pattern that enhances the interpretability and operational utility of the susceptibility delineation. Critically, the PSO-optimized model reduces the “bleed” of high-susceptibility predictions into peripheral zones, aligning more closely with environmental realities and known climatic trends within Central Asia.

3.6. Model Performance Comparison: TabTransformer vs. TabTransformer-PSO

To assess the impact of hyperparameter tuning on model performance, both the default TabTransformer and its PSO-optimized counterpart were evaluated using three standard regression metrics: Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and the Coefficient of Determination (R²). These metrics were computed for both training and test datasets to measure predictive accuracy and generalization ability. As shown in Table 5, the PSO-tuned model outperformed the default configuration across all performance measures. On the training set, TabTransformer–PSO achieved an RMSE of 0.062, an MAE of 0.019, and an R² of 0.98, compared to 0.082, 0.027, and 0.97, respectively, for the default model. This improvement was sustained on the test set, where TabTransformer–PSO registered lower RMSE (0.123 vs. 0.132) and MAE (0.034 vs. 0.038) values, and a slightly higher R² score (0.938 vs. 0.93). These results demonstrate the effectiveness of PSO in enhancing the TabTransformer’s performance, particularly in minimizing prediction errors and improving model fit. The marginal yet consistent improvements across both datasets suggest that hyperparameter tuning via PSO contributes to better generalization without overfitting.

Table 6 presents a comprehensive comparison of heatwave susceptibility prediction accuracy between the baseline TabTransformer model and its PSO-enhanced counterpart (TabTransformer–PSO) across all five Central Asian countries, quantifying both the percentage improvements in MAE and RMSE metrics to demonstrate the varying degrees of optimization benefits achieved at individual country levels versus regional aggregate performance (Table 5).

The TabTransformer–PSO model demonstrated superior performance compared to the baseline TabTransformer across all Central Asian countries, with Uzbekistan emerging as the leading beneficiary in terms of both MAE and RMSE reduction. Specifically, Uzbekistan achieved the most substantial improvements with MAE decreasing from 0.137 to 0.058 (57.66% improvement) and RMSE from 0.174 to 0.072 (58.62% improvement), representing the lowest absolute error values among all countries. Turkmenistan followed closely with 46.94% MAE improvement and 51.11% RMSE improvement, while Kyrgyzstan showed the most modest gains at 23.46% and 18.69%, respectively. Notably, the country-specific improvements (averaging 43.28% for MAE and 42.03% for RMSE) substantially exceeded the regional aggregate improvements of 10.53% and 6.82%, revealing significant geographic heterogeneity in model optimization benefits that becomes masked when data is pooled across the entire Central Asian region.

3.7. AUROC-Based Validation Assessment

To further validate the classification performance of both models, the AUROC curve was computed. AUC serves as a threshold-independent metric that evaluates the model’s ability to distinguish between heatwave and non-heatwave occurrences. The results, presented in Figure 9 and Table 7, reveal strong discriminative capability for both models. The standalone TabTransformer achieved an AUC of 93.3%, confirming its high ability to differentiate between the two classes. The PSO-optimized variant further improved this result, reaching an AUC of 94.0%. While the gain is marginal, it highlights the role of hyperparameter tuning in refining the model’s classification threshold and decision boundaries. This improvement, though modest in magnitude, supports the assertion that PSO contributes positively to model calibration, enabling slightly more confident and consistent predictions, especially in borderline cases.

While the dataset was split into training (70%) and validation (30%) subsets, the training portion inherently served a dual role—model fitting and internal validation—due both to its sufficient size and spatial diversity across Central Asia and to the TabTransformer’s built-in iterative training process, which evaluates performance on withheld batches during each epoch. This internal checking, combined with the spatial representativeness of the data, ensured that the model’s learning was balanced across different environmental and climatic zones. To further safeguard against potential sampling bias, we employed 10-fold cross-validation on the training: validation data. This approach partitions the dataset into ten mutually exclusive folds, iteratively training on nine folds while validating on the remaining one, thereby producing ten validation performance estimates that reflect model robustness under different data splits. These results are presented in Figure 10 and Table 8 for both the baseline TabTransformer and the TabTransformer–PSO models.

Across the folds, both models achieved AUC values well above 0.80, indicating high discriminative capability in all validation splits. Perfect scores (AUC = 1.00) were occasionally observed in both models, particularly in folds 2, 3, and 6 for TabTransformer, and folds 2, 3, 6, and 8 for TabTransformer–PSO, showing that in some partitions the models achieved ideal separation between classes. The lower scores (e.g., TabTransformer fold 4 = 0.81, fold 9 = 0.83) point to partitions where classification was more challenging, likely due to differences in the distribution of training and validation samples. Notably, PSO tuning mitigated some of the drop-offs seen in the baseline model—for example, in fold 4, performance improved from 0.81 to 0.94. The mean AUCs (TabTransformer = 0.93, TabTransformer–PSO = 0.94) confirm that both approaches provide excellent predictive performance. The slightly smaller standard deviation in the PSO-optimized model (0.06 vs. 0.07) suggests more consistent outcomes across folds, reflecting greater robustness to sampling variability.

3.8. Areal Extent and Statistical Analysis of Susceptibility Classes

The spatial distribution of susceptibility categories varied slightly between the baseline TabTransformer and the PSO-optimized model (Table 9). In the TabTransformer output, very-low- and low-susceptibility zones accounted for approximately 20.22% (≈808,800 km²) and 21.91% (≈876,400 km²) of Central Asia, respectively, while 18.15% (≈726,000 km²) was classified as very high susceptibility. In the TabTransformer–PSO output, very-high-susceptibility zones expanded to 19.55% (≈782,000 km²), representing an areal increase of about 56,000 km², and low-susceptibility areas decreased to 20.13% (≈805,200 km²), representing an areal reduction of about 71,200 km². Moderate and high categories maintained similar shares across both models, each covering roughly one-fifth of the total area.

To rigorously compare the spatial susceptibility outputs of the TabTransformer and TabTransformer–PSO models, several non-parametric statistical tests were employed (Table 10). The Chi-squared test indicated highly significant differences in the distribution of susceptibility classes for both models (p < 0.0001). The Friedman test, used for ranking multiple related samples, revealed a statistically significant difference in mean ranks between the models, with TabTransformer–PSO exhibiting a higher mean rank (1.9) compared to the standalone TabTransformer (1.008), again at p < 0.0001. Finally, the Wilcoxon signed-rank test confirmed significant pairwise differences in susceptibility values between the two modeling approaches (Z = −9.49, p < 0.0001). Together, these tests provide strong statistical evidence that hyperparameter tuning via PSO induces meaningful variations in the spatial patterns of predicted heatwave susceptibility across Central Asia.

4. Discussion

The following discussion addresses the central research question—namely, whether a PSO-optimized TabTransformer, powered entirely by remote sensing data, can generate accurate, high-resolution, and transferable heatwave susceptibility maps across Central Asia’s data-scarce, trans-boundary landscapes. Section 4.1 examines model behavior and variable influence to elucidate the environmental drivers captured by the framework. Section 4.2, Section 4.3 and Section 4.4 evaluate the role of PSO in hyperparameter tuning, its effect on learning dynamics, and the resulting gains in efficiency and generalization. Section 4.5 situates performance metrics in the context of real-world applicability, while Section 4.6 and Section 4.7 interpret spatial outputs quantitatively and geographically at national scales. Finally, Section 4.8 and Section 4.9 discuss broader implications, limitations, and directions for future research, placing the findings within the broader hazard mapping and climate adaptation literature.

4.1. Model Behavior and Variable Influence

The feature importance results from the TabTransformer model show a clear ranking of the 13 factors used to predict heatwave susceptibility in Central Asia. Thermal and atmospheric indicators dominate, with maximum temperature as the most important predictor—an expected result, since areas with frequent high temperatures are naturally more prone to heat stress, especially when cooling is limited. Following closely, Number of Hot Days (Tmax > 30 °C) scored an importance of 0.245, indicating that not only extreme maximum temperatures but also the frequency of moderately hot days substantially contribute to heatwave vulnerability. This factor reflects the persistence and accumulation of heat exposure, which intensifies physiological and environmental stress over time. The high importance of this metric suggests that areas with many such days are primed for transitioning into full heatwave conditions under favorable synoptic circumstances. Rainfall (0.190) ranked third in importance, highlighting its inverse relationship with heatwave susceptibility. Reduced precipitation limits surface moisture and evaporative cooling, increases soil drying and vegetation stress, and raises surface albedo, thereby promoting conditions favorable for intense heating in Central Asia’s arid regions.

Surprisingly, LST, while conceptually a direct indicator of surface heat stress, held the fourth rank with an importance score of 0.140. Although LST represents the current thermal state of the land and is affected by diurnal and seasonal variability, its moderate importance suggests it is less predictive than Tmax or the frequency of hot days for assessing long-term or large-scale heatwave susceptibility. Among the topographic variables, Aspect (0.136) demonstrated a greater influence than Slope (0.100) and Elevation (0.095). In Central Asia, aspect is critical for localized heating, as south-facing slopes receive greater solar radiation and are consequently more susceptible to elevated daytime temperatures. Meanwhile, elevation inversely correlates with temperature through lapse rate dynamics, with higher altitudes experiencing cooler conditions, thus reducing susceptibility. Humidity (0.114) had moderate importance, reflecting its role in altering perceived temperature through the heat index, increasing the probability of dry heat in arid regions and intensifying thermal stress in humid conditions. Both albedo (0.098) and population density (0.096) showed similar contributions. Lower albedo surfaces increase heat absorption and local warming, while higher population density is linked to anthropogenic heat and Urban Heat Island (UHI) effect. While urbanization is less extensive in Central Asia than in densely populated regions, major cities such as Tashkent, Almaty, and Bishkek still ex-hibit clear heat island signatures, which the model effectively captures.

On the lower end of the importance spectrum, NDVI scored 0.072, indicating a relatively minor yet still relevant role in determining heatwave susceptibility. While vegetation cover cools locally via transpiration and shading, its predictive power is likely weakened by NDVI’s seasonal variability and Central Asia’s widespread low natural vegetation cover. Furthermore, NDVI is more impactful on microclimatic scales, and its effect may be overshadowed by stronger thermal and atmospheric drivers in a macro-scale susceptibility model. Land Cover (0.047) ranked second-lowest, likely because it is relatively static in the region and overlaps with other variables (e.g., NDVI, LST) that indirectly reflect land surface properties. In this modeling framework, land cover may have provided redundant or overlapping information, thereby reducing its standalone contribution. Finally, Number of Days with Heat Index >35 °C was assigned the lowest importance score of 0.028. Although the heat index is widely used in human comfort studies, its limited relevance here likely stems from two factors: (1) it requires high humidity to register significant values, and many regions in Central Asia experience dry heat, making this metric less sensitive; and (2) it overlaps with other thermal indicators like Tmax and LST, which capture similar phenomena with higher temporal resolution. Thus, the model appears to downweight this variable in favor of more directly informative predictors. In sum, thermal metrics (Tmax, heat day frequency) are the top predictors, followed by climate modulators (rainfall, humidity), surface energy balance (LST, albedo), topography, and anthropogenic factors. Variables such as NDVI, land cover, and heat index, while theoretically relevant, contribute less in this specific model due to regional climatic characteristics, redundancy with other inputs, or lower temporal stability. This layered understanding enhances both the interpretability of the model and its value as a decision-support tool.

Relevant literature review reveals that in process-based/regional studies, the role of vegetation—often captured by NDVI or phenology metrics—is frequently linked to microclimatic cooling [83,84]. Greener areas or increased blue-green infrastructure consistently relate to lower modeled or observed heat risk. However, in large-scale, ML-driven or purely meteorological prediction frameworks [25], NDVI is often excluded or found unimportant—likely due to predictor selection (preference for atmospheric/synoptic variables) or coarse data, and limited landscape variation within the model domain. Land cover type, especially the distinction between vegetation (forest, grassland) and built-up areas, repeatedly emerges as a primary heatwave risk modulator in urban and peri-urban hazard models [83]. Increased construction land is associated with elevated UTCI/heat stress, while higher proportions of greenspace and water bodies buffer extreme heat exposure. In climate/impact modeling [84], explicit “what-if” land cover change (LCC) scenarios demonstrate that increased tree cover can meaningfully reduce peak heatwave temperatures, especially under extreme soil moisture/heat stress conditions. Additionally, Heat Index/Number of Days Variables serves as either a threshold for defining heatwave events or as a primary output variable (e.g., annual # HWD in Asadollah et al. [25]). In most complex ML modeling frameworks, predictors based on atmospheric states (e.g., humidity, wind, synoptic pressure) outweigh NDVI or land cover in importance when predicting the number of heatwave days. Nevertheless, in risk mapping oriented around human experience (e.g., UTCI, LST models), the number/duration of heat events is used to partition risk, with land cover/NDVI acting as spatial moderators rather than core predictors. In sum, in regions where high-resolution landscape data are available and the focus is on urban areas, NDVI and land cover are frequently highly influential—because local micro-climate/greenness are decisive for heat exposure [83,84]. In large-scale, synoptic, or ML-intensive models (especially in arid/semi-arid zones or where predictor selection omits local land metrics), atmospheric and meteorological variables dominate variable importance rankings; NDVI/land cover play lesser roles or are omitted (Asadollah et al. [25]; also echoed in spatial patterns in Dong et al. [83] for non-urban zones). Studies emphasizing landscape heterogeneity and operational urban planning needs almost always find land cover/NDVI meaningful, while purely predictive climatological models may mask or minimize their effect. Thus, our finding of low importance for NDVI, land cover, and number of hot days in Central Asia is fully consistent with other scale-matched, climatic-contextualized studies, and the divergence from some urban-focused findings is a consequence of well-documented methodological and environmental distinctions.

4.2. Benefits and Limitations of PSO for Model Tuning

Particle Swarm Optimization (PSO) applied to hyperparameter tuning significantly enhanced the TabTransformer’s predictive capabilities by optimizing architecture parameters that govern learning dynamics, model capacity, and regularization. The PSO framework operates by simulating a population (swarm) of candidate solutions that collectively explore the multidimensional hyperparameter space. Each particle adjusts its trajectory based on its own experience (personal best) and the swarm’s collective knowledge (global best), enabling an adaptive balance between exploration of new configurations and exploitation of promising regions. This metaheuristic strategy is effective for deep learning models, where hyperparameters (e.g., embedding size, layer count, attention heads, dropout) interact complexly to shape convergence, generalization, and overfitting. The optimization resulted in a notably smaller embedding dimension (8 vs. default 32), fewer transformer layers (1 vs. 4), and fewer attention heads (3 vs. 8), accompanied by adjusted dropout rates. This reduction in model complexity suggests that the dataset and problem structure favor a more compact representation, which likely prevents overfitting while retaining sufficient expressiveness. The elevated dropout values (attention dropout 0.2 and feedforward dropout 0.17) imply stronger regularization, helping to mitigate the risk of memorizing noise in the training data. These hyperparameter shifts demonstrate how PSO navigates trade-offs inherent in deep learning: increasing complexity can model richer patterns but risks overfitting, while simpler architectures may generalize better but potentially underfit. PSO’s global search capability effectively balances this trade-off without exhaustive brute-force searches that are computationally prohibitive in such high-dimensional spaces. Despite these advantages, PSO’s stochastic nature and requirement for multiple iterations can incur considerable computational cost, especially for transformer models with long training times. Moreover, the optimal hyperparameters identified are contingent on the specific dataset, model initialization, and PSO parameters (e.g., population size, coefficients), necessitating replication to ensure robustness. Nevertheless, the demonstrated improvements justify its integration as a routine tuning step in transformer-based environmental modeling.

4.3. Metaheuristic Learning Behavior and Efficiency of Convergence

The convergence pattern exhibited by PSO during hyperparameter tuning provides critical insight into the optimization landscape of transformer-based models when applied to complex geospatial classification tasks such as heatwave susceptibility modeling. Unlike traditional exhaustive search methods such as grid search or even stochastic approaches like random search, PSO leverages swarm intelligence to balance exploration and exploitation—and this dynamic is clearly evident in the optimization trace. The flat R² in early iterations (1–25) suggests broad exploration, as the swarm navigated a homogeneous solution space. This phase is vital in ensuring the algorithm does not prematurely commit to a suboptimal basin of attraction. Once more promising regions are located—evidenced by the sharp upturn beginning around iteration 30—the swarm collectively shifts toward higher-performing configurations, exploiting known good regions while still maintaining some exploratory behavior through inertia. The rapid improvement and early stabilization at R² = 0.96 by iteration 63 demonstrate that PSO was highly effective in identifying near-optimal or optimal configurations well before exhausting the iteration budget. This suggests the underlying loss surface of the TabTransformer architecture, in this context, is sufficiently smooth for global optimization algorithms to operate efficiently—an important insight for computational resource allocation in future model development. Additionally, the lack of oscillations or regressions in the convergence curve points to algorithmic stability. This outcome affirms that the balance between global and local coefficients (both set to 2 in this case), combined with an appropriate inertia weight (0.4), created a well-calibrated swarm dynamic capable of sustained convergence. From an operational standpoint, this behavior is highly desirable: it implies that PSO can reduce the number of training iterations and thus computational cost without sacrificing performance. More importantly, the convergence trajectory validates the decision to pair PSO with TabTransformer in a remote sensing context, particularly where high-dimensional inputs and nonlinear dependencies exist—conditions that commonly arise in environmental modeling. The PSO convergence behavior not only confirms the algorithm’s tuning efficacy but also highlights its suitability as a robust hyperparameter optimization method in data-scarce, remote sensing–driven climate applications.

While direct, in-study benchmarking against alternative optimizers such as random search or Bayesian optimization was beyond the scope of this work, the choice of PSO was informed by its strong theoretical and empirical record in the literature. Numerous studies report that PSO variants achieve faster convergence, lower computational cost, and higher accuracy than other stochastic metaheuristics, owing to their ability to track the global best solution while adaptively refining search around promising regions [85,86,87]. Enhanced PSO versions have also outperformed established methods such as COBYLA and Differential Evolution in tasks with smooth optimization landscapes [88]. In specific applied domains, PSO-based methods have achieved performance levels exceeding 85%, consistently surpassing competing algorithms [89]. Nonetheless, existing literature seldom includes head-to-head evaluations against random search or Bayesian optimization on identical, domain-specific tasks, and results can vary substantially depending on algorithm settings, data structure, and search space complexity [90]. In the present study, coupling PSO with TabTransformer produced measurable improvements over untuned baselines across all reported performance metrics, validating its suitability for large-scale, high-dimensional hyperparameter optimization in heatwave susceptibility mapping.

4.4. Comparative Analysis of Model Learning Behavior and Optimization Impact

A detailed examination of the loss and accuracy trajectories reveals profound distinctions in the learning behavior and generalization capability between the baseline TabTransformer and the PSO-enhanced TabTransformer. The baseline model presents a pattern of reliable and incremental improvement, with the gradual convergence of loss and accuracy suggesting that the transformer architecture, when configured with default hyperparameters, is inherently suited to structured geospatial data, offering effective regularization and steady learning. The close association between training and validation performance indicates a well-balanced model free from substantial overfitting or underfitting. However, this conventional setup, while robust, manifests a relatively slow ascent to peak performance and terminates with slightly elevated loss values, hinting at possible inefficiencies in resource use and representational capacity when faced with complex, high-dimensional environmental inputs. PSO hyperparameter tuning significantly alters the model’s training dynamics. The TabTransformer–PSO exhibits not only a sharper and earlier reduction in loss for both training and validation data but also a more pronounced and sustained plateau at minimal loss values, highlighting the efficacy of the discovered hyperparameter configuration. Importantly, the validation loss remains as low as or marginally lower than the training loss throughout most of the process, a sign of well-matched regularization and model complexity—a delicate equilibrium rarely achieved by manual tuning. On the accuracy front, the PSO-tuned model achieves a remarkable leap in early-stage performance, with validation accuracy exceeding 90% almost immediately and stabilizing near the theoretical maximum as training progresses. The consistent parity between training and validation accuracy, and at times the slightly higher validation accuracy, strongly implies that the network is not merely memorizing training patterns but has internalized the essential data-generating processes key to generalization.

These observed differences are attributable to the tailored adjustment of architectural parameters—embedding dimension, layer depth, attention heads, and dropout rates—facilitated by PSO’s global search strategy. PSO dynamically navigates the hyperparameter landscape, striking an optimal balance between model representational power and regularization, thereby enhancing both learning efficiency and predictive robustness. The resource savings from faster convergence and the reduction in risk of overfitting are particularly valuable for large-scale, operational geospatial modeling. Furthermore, the refined decision boundaries and enhanced calibration resulting from this process translate directly into more reliable susceptibility delineation on the ground, which is critical for environmental risk mapping where actionable decisions depend on the model’s accuracy and stability. The epoch-wise training and validation curves exemplify the transformative impact of PSO-based hyperparameter optimization, confirming its suitability and substantial benefit in state-of-the-art, transformer-based remote sensing applications for climate hazard assessment.

4.5. Model Performance in Context

The quantitative comparison between the baseline TabTransformer and the PSO-tuned model revealed consistent improvements across training and test datasets. Key regression metrics—RMSE, MAE, and R²—indicate both better predictive accuracy and stronger explanatory power following PSO. On the training set, the tuned model achieved an RMSE reduction from 0.082 to 0.062 and MAE reduction from 0.027 to 0.019, with R² increasing from 0.97 to 0.98. The test set exhibited similar trends: RMSE declined from 0.132 to 0.123, MAE from 0.038 to 0.034, and R² improved from 0.93 to 0.938. These differences, while seemingly modest, are meaningful in spatial environmental modeling where incremental gains can translate into substantially improved risk delineation over heterogeneous landscapes. The reduction in error metrics implies that the PSO-tuned model’s predictions more closely approximate observed heatwave occurrences, reducing uncertainty in spatial risk assessments. Higher R² values indicate better model fit, capturing more variance and underlying environmental drivers. The substantial discrepancy between country-specific and regional aggregate performance improvements reveals a critical geographic dimension in heatwave susceptibility modeling that warrants careful consideration. While the TabTransformer–PSO model consistently outperformed its baseline counterpart across all Central Asian nations, the magnitude of improvement varied dramatically between individual countries and their collective regional representation. This phenomenon suggests the presence of distinct climatic, topographic, and environmental characteristics within each country that respond differently to the PSO algorithm, indicating that heatwave susceptibility patterns are inherently heterogeneous across the Central Asian landscape. The exceptional performance observed in Uzbekistan, which achieved nearly 58% improvement in both MAE and RMSE metrics, can be attributed to several interconnected factors that distinguish this nation within the regional context. Uzbekistan’s geographic position in the heart of Central Asia, characterized by diverse topographic features ranging from the Kyzylkum Desert to mountainous regions, creates complex microclimatic conditions that benefit significantly from the enhanced feature selection and hyperparameter optimization provided by the PSO algorithm.

The geographic clustering evident in the heatwave susceptibility maps provides additional context for understanding these performance variations. The southwestern regions, encompassing much of Uzbekistan and Turkmenistan, display predominantly very high heatwave susceptibility, creating distinct patterns that the PSO-optimized model can more effectively differentiate and predict. This spatial coherence in high-risk areas allows the enhanced model to better delineate the boundaries between susceptibility classes, resulting in more accurate classifications and consequently lower error metrics. In contrast, countries like Kazakhstan, show more heterogeneous susceptibility patterns across their vast territory, potentially explaining why their improvement percentages, while substantial, do not reach the levels observed in Uzbekistan. The aggregation effect that reduces apparent model improvements from country-specific levels to regional levels represents a classic example of Simpson’s paradox in geospatial analysis. When individual country datasets are pooled to create a regional model, the unique characteristics and optimization benefits specific to each geographic domain become diluted within the larger, more generalized dataset. This phenomenon has profound implications for practical implementation of heatwave early warning systems, suggesting that country-specific model deployment would yield significantly better prediction accuracy than a single regional model approach. The weighted average effect of pooling diverse geographic conditions, climate regimes, and topographic features across the entire Central Asian region creates a modeling environment where the PSO algorithm cannot fully exploit the localized patterns that drive its exceptional performance at individual country scales. The implications of these findings extend beyond academic interest to practical climate adaptation strategies across Central Asia. The demonstrated country-specific benefits of PSO suggest that national meteorological services would achieve superior heatwave prediction accuracy by implementing individualized models rather than relying on regional approaches. This localized modeling strategy could significantly enhance early warning system effectiveness, potentially saving lives and reducing economic impacts associated with extreme heat events. Moreover, the geographic heterogeneity in model improvements indicates that countries experiencing the highest susceptibility to heatwaves, such as those in the southwestern portions of the region, also benefit most from advanced modeling techniques, creating a synergistic relationship between vulnerability and predictive capability enhancement.

Furthermore, the AUROC analysis confirmed a strengthened ability to discriminate between heatwave and non-heatwave classes. High AUROC values (>0.9) across both models denote strong classification performance, with PSO tuning yielding consistent gains. This reinforces the tuned model’s potential utility in operational hazard mapping where precise identification of high-risk zones is critical. Importantly, the congruence of training and test performances indicates good generalization, signifying that regularization and model complexity control mechanisms embedded within the transformer and optimized via PSO effectively prevent overfitting. Given the spatial autocorrelation and temporal variability inherent in environmental data, this robust performance highlights the framework’s capacity to handle real-world complexities. The comparative 10-fold cross-validation results indicate that both the baseline TabTransformer and the PSO-optimized variant delivered consistently strong classification performance, achieving mean AUC values above 0.93 across all folds. The TabTransformer–PSO model not only attained a slightly higher mean AUC (0.94 vs. 0.93) but also recorded a lower standard deviation (0.06 vs. 0.07), suggesting a more stable performance profile across the folds. This modest yet consistent improvement supports the effectiveness of PSO in fine-tuning hyperparameters to enhance both predictive accuracy and model robustness. The relatively narrow variance in fold-wise results for both models underscores their generalizability, while PSO appears to further reduce susceptibility to performance fluctuations caused by variations in the training–validation split. In the context of heatwave susceptibility mapping, such stability is crucial for ensuring reliable spatial predictions under varying input conditions.

It is noteworthy that our goal was to develop a truly remote sensing–based framework, given the unique characteristics of heatwaves and the geographic realities of Central Asia. Unlike sudden-onset hazards such as landslides, where precise event localization is possible through on-site observation, heatwaves are dynamic, spatially extensive, “creeping” phenomena. Their occurrence and intensity fluctuate gradually across regions, often without a singular, pinpointable ground locus. Remote sensing–derived indices—such as the Heat Wave Index (HWI)—enable the systematic detection of widespread anomalies over large areas, which is essential for regions of this scale. Moreover, Central Asia’s vast territorial expanse presents practical challenges for in situ data collection, even where ground-based records exist. Ensuring well-distributed, spatially representative heatwave evidence is considerably more difficult than simply maximizing the number of samples. Our approach intentionally balanced abundance with representativeness, selecting 200 georeferenced points from remote sensing products to reflect the diversity of climatological zones and land covers across all five countries. A larger sample size (e.g., thousands of densely clustered points) could risk overfitting and degrade generalizability. This rationale is consistent with similar environmental susceptibility studies—for example, Liu et al. [91] employed a composite Standardized Drought Condition Index (SDCI) derived entirely from remote sensing as an evidence layer for drought modeling when harmonized, large-scale ground truth was unavailable. Even where heatwave impact records exist, they are typically sparse, irregularly distributed, or influenced by differences in national reporting standards. The rarity of large-scale heatwave susceptibility mapping reflects not only methodological novelty but also the constraints of assembling suitable ground evidence across international, multi-ecotope domains.

Finally, it should be noted that both remote sensing indices and ground-based datasets have inherent sources of uncertainty. While RS-based indices may be limited by coarse spatial or temporal resolution, ground-truth heatwave data is itself complicated by subjective thresholds (e.g., population vulnerability, adaptation, and perception), making the construction of a comprehensive and standardized ground-evidence database inherently challenging.

4.6. Agreement and Divergence in Susceptibility Outputs

The areal comparisons of heatwave susceptibility maps indicate that PSO led to subtle but meaningful shifts in the spatial pattern of susceptibility classes. Specifically, the TabTransformer–PSO model produced a larger extent of very-high-susceptibility zones (increase of ≈ 56,000 km²) coupled with reductions in the low (decrease of ≈ 71,200 km²) and very low categories. Although the relative percentage changes may appear small, the absolute differences translate to tens of thousands of square kilometers—an extent large enough to encompass multiple provinces or major metropolitan regions in Central Asia. If we treat the PSO-optimized model as the more reliable baseline, then the weaker-performing TabTransformer effectively misclassified this ≈ 56,000 km² as low or very-low risk when it should have been flagged as very high susceptibility. Such false-negative errors carry significant management and socio-economic consequences: regions wrongly assumed to be safe may become targets for urban expansion, infrastructure investment, or agricultural intensification, placing populations, economic assets, and critical systems squarely in harm’s way during future heatwaves. This creates a risk of avoidable losses in human health, agricultural productivity, and infrastructure resilience, along with increased emergency response burdens. Conversely, the contraction of low-risk areas in the PSO-optimized output signals that safety margins in certain locations may be narrower than previously assumed, highlighting the need for adaptive planning even in zones historically considered resilient. While the shifts in moderate- and high-susceptibility classes were relatively minor, the consistent coverage patterns across models still reinforce their agreement on large-scale susceptibility gradients.

The statistical evaluation of susceptibility maps generated by the two models sheds light on spatial consistency and differences resulting from hyperparameter tuning. The Chi-squared test revealed highly significant differences (p < 0.0001) in class frequency distributions, indicating that the models allocate areas differently across susceptibility categories. This suggests that PSO tuning modifies the sensitivity and thresholds applied to raw predictive outputs, refining the spatial classification of heatwave hazard. The Friedman test (p < 0.0001) confirmed systematic differences in spatial susceptibility rankings. This test’s sensitivity to ordinal differences highlights how PSO tuning affects not only class frequencies but the relative ordering of areas by vulnerability. Finally, the Wilcoxon signed-rank test comparing paired spatial predictions confirmed that differences at individual locations are significant, reinforcing that tuning alters local susceptibility estimates rather than merely shifting global statistics. This is critical for practical applications, where local accuracy determines the effectiveness of targeted interventions. Together, these statistical tests validate that hyperparameter tuning materially changes susceptibility outputs. The changes likely arise from enhanced model generalization, improved feature weighting, and better calibration of decision boundaries within the transformer architecture. While the broad spatial patterns of susceptibility remain coherent, PSO tuning provides a more nuanced delineation of hotspots and transitional zones, potentially enabling more precise resource allocation and hazard mitigation.

4.7. Geographic Interpretation: Country-Wise Discussion of Heatwave Susceptibility

A country-wise examination of the susceptibility outputs, contextualized using the overly on country borders, reveals substantive patterns that underscore both regional climatic realities and the advantages conferred by PSO-based model optimization. Kazakhstan, the largest country in the study area, displays striking north–south contrasts. In both models, the northern and eastern regions—covering the Kazakh Uplands and the Altai foothills—are predominantly classified as low to very low susceptibility, corresponding to their cooler, higher-elevation, and more temperate climate regimes. Moving southward, particularly below approximately 45° N latitude, susceptibility intensifies markedly, with the strongest effects observed in the dry, expansive lowlands bordering Uzbekistan and around the Aral Sea basin. The TabTransformer–PSO output offers a distinctly sharper separation of these zones compared to the baseline model, narrowing the bands of high susceptibility to those areas that are physiographically most exposed to extreme heat. This refinement aligns well with observed climatic trends, lending support to the model’s improved realism. Uzbekistan emerges as one of the most heatwave-prone countries in the study, a finding consistent across both susceptibility maps. Most of the republic falls into the high or very-high-susceptibility categories—particularly the Kyzylkum Desert, the lower Amudarya basin, and urban centers such as Tashkent and Samarkand. The PSO-enhanced map delineates urban and irrigated oases with better granularity, differentiating moderate susceptibility zones in the Fergana Valley and river corridors from the contiguous high-susceptibility deserts. Notably, the cities and agricultural heartlands are more tightly circumscribed by zones of extreme exposure in the PSO-optimized model, emphasizing elevated vulnerability in densely populated areas.

Turkmenistan consistently shows widespread high to very high susceptibility throughout its interior, especially across the Karakum Desert and surrounding lowland plains. The area of maximal susceptibility is slightly reduced and more precisely defined in the TabTransformer–PSO result, with clearer boundaries between high-risk desert interiors and the relatively less-susceptible borderlands near the Kopet Dag foothills along the Iranian frontier. This improved spatial resolution is critical for targeting interventions in this country, where marginal climatic and water resource conditions prevail. Kyrgyzstan is characterized by markedly lower susceptibility in its mountainous east and central regions, as both models capture the climate-mitigating influence of high elevations in the Tien Shan. Lower-lying areas in the southwest, including the Fergana Valley and the Chuy Valley near Bishkek, transition into higher susceptibility categories, though the PSO-based model portrays these at slightly greater spatial fidelity, avoiding overgeneralization into neighboring highlands. The result is a risk landscape that mirrors actual climatic and topographical gradients, crucial for a predominantly mountainous nation. Tajikistan, dominated by the Pamir and Alai mountains, overwhelmingly falls into the very low to moderate susceptibility classes, especially in the east and central massif. However, both models identify the southwestern and western lowland areas—especially the Vakhsh and Panj Valley corridors—as moderate to high-risk zones. The optimized TabTransformer–PSO provides a finer discrimination of these river-basin risk areas, crucial for hazard mapping in Tajikistan’s densely settled, agriculturally intensive valleys.

In a nutshell, while broad patterns are preserved, PSO ensures greater spatial precision and ecological plausibility across all countries. Susceptibility classes are more tightly correlated with topo-climatic and environmental characteristics—such as temperature and rainfall, aspect and elevation, and known heatwave hotspots—under the TabTransformer–PSO model. This sharpening of boundaries and improved mapping of transitional zones increases the operational value of susceptibility outputs, supporting targeted policy and intervention efforts in the region’s varied socio-environmental contexts. Ultimately, the integration of advanced deep learning and metaheuristic optimization yields susceptibility maps that are not only statistically robust but also geographically meaningful—an advance of direct relevance to heatwave adaptation planning in Central Asia.

The country-wise susceptibility patterns produced by our TabTransformer–PSO framework are strongly supported by independent climatological and impact-focused studies. In Kazakhstan, our very high-risk zones in the southern lowlands and around the Aral Sea closely correspond to regions identified by Wang et al. [42] as experiencing the most frequent and intense heatwaves, driven primarily by rapid soil moisture decline from reduced precipitation and elevated net radiation. The March 2025 West Asia heatwave attribution study further highlights southern and eastern Kazakhstan as recording unprecedented early-season heat, coinciding with sensitive agricultural phenophases [66]; this aligns with the agricultural–public health vulnerability focus of Broomandi et al. [64]. In Uzbekistan, our concentration of very high susceptibility in the Kyzylkum Desert, the lower Amu Darya basin, and urban Tashkent mirrors WWA [66] findings, which attribute these hotspots to combined soil moisture scarcity, UHI intensification, and dependence on glacier-fed irrigation. For Turkmenistan, we map pervasive high susceptibility across the Karakum Desert core, with moderating effects along Caspian coastal regions; this is consistent with Wang et al. [42], who observed chronic aridity and radiation load in the desert interior, and with WWA [66], which documented severe March 2025 anomalies in the country’s central and southern lowlands. In Kyrgyzstan, our model emphasizes elevated susceptibility in the low-lying Chuy and Fergana valleys, matching the hotspot valleys recognized by Fallah et al. [36] in multi-decadal heat extremes, and corroborating WWA [66] observations of UHI amplification in Bishkek and other expanding urban centers. Similarly, in Tajikistan, our identification of the Vakhsh and Panj valleys as higher susceptibility zones is supported by Fallah et al. [36], who cite intense pre-monsoon radiation and low evaporative cooling as key drivers, and WWA [66], which notes early-season extremes affecting these irrigated lowlands. Across all countries, the dominant environmental factors emphasized in the previous literature—notably soil moisture depletion, precipitation deficits, high solar radiation, UHI effects, and elevation—are explicitly represented among our predictors, reinforcing the physical credibility of the susceptibility patterns. Where differences arise—such as our sharper discrimination of Caspian coastal moderation in Turkmenistan—they largely reflect the finer 30 m RS-driven resolution and heatwave-specific focus of our approach, compared with broader temperature-extreme assessments in some studies. These comparisons are provided in detail in Table 11, summarizing country-specific hotspots, associated environmental drivers, and the level of agreement between our high-resolution heatwave-focused mapping and broader temperature-extreme or multi-hazard assessments from the literature.

4.8. Broader Implications and Applications

The integration of advanced transformer-based deep learning with PSO hyperparameter tuning in this study offers a powerful, data-driven methodology for heatwave susceptibility mapping in Central Asia. This region’s wide-ranging climatic zones, complex terrain, and rapid socioeconomic changes pose significant challenges to traditional statistical or physically based models, which often rely on sparse observations and linear assumptions. By leveraging high-resolution satellite data and flexible neural architectures, this approach captures the nonlinear, multivariate relationships driving heatwave dynamics, supporting scalable and transferable modeling. PSO’s improvements support its broader use in geospatial deep learning to boost performance and reliability. The practical applications of these findings are manifold. Urban planners and disaster management agencies can use fine-grained susceptibility maps to identify vulnerable neighborhoods and prioritize mitigation actions such as increasing urban green spaces, improving building materials, and implementing early warning systems. Policymakers can integrate these spatial risk assessments into climate adaptation strategies, directing investments to areas with greatest need. Furthermore, the methodological framework’s reliance on publicly available data and open-source algorithms enhances reproducibility and facilitates adaptation to other regions facing heatwave risks. Future work can build on this foundation by incorporating additional socio-economic variables, climate projections, or multimodal datasets to enrich susceptibility characterization.

From a climate risk governance perspective, mapping accuracy gains that translate into tens of thousands of square kilometers’ reclassification are not trivial—they can reshape priorities for early-warning systems, heat-health action plans, agricultural advisories, and infrastructure cooling strategies, preventing long-term socio-economic setbacks in one of the world’s most heatwave-vulnerable regions. This case highlights that even small numerical differences in performance metrics should not be overlooked; a model with seemingly high accuracy, such as the baseline TabTransformer, may still lead to substantial misclassification when applied across large geographic extents, resulting in far-reaching and potentially detrimental consequences. It should be emphasized that susceptibility maps—particularly those generated at continental-to-regional scales—are intended to guide preliminary susceptibility assessments and policy prioritization. They identify relative hotspots that warrant further, site-specific evaluation, rather than providing definitive, actionable sites at high spatial precision. For management application, top-down analytical frameworks such as ours should be succeeded by detailed, local validation and customized interventions.

4.9. Uncertainty Considerations in Heatwave Susceptibility Mapping

Uncertainty is an inherent component of environmental risk mapping, arising from multiple sources within both the input data and the modeling process. In the present study, all predictors and evidence layers were derived from remote sensing and global reanalysis products, which, although spatially continuous, differ in spatial resolution, temporal frequency, and retrieval algorithms. For example, ERA5-derived heatwave evidence is provided at a coarse grid (0.25°) and must be downscaled to align with finer-resolution environmental predictors such as MODIS-based NDVI or land surface temperature. This multi-scale integration can introduce spatial misalignment and propagate errors through the modeling pipeline. Additionally, variable retrievals—such as vegetation indices or land cover classifications—are sensitive to atmospheric correction quality, sensor calibration, and seasonal acquisition timing, contributing to measurement uncertainty.

Model-driven uncertainty is also relevant. While the TabTransformer architecture can capture complex, non-linear relationships across mixed-type variables, its training outcomes are sensitive to hyperparameter selection, initialization, and stochastic learning dynamics. Although PSO-based optimization helped converge toward a performant solution, slight variations in training data splits or optimizer hyperparameters could yield differences in predicted susceptibility surfaces. Furthermore, the absence of in situ ground-truth data for direct validation introduces epistemic uncertainty: while the susceptibility maps display strong internal performance metrics, they do not quantify how residual errors vary across different environmental contexts or administrative regions.

From a management perspective, such uncertainties influence the confidence with which susceptibility classes can be used for early warning, land-use planning, or infrastructure prioritization. Overestimation could lead to inefficient allocation of resources, whereas underestimation might leave vulnerable areas unprepared. Mitigation of these uncertainties could be pursued through improved spatial harmonization of input datasets, incorporation of independent ground observations where feasible, and ensemble modeling to average over variable model configurations and data perturbations. Future work should extend the present framework with a formal spatial uncertainty analysis—such as Monte Carlo simulations, prediction interval mapping, or pixel-level variance decomposition—to generate both susceptibility and confidence surfaces. Such paired outputs would allow stakeholders to integrate uncertainty directly into heatwave susceptibility-to-risk assessment and decision-making processes, ensuring more robust and transparent hazard management.

4.10. Limitations and Future Research Directions

While remote sensing-based approaches offer powerful advantages for transboundary, data-scarce susceptibility mapping, they also introduce limitations. The absence of ground-based meteorological and impact data in this study both reflects regional data scarcity and necessarily restricts some aspects of model calibration and validation. Although the workflow leverages ERA5 and satellite products for maximally consistent and reproducible mapping, ground observations could provide additional insights for model adjustment, especially regarding localized or microclimatic extreme events. Differences in remote sensing product resolution, revisit intervals, and potential data artifacts introduce further uncertainty. Additionally, the dynamic and distributed nature of heatwaves means that associations with on-the-ground impacts are not always direct; population- and context-dependent vulnerabilities further complicate the establishment of hard ground “truth”. Nevertheless, as more in situ meteorological and health-impact datasets become available in Central Asia, future work should pursue integrated, hybrid approaches that fuse remote sensing and ground-based evidence. This could improve both the generality and precision of susceptibility mapping, as well as contribute to operational heatwave risk management. Finally, susceptibility maps should be understood as a screening tool for broad hazard assessment and policy prioritization. While our results identify generalized hazard hotspots, localized on-site validation and more granular investigations are needed before implementing targeted adaptation or management actions. Future work could therefore combine detailed field studies or citizen-reported impact records with RS-derived susceptibility to refine and localize intervention strategies.

In addition, while this study adopted a two-way (training/validation) partitioning strategy supplemented with 10-fold cross-validation and internal validation within the TabTransformer architecture, we acknowledge that a three-way split (training/validation/testing) can be especially valuable for smaller, noisier, or highly imbalanced datasets. Such partitioning allows for clearer separation between hyperparameter tuning and final performance evaluation. Although the abundance and spatial representativeness of our dataset, coupled with rigorous cross-validation, rendered an explicit validation set unnecessary for the present work, future studies in data-limited settings—particularly those relying on shorter climatic records or rare event samples—could benefit from incorporating a dedicated validation subset to further safeguard against overfitting and ensure maximal generalizability.

Another methodological limitation relates to the absence of direct benchmarking between PSO and alternative hyperparameter optimization strategies such as random search, Bayesian optimization, or other metaheuristics. While the present study adopted PSO based on established literature highlighting its convergence efficiency and suitability for high-dimensional search spaces, the relative performance of different optimizers was not empirically assessed for this specific application. Future work should therefore incorporate systematic head-to-head evaluations under comparable experimental conditions to quantify trade-offs in accuracy, convergence speed, and computational cost, and to better characterize optimizer–model interactions in the context of environmental hazard mapping.

5. Conclusions

This study demonstrated the integration of the TabTransformer deep learning architecture with Particle Swarm Optimization (PSO) metaheuristics for remote sensing-based heatwave susceptibility mapping in Central Asia. Multiple independent evaluation approaches—including 10-fold cross-validation, statistical significance testing (Chi-squared, Friedman, and Wilcoxon, all p < 0.0001), and areal extent analysis—consistently showed that the PSO-optimized model delivered higher predictive accuracy, greater stability across folds, and improved spatial delineation of susceptibility zones compared with the baseline. These findings attest to the robustness of the framework under different data partitions and confirm its generalizability across varying geographic subsets within the study area. The model’s applicability is reinforced by the fact that even marginal gains in AUC and classification precision translated into reclassifications of up to tens of thousands of square kilometers, enough to substantially alter the designation of priority areas for adaptation.

From a practical standpoint, the areal comparison between models underscores the tangible management consequences of enhanced mapping accuracy: if the weaker-performing baseline had been used as the sole decision support, approximately 56,000 km² of land—previously labeled as low or very low risk—would not have been recognized as very high susceptibility. Such false-negative misclassifications could inadvertently channel urban expansion projects, infrastructure investment, or agricultural intensification into areas facing elevated heatwave risk, with long-term socio-economic and public health consequences. Conversely, the reduction in low-risk zones in the PSO-optimized output warns of narrower safety margins, prompting more cautious land-use and climate adaptation planning even in places traditionally seen as resilient. This case demonstrates that even modest shifts in headline performance metrics can trigger large-scale changes in operational decision landscapes when applied to continental-scale hazard mapping, yet the consistent broad-pattern agreement between models still affirms their convergence on the dominant regional susceptibility gradients.

The produced susceptibility maps highlight pronounced spatial patterns across the study area, identifying southern and southwestern subregions—such as Turkmenistan, Uzbekistan, and southern Kazakhstan—as most prone to heatwave hazards, in line with climatic and physiographic realities. The PSO-optimized model particularly excelled at sharpening spatial transitions and reducing the overextension of high-susceptibility zones, enhancing practical interpretability and value for targeting interventions. Feature importance analyses highlighted the primacy of thermal metrics—maximum temperature and frequency of hot days—while also affirming supporting roles for rainfall, land surface temperature, and topography. From a policy and planning perspective, the susceptibility maps generated through this approach can serve as actionable tools for prioritizing communities and infrastructure for heat adaptation interventions. Decision-makers can use these spatial outputs to identify high-risk hotspots, optimize early warning systems, and guide targeted allocation of resources such as cooling centers, green infrastructure, and public awareness campaigns.

However, we acknowledge that the spatial resolution of satellite-derived datasets imposes certain limitations, particularly in heterogeneous or urbanized landscapes where local-scale variations may be underrepresented. While in situ impact data were not uniformly available for our study region, we sought out comparable and relevant studies reporting historical heatwave impacts in analogous climatic contexts to provide a fair, real-world basis for interpreting our validation results. Integrating such datasets in future applications will strengthen context-specific calibration, enhance operational confidence in the model’s outputs, and improve the granularity and accuracy of subsequent susceptibility assessments. Of particular note, the proposed framework is highly scalable and transferable to other data-scarce regions that experience heatwave hazards, provided that comparable remote sensing and ancillary datasets are available. This scalability arises from the exclusive use of open-access, globally consistent satellite-derived indicators, the domain-agnostic structure of the TabTransformer model—which flexibly accommodates diverse geospatial and environmental variables—and the PSO-based automatic hyperparameter optimization, which minimizes manual adjustment and the need for expert regional intervention. Adaptation to diverse climatic and landscape contexts can thus be achieved with only minor modifications to input features or thresholds, enabling rapid and reproducible deployment at broader geographic scales.

Looking forward, expanding the framework to incorporate multi-year temporal validation, finer-resolution predictors in urbanized or heterogeneous areas, and integration with region-specific impact datasets will further enhance confidence in operational deployment. These steps, combined with its demonstrated robustness, scalability, and consistency with regional heatwave trends, position the approach as an immediately valuable and continually improvable tool for climate adaptation strategy in Central Asia and beyond.

Author Contributions

Conceptualization, A.W.; methodology, A.W.; software, A.W.; validation, A.W. and L.S.; formal analysis, A.W.; investigation, A.W.; resources, A.W.; data curation, A.W.; writing—original draft preparation, A.W.; writing—review and editing, A.W., L.S. and H.J.; visualization, A.W.; supervision, H.J.; project administration, A.W.; funding acquisition, A.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was jointly funded by the 2024 Henan Police College’s college-level research project (HNJY-2024-SSZX-37) and Henan Province 2024 Science and Technology Research and Development Program Joint Fund (242103810099).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author(s).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Nairn, J.R.; Fawcett, R.J. The excess heat factor: A metric for heatwave intensity and its use in classifying heatwave severity. Int. J. Environ. Res. Public Health 2015, 12, 227–253. [Google Scholar] [CrossRef]
Nishant, N.; Ji, F.; Guo, Y.; Herold, N.; Green, D.; Di Virgilio, G.; Perkins-Kirkpatrick, S. Future population exposure to Australian heatwaves. Environ. Res. Lett. 2022, 17, 064030. [Google Scholar] [CrossRef]
Dong, C.; Wang, X.; Ran, Y.; Nawaz, Z. Heatwaves significantly slow the vegetation growth rate on the Tibetan Plateau. Remote Sens. 2022, 14, 2402. [Google Scholar] [CrossRef]
Ngarambe, J.; Santamouris, M.; Yun, G.Y. The impact of urban warming on the mortality of vulnerable populations in Seoul. Sustainability 2022, 14, 13452. [Google Scholar] [CrossRef]
Cvijanovic, I.; Mistry, M.N.; Begg, J.D.; Gasparrini, A.; Rodó, X. Importance of humidity for characterization and communication of dangerous heatwave conditions. NPJ Clim. Atmos. Sci. 2023, 6, 33. [Google Scholar] [CrossRef]
Klingelhöfer, D.; Braun, M.; Brüggmann, D.; Groneberg, D.A. Heatwaves: Does global research reflect the growing threat in the light of climate change? Glob. Health 2023, 19, 56. [Google Scholar] [CrossRef]
García-León, D.; Casanueva, A.; Standardi, G.; Burgstall, A.; Flouris, A.D.; Nybo, L. Current and projected regional economic impacts of heatwaves in Europe. Nat. Commun. 2021, 12, 5807. [Google Scholar] [CrossRef]
Chen, K.; Boomsma, J.; Holmes, H.A. A multiscale analysis of heatwaves and urban heat islands in the western US during the summer of 2021. Sci. Rep. 2023, 13, 9570. [Google Scholar] [CrossRef]
Daryanto, S.; Wang, L.; Jacinthe, P.A. Global synthesis of drought effects on maize and wheat production. PLoS ONE 2016, 11, e0156362. [Google Scholar] [CrossRef]
Heinicke, S.; Frieler, K.; Jägermeyr, J.; Mengel, M. Global gridded crop models underestimate yield responses to droughts and heatwaves. Environ. Res. Lett. 2022, 17, 044026. [Google Scholar] [CrossRef]
Ghafarian, F.; Wieland, R.; Nendel, C. Estimating the evaporative cooling effect of irrigation within and above soybean canopy. Water 2022, 14, 319. [Google Scholar] [CrossRef]
Chen, Y.; Zhang, W.; Zhou, T. Increasing exposure of global croplands productivity to growing season heatwaves under climate warming. Environ. Res. Lett. 2024, 19, 104073. [Google Scholar] [CrossRef]
Dell, M.; Jones, B.F.; Olken, B.A. What do we learn from the weather? The new climate-economy literature. J. Econ. Lit. 2014, 52, 740–798. [Google Scholar] [CrossRef]
Sudarshan, A.; Somanathan, E.; Somanathan, R.; Tewari, M. The Impact of Temperature on Productivity and Labor Supply-Evidence from Indian Manufacturing; Working Paper No. 244; RWI—Leibniz-Institut für Wirtschaftsforschung: Essen, Germany, 2015. [Google Scholar]
Wang, Y.; Zhao, N. Spatiotemporal Variations of Global Human-Perceived Heatwave Risks and their Driving Factors Based on Machine Learning. Remote Sens. 2023, 15, 3627. [Google Scholar] [CrossRef]
Renninger, A.; Holubowska, O.; Blanchard, P. Remote sensing and GPS mobility reveal heat’s impact on human activity across diverse climates. Phys. Soc. 2024, in press. [Google Scholar] [CrossRef]
Ruthrof, K.X.; Breshears, D.D.; Fontaine, J.B.; Froend, R.H.; Matusick, G.; Kala, J.; Hardy, G.E.S.J. Subcontinental heat wave triggers terrestrial and marine, multi-taxa responses. Sci. Rep. 2018, 8, 13094. [Google Scholar] [CrossRef]
Ruffault, J.; Curt, T.; Moron, V.; Trigo, R.M.; Mouillot, F.; Koutsias, N.; Belhadj-Khedher, C. Increased likelihood of heat-induced large wildfires in the Mediterranean Basin. Sci. Rep. 2020, 10, 13790. [Google Scholar] [CrossRef]
Stone, B., Jr.; Mallen, E.; Rajput, M.; Gronlund, C.J.; Broadbent, A.M.; Krayenhoff, E.S.; Georgescu, M. Compound climate and infrastructure events: How electrical grid failure alters heat wave risk. Environ. Sci. Technol. 2021, 55, 6957–6964. [Google Scholar] [CrossRef]
Kephart, J.L.; Sánchez, B.N.; Moore, J.; Schinasi, L.H.; Bakhtsiyarava, M.; Ju, Y.; Rodríguez, D.A. City-level impact of extreme temperatures and mortality in Latin America. Nat. Med. 2022, 28, 1700–1705. [Google Scholar] [CrossRef]
García-García, A.; Cuesta-Valero, F.J.; Miralles, D.G.; Mahecha, M.D.; Quaas, J.; Reichstein, M.; Peng, J. Soil heat extremes can outpace air temperature extremes. Nat. Clim. Change 2023, 13, 1237–1241. [Google Scholar] [CrossRef]
Chongtaku, T.; Taparugssanagorn, A.; Miyazaki, H.; Tsusaka, T.W. Enhanced Spatiotemporal Heatwave Analysis in Urban and Non-Urban Thai Environments Through Integration of In-Situ and Remote Sensing Data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 19174–19193. [Google Scholar] [CrossRef]
Argüesoa, D.; Evansa, J.P.; Fitaa, L.; Bormannab, K.J. Simulated impact of urban expansion on future temperature heatwaves in Sydney. In Proceedings of the 20th International Congress on Modelling and Simulation, Adelaide, Australia, 1–6 December 2013; pp. 3179–3185. [Google Scholar]
Park, J.; Kim, J. Defining heatwave thresholds using an inductive machine learning approach. PLoS ONE 2018, 13, e0206872. [Google Scholar] [CrossRef]
Asadollah, S.B.H.S.; Khan, N.; Sharafati, A.; Shahid, S.; Chung, E.S.; Wang, X.J. Prediction of heat waves using meteorological variables in diverse regions of Iran with advanced machine learning models. Stoch. Environ. Res. Risk Assess. 2022, 36, 2387–2402. [Google Scholar] [CrossRef]
Suthar, G.; Singh, S.; Kaul, N.; Khandelwal, S.; Singhal, R.P. Prediction of maximum air temperature for defining heat wave in Rajasthan and Karnataka states of India using machine learning approach. Remote Sens. Appl. Soc. Environ. 2023, 32, 101048. [Google Scholar] [CrossRef]
Toure, M.; Thiaw, W.M.; Sy, I.; Bekele, E.; Gueye, O.; Bhuiyan, M.A.E.; Gaye, A.T. Machine learning-based prediction of heatwave-related hospitalizations: A case study in Matam, Senegal. Int. J. Environ. Res. Public Health 2025, 22, 1349. [Google Scholar] [CrossRef]
Chongtaku, T.; Taparugssanagorn, A.; Miyazaki, H.; Tsusaka, T.W. Integrating remote sensing and ground-based data for enhanced spatial–temporal analysis of heatwaves: A machine learning approach. Appl. Sci. 2024, 14, 3969. [Google Scholar] [CrossRef]
Reischl, C.; Rauter, R.; Posch, A. Urban vulnerability and adaptation to heatwaves: A case study of Graz (Austria). Clim. Policy 2018, 18, 63–75. [Google Scholar] [CrossRef]
Cremonini, L.; Georgiadis, T.; Nardino, M.; Rossi, F.; Rossi, A.; Pinca, G.; Fazzini, M. Tools for urban climate adaptation plans: A case study on Bologna and outcomes for heat wave impact reduction. Challenges 2023, 14, 48. [Google Scholar] [CrossRef]
Zografos, C.; Anguelovski, I.; Grigorova, M. When exposure to climate change is not enough: Exploring heatwave adaptive capacity of a multi-ethnic, low-income urban community in Australia. Urban Clim. 2016, 17, 248–265. [Google Scholar] [CrossRef]
Li, F.; Yigitcanlar, T.; Nepal, M.; Thanh, K.N.; Dur, F. A novel urban heat vulnerability analysis: Integrating machine learning and remote sensing for enhanced insights. Remote Sens. 2024, 16, 3032. [Google Scholar] [CrossRef]
Nasim, W.; Amin, A.; Fahad, S.; Awais, M.; Khan, N.; Mubeen, M.; Wahid, A.; Rehman, M.H.; Ihsan, M.Z.; Ahmad, S.; et al. Future risk assessment by estimating historical heat wave trends with projected heat accumulation using SimCLIM climate model in Pakistan. Atmos. Res. 2018, 205, 118–133. [Google Scholar] [CrossRef]
Dubey, A.K.; Lal, P.; Kumar, P.; Kumar, A.; Dvornikov, A.Y. Present and future projections of heatwave hazard-risk over India: A regional earth system model assessment. Environ. Res. 2021, 201, 111573. [Google Scholar] [CrossRef]
Tripathy, K.P.; Mukherjee, S.; Mishra, A.K.; Mann, M.E.; Williams, A.P. Climate change will accelerate the high-end risk of compound drought and heatwave events. Proc. Natl. Acad. Sci. USA 2023, 120, e2219825120. [Google Scholar] [CrossRef]
Fallah, B.; Didovets, I.; Rostami, M.; Hamidi, M. Climate change impacts on Central Asia: Trends, extremes and future projections. Int. J. Climatol. 2024, 44, 3191–3213. [Google Scholar] [CrossRef]
Nair, A.V.; Araujo, D.S.; Dobariya, R.; Gurung, D.R.; Nikolopoulos, E.I. Will high heatwave risk be the new normal over high-elevation regions of northwestern High Mountain Asia in the future? Environ. Res. Commun. 2025, 7, 061001. [Google Scholar] [CrossRef]
Yang, S.; Ren, Q.; Zhou, N.; Zhang, Y.; Wu, X. Deep Learning for Near-Surface Air Temperature Estimation from FengYun 4A Satellite Data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 17, 13108–13119. [Google Scholar] [CrossRef]
Liang, L.; Yu, L.; Wang, Z. Identifying the dominant impact factors and their contributions to heatwave events over mainland China. Sci. Total Environ. 2022, 848, 157527. [Google Scholar] [CrossRef]
Adnan, M.S.G.; Dewan, A.; Botje, D.; Shahid, S.; Hassan, Q.K. Vulnerability of Australia to heatwaves: A systematic review on influencing factors, impacts, and mitigation options. Environ. Res. 2022, 213, 113703. [Google Scholar] [CrossRef]
Xi, D.; Liu, L.; Zhang, M.; Huang, C.; Burkart, K.G.; Ebi, K.; Ji, J.S. Risk factors associated with heatwave mortality in Chinese adults over 65 years. Nat. Med. 2024, 30, 1489–1498. [Google Scholar] [CrossRef]
Wang, X.; Li, Y.; Chen, Y.; Li, Y.; Wang, C.; Kaldybayev, A.; Duan, W. Intensification of heatwaves in Central Asia from 1981 to 2020–Role of soil moisture reduction. J. Hydrol. 2023, 627, 130395. [Google Scholar] [CrossRef]
Li, T.; Song, F.; Bao, J.; De Maeyer, P.; Yuan, Y.; Huang, X.; Goethals, P. Historical and projected cropland impacts of heatwaves in central Asia under climate change. Earth’s Future 2025, 13, e2024EF005595. [Google Scholar] [CrossRef]
Zheng, J.; Wu, X.; Li, X.; Peng, J. Near-Surface Air Temperature Estimation Based on an Improved Conditional Generative Adversarial Network. Sensors 2024, 24, 5972. [Google Scholar] [CrossRef]
Hoang, N.D.; Tran, V.D.; Huynh, T.C. From Data to Insights: Modeling Urban Land Surface Temperature Using Geospatial Analysis and Interpretable Machine Learning. Sensors 2025, 25, 1169. [Google Scholar] [CrossRef]
Zhang, Y.; Liu, J.; Shen, W. A review of ensemble learning algorithms used in remote sensing applications. Appl. Sci. 2022, 12, 8654. [Google Scholar] [CrossRef]
Tombe, R.; Viriri, S. Remote sensing image scene classification: Advances and open challenges. Geomatics 2023, 3, 137–155. [Google Scholar] [CrossRef]
Deng, P.; Xu, K.; Huang, H. When CNNs meet vision transformer: A joint framework for remote sensing scene classification. IEEE Geosci. Remote Sens. Lett. 2021, 19, 1–5. [Google Scholar] [CrossRef]
Xu, K.; Deng, P.; Huang, H. Vision transformer: An excellent teacher for guiding small networks in remote sensing image scene classification. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–15. [Google Scholar] [CrossRef]
Wang, X.; Zhu, J.; Yan, Z.; Zhang, Z.; Zhang, Y.; Chen, Y.; Li, H. LaST: Label-free self-distillation contrastive learning with transformer architecture for remote sensing image scene classification. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
Zhang, J.; Tang, B.; Hu, S. Infrared and visible image fusion based on particle swarm optimization and dense block. Front. Energy Res. 2022, 10, 1001450. [Google Scholar] [CrossRef]
Afroosheh, S.; Askari, M. Fusion of Deep Learning and GIS for Advanced Remote Sensing Image Analysis. arXiv 2024, arXiv:2412.19856. [Google Scholar]
Li, P.; Yu, Y.; Huang, D.; Wang, Z.H.; Sharma, A. Regional heatwave prediction using graph neural network and weather station data. Geophys. Res. Lett. 2023, 50, e2023GL103405. [Google Scholar] [CrossRef]
Wu, Y.; Gong, M.; Miao, Q.; Qin, K. Computational Intelligence in Remote Sensing. Remote Sens. 2023, 15, 5325. [Google Scholar] [CrossRef]
Yokoya, N.; Yamanoi, K.; He, W.; Baier, G.; Adriano, B.; Miura, H.; Oishi, S. Breaking limits of remote sensing by deep learning from simulated data for flood and debris-flow mapping. IEEE Trans. Geosci. Remote Sens. 2020, 60, 1–15. [Google Scholar] [CrossRef]
Perkins, S.E.; Alexander, L.V. On the measurement of heat waves. J. Climate 2013, 26, 4500–4517. [Google Scholar] [CrossRef]
Shikhovtsev, A.Y.; Kovadlo, P.G.; Kiselev, A.V.; Eselevich, M.V.; Lukin, V.P. Neural network modeling of atmospheric optical turbulence using meteorological reanalysis data. Publ. Astron. Soc. Pac. 2023, 135, 014503. [Google Scholar] [CrossRef]
Sun, H.; Yao, T.; Su, F.; He, Z.; Tang, G.; Li, N.; Zheng, B.; Huang, J.; Meng, F.; Ou, T.; et al. Corrected ERA5 precipitation by machine learning significantly improved flow simulations for the Third Pole basins. J. Hydrometeorol. 2022, 23, 1663–1682. [Google Scholar] [CrossRef]
Russo, S.; Sillmann, J.; Fischer, E.M. Top ten European heatwaves since 1950 and their occurrence in the coming decades. Environ. Res. Lett. 2015, 10, 124003. [Google Scholar] [CrossRef]
Murari, K.K.; Ghosh, S.; Patwardhan, A.; Daly, E.; Salvi, K. Intensification of future severe heat waves in India and their effect on heat stress and mortality. Reg. Environ. Change 2015, 15, 569–579. [Google Scholar] [CrossRef]
Tomlinson, C.; Chapman, L.; Thornes, J.; Baker, C. Derivation of Birmingham’s summer surface urban heat island from MODIS satellite images. Int. J. Climatol. 2012, 32, 214–224. [Google Scholar] [CrossRef]
Ahmadalipour, A.; Moradkhani, H. Escalating heat-stress mortality risk due to global warming in the Middle East and North Africa (MENA). Environ. Int. 2018, 117, 215–225. [Google Scholar] [CrossRef]
Smid, M.; Russo, S.; Costa, A.C.; Granell, C.; Pebesma, E. Ranking European capitals by exposure to heat waves and cold waves. Urban Clim. 2019, 27, 388–402. [Google Scholar] [CrossRef]
Broomandi, P.; Satyanaga, A.; Bagheri, M.; Hadei, M.; Galán-Madruga, D.; Fard, A.M.; Kim, J.R. Extreme Temperature Events in Kazakhstan and Their Impacts on Public Health and Energy Demand. Glob. Chall. 2025, 9, 2400207. [Google Scholar] [CrossRef]
Wang, C.; Li, Z.; Chen, Y.; Li, Y.; Liu, X.; Hou, Y.; Sun, F. Increased compound droughts and heatwaves in a double pack in Central Asia. Remote Sens. 2022, 14, 2959. [Google Scholar] [CrossRef]
WWA [World Weather Attribution]. Extraordinary March Heatwave in Central Asia Up to 10 °C Hotter in a Warming Climate. 2025. Available online: http://www.worldweatherattribution.org/wp-content/uploads/WWA-scientific-report-West-Asia-heat.pdf (accessed on 20 June 2025).
European Centre for Medium-Range Weather Forecasts (ECMWF). ERA5 Documentation. Available online: https://climate.copernicus.eu/climate-reanalysis (accessed on 20 June 2025).
Hersbach, H.; Bell, B.; Berrisford, P.; Hirahara, S.; Horányi, A.; Muñoz-Sabater, J.; Nicolas, J.; Peubey, C.; Radu, R.; Schepers, D.; et al. The ERA5 global reanalysis. Q. J. R. Meteorol. Soc. 2020, 146, 1999–2049. [Google Scholar] [CrossRef]
Hersbach, H. The Climate Data Guide: ERA5 Atmospheric Reanalysis. National Center for Atmospheric Research Staff, Ed.; Available online: https://climatedataguide.ucar.edu/climate-data/era5-atmospheric-reanalysis (accessed on 20 June 2025).
Schneider, D.P.; Deser, C.; Fasullo, J.; Trenberth, K.E. Climate Data Guide Spurs Discovery and Understanding. Eos Trans. AGU 2013, 94, 121–122. [Google Scholar] [CrossRef]
Kornejady, A.; Ownegh, M.; Bahremand, A. Landslide susceptibility assessment using maximum entropy model with two different data sampling methods. Catena 2017, 152, 144–162. [Google Scholar] [CrossRef]
Kornejady, A.; Ownegh, M.; Rahmati, O.; Bahremand, A. Landslide susceptibility assessment using three bivariate models considering the new topo-hydrological factor: HAND. Geocarto Int. 2018, 33, 1155–1185. [Google Scholar] [CrossRef]
Pourghasemi, H.R.; Kornejady, A.; Kerle, N.; Shabani, F. Investigating the effects of different landslide positioning techniques, landslide partitioning approaches, and presence-absence balances on landslide susceptibility mapping. Catena 2020, 187, 104364. [Google Scholar] [CrossRef]
Yu, X.; Chen, H. Research on the influence of different sampling resolution and spatial resolution in sampling strategy on landslide susceptibility mapping results. Sci. Rep. 2024, 14, 1549. [Google Scholar] [CrossRef]
Huang, X.; Khetan, A.; Cvitkovic, M.; Karnin, Z. TabTransformer: Tabular data modeling using contextual embeddings. arXiv 2020, arXiv:2012.06678. [Google Scholar] [CrossRef]
Vyas, T.K. Deep learning with tabular data: A self-supervised approach. arXiv 2024, arXiv:2401.15238. [Google Scholar] [CrossRef]
Zhao, J.; Gao, L.; Ren, S. Prediction of open-pit mine truck travel time based on LSTM-TabTransformer. Sci. Rep. 2025, 15, 7427. [Google Scholar] [CrossRef]
Kennedy, J.; Eberhart, R. Particle swarm optimization. In Proceedings of the ICNN’95—International Conference on Neural Networks, Perth, Australia, 27 November–1 December 1995; IEEE: New York, NY, USA, 1995; Volume 4, pp. 1942–1948. [Google Scholar]
Shi, Y. Particle swarm optimization: Developments, applications and resources. In Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No. 01TH8546), Seoul, Republic of Korea, 26–29 May 2001; IEEE: New York, NY, USA, 2001; Volume 1, pp. 81–86. [Google Scholar]
Sengupta, S.; Basak, S.; Peters, R.A. Particle Swarm Optimization: A survey of historical and recent developments with hybridization perspectives. Mach. Learn. Knowl. Extr. 2018, 1, 157–191. [Google Scholar] [CrossRef]
Rahmati, O.; Kornejady, A.; Samadi, M.; Deo, R.C.; Conoscenti, C.; Lombardo, L.; Bui, D.T. PMT: New analytical framework for automated evaluation of geo-environmental modelling approaches. Sci. Total Environ. 2019, 664, 296–311. [Google Scholar] [CrossRef]
Tian, Y.; Su, D.; Lauria, S.; Liu, X. Recent advances on loss functions in deep learning for computer vision. Neurocomputing 2022, 497, 129–158. [Google Scholar] [CrossRef]
Dong, J.; Peng, J.; He, X.; Corcoran, J.; Qiu, S.; Wang, X. Heatwave-induced human health risk assessment in megacities based on heat stress–social vulnerability–human exposure framework. Landsc. Urban Plan. 2020, 203, 103907. [Google Scholar] [CrossRef]
Stéfanon, M.; Schindler, S.; Drobinski, P.; de Noblet-Ducoudré, N.; Andrea, F.D. Simulating the effect of anthropogenic vegetation land cover on heatwave temperatures over central France. Clim. Res. 2014, 60, 133–146. [Google Scholar] [CrossRef]
Al-Jodah, A.; Abbas, S.J.; Hasan, A.F.; Humaidi, A.J.; Mahdi Al-Obaidi, A.S.; Al-Qassar, A.A.; Hassan, R.F. PSO-based optimized neural network PID control approach for a four wheeled omnidirectional mobile robot. Int. Rev. Appl. Sci. Eng. 2023, 14, 58–67. [Google Scholar] [CrossRef]
Zhao, J.; Fu, Y.; Mei, J. An improved dynamic cooperative random drift particle swarm optimization algorithm based on search history decision. J. Algorithms Comput. Technol. 2020, 14, 1748302620973537. [Google Scholar] [CrossRef]
Swain, A.; Salkuti, S.R.; Swain, K. An optimized and decentralized energy provision system for smart cities. Energies 2021, 14, 1451. [Google Scholar] [CrossRef]
Chernyak, Y.; Mohammad, I.A.; Masnicak, N.; Pivoluska, M.; Plesch, M. Harmonic oscillator based particle swarm optimization. PLoS ONE 2025, 20, e0326173. [Google Scholar] [CrossRef]
Onyelowe, K.C.; Moghal, A.A.B.; Ahmad, F.; Rehman, A.U.; Hanandeh, S. Numerical model of debris flow susceptibility using slope stability failure machine learning prediction with metaheuristic techniques trained with different algorithms. Sci. Rep. 2024, 14, 19562. [Google Scholar] [CrossRef]
Wang, Z.L.; Ogawa, T.; Adachi, Y. Influence of algorithm parameters of Bayesian optimization, genetic algorithm, and particle swarm optimization on their optimization performance. Adv. Theory Simul. 2019, 2, 1900110. [Google Scholar] [CrossRef]
Liu, J.; Li, M.; Li, R.; Shalamzari, M.J.; Ren, Y.; Silakhori, E. Comprehensive assessment of drought susceptibility using predictive modeling, climate change projections, and land use dynamics for sustainable management. Land 2025, 14, 337. [Google Scholar] [CrossRef]

Figure 1. Map of the study area in Central Asia, including its constituent countries.

Figure 2. Methodological flowchart adopted in this study.

Figure 3. Heat Wave Index (HWI) derived from the ERA5 reanalysis dataset and 200 extracted heatwave samples.

Figure 4. Thematic maps of heatwave causative factors used as input variables for the deep learning-based susceptibility model; (a) Temperature; (b) Rainfall; (c) Humidity; (d) Number of hot days; (e) Number of heat index days; (f) Elevation; (g) Slope; (h) NDVI; (i) LST; (j) Population density; (k) Albedo; (l) Land cover; (m) Aspect.

Figure 5. PSO convergence curve showing R² score improvement over 100 iterations.

Figure 6. Loss (a) and accuracy plots (b) over training epochs (learning progression) in the TabTransformer model.

Figure 7. Loss (a) and accuracy plots (b) over training epochs (learning progression) in the PSO-optimized TabTransformer model.

Figure 8. Heatwave susceptibility maps (HSM) generated by (a) the baseline TabTransformer model and (b) its PSO-optimized counterpart.

Figure 9. ROC curves and AUC values comparing the classification performance of TabTransformer and TabTransformer–PSO models on the test set.

Figure 10. ROC curves for the 10-fold cross-validation of (a) TabTransformer and (b) TabTransformer–PSO models. Thin colored lines represent individual folds, the thick blue line shows the mean ROC curve, and the shaded region marks ± 1 SD. The dashed red line indicates the no-skill classifier (AUC = 0.5).

Table 1. Causative factors used in the heatwave susceptibility model in Central Asia along with their data sources.

Factor	Source	Temporal Coverage	Spatial Resolution
Elevation	Shuttle Radar Topography Mission (SRTM)	Static	30 m
Slope	Shuttle Radar Topography Mission (SRTM)	Static	30 m
Aspect	Shuttle Radar Topography Mission (SRTM)	Static	30 m
Normalized Difference Vegetation Index (NDVI)	Landsat-8	2015–2024	30 m
Land surface temperature (LST)	Landsat-8	2015–2024	30 m
Albedo	Landsat-8	2015–2024	30 m
Population density	https://hub.worldpop.org/ (accessed on 20 June 2025)	2020	1 km
Land cover	ESA Sentinel-2	2017–2023	10 m
Maximum Temperature	ERA 5	1991–2020	0.25° (~27.75 km)
Rainfall	ERA 5	1991–2020	0.25° (~27.75 km)
Humidity	ERA 5	1991–2020	0.25° (~27.75 km)
Number of Hot Days (Tmax > 30 °C)	ERA 5	1991–2020	0.25° (~27.75 km)
Number of Days with Heat Index (>35 °C)	ERA 5	1991–2020	0.25° (~27.75 km)

Table 2. Relative importance scores of the 13 remote sensing-derived causative factors in predicting heatwave susceptibility, as derived from the trained TabTransformer model.

Independent Variables	Importance
Elevation	0.095
Slope	0.1
Aspect	0.136
Normalized Difference Vegetation Index (NDVI)	0.072
Land surface temperature (LST)	0.14
Albedo	0.098
Population density	0.096
Land cover	0.047
Maximum Temperature	0.313
Rainfall	0.19
Humidity	0.114
Number of Hot Days (Tmax > 30 °C)	0.245
Number of Days with Heat Index (>35 °C)	0.028

Table 3. Configuration parameters used in the Particle Swarm Optimization (PSO) algorithm for hyperparameter tuning of the TabTransformer model.

Parameter	Value
Local learning coefficient	2
Global learning coefficient	2
Minimum inertia weight	0.4
Number of epochs	100
Population size (swarm size)	50

Table 4. Comparison of default and PSO-optimized TabTransformer models’ hyperparameters, along with the best R² score achieved during model training.

Algorithm	Embedding Dimension	Transformer Depth	Attention Heads	Attention Dropout	Feedforward Dropout	Best Cost (R²)
TabTransformer (Default)	32	4	8	0.1	0.1	-
PSO	8	1	3	0.2	0.17	0.96

Table 5. Comparison of model performance metrics (RMSE, MAE, and R²) for TabTransformer and TabTransformer–PSO on training and validation datasets.

Model	Train			Test
Model	RMSE	MAE	R²	RMSE	MAE	R²
TabTransformer	0.082	0.027	0.97	0.132	0.038	0.93
TabTransformer–PSO	0.062	0.019	0.98	0.123	0.034	0.938

Table 6. Performance comparison between baseline TabTransformer and TabTransformer–PSO models across Central Asian countries, showing MAE and RMSE metrics with corresponding percentage improvements achieved through PSO enhancement.

Country	Base Model MAE	PSO Model MAE	MAE Improvement (%)	Base Model RMSE	PSO Model RMSE	RMSE Improvement (%)
Kazakhstan	0.122	0.073	40.16	0.168	0.110	34.52
Kyrgyzstan	0.081	0.062	23.46	0.107	0.087	18.69
Tajikistan	0.100	0.053	47.00	0.100	0.053	47.00
Turkmenistan	0.098	0.052	46.94	0.135	0.066	51.11
Uzbekistan	0.137	0.058	57.66	0.174	0.072	58.62
Regional Average	0.038	0.034	10.53	0.132	0.123	6.82

Table 7. AUROC performance metrics of TabTransformer and TabTransformer–PSO models, including AUC values, standard errors, and 95% confidence intervals on the test dataset.

Models	AUC	Standard Error	95% Confidence Interval
TabTransformer	0.933	0.0247	0.873 to 0.971
TabTransformer–PSO	0.940	0.0222	0.881 to 0.975

Table 8. Fold-wise AUC values from 10-fold cross-validation for the baseline TabTransformer and the PSO-optimized TabTransformer–PSO models.

Fold	TabTransformer AUC	TabTransformer–PSO AUC
1	0.86	0.94
2	1	1
3	1	1
4	0.81	0.94
5	0.94	0.92
6	1	1
7	0.89	0.86
8	0.97	1
9	0.83	0.83
10	0.97	0.89
Mean	0.93	0.94
Standard Deviation	0.07	0.06

Table 9. Areal extent of heatwave susceptibility classes for Central Asia derived from the baseline TabTransformer and PSO-optimized TabTransformer–PSO models. Percentages are relative to the total area of Central Asia (~4 million km²); corresponding areas in km² are shown in parentheses and rounded to the nearest hundred.

Susceptibility Class	TabTransformer		TabTransformer–PSO
Susceptibility Class	(%)	(km²)	(%)	(km²)
Very low	20.22	808,800	19.7	788,000
Low	21.91	876,400	20.13	805,200
Moderate	20.29	811,600	20.64	825,600
High	19.43	777,200	19.98	799,200
Very high	18.15	726,000	19.55	782,000

Table 10. Statistical test results comparing susceptibility maps generated by TabTransformer and TabTransformer–PSO models.

Test	Models Compared	Statistic	Mean Rank (If Applicable)	Significance Level
Chi-squared test	TabTransformer	69.7	–	p < 0.0001
Chi-squared test	TabTransformer–PSO	69.6	–	p < 0.0001
Friedman test	TabTransformer	–	1.008	p < 0.0001
Friedman test	TabTransformer–PSO	–	1.9	p < 0.0001
Wilcoxon test	TabTransformer vs. TabTransformer–PSO	Z = −9.49	–	p < 0.0001

Table 11. Cross-validation of our RS-driven TabTransformer–PSO heatwave susceptibility outputs with independent observational, attribution, and climate-impact studies in Central Asia.

Country	Literature Hotspots	Our Model Hotspots	Main Drivers in Studies	Agreement/Notes
Kazakhstan	Southern lowlands incl. Syr Darya & Chu basins, Aral Sea vicinity; NW Kazakhstan extreme heatwave frequency hotspots. Early March 2025 severe heat in S & E Kazakhstan [66].	Very high susceptibility in south & west, lower in NE uplands.	Soil moisture depletion, reduced precipitation, high net radiation [42]; early onset heat linked to anticyclonic circulation [66]; agriculture–health exposure synergy [64].	Strong agreement; PSO-enhanced maps provide finer gradient in south.
Uzbekistan	Kyzylkum Desert, Fergana Valley, Amu Darya delta, urban Tashkent high risk [66].	Very high susceptibility in Kyzylkum and Amu Darya basin; high in Tashkent & irrigated oases.	Soil moisture deficit; UHI in Tashkent; irrigation dependence [66].	Agreement; our maps reveal sharper urban–rural contrast.
Turkmenistan	Karakum Desert; Caspian lowlands moderately less extreme [42]. March 2025 event struck central/south [66].	Widespread high susceptibility in Karakum; moderate in Caspian lowlands.	Chronic soil dryness; low precipitation; high radiation load; elevation moderation near Kopet Dag [42].	Matches broad literature, with finer delineation of coastal moderation.
Kyrgyzstan	Low valleys esp. Chuy & Fergana suffer hottest extremes [36,66]; UHI in Bishkek [66].	Moderate–high susceptibility in Chuy & Fergana valleys; lower in highlands.	Elevation effects; valley heat pooling; UHI amplification from urban expansion [36].	Agreement; PSO resolution outlines valley floors distinctly.
Tajikistan	Vakhsh & Panj valleys hotter months; pre-monsoon extremes [36,66].	Higher susceptibility in Vakhsh & Panj; lower in Pamir highlands.	Solar radiation, low evapotranspiration, irrigation, low elevation [36].	Agreement; our maps nuance variation between valley segments.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, A.; Sun, L.; Jia, H. Harnessing TabTransformer Model and Particle Swarm Optimization Algorithm for Remote Sensing-Based Heatwave Susceptibility Mapping in Central Asia. Atmosphere 2025, 16, 1166. https://doi.org/10.3390/atmos16101166

AMA Style

Wang A, Sun L, Jia H. Harnessing TabTransformer Model and Particle Swarm Optimization Algorithm for Remote Sensing-Based Heatwave Susceptibility Mapping in Central Asia. Atmosphere. 2025; 16(10):1166. https://doi.org/10.3390/atmos16101166

Chicago/Turabian Style

Wang, Antao, Linan Sun, and Huicong Jia. 2025. "Harnessing TabTransformer Model and Particle Swarm Optimization Algorithm for Remote Sensing-Based Heatwave Susceptibility Mapping in Central Asia" Atmosphere 16, no. 10: 1166. https://doi.org/10.3390/atmos16101166

APA Style

Wang, A., Sun, L., & Jia, H. (2025). Harnessing TabTransformer Model and Particle Swarm Optimization Algorithm for Remote Sensing-Based Heatwave Susceptibility Mapping in Central Asia. Atmosphere, 16(10), 1166. https://doi.org/10.3390/atmos16101166

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Harnessing TabTransformer Model and Particle Swarm Optimization Algorithm for Remote Sensing-Based Heatwave Susceptibility Mapping in Central Asia

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Data Sources and Processing

2.2.1. Evidence Layer: Heatwave Occurrence Mapping

2.2.2. Causative Factors for Heatwave Susceptibility

2.3. TabTransformer Model Architecture

2.4. Particle Swarm Optimization for Hyperparameter Tuning

2.5. Model Performance Assessment During Training and Validation

2.5.1. Loss and Accuracy Plots

2.5.2. Quantitative Performance Metrics (RMSE, MAE, R2)

2.5.3. Discriminative Performance and AUROC Analysis

2.5.4. Statistical Comparison of Susceptibility Outputs

3. Results

3.1. Feature Importance Analysis

3.2. PSO Hyperparameter Optimization Outcomes

3.3. PSO Convergence Dynamics

3.4. Learning Dynamics of TabTransformer and TabTransformer–PSO: Loss and Accuracy Trajectories

3.5. Spatial Patterns of Heatwave Susceptibility: A Geographic Comparison

3.6. Model Performance Comparison: TabTransformer vs. TabTransformer-PSO

3.7. AUROC-Based Validation Assessment

3.8. Areal Extent and Statistical Analysis of Susceptibility Classes

4. Discussion

4.1. Model Behavior and Variable Influence

4.2. Benefits and Limitations of PSO for Model Tuning

4.3. Metaheuristic Learning Behavior and Efficiency of Convergence

4.4. Comparative Analysis of Model Learning Behavior and Optimization Impact

4.5. Model Performance in Context

4.6. Agreement and Divergence in Susceptibility Outputs

4.7. Geographic Interpretation: Country-Wise Discussion of Heatwave Susceptibility

4.8. Broader Implications and Applications

4.9. Uncertainty Considerations in Heatwave Susceptibility Mapping

4.10. Limitations and Future Research Directions

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

2.5.2. Quantitative Performance Metrics (RMSE, MAE, R²)