Spatiotemporal Variations of Global Human-Perceived Heatwave Risks and their Driving Factors Based on Machine Learning

: With ongoing global warming, heatwave-related disasters are on the rise, exerting a multifaceted impact on both the natural ecosystem and human society. While temperature has been extensively studied in the effects of extreme heat on human health, humidity has often been ignored. It is crucial to consider the combined inﬂuence of temperature and humidity when assessing heatwave risks and safeguarding human well-being. This study, leveraging remote sensing products and reanalysis data, presented the ﬁrst analysis of the spatiotemporal variations in global human-perceived heatwaves on a seasonal scale from 2000 to 2020


Introduction
Heatwaves have profound impacts on various aspects of human activities and natural ecosystems, such as threatening human health [1], increasing morbidity [2] and mortality [3], reducing worker productivity [4], damaging public facilities, and reducing crop yields [5] and economic output [6].In the current global warming environment, scholars have observed a significant increase in heatwaves in several regions [7][8][9][10], jeopardizing rapid and sustainable global development [11][12][13][14].Many experts in the field of climate change are currently conducting studies on extreme heat and its risk, most of which have traditionally focused only on the most fundamental single variable, i.e., near-surface temperature.However, the complex nature of these events necessitates a more comprehensive understanding beyond temperature alone.
To better understand extreme heat events and their effects, scholars are increasingly working on the interaction between temperature and other factors [15][16][17][18][19][20].In hot environments, the human body relies on perspiration to dissipate heat and maintain its well-being, and high relative humidity can impede the body's convection and heat dissipation abilities.In severe cases, the inability to effectively dissipate heat can lead to elevated body temperatures beyond the body's tolerance, resulting in various heat-related illnesses and even fatalities.Therefore, focusing solely on near-surface air temperature is insufficient to accurately assess the comfort level of the environment for human health [20,21], and it is crucial to consider the synergistic effects of temperature and humidity when studying heatwave risks, which are at the forefront of current climate change prediction research [22][23][24][25][26][27].However, studies in relation to heat events that consider the synergistic effects of temperature and humidity which are more reflective of real human comfort remain limited.As a result, these studies lack sufficient reference value for fully understanding the actual impact on human beings.Besides, further research is also needed to better understand the main influencing factors and driving mechanisms involved.
Previous qualitative analyses in spatiotemporal variability of extreme heat events and their attribution have mostly relied on methods such as linear regression [28][29][30], which are important for understanding the drivers of extreme heat variability, but may also lack a certain degree of objectivity in the analysis.In contrast, RF regression offers several advantages over traditional methods [31].By utilizing multiple decision trees, RF regression can capture complex nonlinear relationships, allowing for the learning of diverse patterns and features in the data.Additionally, RF regression is robust against noise and outliers, as it averages predictions from multiple trees, thus reducing the impact of individual inaccuracies or outliers and minimizing model variance.Furthermore, RF regression mitigates the risk of overfitting through random sampling and the random nature of decision trees.By controlling parameters such as maximum depth and minimum leaf samples, the model's complexity can be regulated, enhancing its generalization capability [31][32][33].RF regression has found wide application in various fields, including hydrology [34][35][36][37], climate [38][39][40], disease [41][42][43][44], and the environment [45][46][47][48].By combining expert knowledge and selecting relevant drivers through human decision-making, the application of RF models can provide a comprehensive analysis of the drivers influencing heatwave distribution.
Understanding the spatiotemporal variation of global heatwaves and identifying the factors influencing their distribution is a prerequisite for the effective mitigation of extreme heatwave risks.However, despite a growing number of studies focusing on heat events, few have considered the synergistic effect of temperature and humidity.Based on the near-surface temperature and relative humidity data from the ERA5 dataset and a series of remote sensing products as driving factors, we here present the first global analysis of spatiotemporal variation characteristics of human-perceived heatwaves and their risks at seasonal scales during 2000-2020 using HI reflecting actual human comfort.At the same time, the RF model was applied to quantify the explanatory power and importance of different driving factors in topography, climate, and humans, on the spatial distribution of heatwave risk and the impact after interaction between each two factors.The findings of this study aim to serve as scientific references for informing policy design in climate change adaptation, disaster prevention and mitigation, heatwave management, and risk reduction strategies across different regions.

Materials and Methods
Based on the near-surface temperature and relative humidity data from the ERA5 dataset and a series of remote sensing products as driving factors, this study analyzes the spatiotemporal characteristics of global heatwaves and their risks at seasonal scales during 2000-2020.Followed by several previous related studies, DEM [49], slope [50], and aspect [51] are selected as topographic factors, precipitation [52] as a climatic factor, and population density (popdens) [53], nighttime light (NTL) [54], and land-use coverage (LUCC) [55] as human factors (see Table 1).According to some existing research, these above factors could be inferred to contribute, to a greater or lesser extent, directly or indirectly, to extreme heat events such as heatwaves.For example, elevation may affect temperature distribution, with temperatures generally decreasing as altitude increases, leading to varying heatwave effects in areas with different elevations [56].Besides, slope and aspect influence solar radiation received and heat distribution on the ground surface, subsequently affecting the surface temperature.South-facing or sunlit slopes tend to receive more solar radiation, potentially culminating in elevated temperatures and heightened heatwave risks [57].Apart from that, precipitation impacts regional humidity and evapotranspiration, which may in turn shape human perception of heatwaves.Areas with less precipitation are likely to experience relatively amplified heatwave risks [58].Furthermore, nighttime lights can serve as an indicator of urbanization, and the urban heat island effect prevalent in urban areas can exacerbate heatwaves [59].Additionally, in densely populated areas, the limited capacity for human heat dissipation intensifies the detrimental effects of heatwaves on human health [60].Lastly, different land use types affect surface albedo, vegetation cover, soil moisture, and other factors, which may also influence the spatial distribution of heatwaves [61].The effects of the seven abovementioned factors on the spatial distribution of global heatwaves were quantified by applying RF model regression after a uniform spatial data preprocessing approach, and the technical route is shown in Figure 1.

Data Sources
The climate data used in this study are the 2 m temperature, precipitation, and relative humidity data from the ERA5 monthly averaged dataset and hourly dataset (https: //cds.climate.copernicus.eu/cdsapp#!/search?type=dataset(accessed on 23 February 2023)).ERA5 is the fifth-generation ECMWF reanalysis for the global climate and weather for the past 8 decades available from 1940 onwards, which replaces the ERA-Interim reanalysis.Reanalysis combines model data with observations from across the world into a globally complete and consistent dataset using the laws of physics, and there are various observations (both satellite and in-situ) used as input to ERA5 [62].We divided a year into four quarters, with March-May as the first quarter (S1); June-August as the second quarter (S2); September-November as the third quarter (S3); and December-February as the fourth quarter (S4), and the seasonal averaged data were calculated from the monthly averaged and hourly data.Topographic data are provided by the National Aeronautics and Space Administration (NASA, https://www.nasa.gov/(accessed on 16 January 2023)) as global 0.

Data Sources
The climate data used in this study are the 2 m temperature, precipitation, and relative humidity data from the ERA5 monthly averaged dataset and hourly dataset (https://cds.climate.copernicus.eu/cdsapp#!/search?type=dataset(accessed on 23 February 2023)).ERA5 is the fifth-generation ECMWF reanalysis for the global climate and weather for the past 8 decades available from 1940 onwards, which replaces the ERA-Interim reanalysis.Reanalysis combines model data with observations from across the world into a globally complete and consistent dataset using the laws of physics, and there are various observations (both satellite and in-situ) used as input to ERA5 [62].We divided a year into four quarters, with March-May as the first quarter (S1); June-August as the second quarter (S2); September-November as the third quarter (S3); and December-February as the fourth quarter (S4), and the seasonal averaged data were calculated from the monthly averaged and hourly data.

HI and Heatwave Risk
In meteorology, it is usually defined that a heatwave occurs when the temperature exceeds a certain threshold for three or more consecutive days, but recently, there are studies using the HI adopted by the National Oceanic and Atmospheric Administration (NOAA) in its operations instead of the traditional heatwave definition [63][64][65].In this study, HI (also often referred to as body temperature in practical applications) was chosen as a quantitative indicator to analyze heatwaves and their associated risks as it takes into account the synergistic effect of temperature and humidity and can more intuitively reflect the comfort level of the human body in different climatic environments.Its algorithm is given by Rothfusz [66] and Steadman [67] and has wide applicability in the conditions of temperature 20-50 • C and water vapor pressure 0-4.6 kPa.The calculation of HI is first performed using the equation proposed by Steadman, as follows: where T is the temperature ( • F) and RH is the relative humidity (%), the HI is calculated in • F. If the calculated HI is 80 • F or higher, i.e., in the case of more extreme temperature and relative humidity, the formula proposed by Rothfusz is used to calculate: Rothfusz's regression calculation does not apply to extreme temperature and relative humidity conditions beyond the range of data considered by Steadman.Additionally, in the case of partial temperature-humidity combinations, we made some adjustments to the calculations based on the methodology provided by the NOAA National Climate Service Climate Prediction Center (https://www.wpc.ncep.noaa.gov/html/heatindex_equation.shtml(accessed on 6 July 2023)).In practice, the simplified formula (Equation ( 1)) is computed first, and if this HI value is 80 degrees F or higher, the full regression equation (Equation ( 2)) along with related adjustment is applied to re-estimate HI as the final result.Upon completing the calculation of the HI, we proceeded to convert the units from Fahrenheit ( • F) to Celsius ( • C) following the corresponding conversion relationship.For the calculated HI, the risk level of heatwaves can be classified according to the value of the daily average HI (Table 2) [68].

Random Forests
RF is an ensemble learning technique that combines the predictions of multiple individual models to achieve high accuracy and generalization performance [69], which are convenient for large datasets, insensitive to multicollinearity, and robust to missing data and unbalanced data, and can provide predictions with relatively high accuracy [70,71].It can be used not only for classification but also for regression analysis and further exploration of the relative importance of influencing factors by constructing multiple models using the available data and integrating their results through voting (for classification) or averaging (for regression).This ensemble approach enhances the overall predictive power of the model and improves its ability to handle complex patterns in the data.
When tackling regression problems, each decision tree in the RF ensemble serves as a regression tree.These trees are built by iteratively splitting the data using the criterion of minimizing the mean squared deviation.To be more specific, when a splitting feature A is selected, an arbitrary splitting point s is employed to divide the dataset into two subsets, namely D1 and D2.This splitting point is carefully determined to minimize the mean squared deviation of the two subsets after the division, thus ensuring an optimal partitioning of the data.min where c 1 and c 2 are the mean sample output values of D1 and D2, respectively; y i is the measured value.
After the initial split, the splitting process continues at the resulting nodes following the same principle.This recursive splitting procedure is performed until a termination criterion is met.Once all the decision trees are constructed, the final prediction result is obtained by taking the mean of the predictions from all the trees in the RF ensemble.In this study, R language is used for the operation of RF regression.We construct regression models of global heatwaves with each relevant influencing factor, two-two interaction factors, and all factors, respectively, and analyze the driving factors of global heatwaves based on the model fit (R 2 ) and the importance ranking results generated by the RF model.

Global HI
The global average daily HI exhibited a persistent upward trend since 2000 as shown in Figure 2, surpassing regular fluctuations and registering an approximate increase of 1.5 • C by 2020.Moreover, both the annual mean maximum HI and minimum HI over global land areas indicated a gradual ascent.Global HI change between 2000 and 2020 was initially obtained by rasterizing the difference between the HI calculation results and its spatial distribution is shown in Figure 2. Global HI varied significantly from season to season over the 20-year period but increased in the majority of land areas.There were similarities in the distribution of HI changes in different seasons, and the degree of HI changes from season to season was closely related to latitude, with most land areas near the equator, such as southern China, Southeast Asia, southern Mexico, and northern South America, showing small increases (0-1.The central and eastern regions of Russia showed a moderate decrease in the HI in S1 (0-2.6 • C) and a moderate or large increase in S2, S3, and S4 (1.4-4.2 • C).The eastern regions of Europe show a decrease in the HI in S4 (0-1.4 • C) and an increase in S1, S2, and S3 (0-1.4 • C).In most of the regions in Antarctica, the HI decreased in S1 and S3 (0-1.4 • C) and increased in S2 and S4 (0-1.4 • C).The HI increased significantly in S3 in northern Canada (1.4-5.9 • C) and decreased in S4 in the western region (0-1.4• C).In the northern region of the United States, the HI decreased in S3 (0-1.4 • C) and increased in S1, S2, and S4 (0-2.8 • C).Thus, the change in global HI from season to season is more and more obvious with increasing latitude.

Global Heatwave Risk
After reclassifying global HI according to the criteria, the global average number of days under each of the four different heatwave risk levels for each year from 2000 to 2020 is shown in Figure 3A; these time series were fitted by ordinary least squares regression with the highest R-squared fit chosen, and all passed the significance test with p-values less than 0.05.S2 (northern hemisphere summer) and S4 (southern hemisphere summer) were selected as representatives to obtain the global heatwave risk distribution during 2000-2020 as shown in Figure 4, and the number of pixels of each risk level and the area share are shown in Figure 3B.During the 20-year duration, a noteworthy inclination was observed towards a rise in the frequency of days characterized by an elevated risk level, with the exception of NRL days.The rate of decline in NRL days was the greatest (approximately 78.5 days Mkm −2 ), followed by the rate of increase in LRL days (approximately 42.6 days Mkm −2 ), the rate of increase in MRL days (approximately 31.5 days Mkm −2 ), and the rate of increase in HRL days was the slowest but also reached about 1-day Mkm −2 .Within the global land area, the area under different risk levels of heatwaves in S2 and S4 varied greatly, among which the area under NRL was the largest, reaching more than 70% in S2 and more than 80% in S4.Among the remaining land areas, the proportion of high-risk areas was relatively small, less than 1% in S4 and only about 3% in S2.Meanwhile, in S2, the area of medium-risk region was slightly larger than low-risk, while in S4 the area of medium-risk region gradually increased from smaller than the area of low-risk region to larger than the area of low-risk.From 2000 to 2020, the proportion of no-risk and low-risk areas tended to gradually decrease, with the weakest change between 2005 and 2010, while the proportion of medium-risk and high-risk areas tended to gradually increase, showing the overall trend that global heatwave events are gradually worsening.

Global Heatwave Risk
After reclassifying global HI according to the criteria, the global average number of days under each of the four different heatwave risk levels for each year from 2000 to 2020 is shown in Figure 3A; these time series were fitted by ordinary least squares regression with the highest R-squared fit chosen, and all passed the significance test with p-values less than 0.05.S2 (northern hemisphere summer) and S4 (southern hemisphere summer) were selected as representatives to obtain the global heatwave risk distribution during 2000-2020 as shown in Figure 4, and the number of pixels of each risk level and the area share are shown in Figure 3B.During the 20-year duration, a noteworthy inclination was observed towards a rise in the frequency of days characterized by an elevated risk level, with the exception of NRL days.The rate of decline in NRL days was the greatest (approximately 78.5 days Mkm −2 ), followed by the rate of increase in LRL days (approximately 42.6 days Mkm −2 ), the rate of increase in MRL days (approximately 31.5 days Mkm −2 ), and the rate of increase in HRL days was the slowest but also reached about 1-day Mkm −2 .Within the global land area, the area under different risk levels of heatwaves in S2 and S4 varied greatly, among which the area under NRL was the largest, reaching more than 70% in S2 and more than 80% in S4.Among the remaining land areas, the proportion of highrisk areas was relatively small, less than 1% in S4 and only about 3% in S2.Meanwhile, in S2, the area of medium-risk region was slightly larger than low-risk, while in S4 the area of medium-risk region gradually increased from smaller than the area of low-risk region In S2, global heatwave risk areas were concentrated in the northern hemisphere at low latitudes and near the equator, while in S4 the opposite was true, with global heatwave risk areas concentrated in the southern hemisphere at low latitudes and near the equator.Some mid-and high-latitude land masses were still covered by risk-free areas globally, including Antarctica, which is always risk-free, Russia, which is closer to the Arctic, and most of Europe and northern North America, high-altitude areas in China such as the Qinghai-Tibet and Yunnan-Guizhou plateaus, and Australia and areas in southern Africa and southern South America, which were also risk-free.The areas under HRL are small in size and scattered in longitude, but they are all concentrated between the equator and the Tropic of Cancer in latitude.In S2, the high-risk areas were mainly located in the northern part of Africa, such as Saudi Arabia, Egypt, Sudan, and Mali, and the southwestern part of Asia, such as Pakistan and India; in S4, the high-risk areas were mainly located in the southern part of Africa, the Oceania region, such as the western part of Australia, and the central part of Brazil in South America.Most of these areas are controlled by the subtropical high-pressure zone, with prevailing sinking air currents, low precipitation throughout the year, hot and dry, prone to extreme heat disasters, and distributed with a large number of high-risk areas.Similarly, areas near the equator in S2 and S4 were distributed in high-risk areas, such as the northern regions of South America, Southeast Asia, and central Africa, as most of these areas are controlled by the tropical depression, humid and rainy, high temperatures throughout the year, the risk of extreme heatwave disasters is higher.Around these high-risk areas, there was a relatively continuous distribution of medium-risk areas, such as almost all the remaining land areas in central to northern Africa except Ethiopia, southwestern Asia, the Middle East, southern India, southeastern China, Southeast Asia, and parts of southern North America and northern South America.Low-risk areas of comparable size to medium-risk areas were distributed around the medium-risk areas.The change over the period 2000-2020 shows that the high-risk areas were gradually expanding in size, and the medium-risk areas were expanding outward while some of them were transformed into high-risk areas, and the area expanded was larger than the area transformed into high-risk areas.The low-risk areas were also expanding in the same way as the medium-risk areas, while the area of the no-risk areas was gradually decreasing due to the expansion of the peripheral areas of the low-risk areas.In S2, global heatwave risk areas were concentrated in the northern hemisphere at low latitudes and near the equator, while in S4 the opposite was true, with global heatwave risk areas concentrated in the southern hemisphere at low latitudes and near the equator.Some mid-and high-latitude land masses were still covered by risk-free areas globally, including Antarctica, which is always risk-free, Russia, which is closer to the Arctic, and most of Europe and northern North America, high-altitude areas in China such as the Qinghai-Tibet and Yunnan-Guizhou plateaus, and Australia and areas in southern Africa and southern South America, which were also risk-free.The areas under HRL are small in size and scattered in longitude, but they are all concentrated between the equator and the Tropic of Cancer in latitude.In S2, the high-risk areas were mainly located in the The results of the change and shift analysis of global heatwave risk level for the period 2000-2020 are shown in Figure 5.As displayed in Figure 5A, the risk level did not change significantly in most regions during the 20-year period, while almost all of the regions that did change experienced an increase in risk level.Only a very small portion of western China and central Africa experienced a decrease in risk level in S2.In Asia and northcentral Africa, the areas with upgraded risk levels were more scattered, while in central to southern North America and central to northern South America, the areas with upgraded risk levels were more concentrated and continuous, and larger in size.In S4, only a very small part of southern Africa and western Australia had a decrease in risk level, while central, western, and southern Africa had an increase in risk level with a more scattered distribution; Australia had a smaller area with an increase in risk level, and South America had a more concentrated area with a larger area with an increase in risk level.At the same time, central Africa, southern India, Southeast Asia, and South America all had areas with upgraded risk levels in S2 and S4.The shift in the global heatwave risk level between S2 and S4 of 2000-2020 provides a better visualization of the change in risk level by region (Figure 5B).Antarctica and the land areas near the North and South Poles remained free of heatwave risk for 20 years, with some areas near the edges of the region converting from no-risk to low-risk.In S2, the distribution of risk areas extended to high latitudes in the northern hemisphere, and the high-risk areas in northern Africa and parts of southern Asia remained under HRL, with almost no reduction in risk level, and only a certain area on the periphery of the area from MRL to HRL.Compared with S2, the distribution of risk areas in the northern hemisphere was greatly reduced in S4, but the risk areas in the southern hemisphere at low latitudes and the shift in risk level did not change significantly compared with S2.In both seasons, there were almost no areas around the perennial medium-risk areas that were downgraded to low-risk areas, and some areas that were upgraded from low-risk to medium-risk areas were distributed in the periphery, and the area of such grade-changing areas accounted for the largest area among the grade-changing areas, i.e., the majority of the areas with upgraded risk levels had an excessive shift from LRL to MRL, and in both seasons, the shift was most obvious in the northern part of South America.The most significant transition occurred in the northern region of South America.

Single Driving Factor
The regression of the RF model for each driving factor as the explanatory variable of the global heatwaves was performed separately and the results are shown in Figure 6.All the results of the factor detection passed the significance test except for slope and NTL, and the explanatory power of different driving factors, i.e., the fit of the RF regression model (R 2 ), differed significantly between 0 and 0.8.The shift in the global heatwave risk level between S2 and S4 of 2000-2020 provides a better visualization of the change in risk level by region (Figure 5B).Antarctica and the land areas near the North and South Poles remained free of heatwave risk for 20 years, with some areas near the edges of the region converting from no-risk to low-risk.In S2, the distribution of risk areas extended to high latitudes in the northern hemisphere, and the high-risk areas in northern Africa and parts of southern Asia remained under HRL, with almost no reduction in risk level, and only a certain area on the periphery of the area from MRL to HRL.Compared with S2, the distribution of risk areas in the northern hemisphere was greatly reduced in S4, but the risk areas in the southern hemisphere at low latitudes and the shift in risk level did not change significantly compared with S2.In both seasons, there were almost no areas around the perennial medium-risk areas that were downgraded to low-risk areas, and some areas that were upgraded from low-risk to medium-risk areas were distributed in the periphery, and the area of such grade-changing areas accounted for the largest area among the grade-changing areas, i.e., the majority of the areas with upgraded risk levels had an excessive shift from LRL to MRL, and in both seasons, the shift was most obvious in the northern part of South America.The most significant transition occurred in the northern region of South America.

Single Driving Factor
The regression of the RF model for each driving factor as the explanatory variable of the global heatwaves was performed separately and the results are shown in Figure 6.All the results of the factor detection passed the significance test except for slope and NTL, and the explanatory power of different driving factors, i.e., the fit of the RF regression model (R 2 ), differed significantly between 0 and 0.8.The explanatory power of each factor for the heatwaves distribution during 2000-2020 shows a certain periodicity on the seasonal scale, with some factors having the greatest explanatory power in the first quarter (land use, precipitation) and others in the second quarter (slope direction, population density, elevation, slope, nighttime lights).With the increase of years, the explanatory power of each factor for the distribution of heatwaves does not vary much, and the R 2 only changes very slightly, but the overall performance is slightly reduced, which indicates that the explanatory power of the seven abovementioned factors for the distribution of heatwaves tends to gradually decrease with time.The spatial distribution of heatwaves does not tend to be random, so there are still other factors that have more control over the distribution of global heatwaves to be explored, and the dominant factors affecting the spatial distribution of heatwaves also need to be explored.Among all factors, the aspect had the greatest explanatory power (0.69-0.76) for the spatial distribution of the global heatwaves, followed by LUCC, with model fits ranging from 0.48 to 0.55.As the direction a slope faces, aspect affects the amount of sunlight it The explanatory power of each factor for the heatwaves distribution during 2000-2020 shows a certain periodicity on the seasonal scale, with some factors having the greatest explanatory power in the first quarter (land use, precipitation) and others in the second quarter (slope direction, population density, elevation, slope, nighttime lights).With the increase of years, the explanatory power of each factor for the distribution of heatwaves does not vary much, and the R 2 only changes very slightly, but the overall performance is slightly reduced, which indicates that the explanatory power of the seven abovementioned factors for the distribution of heatwaves tends to gradually decrease with time.The spatial distribution of heatwaves does not tend to be random, so there are still other factors that have more control over the distribution of global heatwaves to be explored, and the dominant factors affecting the spatial distribution of heatwaves also need to be explored.Among all factors, the aspect had the greatest explanatory power (0.69-0.76) for the spatial distribution of the global heatwaves, followed by LUCC, with model fits ranging from 0.48 to 0.55.As the direction a slope faces, aspect affects the amount of sunlight it receives, and consequently, the temperature and humidity of the slope.North-facing slopes in the northern hemisphere (or south-facing slopes in the southern hemisphere) receive less direct sunlight compared to their counterparts, which results in lower temperatures and potentially lower relative humidity.Conversely, south-facing slopes in the northern hemisphere (or north-facing slopes in the southern hemisphere) receive more direct sunlight and tend to have higher temperatures and higher relative humidity.LUCC is the second most influential factor primarily because it directly affects the energy balance and microclimate characteristics of the Earth's surface.Changes in land use, such as deforestation, agricultural expansion, or urbanization, can alter surface albedo, soil moisture, and vegetation cover, and create urban heat island effects.These alterations can lead to increased surface temperatures and changes in humidity, thus influencing the distribution of heatwave risks.
The explanatory power of precipitation fluctuated more with the season, reaching a maximum of 0.43.Precipitation significantly influences the global spatial distribution of heatwave risks as it affects atmospheric moisture, cloud cover, and evapotranspiration rates, which in turn impact temperature and humidity conditions.Higher precipitation can moderate temperatures through increased cloud cover and evapotranspiration, while lower precipitation can lead to higher temperatures and lower relative humidity, increasing heatwave risks.
The explanatory power of elevation and population density were comparable to each other, and both had some influence on the distribution of heatwaves, but the influence was relatively small.Although higher elevations generally experience cooler temperatures due to adiabatic cooling and lower atmospheric pressure, this relationship can vary significantly depending on other local and regional factors.For instance, regional climate, latitude, and local weather patterns can all play a role in determining the temperature at a specific elevation.Additionally, the influence of elevation on heatwave risks may be less uniform and more complex than other factors, making it harder to establish a consistent global pattern.Population density, on the other hand, is an indirect factor affecting heatwave risks.Higher population density could exacerbate the consequences of heatwaves, as more people are exposed to extreme heat, and urban heat island effects can intensify temperatures in densely populated areas.However, population density itself does not directly influence the formation or frequency of heatwaves; it primarily affects the vulnerability of communities to heat-related hazards.
The explanatory power of slope and NTL on the spatial distribution of global heatwaves was the smallest, especially NTL, which is almost zero, and could be judged to have nearly no effect on the distribution of global heatwaves.Slope can affect local temperature and humidity conditions to some extent.However, its influence on heatwave risks is less significant compared to factors such as aspect, land use, and precipitation, which have more direct and pronounced effects on temperature and humidity variations.The slope may play a more critical role in other hazard risks such as landslides or water runoff, but its connection to heatwave risks is generally weaker.Nighttime lights primarily serve as an indicator of human activity, energy consumption, and urbanization levels.While nighttime lights may have some indirect connections to phenomena such as the urban heat island effect and anthropogenic heat emissions, their role in causing or exacerbating heatwaves is relatively minor.Additionally, the spatial resolution of nighttime light data on a global scale might be limited, potentially failing to accurately capture local-scale temperature and humidity variations.Consequently, when analyzing the spatial distribution of heatwave risks, the influence of nighttime lights is considered negligible compared to other more critical factors.
From the seasonal perspective, the relative magnitude of the driving forces of each factor on the global heatwave distribution remained constant under different seasons, but there was still some influence of season on the explanatory power of each factor on the heatwave distribution.The average level of the explanatory power of each factor was relatively high in S1 and S2, followed by S3, while S4 was the lowest, probably because the spatial heterogeneity of global heatwaves is also less significant in that season, and the control power of each factor on its spatial distribution is also lower because it cannot be reflected.For example, the explanatory power of LUCC and precipitation factors were highest in S1 and lowest in S2, while the explanatory power of DEM, slope, aspect, and popdens was highest in S2 and lowest in S4.In numerous regions, S1 signified the transition from winter to spring, characterized by accelerated vegetation growth due to rising temperatures and precipitation.The influence of LUCC on heatwave risks may be more pronounced during this period, as alterations in land cover can directly affect local temperature and humidity conditions.Precipitation often exhibited greater variability during S1, with certain regions encountering elevated rainfall levels as they enter the wet season.This variability could engender a more robust association between precipitation and heatwave risks during S1 compared to other quarters.Conversely, S2 generally represents the zenith of the warm season in many areas, accompanied by more stable weather patterns and potentially less pronounced LUCC effects.The diminished variability in precipitation and land cover could account for the lower explanatory power of these factors during S2.Moreover, during S2, heightened solar radiation and predominantly clear skies can intensify the influence of topographical factors, such as DEM, slope, and aspect, on local temperature and humidity conditions.These factors may have a more substantial impact on heatwave risks during S2 due to more pronounced variations in solar exposure and shading effects.The elevated temperatures in S2 could culminate in an increased likelihood of heatwaves, potentially resulting in a stronger association between population density and heatwave risks, as higher population densities frequently correlate with urban heat island effects and anthropogenic heat emissions.In stark contrast, S4 was typified by colder temperatures and reduced solar radiation across various regions, which can attenuate the influence of topographical factors and population density on heatwave risks.Snow cover in certain areas may further mitigate the impact of slope and aspect on temperature and humidity conditions.
The explanatory power of each factor for the spatial distribution of the heatwaves varied little from year to year, with only minor fluctuations from 2000 to 2020, and the overall driving force of each factor showed a less significant decreasing trend.From the perspective of factors, although the seasonal variability of the explanatory power was large, the aspect was always the factor with the greatest explanatory power, followed by LUCC and precipitation.The relative magnitude of the explanatory power of popdens and DEM on the spatial distribution of heatwaves fluctuated, and the explanatory power of NTL and slope was always relatively small, almost close to zero, and the detection results were not significant.
After regressing all driving factors simultaneously as explanatory variables for heatwaves in an RF model, the importance ranking of each factor as an explanatory variable was obtained as sorted in Figure 7.In line with the conclusion of the regression of each factor in Figure 6, the aspect had the highest importance among all factors for the spatial distribution of the global heatwaves (34-43%), followed by LUCC (23-31%).The importance of precipitation fluctuated seasonally, reaching a maximum of about 21%, while DEM and popdens were of equal importance for the distribution of the heatwaves, both of which have some influence on its distribution, but both have small effects (5.7-11% and 3-18.8%).Slope and NTL had the least importance on the spatial distribution of the global heatwaves, especially NTL, which was almost 0%, and it could be indicated that NTL almost has no influence on the distribution of heatwaves compared with all the factors studied in this paper.

Interaction Driving Factors
The RF regression of the interaction factors of global heatwaves was carried out on a time scale of years, and the R 2 of each RF regression model obtained is shown in Figure 8, listing the explanatory power of the distribution of global heatwaves after two-by-two interactions between the seven factors, and the factors represented by X1 to X7 can be found in Table 1.The interaction factor imputation results obtained for each year differed very little, and the R 2 was in a fluctuating state with no significant trend.The explanatory power of most of the factors for global heatwave distribution did not decrease after an interaction.

Interaction Driving Factors
The RF regression of the interaction factors of global heatwaves was carried out on time scale of years, and the R² of each RF regression model obtained is shown in Figure 8 listing the explanatory power of the distribution of global heatwaves after two-by-tw interactions between the seven factors, and the factors represented by X1 to X7 can b found in Table 1.The interaction factor imputation results obtained for each year differed very little, and the R 2 was in a fluctuating state with no significant trend.The explanatory power of most of the factors for global heatwave distribution did not decrease after an interaction.Among them, the explanatory power of DEM increased significantly after interacting with precipitation and LUCC; especially after interacting with aspect, the explanatory power increased from 0.15 to 0.80.The interaction between DEM and precipitation can result in variations in temperature and humidity across different elevations.Higher elevations generally experience lower temperatures and increased precipitation due to oro- Among them, the explanatory power of DEM increased significantly after interacting with precipitation and LUCC; especially after interacting with aspect, the explanatory power increased from 0.15 to 0.80.The interaction between DEM and precipitation can result in variations in temperature and humidity across different elevations.Higher elevations generally experience lower temperatures and increased precipitation due to orographic lift.The combined effect of elevation and precipitation can create localized microclimates that influence heatwave risks.Besides, the interaction between DEM and land use and LUCC can lead to variations in vegetation types and density across different elevations.Changes in land cover, such as deforestation or urbanization, can modify local temperature and humidity conditions and thus influence heatwave risks.When combined with the elevation data provided by DEM, these LUCC effects can be more accurately characterized, and their influence on heatwave risks is better understood.Aspect, or the direction a slope faces, affects the amount of solar radiation received by a given area, which influences temperature and humidity conditions.When DEM is combined with aspect, the resulting information can better represent the variations in solar exposure and potential microclimates that contribute to heatwave risks.
The single factor explanatory power of slope was only 0.03, which indicates almost no effect on the distribution of heatwaves, and reached 0.65 after interacting with aspect and LUCC, increasing the explanatory power.Slope, by itself, may have a limited impact on the distribution of heatwaves, as it does not take into account the direction the slope faces (i.e., aspect).When slope interacts with aspect, the combined information can better represent variations in solar exposure and shading effects across different terrains.Areas with a specific slope and aspect may receive more or less solar radiation, leading to differences in local temperature and humidity conditions.This interaction can help explain the distribution of heatwaves more effectively, resulting in an increase in explanatory power.Besides, when slope interacts with LUCC, it can better represent how different land use and land cover types are distributed across various terrains.For example, steeper slopes may have less vegetation or be less susceptible to urbanization compared to flatter areas, leading to different microclimates and heatwave risks.This interaction enables a more accurate representation of the spatial variability in microclimates and other environmental factors that contribute to heatwave distribution, thus enhancing the explanatory power.
The explanatory power of the aspect interacted with precipitation and LUCC for global heatwave distribution reached the maximum of all the interactions (above 0.85), but because the single-factor explanatory power of these two factors is already larger, the increase of the explanatory power of this interaction was still not as great as that of aspect and LUCC interacted with DEM and slope.However, the interaction of each factor with popdens and NTL has little effect on enhancing the explanatory power, and the model fit after the interaction has only a very small to negligible improvement compared with the single-factor model fit of the interacting factors.Popdens and NTL data often capture similar aspects of human activity and urbanization.Both factors can be indicative of the urban heat island effect, where built-up areas exhibit higher temperatures than surrounding rural areas.Since these factors provide overlapping information, their interaction with other factors might not add significant new insights, resulting in only a small improvement in explanatory power.The relationship between heatwave risks and factors such as popdens and NTL might not be consistent across different regions, resulting in spatial heterogeneity.In some areas, the urban heat island effect might be more pronounced, while in others, topographical and land cover factors might dominate.This spatial heterogeneity might weaken the overall increasing impact of interactions with population density and NTL on the explanatory power.

Discussion
In this paper, we analyzed the global seasonal-scale HI and heatwave risk from 2000 to 2020 and used RF model regression to analyze the explanatory power of seven factors on the global seasonal-scale heatwave distribution, including three topographic factors (DEM, slope, and aspect), one climatic factor (precipitation), and three human factors (NTL, popdens, and LUCC).

Global HI and Heatwave Risks
Over the past 20 years, global variations in seasonal HI have been observed, with most land areas experiencing an increase.Lower-latitude regions, such as southern China, Southeast Asia, southern Mexico, and northern South America, see smaller HI increases due to higher solar radiation and less seasonal impact.In contrast, the western and southern US, central Brazil, the Arabian Peninsula, and the Middle East show moderate or slight increases across all seasons, possibly related to atmospheric and oceanic circulation.Equatorial regions, influenced by the low-pressure belt, experience higher humidity and precipitation, limiting HI increases.Mid-to-high latitude regions exhibit more significant seasonal HI variations, affected by factors such as topography, land use, vegetation cover, and soil moisture.For instance, in northern China, HI decreases in the first and fourth quarters, while it increases in the second and third quarters due to the monsoon climate's influence.Glaciers and snow cover also impact HI variations, particularly in polar regions and high mountain areas.In northern Canada's third quarter and Alaska's fourth quarter, substantial HI increases may result from seasonal climate changes, daylight hour variations, radiation intensity, polar amplification [72,73], monsoons, and climate system changes affecting temperature and precipitation distribution [74], as well as topography and land use type differences impacting climate characteristics and surface energy balance [12].
In scenarios S2 and S4, global heatwave risk areas displayed unique patterns, which could be influenced by factors such as atmospheric and oceanic circulation, climate zones, regional climate features, and climate change.Notably, in scenario S2, risks seemed to be concentrated in the northern hemisphere, specifically at low latitudes and near the equator.Conversely, scenario S4 appeared to predominantly focus on the southern hemisphere.Equatorial regions, typically characterized by humid and high-temperature environments, may face heightened heatwave risks.In contrast, subtropical high-pressure zones, known for their hot and dry conditions, could be more susceptible to extreme heat events.Climate change may play a role in the expansion of high-and medium-risk areas, causing regions that were previously considered low-risk to experience more frequent and intense heatwaves.A thorough and objective understanding of these factors is crucial for developing effective adaptation and mitigation strategies to address the growing challenges posed by heatwave risks.

Attribution of Global Heatwaves
Among the seven factors, all of them had some explanatory power for the spatial distribution of global heatwaves, except for slope and NTL, among which aspect had the greatest explanatory power, followed by LUCC and precipitation.The weakest explanatory power was from NTL (approximately 0), and there was almost no pattern between the global NTL distribution and the distribution of heatwaves after a comparative analysis of research data.On the one hand, some of the regions where extreme heatwave events occur, i.e., regions with high HI values, have harsh natural environments, such as the Sahara Desert in northern Africa, where very few people are able to adapt to survive and develop [75], and the NTL data used in this paper also showed that these regions have extremely low nighttime light value distributions; at the same time, the popdens and LUCC data used in this paper also showed that some of the regions with high HI values have extremely low population density and urban land area, resulting in only very weak or almost undetectable NTL distribution [76,77].On the other hand, combined with the LUCC data used in this paper, some urban areas located in hot and humid climates and developed to a certain extent also have a higher risk of heatwaves due to their rapid urbanization [78], which corresponds to a distribution of high HI, and usually such areas also have a distribution of relatively high NTL values [79].Therefore, the combination of the study data and the attribution results could indicate the possible reasons for the NTL data to be large without any driving force on the spatial distribution of global heatwaves.
Aspect demonstrates the greatest explanatory power for the global heatwave distribution, followed by LUCC and precipitation.Aspect plays a significant role in heatwave distribution due to its influence on solar radiation received by different slope directions, which in turn affects local temperature and humidity conditions [80,81].In the Northern Hemisphere, south-facing slopes generally receive more solar radiation than north-facing slopes, which may lead to higher temperatures and lower relative humidity, thereby increasing the risk of heatwaves [82].Similarly, east-and west-facing slopes may also exhibit different temperature and humidity characteristics, but further data analysis is needed to confirm this.LUCC also contributes to the heatwave distribution, as different landcover types exhibit varied surface properties, such as albedo, evapotranspiration rates, and surface roughness, which can impact local microclimates and temperature patterns [83].Urban areas, for instance, are known to experience higher temperatures and heatwave risks due to the urban heat island effect [59].Precipitation has a weaker explanatory power for heatwave distribution, likely because it is more dependent on atmospheric circulation patterns and regional climate dynamics [84].However, precipitation can still influence heatwave risks by modulating soil moisture and evaporative cooling, which in turn affects air temperature and humidity [85].No significant trends in the time series were found in the R 2 of the regression model for each influencing factor, indicating that the driving mechanism of each factor for the spatial distribution of the global heatwaves is relatively stable in time.

Possible Targeted Measures
This study described the spatial and temporal patterns of global heatwaves since the beginning of the 21st century.Through the analysis of meteorological data, the average body temperature in different regions and seasons was derived, and the changes in heatwave risk were analyzed, which could help people understand the current climate change situation and provide basic data support to promote policy design and risk reduction for climate change adaptation and disaster prevention and mitigation in various regions.Meanwhile, this study also analyzed the driving forces of different factors on global heatwave distribution, which could help the government and society to better understand the factors affecting climate change so that they can take corresponding preventive measures to mitigate the impact of heatwaves on human beings under climate change and promote the sustainable development of society.Based on the research findings, the following are some possible targeted suggestions for addressing heatwave risks: Focus on changes in high-latitude areas: HI increase was found to be more pronounced in high-latitude areas, so greater attention should be given to heatwave risks and response measures in these regions; Bolster land use and urban planning: As LUCC had significant explanatory power for the distribution of heatwaves, incorporating green infrastructure and low-carbon development strategies within urban planning and land use initiatives could effectively mitigate heatwave risks; Consider the interplay between precipitation and heatwaves: Precipitation exhibited some explanatory power for the distribution of heatwaves.Thus, the relationship between precipitation and heatwaves should be taken into account to better predict and cope with heatwaves.
Utilize topographic information to optimize prevention measures: Since the results identified aspect as the primary explanatory factor for global heatwave distribution, employing topographic information when devising heatwave prevention measures can facilitate optimized resource allocation and response strategies; Address seasonal variations: Tailoring response strategies to heatwave risks across different seasons is vital.For instance, during seasons with elevated heatwave risks, enhancing warning and emergency response systems can effectively raise public awareness and preparedness.

Limitations in Present Study
Following the definition and global study of Meehl et al. [12], Robinson et al. [68], Russo et al. [26], and Smith et al. [86], for the calculated HI, the risk levels of global extreme heatwaves were classified according to the value of the daily average HI as shown in Table 2, which is widely used and was derived based on historical data and climate models [87], as well as considerations of climate change and human adaptability in determining the thresholds [88].This method directly reflects the combined impact on the human body, making it easy for the public to understand.Besides, by setting clear HI thresholds, the heatwave risk levels of different regions can be compared intuitively, which helps in formulating response measures.However, the absolute thresholds may still have limitations in terms of their applicability across different regions.These thresholds may provide an adequate division of heatwave risks in some areas but may not be as precise in others.Looking forward, our research plans include investigating the validity of the classification for different regions and considering comparative analyses that incorporate relative threshold divisions to refine and improve the heatwave risk classification method, making it more versatile and adaptable to a wider range of situations and regions.
Given the constraints posed by the limited data available at the spatial and temporal scales of this study, this paper only considered the influence of seven factors from topography, climate, and human aspects on the global seasonal heatwaves over a 20-year period from a geographic perspective.The underlying causes of the occurrence of heatwaves are still global warming, the influence of atmospheric circulation, the control of interplanetary elements, etc.There are other factors of greater mechanistic relevance available to perform the analysis of the driving forces at other spatiotemporal scales.In examining climatic factors, we primarily concentrated on the prevalent climatic variables that may exhibit associations with heatwaves.Initially, we selected relative humidity as a potential climatic factor, alongside precipitation, and obtained substantial explanatory power.However, considering that relative humidity was utilized as a variable in the calculation of the HI, we ultimately excluded it, presenting the final results with a single climatic factor.Nevertheless, it is crucial to acknowledge the potential influence of additional climatic variables.In future investigations, we will undoubtedly delve deeper into this subject, incorporating a broader range of climatic data to enhance our understanding.Additionally, human activities could also have an impact on heatwaves through their influence on surface temperature, urban heat island effect, etc., which is more complex, including urbanization, industrialization, transportation, etc., and also more difficult to accurately identify, quantify, directly access, and analyze.For example, the impact of transportation may be manifested in road traffic flow, pollutant emission, etc., and the impact of environmental aspects may be manifested in urban green coverage, building height and density, water bodies and forest area, etc.In the future, based on this study, we will try to overcome the above difficulties, obtain richer data sources, and analyze more comprehensively the mechanisms of the influence of more climatic meteorological and human factors on the spatiotemporal distribution and changes of heatwaves at different scales so as to propose more refined and regionally targeted interventions and policies for the prevention and control of extreme climate disasters.

Conclusions
Drawing upon near-surface air temperature and relative humidity data from ERA5 and a series of remote sensing products as driving factors, this study investigated the spatiotemporal variability of global human-perceived heatwaves on seasonal scales during 2000-2020.Considering the synergistic effect of temperature-humidity, HI was employed to characterize human health risks.An RF model regression was utilized to quantify the impact of various driving factors in topography, climate, and human activities, as well as the interactions between factors on the spatial distribution of global heatwaves.The key outcomes from the present study are summarized as follows: 1.
Since the 21st century, changes in HI have varied significantly worldwide, with the majority of regions witnessing an increase, particularly at higher latitudes.The area of the HI-increasing region was larger in S2 and S4, and the largest HI-increasing area was observed in S2, while the overall HI increase peaked in S3; 2.
Except for the decreasing area of no-risk regions, regions under all other risk levels expanded (the proportion of high-risk regions increased from 2.97% to 3.69% in S2, and from 0.04% to 0.35% in S4); 3.
Aspect exhibited the most substantial influence on the spatial distribution of global heatwaves (0.69-0.76), followed by LUCC (0.48-0.55) and precipitation (0.16-0.43).In contrast, slope and NTL exhibited negligible effects, essentially having no impact on the global heatwave distribution; 4.
From 2000 to 2020, the explanatory power of the factors underwent a minor decrease without a significant trend but showed seasonal periodicity.The overall explanatory power of each factor was relatively high in S1 and S2 and low in S4.The explanatory power of climatic and land use factors was the highest in S1, and that of topographic and other human factors was the highest in S2; 5.
There was no significant trend in the attribution results of the interaction factors over the years, but the explanatory power of DEM and slope increased notably when interacting with the climate factor, aspect, and LUCC, respectively.The explanatory power of aspect and LUCC interacting with precipitation reached the maximum value (above 0.85) under all interactions.
Overall, these findings provide valuable insights into the changes in HI, humanperceived heatwave risks, and some underlying driving factors at a global scale, highlighting the interplay between climatic, topographic, and human factors in shaping heatwave patterns and risks.
Topographic data are provided by the National Aeronautics and Space Administration (NASA, https://www.nasa.gov/(accessed on 16 January 2023)) as global 0.1° DEM data, which is static over the 20 years; population density data are Gridded Population of the World Version 4 (GPWv4) at a spatial resolution of 0.125° from NASA's Socioeconomic Data and Applications Center (SEDAC, https://sedac.ciesin.columbia.edu/data/set/gpw-v4-data-quality-indicators-rev11(accessed on 27 December 2022)); nighttime light data are available on the Harvard Dataverse
4 • C) and less influenced by seasonal changes.The HI increased moderately (0-2.8 • C) in all four quarters in central Brazil, the Arabian Peninsula, and the Middle East, but increased slightly (0-1.4• C) in S1, S2, and S4.In southeastern China, the HI decreased slightly (0-1.4• C) in S3.The HI displayed a slight decrease (0-1.4 • C) in S1 and S3 in most of Australia (0-1.4 • C) and a small increase (0-1.4 • C) in S2 and S4.Change in HI in the middle and high latitudes was most influenced by the seasons, with the most significant difference in the change in northern China, Russia, northern Canada, the northern United States, and the Antarctic continent, and the HI decreased (0-1.4 • C) in S1 and S4 in northern China, and increased slightly (0-1.4• C) in S2 and S3 (0-1.4 • C).

24 Figure 3 .
Figure 3.The change in heatwave risks from 2000 to 2020.(A) Change in the average days under each risk level within a year from 2000 to 2020: (a) NRL; (b) LRL; (c) MRL; and (d) HRL.(B) Change in percentage of the area of each risk level in S2 and S4 from 2000 to 2020: (a) in S2; and (b) in S4.

Figure 3 .
Figure 3.The change in heatwave risks from 2000 to 2020.(A) Change in the average days under each risk level within a year from 2000 to 2020: (a) NRL; (b) LRL; (c) MRL; and (d) HRL.(B) Change in percentage of the area of each risk level in S2 and S4 from 2000 to 2020: (a) in S2; and (b) in S4.

Figure 5 .
Figure 5. Change (A) and shift (B) in the risk level of global heatwaves in S2 and S4 from 2000 to 2020: (a) in S2; and (b) in S4.

Figure 5 .
Figure 5. Change (A) and shift (B) in the risk level of global heatwaves in S2 and S4 from 2000 to 2020: (a) in S2; and (b) in S4.

Figure 6 .
Figure 6.R 2 of RF regression model of each single driving factor for global heatwaves in each season from 2000 to 2020.(A) Comparison of R 2 of each single driving factor for global heatwaves in each season from 2000 to 2020: (a) in S1; (b) in S2; (c) in S3; and (d) in S4. (B) Change in R 2 of each single driving factor for global heatwaves in each season from 2000 to 2020.

Figure 6 .
Figure 6.R 2 of RF regression model of each single driving factor for global heatwaves in each season from 2000 to 2020.(A) Comparison of R 2 of each single driving factor for global heatwaves in each season from 2000 to 2020: (a) in S1; (b) in S2; (c) in S3; and (d) in S4. (B) Change in R 2 of each single driving factor for global heatwaves in each season from 2000 to 2020.

Figure 7 .
Figure 7. Importance of each driving factor for global heatwaves of the total RF regression model i each season from 2000 to 2020: (a) in S1; (b) in S2; (c) in S3; and (d) in S4.

Figure 7 . 24 Figure 8 .
Figure 7. Importance of each driving factor for global heatwaves of the total RF regression model in each season from 2000 to 2020: (a) in S1; (b) in S2; (c) in S3; and (d) in S4.Remote Sens. 2023, 15, x FOR PEER REVIEW 16 of 24

Figure 8 .
Figure 8. R 2 of RF regression model of each pair of interaction driving factors for global heatwaves from 2000 to 2020.

Table 1 .
Factors selected to be detected in this study.

Table 2 .
HI, the corresponding risk, and the health problems they could trigger.