1. Introduction
Mountains, as one of the primary landform types on the Earth’s surface, account for nearly a quarter of the global terrestrial areas [
1,
2]. Their intricate topographic features, characterized by steep inclines, significant elevation gradients, and diverse landscapes, play a crucial role in influencing the circulation of energy and materials within regional climate systems [
3,
4,
5]. Generally, the topographical forcing effects of mountainous terrain result in considerable non-uniformity within the atmospheric boundary layer. Localized circulations, such as valley winds and orographically induced uplift airflows, interact with large-scale meteorological systems, resulting in pronounced spatial gradients in meteorological variables (e.g., temperature, precipitation, wind speed, and humidity) across both vertical and horizontal scales [
6,
7,
8,
9]. This inherent complexity poses substantial challenges for conventional meteorological monitoring networks, which struggle to accurately capture the spatiotemporal dynamics of meteorological elements in mountainous areas. In particular, the insufficiency of infrastructure in these regions, such as power supply, site accessibility, and network transmission capabilities, further exacerbates the difficulties associated with meteorological monitoring in mountainous environments. Consequently, the scarcity of high-resolution meteorological data has become a critical impediment to effective disaster prediction, ecological conservation, and sustainable resource management [
10,
11].
ERA5 is the fifth generation of global reanalysis datasets released by the European Centre for Medium-Range Weather Forecasts (ECMWF). This dataset synthesizes a variety of observational data sources, including in situ measurements, airborne observations, and satellite remote sensing, alongside advanced numerical weather prediction products through assimilation techniques [
12]. Nowadays, the ERA5 database is widely applied in multiple disciplines. For example, Liu et al. conducted an assessment of the accuracy of precipitation and near-surface air temperature data derived from the ERA5 reanalysis dataset over the past 70 years, analyzing the spatiotemporal distributions of temperature and precipitation on a global scale [
13]. Similarly, Keune et al. utilized the comprehensive ERA5 climatology from 1940 to the present to explore droughts, highlighting the dataset’s utility for environmental agencies and its relevance to sectors such as water management and agriculture, thereby aiding in the monitoring of water and food security [
14]. Li et al. [
15] proposed an innovative AI method through integrating the Global Multi-Factor Cross-Attention (GMFCA) module, the Feature Fusion Residual Dense Block (FFRDB) connection module, and the U-Net module, in conjunction with a Bayesian global optimization algorithm. This approach achieved a significant reduction in the RMSE by 32–51% for the downscaling of ERA5 land surface temperature data when compared to the random forest method. Zhai et al. [
16] assessed the applicability of ERA5 wind and wave products in the South China Sea, finding that the bias between ERA5 and observed significant wave heights ranged from −0.24 to 0.28 m. The analysis indicated a positive bias in the ERA5 data, suggesting an overall underestimation in most areas, with the exception of the Beibu Gulf and central-western South China Sea, where overestimation was observed.
In addition to the aforementioned direct applications, a substantial body of literature has assessed the accuracy and applicability of ERA5 datasets. For instance, Zhi et al. [
17] examined the relevance of the high-altitude temperature fields within various reanalysis datasets in China by integrating sounding data. Their findings indicated that the ERA5 high-altitude temperature data provide a more accurate representation of the upper troposphere in the northern part of China. Jiang et al. evaluated the ERA5 precipitation over mainland China from 2003 to 2015, utilizing gridded gauge precipitation data. They concluded that ERA5 exhibits larger relative biases in its precipitation estimates, tending to overestimate light precipitation events while underestimating moderate and heavy precipitation events [
18]. Yilmaz validated the trends in ERA5 and ERA5-Land temperature during 1951–2020 against ground station-based observations in Turkey. He revealed a high degree of consistency between the trends in ERA5/ERA5-Land and observations, demonstrating that ERA5/ERA5-land can serve as a substantial alternative to observational data [
19]. Suo et al. [
20] reported the applicability of ERA5 wind speed data over the Bohai baseline, based on a three-year observation. Their results demonstrated that the wind speed from ERA5 reanalysis exhibits strong consistency from the surface to the tropospheric level of about 5 km, with correlation coefficient R
2 values ranging from 0.5 to 0.85.
Despite these extensive validations of the applicabilities of various ERA5 parameters for different regions and spatiotemporal scales, however, in complex mountainous areas, the use of ERA5 faces more challenges. The (sub-)hundred-meter-scale undulating terrains characteristic of these regions may exceed the horizontal resolution capabilities of ERA5 (e.g., ~30 km), thereby failing to adequately capture the complex physical processes and atmospheric dynamics induced by the rolling topography at such scales [
21,
22,
23,
24,
25], which can result in great uncertainties. This issue is particularly pronounced in the mountainous regions in central China, a climatic transitional zone between northern and southern China, where a systematic assessment of the ERA5 datasets in this area is urgently needed for both mountain meteorology and climate change research [
26].
In this study, we conducted three-month continuous measurements using ground-based Doppler lidar and a microwave radiometer in a valley in the Jigong mountains, Xinyang city, Henan province, central China. These measurements aimed to detect the vertical profiles of key boundary layer meteorological variables and to use the observational data to evaluate the accuracy of the ERA5 reanalysis. The findings contribute to a deeper understanding of the vertical meteorological characteristics inherent to mountainous regions. Moreover, given the critical role of the atmospheric boundary layer in regulating the exchange of momentum, heat, and mass between the Earth’s surface and the free atmosphere—processes which influence cloud formation and precipitation—the results also hold significant implications for providing essential reference data for meteorological predictions and climate modeling in mountainous central China.
3. Results
3.1. Horizontal Wind Speed Assessment
As a transitional zone between northern and southern China, Xinyang exhibits climatic characteristics that reflect a shift from the clearly defined four-season cycle of the north to the dry–wet seasonal pattern prevalent in the south. Therefore, it is essential to analyze the vertical monthly performances of the ERA5 datasets, particularly at different representative altitudes within the mountainous region, including the valley (50 m), mountain slope (100 m), near the summit (200 m), middle boundary layer (500 m), and the top of the boundary layer (1000 m).
From the absolute error box plots (
Figure 2a–c), it is evident that there are distinct differences in error distributions across different months and heights. In general, the absolute errors are confined within ±5 m/s for the majority of instances, although extreme values occur with a low probability. In terms of relative error (
Figure 2d,e), fluctuations in the wind speed baseline result in several notable outliers. Certain altitudes exhibit particularly large deviations, especially in February 2025, where some points reach as low as −15 m/s or as high as 10 m/s. These anomalies may be attributed to extreme weather processes or insufficient simulation of the boundary layer by the model. A comparative analysis of the three months indicates that the overall median error in December 2024 (
Figure 2a,d) tends to be closer to 0, while January 2025 (
Figure 2b,e) and February 2025 (
Figure 2c,f) display larger outliers at specific heights, especially at 200 m, 500 m, and 1000 m. This suggests that in January and February, the weather conditions are more complicated, leading to greater discrepancies between the modeled and observed wind speed. Another plausible explanation is the highly stable boundary layer in winter, which can cause differences between the model and actual observations. Additionally, strong convective activity or frontal systems in January and February could significantly impact the high-altitude wind field, whereas the ERA5 model simulation still maintains a relatively smooth trend.
From the perspective of horizontal wind speed errors at different altitudes, we can observe that ERA5 underestimates the wind speed below 500 m. This discrepancy may arise from the model’s insufficiently detailed representation of the terrain. Especially for the range of 50–200 m, atmospheric flow is highly sensitive to strong disturbances caused by complex underlying terrain, leading to more abrupt changes in both horizontal and vertical wind components. The coarse grid resolution of reanalysis data generally fails to capture these fine-scale features, thereby causing larger deviations. In addition, terrain-induced forcing and thermal heterogeneity enhance low-level turbulent transport, introducing further uncertainties into simulations. At altitudes of 500–1000 m, the wind field is predominantly governed by large-scale circulation, with reduced influences from turbulence and surface roughness. Consequently, the wind speed distribution becomes relatively stable, and the model simulation aligns more closely to the average state, naturally reducing outliers and extreme deviations.
3.2. Vertical Wind Speed Assessment
From a median perspective, the comparison of vertical wind speed against horizontal wind speed reveals distinct characteristics. Specifically, the ERA5 model tends to overestimate vertical wind speed, especially at altitudes below 500 m, with this bias varying significantly across different months (
Figure 3). This indicates that the model’s simulation results lack consistency. Similarly to horizontal wind speed, extreme errors are more prevalent at lower altitudes, with significantly fewer extreme discrepancies observed at 1000 m. The blue scattered points tend to cluster more densely at low levels (e.g., 50 m and 100 m), occasionally including extreme outliers. For example, in
Figure 3b,c, some points exceed −1.0 m/s or more. This shows that in the near-surface layer, the difference between the observed and modeled vertical speeds is more pronounced, potentially due to the inability of ERA5 datasets to finely resolve the subtle and rapid disturbances caused by surface roughness in mountainous terrains, leading to high-frequency outliers.
The height of 200–500 m represents a critical transition zone between the boundary layer and the transition layer. The slight reduction or shift in the box plot suggests that the deviation within this height segment may be lower during certain months (e.g., January 2025) or that the deviation distribution is more concentrated around −0.5 to 0.2 m/s. Occasionally, the median approaches 0, indicating that in most cases, ERA5 and observations are relatively consistent in resolving the vertical wind speed at this height. However, a notable number of outliers below 1.0 m/s can still be observed, suggesting that under conditions of strong convection or jet streams, ERA5 struggles to adequately reproduce vertical motion, resulting in significant differences between the two datasets. At higher levels, the overall box tends to be closer to 0, with fewer outliers. This is because high-altitude wind fields are predominantly influenced by large-scale circulation, and ERA5’s average characterization for rising/sinking motion is often more accurate than that for the near-ground layer. Additionally, the signal quality of lidar at high altitudes may improve or remain relatively stable. Nevertheless, occasional significant negative deviations (−1 m/s level) can still be observed in some months (e.g., 1000 m in
Figure 3a), indicating that some extreme weather processes may have occurred during this period. Therefore, there are substantial differences in the intensity estimates of the upper-level rising/sinking flows by the instruments or models.
Compared with the horizontal wind speed, the vertical wind speed is typically on the order of 0 to 1 m/s (or less), and even small measurement or simulation errors become prominent in absolute terms. For instance, a deviation of 0.5 m/s may be negligible for horizontal flow but could represent a considerable proportion for vertical flow. Consequently, during intense weather events or active turbulence, the observed data are more susceptible to short-term spikes. If ERA5 employs relatively smooth parameterizations, this can lead to significant discrepancies.
3.3. Horizontal Wind Direction Assessment
The primary wind directions recorded at the Xinyang site are northeasterly and southwesterly, consistent with the valley orientation. The ERA5 dataset also predominantly reflects these two directions, albeit with lower frequencies. As shown in
Figure 4a, the wind direction petals of the lidar at 50 m display nearly equal proportions for both northeasterly and southwesterly winds, each constituting approximately 20%. In contrast, ERA5 shows a slightly elevated frequency of northeasterly winds, accounting for around 15%. With increasing heights (from 50 m to 200 m), the observed wind directions become predominantly shifted towards the northeasterly direction, a trend that is similarly observed in the ERA5 wind direction data. At greater altitudes, such as 500 m, the lidar continues to demonstrate a stronger distribution intensity of northeasterly winds, while ERA5 exhibits a more uniform distribution within the northern and eastern ranges. At 1000 m, both datasets exhibit comparable frequency patterns, with peak frequencies of approximately 15%, yet they differ in their peak directions. The lidar measurements indicate a peak around NNE, whereas the ERA5 wind direction peaks around NNW.
In general, within the low atmospheric layer (50–200 m), terrain obstructions and local circulations driven by thermal heterogeneity tend to skew wind directions towards specific orientations. Lidar can effectively capture these small-scale effects due to its high spatial resolution, while ERA5 relies on global or regional grids with limited resolution, making it challenging to fully simulate such microscale variations. Although ERA5 has a relatively high horizontal resolution (about 30 km or finer), it still simplifies complex terrain features such as mountainous regions. In addition, the boundary layer parameterization scheme may not adequately resolve the vertical shear of the low-level wind directions, leading ERA5 to “smooth” the wind direction field.
To further quantitatively analyze the differences in wind direction, we added both the overlap integral and the mean wind direction metrics for all five height levels (50–1000 m). The overlap integral (
Ioverlap) is defined as follows:
where
Alidar,
AERA5, and
Aoverlap denote the red (lidar), blue (ERA5), and purple (overlap) areas in
Figure 4. The three areas are obtained from the polar polygons via the shoelace formula, and the overlap area is approximated by integrating the minimum radius every 1°.
The mean wind direction
θaverage is computed as the argument of the vector sum. The calculation results of the two indices are shown in
Table 1.
The mean wind direction difference at the 50 m height reaches +91.1°, indicating that ERA5 misinterprets the southeasterly wind (110.8°) observed by lidar as a northeasterly wind (≈20°). The 30 km spatial resolution of ERA5 is insufficient to resolve local terrain channels and thermal circulation, leading to amplified errors in the near-ground layer over this mountainous region. The overlap integral peaks at 200 m (0.68) but remains below the threshold of 0.85 for high similarity, while the mean direction bias ranges from 21.5° to 91.1°, suggesting that ERA5 underrepresents local directional patterns, particularly within the surface layer below 200 m. Between 200 m and 1000 m, the directional underestimations in ERA5 stabilize around 22°, with values of 21.5° at 200 m, 22.1° at 500 m, and 22.4° at 1000 m, demonstrating minimal influence from the underlying rolling topography.
Figure 5 presents the scatter plots of lidar-measured (OBS) and ERA5 wind directions. It is observed that as the height increases, the scatter distributions of the two datasets show a certain concentrated cluster distribution but also a significant deviation and dispersion. Combining the correlation coefficient shown in each figure shows that at each height there is a red high-density cluster near the diagonal (especially near the northeast), which shows that the most common wind directions are concentrated in certain similar directions (such as near NE) in both observations and ERA5 data. In addition to the main concentrated clusters, there are also scattered points scattered outside the diagonal, even with a difference of more than 100 degrees. These outliers indicate that ERA5 and the measurements exhibit significant discrepancies during certain time periods or samples. These discrepancies are primarily attributed to changes in wind direction induced by topographic effects, which cannot be accurately captured by simple bilinear interpolation. From 50 m to 200 m, the correlation coefficient fluctuates roughly around 0.4 to 0.45 when the height increases, with little difference. But at 500 m and 1000 m, the correlation coefficient drops significantly (to 0.42 and 0.29, respectively). Considering that ERA5 is a reanalysis dataset with a coarse resolution (the horizontal resolution is usually on the order of tens of kilometers), it is difficult to accurately depict the topography and underlying surface and local meteorological conditions of the observation point.
3.4. Temperature Assessment
Figure 6 illustrates the vertical profile of the absolute temperature deviation (T Abs. Dev) between MWR observations and ERA5 reanalysis data. There is minimal deviation between the observed temperatures and those from ERA5 within the lower atmospheric layer (0–500 m), indicating that ERA5 accurately represents near-surface temperature conditions. In the altitude range of 500–1500 m, the deviation gradually increases from approximately 0 K to 6.8 K. This suggests that the accuracy of ERA5 in reconstructing temperature decreases progressively with increasing altitude.
The black horizontal bars and red shaded areas in
Figure 6 denote the ranges of deviations and the density distribution of the point cloud at different levels, respectively. It is evident that deviations within the near-surface layer (0–300 m) are highly concentrated and symmetric, further validating the strong consistency between ERA5 and MWR observations near the surface. In contrast, at higher levels, particularly above 1000 m, the dispersion of deviations becomes substantially larger, demonstrating that the temperature differences between the two datasets increase with altitude.
Temperature variations in the boundary layer are typically strongly influenced by ground effects, such as diurnal variations in sensible heat flux caused by radiation. As a result, ERA5 can more accurately represent temperature variations in the lower atmospheric layers, leading to smaller deviations. With increasing altitude, especially in the upper boundary layer or lower free atmosphere, ERA5 data may be affected by factors such as complex terrain and local thermal gradients, resulting in increased temperature deviations. Furthermore, at higher altitudes (e.g., above 1000 m), the vertical structure of the atmosphere becomes more complex, particularly in mountainous regions or areas with complex terrain. ERA5 may not fully capture these localized thermal differences, which explains why temperature deviations tend to be greater at higher levels compared to lower levels. This phenomenon also contributes to the larger wind speed deviations observed at lower levels and the larger temperature deviations at higher levels in ERA5. As another aspect, uncertainties in MWR temperature measurements also increase with height due to the saturation of oxygen absorption lines, which reduces the brightness temperature sensitivity to atmospheric temperature. This issue warrants further discussion in future.
Figure 7 depicts the scatter density relationships between observed temperatures (MWR T) and ERA5 temperatures (ERA5 T) at various heights, quantifying the linear consistency and deviation characteristics between the two datasets across different atmospheric layers. In the lower layers of 2–200 m (
Figure 7a–d), scatter points are tightly clustered around the 1:1 line (y = x), and the highest-density regions exhibit a strong linear trend. There are fitted slopes of 0.83, 0.90, 0.91, and 0.83 and R
2 values of 0.63, 0.66, 0.69, and 0.65 for the heights of 2 m, 50 m, 100, and 200 m, respectively. These results indicate high consistency between ERA5 and actual temperature observations in the near surface in mountainous central China.
From the middle to near the top of the atmospheric boundary layer (i.e., 500 m, 1000 m, and 1500 m in
Figure 7e–g), the fitted slopes gradually decrease from 0.70 to 0.62. Despite R
2 values remaining above 0.60, this suggests that the consistency between ERA5 and MWR observations weakens in the middle and upper atmospheric layers, and systematic deviations become more pronounced. This degradation may be attributed to the reduced reliance on observational data in ERA5 at higher altitudes, the decreased sensitivity of the MWR, and changes in atmospheric stratification characteristics.
3.5. Relative Humidity Assessment
Figure 8 displays the vertical profiles of the absolute relative humidity deviation (RH Abs. Dev) between microwave radiometer (MWR) observations and ERA5 reanalysis data, highlighting the height-dependent discrepancy patterns. The overall absolute deviations between observed and ERA5 relative humidity are more pronounced compared to temperature deviations. This discrepancy can be attributed to factors such as reduced observation density in the reanalysis assimilation at higher levels and limitations in ERA5 moist parameterizations. As altitude increases, ERA5 tends to diverge more significantly from MWR observations. Near the surface (0–300 m), the deviations are relatively small and concentrated around 0%, suggesting a moderate agreement between MWR observations and ERA5 data. However, starting from 300 m upward, the deviation range noticeably broadens, with the spread of deviations increasing progressively with height. Notably, at levels above 800 m, the distribution of absolute deviations becomes substantially wider, with individual values reaching beyond ±50%. In the near-surface layer (0–300 m), the deviation distribution remains relatively symmetric and narrow, indicating that ERA5 provides reasonably reliable estimates of relative humidity near the ground. Conversely, above 1000 m, the deviation patterns become increasingly asymmetric and scattered, with broader ranges on both the positive and negative sides. This suggests that ERA5 encounters challenges in accurately capturing the variability in humidity in the free atmosphere, likely due to the combined effects of weaker observational constraints and more complex moisture structures aloft.
Figure 9 presents scatter density plots of observed relative humidity (MWR RH) versus ERA5 relative humidity (ERA5 RH) across different heights ranging from 50 m to 3000 m, quantitatively illustrating their linear relationship and deviation characteristics at various atmospheric levels. In the lower atmospheric layers of 2–200 m (
Figure 9a–d), the scatter plots reveal moderate to weak linear relationships. The fitted slopes decrease from 0.65 at 2 m to 0.60 at 200 m, with relatively low R
2 values (e.g., R
2 = 0.44 at 2 m, 0.38 at 50 m, 0.39 at 100 m, and 0.36 at 200 m), indicating considerable dispersion between ERA5 and observed RH even near the surface. Unlike temperature, which exhibits stronger correlations at lower altitudes, relative humidity shows notable inconsistencies even close to the ground. As the altitude increases to 500–1000 m (
Figure 9e,f), the fitted slopes further decrease to values of approximately 0.55–0.62, and R
2 values drop significantly, reaching as low as 0.30–0.32. In
Figure 9g, we examined the relationship between ERA5 RH and MWR-retrieved RH at 1500 m using both linear and exponential regression models. The results indicate that the exponential fit (y = 11.52·exp(0.02x); R
2 = 0.31) offers a more accurate representation of the observed relationship compared to the linear fit (y = 0.65x + 1.87; R
2 = 0.27). This improvement can be largely attributed to the nonlinear increase in ERA5 RH under moderately to highly humid conditions, where ERA5 tends to overestimate RH more rapidly than MWR observations. A noticeable change in the distribution density and slope of the fitted curve occurs around an MWR RH value of 60%, suggesting the presence of a “turning point”. This phenomenon may be associated with the proximity of this layer to the top of the planetary boundary layer or the base of low-level clouds, where significant moisture gradients and cloud formation processes are prevalent. In such conditions, ERA5 RH is more strongly influenced by model parameterizations (e.g., condensation and vertical mixing), whereas the MWR provides more localized and sensitive measurements. Consequently, the exponential function better captures the accelerated increase in ERA5 RH under high-humidity conditions, thereby offering a more realistic depiction of the relationship between the two datasets.
Moreover, ERA5 tends to underestimate relative humidity at low-to-moderate RH values while overestimating it at higher RH levels, especially above 70%. This systematic bias is evident in the fitted regression slopes being significantly less than 1 and there being positive intercepts at most heights, indicating that ERA5 RH values are systematically shifted compared to observations.
3.6. The Applicability of the ERA5 Dataset During Extreme Cold Events
A cold event refers to the rapid intensification of cold air originating from high-latitude regions and its subsequent movement into mid- and low-latitude areas, resulting in sharp temperature drops (defined as a decrease of more than 8 °C within 24 h, with the minimum temperature falling below 4 °C), accompanied by strong winds and precipitation such as rain or snow. As cold events represent the most important large-scale extreme weather events during the winter season affecting this region, it is of critical significance to evaluate the accuracy of ERA5 datasets in this area under cold event conditions.
During 7–8 February 2025, a severe cold event hit this region, resulting in abrupt alterations in various meteorological variables. During this cold event, a strong closed low-pressure system was present at the 500 hPa altitude in the eastern part of Inner Mongolia, and a transversal trough extended in the east–west direction. There was a cold high-pressure system from the Baikal Lake to the central region of Mongolia, and the pressure in North China was relatively low, forming a large pressure gradient. This combination of the high-pressure trough in the upper atmosphere and the cold high-pressure system on the ground led to the occurrence of this cold wave [
42].
Figure 10 illustrates the discrepancies between observations and ERA5 analysis. It is apparent that the ERA5 wind speeds are generally underestimated by more than 5 m/s compared to the observations across all vertical levels. Specifically, for vertical wind speed, ERA5 consistently underestimates values below around 500 m while overestimating them at elevations above this threshold. In addition, during the cold event, ERA5 temperature exhibits a slight underestimation of 2–3 °C, whereas ERA5-RH is markedly underestimated by over 20%. These results demonstrate that the ERA5 data necessitates further refinement to rectify its shortcomings in the mountainous regions of central China. Various methodologies could potentially be employed to enhance its accuracy. For instance, machine learning models have been evidenced to be effective in mitigating biases in ERA5 data.
4. Discussion
In terms of dynamic variables, both ERA5 horizontal and vertical wind speed exhibit decreasing deviations along with increasing altitude when compared to the lidar observations, with notable biases in the near-surface layer below 500 m and minimal biases near the top of the boundary layer. Regarding wind direction, both ERA5 and observations indicate a predominance of northeasterly and southwesterly winds at the near-surface level, which aligns with the valley orientation. However, ERA5 exhibits lower frequencies in these wind directions. At an altitude of 1000 m, both datasets display similar frequency patterns, peaking at approximately 15%, but differ in their dominant wind directions (NNE in observations and NNW in ERA5). These findings emphasize the considerable influence of underlying heterogeneous mountainous terrains on the low-level dynamic field.
In contrast to the dynamic variables, the ERA5 thermal variables of temperature and relative humidity exhibit relatively small deviations within the lower atmospheric layers of 0–500 m, suggesting that ERA5 effectively represents near-surface thermal conditions. Between 500 and 1500 m, the temperature deviations increase but remain within a relatively narrow range of ±5 K. In comparison, the deviations in relatively humidity are more pronounced, demonstrating the limited capability of ERA5 in capturing moisture variability.
Compared to other studies on the applicability of ERA5 datasets over mountainous terrains, similar discrepancies have been reported in mountainous areas in Zhejiang province in southeastern China, where it is found that ERA5 deviations in both horizontal wind speed and temperature increase along with altitude, with temperature biases reaching approximately 5 K at elevations between 2000 and 3000 m [
37]. Zhao et al. reported a very high correlation coefficient of 0.98 between ERA5-Land temperature and ground observations in the Qilian mountains in northwestern China [
43]. This strong correlation can be attributed to the fact that land surface temperatures in ERA5 are primarily derived from satellite remote sensing, which offers high accuracy. However, the focus of our study is on temperatures at different vertical levels rather than surface values. In ERA5, temperature data at various altitudes are mainly obtained through model simulations, which are more prone to introducing considerable errors and uncertainties. Additionally, differences in topography, landforms, and geographical climate zones contribute to significant variations in the applicability of ERA5 reanalysis data. Consequently, the agreement between modeled and observed temperatures is significantly weaker compared to the land surface comparisons. Over the Central Andes Altiplano (2500–5650 m), Birkel et al. found that ERA5 2 m temperature exhibits a high correlation with observations, reaching up to 0.93, demonstrating the great ability of ERA5 to resolve near-surface meteorological variables [
44].
The limitations of ERA5 data become particularly apparent during extreme cold events, highlighting the requirements for further improvements to address its deficiencies in this region. In light of these limitations, the application of machine learning models is highly recommended as a promising approach to improving the accuracy of ERA5 data. Future work will explore ML-based downscaling and bias correction of ERA5 forcing based on the neural network and Swin Transformer models. For example, Beck et al. employed convolutional neural network super-resolution models to sharpen reanalysis winds to 1 km grids over the Tibetan Plateau [
45]. Peng et al. constructed a physics-informed Swin Transformer model and improved the wind speed forecasts in complex terrain by up to 25% [
46]. In addition, multiple mathematical correction algorithms, such as bias adjustment [
47], may also prove effective in reducing biases in the ERA5 dataset and could be applied in future studies.
Another limitation worth emphasizing is that we only took single-point measurements due to challenges related to power supply, site accessibility, and network transmission capabilities in such mountainous regions. This limitation may introduce certain uncertainties regarding the general applicability of ERA5 data in these mountainous areas. Nevertheless, this study represents the first quantitative evaluation of ERA5 data in central China’s mountainous regions. Therefore, more intensive observations across various synoptic situations in different seasons and at different target points within the mountainous terrain are recommended for future research in order to supplement the currently lacking mountain meteorological data and enable a more comprehensive assessment of ERA5’s applicability.
5. Conclusions
This study presents a comprehensive assessment of the vertical accuracy of ERA5 datasets and their applicability in the mountainous regions of central China, based on continuous measurements of boundary layer meteorological variables collected from December 2024 to February 2025. These measurements were acquired using a high-precision remote sensing platform, including a Doppler wind lidar and a microwave radiometer, deployed at a representative mountainous valley site in Xinyang, central China. The key findings are summarized as follows:
Significantly influenced by the underlying topographic forcing effect, the deviations of ERA5 horizontal and vertical wind speed decrease with altitude. With regard to the wind direction, ERA5 effectively captures the dominant northeasterly and southwesterly winds shaped by the near-surface rolling terrain, which is consistent with the measurements, albeit with lower-frequency occurrences. At higher altitudes, both datasets exhibit similar frequency patterns; however, observations show a stronger prevalence of NNE wind, whereas ERA5 indicates a greater frequency of NNW. Unlike the dynamic fields, the temperature and relative humidity of ERA5 datasets display increasing deviations with altitude, showing minimal biases near the surface but significant errors in the upper boundary layer. Notably, ERA5 demonstrates limited capability in accurately representing the humidity conditions in mountainous areas.
Considering extreme cold events, ERA5 is also unable to accurately resolve both meteorological thermal and dynamic variables, which highly limited its applicability in this important climatic transitional zone between northern and southern China. For a future perspective, we recommend the site-specific development and application of multi-machine learning techniques such as neural network and Swin Transformer models and mathematical correction algorithms such as bias adjustment to enhance the accuracy and reliability of ERA5 data in this area.
The conclusions drawn in this study offer valuable insights into the applicability of ERA5 in the mountainous regions of central China, thereby serving as an essential scientific resource for the investigation of mountainous meteorology. Furthermore, these findings hold significant implications for climate modeling efforts aimed at understanding the interactions of the northern and southern climatic systems across mainland China.