1. Introduction
Two key challenges of modern wind technology have long been recognised: the optimal specification of electricity-generating wind turbines and the variability of wind resources [
1]. Consequently, wind energy resource assessment and forecasting are widely acknowledged as grand challenges shaping the future of wind science [
2]. Forecasting is directly linked to grid stability, as managing large quantities of variable wind power requires accurate short-term and long-term predictions to balance electricity supply and demand. Precise characterisation of the wind climate is also a requirement for cost-effective wind farm development, directly influencing turbine selection, wind farm layout, and overall financial viability.
Given the spatial and temporal limitations of observational networks, high-resolution atmospheric reanalysis models have become indispensable tools for estimating wind resources over large regions. These models provide detailed datasets for various atmospheric parameters, such as air temperature and pressure, wind speed and direction, and others. One such model is the Norwegian reanalysis NORA3, a dynamically downscaled, non-hydrostatic reanalysis with a horizontal resolution of 3 km, developed to deliver detailed wind climate information for the Baltic Sea, North Sea, Norwegian Sea, and Barents Sea [
3]. However, the usefulness of any reanalysis data set is fundamentally linked to its accuracy and representativeness when evaluated against independent, ground-truth observations.
Several studies have investigated the validation of wind profiles derived from reanalysis models. In [
4], the NORA3 dataset was validated using wind measurements from six offshore sites. The results showed that model-based median energy production and capacity factor values are slightly underestimated, with the capacity factor being on average about three percentage points lower, primarily due to the model’s tendency to underestimate high-wind-speed events. In [
5], the NORA3 and NEWA wind atlases were compared with offshore platform reference data, and both datasets demonstrated good agreement with in situ observations, with NORA3 slightly outperforming NEWA. In [
6], three wind atlases—NORA3, NEWA, and ERA5—were evaluated against wind profiles obtained from Doppler wind lidar measurements. The study found that all three atlases perform well at offshore locations, with ERA5 and NORA3 showing the closest agreement with lidar data. However, performance was less consistent in coastal and complex terrain environments, where significant deviations were observed, particularly for ERA5 and NEWA.
Furthermore, the assessment of high-resolution atmospheric reanalysis models is evolving beyond simple dataset validation. As shown by [
7], deep learning and spatiotemporal modelling can mitigate wind speed biases by as much as 11% in challenging maritime conditions. These findings suggest that similar data-driven corrections could enhance the utility of the NORA3 dataset within the Baltic Sea region. Despite the high fidelity of reanalysis products, additional uncertainties often arise during the downscaling of mesoscale datasets. Study [
8] investigates how different meso–micro coupling methodologies can lead to variations in wind climate and annual energy production estimates, particularly under stable atmospheric conditions. The results indicate the need for dataset-specific evaluation and correction rather than reliance on a single downscaling strategy. Building on these advancements, Ref. [
9] demonstrated that NORA3’s precise handling of convective precipitation and hub-height wind speeds enables the creation of detailed rain erosion atlases. Such tools regarding leading-edge erosion (LEE) of turbine blades are now critical, allowing for more proactive and strategic maintenance in the challenging environments of Scandinavia and the Baltic Sea.
Therefore, this study aims to assess the accuracy of wind speed and vertical wind profiles derived from the NORA3 reanalysis through validation against independent wind measurements obtained from lidar and meteorological masts. The combined use of lidar and mast data allows for a robust evaluation across a range of heights and atmospheric conditions. The study focuses on the Baltic Sea region, particularly Latvian coastal environments, providing performance characteristics of the NORA3 dataset within this specific geographical context.
2. Materials and Methods
2.1. Observation Datasets
Observation datasets for the analysis and comparison were obtained from various sources and locations. In Riga, a Vaisala WindCube lidar was deployed and provided wind parameter measurements for altitudes from 52 m to 212 m with a 20 m increment. In other locations, NRG Systems data loggers and sensors were installed on dedicated masts at altitudes ranging from 11 m to 84 m. At least two anemometers were placed at each altitude to mitigate air flow disturbances produced by structural elements of the mast itself. Lidar technology is considered more reliable due to the lack of physical components, such as anemometers, wind vanes and masts, but the costs are a limiting factor for widespread use.
Quality control procedures were applied for both types of observation data sources. The lidar device performs self-assessment of measurements based on carrier-to-noise ratio metrics, which are calibrated and validated by the manufacturer. NRG devices are calibrated during the manufacturing process and installed according to IEC 61400-50-1 [
10] and IEC 61400-12-1 [
11] standard specifications. Information about observation locations, coordinates and timestamps is summarised in
Table 1 and
Table 2 and
Figure 1.
The spatial distribution of the observational sites in
Figure 1 was mapped using the Google Earth web platform. Markers were positioned by inputting the coordinates of observational sites from
Table 2, with the base cartographic data provided by Google and GeoBasis-DE/BKG (2009) [
12].
2.2. The NORA3 Dataset
The NORA3 dataset provides high-resolution records of weather patterns for Northern Europe, utilising a 3 km horizontal grid and 65 vertical levels based on a terrain-following pressure system that provides data for altitudes from 10 m to 750 m [
3]. It is based on the non-hydrostatic, convection-permitting (CP) HARMONIE-AROME model, which solves the fully compressible Euler equations. By resolving these dynamics, NORA3 enables high-quality modelling of surface temperature, precipitation, and wind fields within topographically complex regions [
13].
The dataset was retrieved from the WindPro software (version: 4.1.292) METEO object, matching the timestamps of the observational data. It is important to note, however, that NORA3 datasets are provided at specific grid nodes that may not exactly align with the coordinates of observation stations. This approach can cause numerical instability, leading to differences depending on the distance from the coordinates of a given node.
For this study, the NORA3 data points closest to the observation sites were selected; their precise coordinates are detailed in
Table 2, as well as corresponding observation points and the distances between them.
The WindPro software (version: 4.1.292) was also used to adjust the raw data for site-specific altitudes using the WAsP component (external version 12) before further analysis.
2.3. Dataset Comparison and Analysis
The observation dataset was resampled from 10 min measurement intervals to hourly mean values and joined by timestamps with NORA3 dataset (hourly intervals), producing pairs of observation and reanalysis values of wind parameters for all available altitudes. Several comparison and analysis techniques were applied to the datasets in sync.
Comparison of raw wind speed values between datasets is based on correlation analysis for distinct altitudes. It shows how well values match for each timestamp, as well as demonstrates the overall trend of the datasets.
Wind speed distribution analysis provides insights into the expected probability of certain wind speeds occurring at specific altitudes. Weibull was used as a probability distribution function (PDF), which is widely applied in wind analysis [
14,
15]. It helps to eliminate the noisy nature of the wind speed measurements and produce useful results for further analysis.
Wind rose comparison gives valuable insights into the distribution of prevalent wind directions, which is especially valuable in coastal regions, where wind direction and energy production are significantly influenced by smoother, low-turbulence air flow from the sea. This observation is supported by the existing literature [
16,
17], which demonstrates that offshore wind conditions typically exhibit lower turbulent kinetic energy (TKE) and more organised flow patterns than those occurring over land-based terrains.
An annual energy production (AEP) estimate was performed for a typical wind turbine based on Vestas V172 7.2 MW operational characteristics (see
Table 3). The same estimation procedure was applied for both the NORA3 and observation datasets, giving a basis for comparison in terms of modelling results. In addition, a wind resource extrapolation technique [
14,
15,
18] was applied to estimate energy production on altitudes beyond available observations. The technique involves calculation of coefficients for PDF (Weibull in our case) for each observed altitude, applying logarithmic extrapolation of these coefficients to higher altitudes, estimating PDF on higher altitudes, and finally using this PDF for estimating energy production. The same technique was applied for interpolation purposes in the Riga site, where wind observations are available for a whole range of altitudes. It allows comparison of AEP based on actual observed PDFs and predicted PDFs and helps to validate the extrapolation technique.
3. Results and Discussion
3.1. Correlation Analysis
The datasets feature high correlation between wind speed values for all levels (see
Figure 2,
Figure 3,
Figure 4 and
Figure 5). The highest correlation between the NORA3 and observation datasets is found in Riga, where the lidar device was deployed. The correlation coefficient ranges from 0.86 to 0.91 across different altitudes, while in locations with meteorological masts, the correlation is lower, namely 0.75–0.82 in Ainazi, 0.83–0.86 in Pavilosta, and 0.87–0.89 in Ventspils. Also, the correlation coefficient increases with altitude. This can be explained by the relatively small altitudes of the meteorological masts. Also, the correlation coefficients are comparable with the Riga location at corresponding altitudes.
3.2. Wind Speed Distribution
Wind speed distribution closely matches between the NORA3 and observation datasets at higher altitudes, while at lower altitudes, the NORA3 dataset demonstrates a bias towards stronger wind speeds in contrast to actual observations (see
Figure 6,
Figure 7,
Figure 8 and
Figure 9).
Mean absolute errors (MAEs) between observed and modelled wind speed distributions range from 2.7% to 0.4% at various altitudes and locations. Considering the wind speed distributions at altitudes of about 85 m, the highest match between the datasets is demonstrated in the Ventspils location (0.4%), followed by Pavilosta (1.1%) and Ainazi (1.5%). In Riga, at the same altitude, the datasets feature moderate discrepancy (about 2%), reaching comparable matches only at altitudes above 112 m. This can be explained by landscape features, such as coastal plains in Ventspils and forested and urban areas in other locations.
3.3. Wind Rose Comparison
The main discrepancies between the NORA3 and observation datasets are found in wind roses. While wind speeds are comparable, wind directions are significantly different at all altitudes and locations. Mean absolute errors (MAEs) between the observed and modelled wind directions are used as a comparison metric which gives a quantitative indication about differences between the wind roses. Long-term observations in this region report southwest (SW) winds as most prevalent [
20,
21]. Overall wind roses demonstrate the same feature, while the details are different (see
Figure 10,
Figure 11,
Figure 12,
Figure 13 and
Figure 14).
Figure 10 and
Figure 11 demonstrate a short-term comparison between lidar-based measurements and the NORA3 dataset at altitudes of 72 m and 172 m. While the wind roses appeared to be the most diverse, both of them feature a single dominant wind direction and on average the offset is about 30 degrees. Notably, no significant height dependency was observed, as both altitudes show similar deviations from the NORA3 wind patterns.
As illustrated in
Figure 12 and
Figure 13, in the longer-term dataset comparisons from meteorological masts in Ainazi and Pavilosta, the wind roses are more uniform and feature a variety of wind directions. Despite this closer visual alignment, the average differences in wind directions are the most significant among the locations and are about 60 degrees. In the case of Pavilosta, the southeast (SE) wind direction, identified as one of the prevailing directions, is notably underrepresented in the NORA3 wind rose.
As shown in
Figure 14, the comparison between the Ventspils meteorological mast data and the NORA3 datasets yields the least deviation, on average only about 25 degrees for long-term observations. Altogether, systematic bias in wind roses is observed in all locations; thus, it is not related to a specific measurement device (e.g., wrongly oriented lidar). It suggests that the NORA3 dataset is designed to incorporate long-term wind conditions for past decades, while observation datasets used in this research are relatively short-term, covering only a few years. NORA3 as a reanalysis dataset eventually takes into account actual wind conditions, but short-term deviations have a lower impact compared to existing long-term trends.
3.4. Annual Wind Energy Production Estimates
Both the NORA3 and observation datasets were used to estimate annual wind energy production for the typical Vestas wind turbine model 172-7.2 MW. Overall, the estimates based on the NORA3 dataset are 3–6 GWh/yr higher compared to the observation datasets (see
Figure 15).
The energy production estimates based on the logarithmic wind resource extrapolation technique match the NORA3 and observation datasets with higher altitudes, except for Ainazi, where they diverge. Extrapolation is based on three points at low altitudes, where small biases result in large deviations at high altitudes. Thus, the divergence can be explained by biases in both data sources rather than systemic discrepancies in the technique.
The Riga location is used for a validation of the approach, where actual wind speed readings and PDF coefficients are available for the higher altitudes. The trend for the observation dataset demonstrates a high coefficient of determination (R2 = 0.945). Notably, the trend for NORA3 fits the dataset very closely (R2 = 0.999), which suggests that the reanalysis model incorporates similar logarithmic relations.
AEP estimates provided in this study are meant as quantitative metrics for comparing the datasets. Actual AEP estimates for a specific wind turbine depend on a variety of factors, such as installation location, surrounding terrain, wind wake effects from other turbines and many others; thus, dedicated analysis should be performed.
4. Conclusions
In this study, NORA3 was validated against observational data in the eastern Baltic Sea region, Latvia. Four different sites with varying terrain types were used for obtaining wind profile data. By incorporating wind profile data from both lidar-based WindCube devices and meteorological masts, the analysis provides a multi-dataset comparison. This approach establishes the specific performance characteristics of NORA3 within the Latvian geographical context, allowing for a localised assessment of the dataset.
Correlation analysis shows a close match between the NORA3 and observation datasets for all locations (correlation coefficient ranges from 0.75 to 0.91). The same match is true for the wind speed distribution comparisons; however, at lower altitudes, the NORA3 dataset exhibits a bias towards stronger wind speeds compared to the observed data. The most variable results were revealed in the wind rose analysis. Long-term datasets confirm the southwest (SW) direction as the predominant wind in the Latvian climate, although shorter-term observation datasets exhibit deviations which are not reflected in the NORA3 dataset for the corresponding periods.
Annual wind energy production (AEP) estimates were compared for both types of datasets. Overall, NORA3 AEP estimates are 3–6 GWh/yr higher than estimates from observational datasets for the given turbine model. The same trend is observed in almost all locations and at all altitudes, while with higher altitudes, the relative difference between NORA3 and observational estimates goes down to 10–15%.
NORA3 is a robust tool for assessing the wind resource at wind turbine installation altitudes in the Baltic Sea region, though primary modelling targets of the NORA3 are the North Sea, the Norwegian Sea, and the Barents Sea. Nevertheless, energy production estimates are more optimistic than actual observations. For higher assessment quality, long-term observational data is essential for the specific regions and integration with other reanalysis datasets (such as ERA5 and NEWA), as well as insights from landscape and climate-related studies for quantitative assessment of terrain types.