1. Introduction
Air pollution remains one of the most significant environmental health risks globally, contributing to a wide range of adverse health effects even at low concentrations [
1]. During specific pollution episodes—such as wildfires, dust storms, or elevated local anthropogenic emissions combined with stagnant meteorological conditions in the planetary boundary layer—pollutant levels can increase dramatically, resulting in higher mortality rates and hospital admissions [
2]. Therefore, accurate air quality forecasts are important, especially to protect sensitive population groups by issuing health advisories and mitigation measures in advance.
Chemical transport models (CTMs) are essential tools for simulating the emission, transport, chemical transformation, and deposition of atmospheric pollutants [
3]. These models are widely used for both air quality forecasting and regulatory assessments [
4,
5]. Depending on the spatial scale, different modelling systems are applied, i.e., global CTMs provide estimates of pollutant transport on intercontinental scales, while regional models focus on smaller domains with higher spatial resolution, allowing for a more detailed representation of atmospheric processes [
6].
To operate effectively, regional air quality models require a range of input data, including meteorological fields, emission inventories, land use data, initial conditions, and boundary conditions. While initial conditions are necessary only for the first time step of a simulation, boundary conditions must be provided continuously along the edges of the modelling domain throughout the simulation period to ensure a consistent solution of the model’s three-dimensional equations [
7].
In operational air quality forecasting, boundary conditions are usually derived from global model outputs. In contrast, for retrospective policy assessments or scenario modelling, boundary conditions may incorporate observationally constrained data to increase accuracy [
8]. One of the most widely used global data sources is the Copernicus Atmosphere Monitoring Service (CAMS), part of the European Union’s Earth observation programme. The CAMS provides daily data on atmospheric composition using the Integrated Forecasting System (IFS), which is also used in ECMWF’s numerical weather prediction. The CAMS extends the IFS by including modules for aerosols, reactive gases, and greenhouse gases developed in the precursor projects GEMS and MACC [
9,
10].
The CAMS products—available both as forecasts and reanalyses—are widely used in regional air quality modelling, particularly as boundary conditions for European-scale and national models [
11,
12,
13,
14]. However, uncertainties remain, arising from both input data (e.g., emissions, meteorology, boundary conditions) and the model formulation itself (e.g., chemical mechanisms, parameterisations) [
12]. Among these, boundary conditions play a critical role in influencing pollutant levels within regional domains, particularly for pollutants with strong long-range transport components.
Several studies have investigated the impact of boundary conditions on regional model performance. Im et al. [
12] showed that long-range transport of ozone from global models significantly influences surface ozone levels in Europe and North America, whereas PM
10 and PM
2.5 are predominantly affected by local emissions. Their study also highlighted a greater dependency on boundary conditions during spring, when transboundary transport tends to be more active. Makar et al. [
15] demonstrated improvements in ozone forecasts by applying a tropopause-height based dynamic adjustment to climatological boundary data. Similarly, Katragkou et al. [
16] found that the impact of chemical boundary conditions on surface ozone, based on a series of sensitivity studies, is comparable to the changes caused by different meteorological forcing.
Jiménez et al. [
7] examined the relative influence of initial and boundary conditions over time, concluding that the effects of initial ozone concentrations diminish after a 48 h spin-up period, whereas the influence of boundary conditions persists, especially near domain edges where short- to medium-range pollutant transport is significant. Akritidis et al. [
17] tested three regional chemical transport model simulations using three different lateral boundary condition setups. In the first run, the boundary conditions were invariant in both space and time. In the second run, they were represented by monthly averages from a global model, allowing for seasonal variability. In the third run, the boundary conditions were also derived from a global model, but included both seasonal and inter-annual variability. The results showed that using boundary conditions with spatial and temporal variability improved the representation of ozone variability. However, incorporating inter-annual variability did not enhance the correlation between modelled and observed concentrations. Additionally, both the normalised standard deviation and the normalised mean bias were improved on a seasonal basis when time- and space-dependent boundary conditions were applied.
At HungaroMet, the Hungarian Meteorological Service, the CHIMERE chemical transport model is used operationally to generate daily air quality forecasts and to assess the previous year’s air quality over Hungary. In our earlier research, we explored the impact of meteorological uncertainties on simulated pollutant concentrations [
18,
19]. In the present study, we focus on another key source of uncertainty: the effect of boundary conditions on modelled concentrations.
Our approach involved a comparative analysis between the current operational configuration of CHIMERE, which uses climatological averages from the LMDz-INCA (Laboratoire de Météorologie Dynamique—INteraction with Chemistry and Aerosols) global database, and two test configurations incorporating the CAMS global forecast data. In the first test configuration, all boundary condition data—both gases and aerosols—were derived from the CAMS global forecast. In the second, only the aerosol-related species were replaced by the CAMS real-time data, while gas-phase pollutant boundary conditions remained based on climatological averages.
The primary aim of this study was to quantify the impact of using real-time boundary conditions, as opposed to climatological ones, on the predicted concentrations of key pollutants (NO2, O3, PM10, and PM2.5) over Hungary. To evaluate the model performance, we compared simulated results against measurements from the Hungarian Air Quality Monitoring Network using a set of statistical performance metrics. The findings contribute to a better understanding of how dynamic boundary data can enhance regional-scale air quality forecasting, particularly in the context of episodic events such as Saharan dust transport or wildfire plumes, which are expected to increase in frequency under changing climate conditions.
This study provides a systematic evaluation of the effects of real-time chemical boundary conditions on regional air quality modelling over Central Europe using the CHIMERE model. While previous studies have primarily focused on seasonal or idealised sensitivity experiments, our work presents an operationally relevant comparison based on the realistic, daily-varying CAMS forecast inputs versus climatological boundary conditions.
3. Results
3.1. Comparison of the Test Runs with the Operational CHIMERE Simulation
First, the modelled yearly concentrations simulated with the operational CHIMERE configuration were compared to those from the test runs. The concentrations were derived from the lowest model layer.
Figure 3,
Figure 4,
Figure 5 and
Figure 6 display the differences between Test1 and operational data in panel (a), and the differences between Test2 and operational data in panel (b), for the pollutants NO
2, O
3, PM
10, and PM
2.5. Test1 shows larger deviations from the operational run. In the case of NO
2 (
Figure 3), both test cases produced lower concentration values compared to the operational yearly average. The Test1 configuration shows larger concentration reduction, with a difference of about 20% closer to the domain boundaries.
During periods of unusually low pollutant concentrations (e.g., due to favourable weather conditions), real-time boundary conditions could result in lower NO2 concentrations compared to the average values. Weaker pollutant transport from neighbouring areas causes lower NO2 concentrations near the boundaries. If lower NO2 concentrations enter the model domain from surrounding regions, this may reduce the concentrations within city areas compared to those expected under constant climatological boundary conditions.
No significant change in annual O
3 concentrations was observed for the replacement of only aerosol-type pollutants (Test2). In contrast, the Test1 configuration produced up to 20–40% higher O
3 concentrations (
Figure 4). The impact of real-time boundary conditions on surface ozone concentrations is larger within the inner parts of the domain. Global models can include long-range transport of ozone from distant regions, often from the upper troposphere or stratosphere. Therefore, the use of CAMS global forecasts as boundary conditions can lead to higher concentrations of ozone in the regional model.
Figure 4 shows a more significant increase in O
3 concentration in the areas around larger cities. Near cities, even if the decrease in NO
2 is less significant, the increase in O
3 can still be more pronounced. This may be because the local emissions of VOCs (from vehicles or the industry sector) may still be high, providing the necessary reactants for O
3 formation while less NO is available to destroy it.
Regarding PM
10,
Figure 5 shows that both test configurations caused significant changes (even an increase of 40%) in the yearly averages in the southwestern part of the domain. The increase in the concentrations is more expanded in the case of Test2, although in most parts of the country the change is below 15%. A similar spatial distribution of concentration changes can be seen for PM
2.5; however, their magnitude is lower. The activity of some natural sources of PM are timely integrated in the CAMS global system. Hence, sudden increases in PM
10 concentrations at the domain boundaries due to regional dust storm or wildfire events could be captured with the two test configurations.
It can be concluded, that the usage of CAMS global forecasts as boundary conditions has a significant impact on the predictions made with the CHIMERE model. The extent of the deviation from the operational model varies depending on the area and the pollutant.
3.2. Model Performance Evaluation
The purpose of the evaluation is to compare the simulated concentrations with the measurements of the Hungarian Air Quality Monitoring Network and to see if the modelling system can accurately represent the surface concentrations of the pollutants. The analysis allows us to determine which configuration matches best the measured values. The assessment was conducted using hourly concentration data from 12 stations for a whole year. Due to the data availability criteria (75% for each time interval used for averaging, such as hourly values, daily averages, or daily maxima), the final dataset used for evaluation included 10 stations for NO
2, 11 for O
3, 11 for PM
10, and 9 for PM
2.5. In order to visualize the main aspects of model performance, we adapted the target diagram from FAIRMODE’s recommended benchmarking methodology.
Figure 7 shows the target diagrams of the four pollutants. Each diagram offers a qualitative summary of the model’s performance, visually illustrating its accuracy in terms of BIAS and CRMSE (unbiased root mean square error) for each station. Each symbol represents the MQI value of that station, which is equal to the distance of the symbol from the origin. If a station falls within the green area, the MQI is less than 1. Additionally, the diagram also displays the 90th percentile of the MQI values for the stations in the top left corner and the parameters necessary for the derivation of the MQI values in the top right corner.
The target diagram of NO2 shows that the MQI values are below 1 for all stations and for all three model runs. The majority of the BIAS values is negative, only Budapest and Sarród lie in the upper half of the diagram. The performance of the three different configurations is similar at the stations; there are no outstanding differences in the position of the symbols representing the configuration types on the chart. Regarding O3, the Test1 configuration does not meet the requirement that 90% of the stations must have an MQI value lower than 1. All model runs have overestimated the measured concentrations. The difference between the performances of operational and Test2 configuration is small. While the symbols showing Test1 results are noticeably distant from those of the other configurations (shifted upwards), the relative positioning of the stations remains consistent. This indicates that the BIAS has increased significantly with the use of near-real-time boundary condition data and the stations were similarly affected. From the target diagram of PM10, we can see that none of the runs meets the model performance criteria, only in the case of the operational runs some stations fall inside the green area. The BIAS is overall negative; CHIMERE underestimates the measured concentrations. The only exception with positive BIAS is Nyírjes station. In the case of PM10, the Test1 and Test2 configurations produced similar performances; nevertheless, the stations were shifted together again. The CRMSE values increased; thus, the symbols indicating Test1 and Test2 are located to the left of the operative symbols. A higher CRMSE means that, although the average error remained almost the same, the spread of the errors around that average is greater. The PM2.5 target plot shows that the 90th percentile of the MQI values is over 1 for all the three model configurations. Similarly to PM10, with Test1 and Test2 configuration, we achieved similar model performances. Budapest and Nyírjes stations, with larger positive BIAS values, stand out from the rest. Again, we found that the CRMSE has changed more than the BIAS when compared to the operational model.
Figure 8 shows the correlation coefficient (R) values valid for the stations. The green bars visualize the R values of the operational, the red bars the Test1, and the orange bars the Test2 configuration. The correlation coefficients are generally higher for O
3 than for the other pollutants. The test configurations did not change much the correlation between measured and modelled NO
2 compared to the operational setup. The R values of PM
10 are below 0.6, and the differences between two test configurations are small. The correlation increased at stations in the southwestern part of the country (Pécs, Szeged, Szombathely, and Veszprém) and at Nyírjes. In the case of PM
2.5, the correlation increased only at Nyírjes when using near-real-time boundary conditions.
Figure 9 shows the normalised mean standard deviation (NMSD) values valid for the stations. The colouring of the three different model runs is the same as that in
Figure 8. This metric provides insights into the variability of a model’s predictions relative to the observed data. For the majority of the cities, the modelled NO
2 concentrations exhibit lower variability than observed. CHIMERE captures the observed variability the best at Budapest, K-puszta, and Szeged. Regarding O
3, the NMSD values are positive for the Test1 case and negative for the other two cases at the stations, except for Miskolc, where all types of modelled standard deviation are larger than the observation. The NMSDs of PM
10 changed a lot with the introduction of the test configurations. The operational model underestimates the observed PM
10 values, but the usage of new boundary data increased the variability in the modelled concentrations. With the two test configurations the NMSDs reached closer to 1. This suggests that the model is able to capture episodic pollution events, but it seems to be overestimating the spread of the pollutant concentrations relative to the observed data. For PM
2.5, it can be also seen that the test configurations improved the NMSD values. However, at Nyírjes and Budapest, the NMSD is above 1, which means that the model is too sensitive or overly influenced by small fluctuations in the input data at these locations.
Concerning the evaluation, there are some important findings. On the one hand, the statistical metrics that describe the accuracy of the NO2 predictions did not vary significantly between the different model configurations. One possible explanation is that the lifetime of NO2 is short, so even if the amount of pollutants that cross the borders is different from that of the operational model, there is very little change in the inner parts of the model domain. In the second place, the model performance of Test1 regarding O3 is different from the operational and Test2 results. Using real-time boundary conditions strengthened the overestimation of the O3 concentrations. Although the variability of the modelled concentrations has increased, the correlation has decreased. Using the global model output as boundary conditions can lead to higher ozone levels being imported into the regional model, as it does not account for smaller-scale local processes that might reduce ozone concentrations at the boundaries. Additionally, if the global model captures more stratosphere–troposphere exchange events, the amount of ozone in the lower troposphere can be higher than the climatological average. The Test1 configuration allows for more ozone entering the model domain, and, as the lifetime of ozone in the troposphere is long enough to allow it to be transported over long distances, ozone concentration is increasingly overestimated throughout the whole country. Regarding PM10, the overall underestimating behaviour of CHIMERE did not change with the introduction of real-time boundary data, but the degree of variability around this average error grew. Real-time boundary data may reflect short-term pollution events or spikes in PM10 concentrations that are not captured by climatological averages. As some of the episodic events, which can significantly raise PM10 levels (e.g., natural dust events), are presented in the CAMS global forecasts, the cross-border pollution could be reflected in our regional model results and this has improved the correlation and the matching between the modelled and measured standard deviation at some stations. In the case of PM2.5, all but one station showed a decrease in correlation with the Test1 and Test2 model configurations, and the NMSD values increased in all cases. When the model is exposed to more fluctuations in the boundary input data, the model captures the overall range of variability better, but the error variability increases too and this reduces the overall correlation strength.
3.3. Modelling Episode Situations: An Example of a Saharan Dust Event
From late March to early April 2024, warm air with a southerly flow brought significant amounts of Saharan dust towards Europe, resulting in high concentrations of aerosol particles across Hungary. On 1 April, the pollution level was very poor according to the Hungarian Air Quality Index and the daily average concentration exceeded 50 μg/m
3 at all the PM
10-measuring stations involved in this study. The elevated amount of desert dust could be modelled with the use of the CAMS forecasts as boundary conditions. The difference between the daily PM
10 concentrations of the test configurations and the operational model is shown on
Figure 10. We see that the differences are higher in the southern part of the model domain.
In
Figure 11, the modelled and measured daily PM
10 averages between 27 March and 2 April are displayed using bar charts. Two sites, Pécs and Szombathely, were selected to show the concentration-raising effect of the episode situation. Starting from these two cities, backward trajectories were drawn with the web version of the Hybrid Single-Particle Lagrangian Integrated Trajectory (HYSPLIT) model [
31,
32]. The two 72 h trajectories were plotted from a height of 1000 m above ground level to determine the dust transportation pathway. The air parcels originated from North Africa. The measured daily average PM
10 concentrations (black bars) were above 140 μg/m
3 at these two stations on 1 April. The operational model (green bars) could not follow the trend in the measured concentrations. The long-range transport of desert dust was observed in the two test configurations (red and orange bars). However, the maximum concentrations within the period were predicted by CHIMERE one day earlier. The maximum values of the test configurations’ daily averages were above 100 μg/m
3, while the operational model’s maximum was significantly lower than 100 μg/m
3.
The CAMS integrates satellite data and observation-based emission estimates to describe the state of the atmosphere accurately. By using the global CAMS forecasts as boundary conditions, the effect of long-range transport could be more realistically represented in our model system. CHIMERE can adjust immediately to extreme pollution events in the global model and provide more accurate forecasts. If there is a possibility of a Saharan dust intrusion, providing more accurate forecasts becomes especially important, not only to keep the public informed, but also to help experts better estimate solar power generation.
4. Conclusions
The impact of using the CAMS global forecasts as boundary conditions in the CHIMERE model run at HungaroMet has been investigated. The operational model, which uses climatological averages from the LMDz-INCA database for boundary conditions, was compared with two types of test configurations. In one configuration, the boundary conditions for all pollutant types were replaced with the data from the CAMS forecasts, while in the other, only the gaseous pollutants kept climatological averages. The year 2024 was selected for the analysis. Initially, 12 stations providing O3, NO2, PM10, and PM2.5 concentrations were chosen to evaluate the model. Unfortunately, not all station data could be included for every pollutant due to data gaps found in the measured time series. The model results were compared using maps, basic statistical metrics, and the FAIRMODE’s MQI values to determine whether the test configurations predicted the measured values more accurately.
The map comparison of the model results revealed that the test configuration led to lower NO2 concentrations and higher O3, PM10, and PM2.5 concentrations. No improvement in the MQI fulfilment was achieved with the test configurations, although some aspects of the model performance were improved. The effect of the choice of boundary conditions on NO2 is the smallest. The NO2 concentration is influenced mainly by the activity of local emission sources and its short lifetime limits the impact of transport from distant sources. However, Test2 has lower MQIs and slightly higher correlation coefficients compared to the other two model setups. Replacing the gaseous pollutants’ boundary conditions with the CAMS data (Test1) significantly degraded the model’s performance in predicting O3 concentrations. The overestimation of O3 increased significantly. This might be because the CAMS global model and our regional CHIMERE model have different spatial and temporal resolutions. The CAMS model, with its coarser resolution, may introduce errors that are then propagated during the interpolation process when downscaling to the finer resolution of the CHIMERE model. These propagated errors can affect the accuracy of the final forecast. Other shortcomings of using the CAMS global forecasts as boundary conditions in our system include differences in the representation of fluxes and the usage of different chemical schemes and reaction constants.
The BIAS of PM10 was reduced by the use of CAMS as the boundary data. However, the MQI values aggregated from station values were further from 1 in the test runs. Replacing the boundary conditions for aerosol-type pollutants with real-time predictions improved the correlation coefficients of PM10 at some stations. The benefits of using the CAMS data are significant when the meteorological situation favours the large-scale transport of aerosol particles from an area where Saharan dust disperses or a wildfire episode is occurring. During dust storm events, the concentration of the natural dust fraction increases within PM10, while during wildfires, the amount of carbonaceous aerosols rises. The increasing frequency of natural dust storms reaching Europe, along with the growing demand for a greater proportion of solar power in electricity generation, necessitates the development of more accurate forecasts of particulate matter (PM) concentrations. Since both test configurations offer advantages (through real-time consideration of the effects of dust, organic, and black carbon mixing ratios) over the operational model in terms of PM10 forecasts, and Test2 caused the least degradation in modelled NO2 and O3 values, the implementation of the Test2 configuration is recommended for operational usage.
Future efforts will focus on identifying the dominant sources of systematic error within the modelling system. While the CAMS-based boundary conditions improve preparedness for extreme pollution episodes, the evaluation revealed that modelled concentrations may still significantly overestimate observed values during such events. Therefore, the additional components of the modelling framework require investigation. Planned developments include an evaluation of how the choice of meteorological input from different numerical weather prediction (NWP) models influences the performance of the CHIMERE regional chemical transport model, as well as an assessment of revised planetary boundary layer (PBL) height calculations. A detailed analysis of vertical mixing processes may help explain the persistent overestimation of O3 and the model’s inability to reproduce observed PM10 peaks during specific episodes.