AIR POLLUTION DISPERSION MODELLING USING SPATIAL ANALYSES

The air pollution dispersion modelling via spatial analyses (Land Use Regression – LUR) is an alternative approach to the air quality assessment to the standard air pollution dispersion modelling techniques. Its advantages are mainly much simpler mathematical apparatus, quicker and simpler calculations and a possibility to incorporate other factors affecting pollutant’s concentration. The goal of the study was to model the PM10 particles dispersion modelling via spatial analyses v in Czech-Polish border area of Upper Silesian industrial agglomeration and compare results with results of the standard Gaussian dispersion model SYMOS’97. Results show that standard Gaussian model with the same data as the LUR model gives better results (determination coefficient 71% for Gaussian model to 48% for LUR model). When factors of the land cover and were included into the LUR model, the LUR model results were significantly improved (65% determination coefficient) to the level comparable with Gaussian model. The hybrid approach combining the Gaussian model with the LUR gives superior quality of results (65% determination coefficient).


Particulate pollution
The PM (Particulate matter) is called a mixture of solid or liquid both organic and anorganic substances in the air.It mainly consists of sulfates, nitrates, ammoniac, salts soot, mineral particles, metals, bacteria, pollens and water.Particles of diameter smaller than 10 µm (PM 10 ) have severe health effects because they may get into lungs or even join the blood stream [19], [20], [16].Natural PM 10 sources are forest fires, dust storms, volcanic processes, erosion or sea water [3], [11].Large part of PM10 has anthropogenic origin [22].It consists of combustion processes (thermal power plants, heating, internal combustion engines), industrial processes like coking, blast furnaces, steelworks, sinter plants, cement production or mineral extraction, dust resuspension from roads and agriculture (soil erosion) [16], [17], [9], [8].Recent research indicates non-existence of a minimal threshold concentration value for human health effects [3].Factors influencing health effects are particles' size and geometry, their chemical composition, physical properties, concentration and time of exposure.Particles greater than 10 µm are caught by ciliated epithelium of upper respiratory tract and have low health impact.Particles smaller than 10 µm cumulate in bronchi and lungs and cause health issues.Particles smaller than 1 µm possess the biggest health threat because they may get into alveoli and frequently contain adsorbed carcinogenic substances.The PM 10 inhalation damages mainly heart and lungs and is a cause of premature death of people with heart or lung disease, cancer, fibrosis, allergic reactions, asthma, lung insufficiency, heart attacks, respiratory tract irritation and cough [19], [20], [16], [3], [11].There are two legal pollution limits for PM10.The 24-hour average limit is 50 µg/m 3 which can be exceeded no more than 35 times per year.The annual average concentration limit is set as 40 µg/m 3 [13], [14].

Land Use Regression modelling
The Land Use Regression (LUR) modelling is an empirical modelling approach which is based on multivariate linear regression.It combines pollution monitoring data with spatial variables describing vicinity of monitoring sites which are typically obtained via spatial analyses in Geographic Information Systems (GIS).The result of the analyses is the linear model where [Factor- * ] are selected spatial factors and [Coe f _ * ] are regression coefficients obtained from the linear regression analysis at the pollution monitoring sites.The empirical model can be than used to estimate spatial distribution of the PM10 pollution in the area of interest.The LUR model was first used for the air pollution monitoring in the SAVIAH (Small Area Variations in Air quality and Health) project.This approach was used to study NO x concentrations in three European cities -Amsterdam, Huddersfield and Prague.The successful application of the LUR in the SAVIAH project model spurred its usage in further studies in European countries and in the rest of the world [10], [15], [12].

Gaussian dispersion modelling
Gaussian dispersion models assume an emission transport from continuous pollution sources in homogenous wind field without spatial limits.The transport itself is in the model provided by the convection by wind and via turbulence diffusion which is described statistically by Gaussian distribution.Spatial limitations, mainly the terrain, are included into model by correction coefficients.Gaussian dispersion models are commonly used for long term (f.e.annual) average concentrations modelling.The dispersion is calculated for a set of standard meteorological conditions and summed, weighted by probability of occurrence of such conditions.The most commonly used Gaussian dispersion models are CALINE3 (Benson, 1979) and ADMS-Urban [1].The SYMOS'97 model [18] is a reference pollution dispersion model in the Czech Republic.It is a Gaussian model which calculates pollution dispersion of both gaseous and particulate pollutants from point, linear and area pollution sources.The model takes into account both dry and wet deposition as well as chemical reactions during transport.

Data sources
The study area was selected to match the area of the Air Silesia project [2].All air pollution source and monitoring data used in the study were purchased from published results of the Air Silesia project and are relevant to the year 2010.The Air Silesia project was focused on collecting the air pollution data and assessment of the air quality in the border region of the Upper Silesian industrial region.The following Fig. 2 shows the study area and the annual mean PM10 concentrations [µg/m 3 ] at the pollution monitoring stations

Air pollution data
The air pollution data -yearly averages of PM 10 concentrations, were obtained from the yearbooks [5] of the Czech Hydrometeorological Institute and the Voivodship Inspectorate of the Environmental Protection of Silesian Voivodship [21] There have been 27 air pollution monitoring stations in the study area measuring the PM10 concentrations (Fig. 2).Peer-reviewed version available at ISPRS Int.J. Geo-Inf.2018, 7, 489; doi:10.3390/ijgi7120489

Pollution source data
The pollution source data were obtained from the pollution source database provided by the Air Silesia project.The data have been divided by the land of origin (Czech-Polish) and by the kind of the pollution source (industrial, domestic heating, car traffic).Brief statistics of emissions are presented in following Tab.1 and emission squares (Fig. 4)

Land Use data
The land use data were obtained from the CORINE Land Cover dataset [7] as vector datasets.There were four kinds of land cover selected for the analysis -built-up areas, forested areas, areas with grass cover and open soil-agricultural areas.

METHODOLOGIES AND RESULTS
There were two basic groups of factors considered in the study -factors of pollution sources and factors of land cover.Each factor (except distance to the nearest major road) was calculated in the similar fashion.There was a buffer of the selected perimeter created around each of pollution monitoring stations.The factor was than calculated as a sum, percentage or length-weighted average of the vector data cut by the buffer.Factors were calculated uniformly (U) or they were calculated as a weighted average based on wind direction probability (W).The area of modelling was split into 14 areas according to the terrain configuration.Meteorological condition in each area were represented by its own dataset (Fig. 5).In that case, buffer zones were split into 8 slices representing 8 wind directions.Factors were calculated for each slice area and final weighted factors were calculated as a weighted average of those factors where weights were probability of wind blowing from corresponding direction.For the purpose of the study, all factors were encoded.For example, the [FL_500_W] code means the factor of forested land cover counted for the buffer distance 500m and weighted by the wind direction probability distribution.Regressions consisted of two steps, statistical significance/insignificance of each factor was evaluated and regression coefficients were calculated with statistically significant factors.The best statistical analysis result was a regression model: (2) The R 2 of the model is 48% and mean quadratic error of the model is 10.59µg/m 3 .When factors of the land cover were taken into account, the resulting best linear model was constructed as The R 2 of the model is 65% and mean quadratic error of the model was 8.34µg/m 3 .
q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q 0 20 40 60 80  The results of the SYMOS'97 model were statistically evaluated with the measurements at monitoring sites.The R 2 of the model is 71% and mean quadratic error of the model was 7.44 µg/m3.Both approaches, the LUR and the dispersion model, can be combined into a hybrid, two-step process where dispersion model results are used as an input data for the LUR model construction.Peer-reviewed version available at ISPRS Int.J. Geo-Inf.2018, 7, 489; doi:10.3390/ijgi7120489 There were two possible input tested.The land cover data were combined with both partial results of the model and the sum of all partial results.
The R 2 of the model was 86% and mean quadratic error of the model was 5.56 µg/m3.Observed Predicted q q q q q q q q q q q q q q q q q q q q q q q q q q q q Pollution sources + Landcover model Hybrid dispersion−LUR model The dispersion model provides more accurate information about pollution dispersion from pollution sources and the LUR model allows incorporation of additional variables

DISCUSSION
The LUR model gives with the similar input data much worse results than Gaussian dispersion model (R 2 48% x 71%).The LUR model was providing good estimates of the air pollution when the pollution monitoring station was positioned within the urban environment in the vicinity of air pollution sources.On the other hand, predictions at rural and natural sites were inaccurate because the model, as constructed, is not able to take into account the long distance pollution transport.The LUR model also did not take into account other parameters of pollution sources used in Gaussian models, mainly source height, exhaust gas speed and temperature, speed and fluency of the traffic stream, etc.The pollution dispersion is also in Gaussian models more accurately described in the form of non-linear dispersion formulas.The LUR model with added land cover factors gives better predictions of PM10 concentrations (R 2 65%) which are comparable but still worse than a Gaussian model.The addition of land cover factors greatly improved the quality of forecasts in natural and rural monitoring sites.The Gaussian model provided better results in industrial and urban-background monitoring sites while the LUR model with land cover factors outperformed it slightly at the rural and natural monitoring sites.The LUR model was also able to explain the reason of significant PM10 concentration underestimation by Gaussian model at three monitoring sites (Opava-Kateřinky, Věř ňovice, Studénka).
The LUR model showed that all three sites which are positioned close to the edge of urban areas are heavily influenced by the nearby agricultural activities and/or wind-caused reemissions and erosion represented in the LUR model by the Open soil factor.When the dispersion and the LUR model were combined, the resulting dispersion-LUR hybrid model kept more accurate information about pollution dispersion from pollution sources and was able to incorporate the effect of land cover on the PM10 concentrations.This resulted in much improved quality of the dispersion model (R 2 86%).The hybrid model formula also gives more information.The road traffic emissions seem to be underestimated by a factor of 2, this may be caused by reemission of particulates which was not accounted in the model.The hybrid model formula also demonstrates the positive effect of trees on particulate pollution.The tree cover reduces the PM10 pollution by up to 12.5 µg/m3 in the urban environment and by up to 28.14 µg/m3 in the rural environment.

CONCLUSION
The LUR modelling is an alternative approach to the standard dispersion models.The biggest advantages of the LUR approach are relative simplicity of calculation compared with time and computational power demanding dispersion modelling and ability to incorporate factors not included in dispersion modelling.Although their results in the study did not match the quality of the Gaussian model the LUR approach should not be dismissed because they may incorporate phenomena which are usually omitted by standard dispersion models.There was also developed a hybrid dispersion-LUR model combining both approaches which gives significantly more accurate modelling results then both separate approaches.

Figure 4 .
Figure 4.The PM 10 distribution in the study area, Basemap:OpenStreetMap

Figure 6 .
Figure 6.Observed to predicted comparison of results

Figure 7 .
Figure 7. Observed to predicted comparison of results

Table 2 .
Factors of pollution sources, factors of land cover