The Neural Network Assisted Land Use Regression

: Land Use Regression (LUR) is one of the air quality assessment modelling techniques. Its advantages lie mainly in a much simpler mathematical apparatus, quicker and simpler calculations, and a possibility to incorporate more factors affecting pollutant concentration than standard dispersion models. The goal of the study was to perform the LUR model in the Polish-Czech-Slovakian Tritia region, to test two sets of pollution data input factors, i.e., factors based on emission data and pollution dispersion model results, to test regression via neural networks and compare it with standard linear regression. Both input datasets, emission data and pollution dispersion model results, provided a similar quality of results in the case when standard linear regression was used, the R 2 of the models was 0.639 and 0.652. Neural network regression provided a signiﬁcantly higher quality of the models, their R 2 was 0.937 and 0.938 for the factors based on emission data and pollution dispersion model results respectively.


Air Pollution
Air pollution is an undesirable state of the environment caused by the emission of pollutants from various pollution sources to the air. A pollutant is defined as any substance that negatively affects human health, ecosystems or properties [1,2]. The overall imbalance in the environment has increased over the years. Sources of pollution can be classified as natural and anthropogenic. Concentrations emitted by natural sources appear as a natural background and are not influenced by human activities [1]. Anthropogenic pollution is caused by human activities, and its origin is bound to human settlements. Anthropogenic pollution, unlike natural pollution, is influenceable. Common air pollution pollutants are particulate matter (PM 10 and PM 2.5 ), nitrogen oxides, carbon monoxide, sulphur dioxide, benzo[a]pyrene, persistent organic pollutants (POPs), etc., [3]. This article deals with PM 10 . PM 10 are solid or liquid (aerosol) particles of 10 µm or less in diameter, compound from various organics, sulfates, nitrates, ammoniac salts, soot, mineral particles, metals, bacteria, pollen and water [1].

Air Pollution Modelling
Air pollution monitoring is a standard tool for the air pollution assessment. In the Czech Republic, it is supposed to be the main tool for air monitoring and research in accordance with Czech Act no. 201/2012 Coll. Air pollution concentrations are regulated in respect to both acute effects of pollution exposure (short-term averages) and the effects of chronic air pollution exposure represented by yearly averages. This approach was implemented from the EU Legislation Act on Ambient Air Quality and Cleaner Air for Europe [4] and is universal for the EU countries. The given study focuses on the chronic part of the air pollution effect represented by yearly averages.
The disadvantage of pure air pollution monitoring is that measurements can provide information about concentrations only at specific measurement points. For effective air pollution management, it is better to have continuous information about the distribution in the study area. Mathematical air pollution dispersion modelling is a suitable method to acquire the concentration distribution of air pollution in the study area.
Mathematical models utilized in the air quality assessment and management can be classified by the model type as [5]: • empirical models, • Gaussian models, • numerical models, • physical models.
Physical models are a smaller physical representation of a real situation that they represent and allow studying. Numerical models are a category of models based on general formulas and algorithms of computational fluid dynamics. These model categories are not considered in the study.
Empirical models are based on in situ measurements and observations. They use various data analysis techniques to create a model describing the phenomenon under study. Empirical models are usually much simpler, require less computational power, and are easier to work with. Their biggest disadvantage is that they are site and time specific. Their scope is limited by the data they are based on. A difference in data location or a time period results in different models constructed. Land Use Regression (LUR) models belong to the empirical model category.

Gaussian Models
Gaussian dispersion modelling is based on the assumption of continuous leakage of pollution from a pollution source and subsequent dispersion of pollutants in a constant homogeneous wind speed field without spatial limiting conditions. The dispersion of pollutants occurs in the wind direction due to convection and in the direction perpendicular to it due to diffusion, which is caused by atmospheric turbulence and expressed statistically using the (Gaussian) normal distribution [6]. Spatial limiting conditions, such as the effect of terrain, are included in the calculation using coefficients [7]. Input data of Gaussian models are terrain data, meteorological data, the characterization of pollution sources, and a mesh of reference points. Input data characterize an average situation during the modelled period [8].
Gaussian models work with a reasonable degree of air pollution dispersion process abstraction. Therefore, it is possible to describe the relationship between the concentrations at the reference point and the source using an easily enumerable mathematical formula. Simultaneously, abstraction imposes smaller requirements on the input data and time consumption of evaluation. Therefore, Gaussian models are widely used for air pollution modelling in widespread areas. Among the Gaussian air pollution models preferred by the EPA are Industrial Source Complex (ISC) [9], CALPUFF [10], CALINE3 [11], AERMOD [12], and others. Gaussian models preferred by the Czech legislation are SYMOS'97 [13], AE-OLIUS [14] and ATEM [15]. These models provide steady-state results describing air pollution concentrations in the study area and time interval.

Land Use Regression
Models based on Land Use Regression belong to the category of empirical models. LUR can be treated as an independent air pollution modelling method. This kind of model is based on the principle that concentrations at a specific location depend on the characteristics of the surrounding environment, especially the characteristics that affect the intensity of emissions and rate of dispersion and deposition. Modelling itself is performed on the basis of the regression model describing the influence of relevant environmental and spatial characteristics, i.e., input factors [16,17]. The model is composed of regression equations that include the relationship between input factors and substance concentrations at monitoring sites. The resulting formula can be used to predict concentrations over the whole area represented by point measurements.
LUR models are not tied to any time or spatial resolution. They are used in a variety of time intervals from short-term campaigns to long-term averages at fixed pollution monitoring sites. The spatial resolution of LUR models can vary from hyper-local urban areas to regional or country level models.
Most LUR models deal with linear regression [18]. Linear regression describes the relationship between the variables x and y, where the values of x (x can be a scalar or a vector representing a set of input values) are assumed to be independent variables and measured without errors. Linear regression consists of two variables connected by a linear function [19]. The dependent random variable represents the value of y under investigation. It is assumed that the variable y is a linear function of x. The coefficient of determination (R 2 ) is used to evaluate the relationship between these variables [20].
LUR itself is especially useful in areas that are difficult to monitor or have an unsatisfactory density of the monitoring station mesh. Therefore, LUR air pollution modelling is used for assessment all over the world, especially in Europe, America, Australia, and Asia.
Van den Bossche et al. [21] used mobile monitoring to gather data at a high spatial resolution in order to build LUR models for the prediction of annual average concentrations of black carbon (BC). The overall prediction was low due to the input uncertainty and lack of predictive variables. The authors highlighted the use of independent data to validate and exclude those data during variable selection in the model building procedure and the importance of using an appropriate cross-validation scheme to estimate the predictive performance of the model. LASSO, the regularized linear model, performed slightly better than the classical supervised approach, and the nonlinear SVM technique did not show significant improvement over the linear model. The generalization of the LUR model to areas where no measurements were made was limited, especially when predicting absolute concentrations.
Lee et al. [22] developed LUR models for particulate matter in the Taipei Metropolis with a high density of roads and strong activities of industry, commerce, and construction. It was possible to achieve R 2 values of 95% (PM 2.5 ), 96% (PM 2.5 absorbance), 87% (PM 10 ), and 65% (PM coarse ). Local traffic, construction, residential land use, and industrial sources were identified as the causes of PM 2.5 pollution. A variable representing the river vicinity decreased PM 2.5 pollution. PM 2.5 absorbance levels were boosted by local traffic, commercial and industrial land use. Increased concentrations of PM coarse were caused by elevated motorways.
LUR input data have a spatial character. Geographic information systems are a suitable tool for processing and management, especially when using remote sensing data. Hsu et al. [23] used GIS and remote sensing to develop ten regression models for the PM 2.5 bound compound concentration based on measurements of a six-year period. The regression models included NH + 4 , SO 2− 4 , NO − 3 , OC, EC, Ba, Mn, Cu, Zn, and Sb. The authors managed to explain the variance (R 2 ) of the LUR models in the range between 0.60 and 0.92. In the course of the study, they were able to successfully estimate the fine spatial variability of PM 2.5 and its compounds in Taiwan. Traffic distribution, industrial areas, greenness, and culture-specific PM 2.5 sources, such as temples, were used as inputs. The main variables determined by the LUR model that affect PM 2.5-bound concentrations are traffic, industrial areas, and greenness.
The study Wu et al. [24] assesses the influence of surrounding greenness on the concentrations caused by local culture-specific emission sources (Chinese restaurants and temples) within a city of Taipei using LUR modelling. Correlation analysis of the LUR PM 2.5 model was carried out. A strongly negative correlation (r: −0.71 to −0.77) between NDVI was detected. Temples (r: 0.52 to 0.66) and Chinese restaurants (r: 0.31 to 0.44) were positively correlated with PM 2.5 concentrations. The result was confirmed using a cross-validation test with the result R 2 of 0.90 and external validation R 2 of 0.83, and with the adjusted model R 2 of 0.89.

Artificial Neural Networks
Artificial neural networks (ANNs) are a mathematical abstraction of biological processes that constitute animal brains. An ANN consists of a set of nodes and neurons, which are interconnected. It loosely simulates the neurons of animal brains. Synapses are represented by these connections. Each neuron receives signals via connections, processes them, and signals further via connections. The signal is a real number. In each neuron, the input signals are summed and transformed using some nonlinear function (activation function). The result is an output signal of the neuron. Each input connection has a weight set, which increases or decreases the importance of the signal. The neuron can also have a bias property that represents the sensitivity of the neuron to the signal. The bias value is added to the input signal. Neurons are usually aggregated into layers. The layers can have different activation functions. In each neural network, the signal is propagated from the input receiving layer (input layer) to the output layer, which provides the output signal through one or more layers of neurons. The signal can be one-directional or contain loops McLachlan et al. [25]. Neural networks can be trained via sets of examples consisting of pairs that combine inputs and desired results. From a mathematical point of view, training is an optimization process that optimizes the performance of the neural network. The performance is usually defined as the difference between the desired results and neural network outputs. Independent variables of the optimization are the weights and biases of the neural network. The adjustment of the weights and biases gives increasingly more accurate results. The training of the network is terminated after a sufficient number of adjustments. This process is called supervised learning. There are also types of neural networks that are trained on data not containing the desired results. This process is called unsupervised learning.
Network training is a time-consuming process demanding a lot of computational power. Once the training process is completed, the neural network is able to recall the output value when provided with an input dataset. Mathematically, this is an enumeration of an explicit mathematical formula, a quick and simple operation.

Gaussian Models
The Analytical Dispersion Modelling Supercomputer System (ADMOSS) was developed at the VSB-Technical University of Ostrava (VSB-TUO) to perform air pollution modelling in widespread areas [26]. The ADMOSS is based on a combination of geographic information systems (GIS), a mathematical model of air pollution dispersion, and parallel computer clusters. The Modelling methodology implemented in the ADMOSS is the Gaussian model SYMOS'97 recommended by the Czech legislation [27]. The ADMOSS is independent of the modelling methodology and able to run with other models of the same class. Studies (Air Silesia [28], AIR PROGRES CZECHO-SLOVAKIA [29], AIR TRITIA [30]) focusing on air pollution modelling and assessment in the Czech-Polish-Slovak borderland were carried out using the ADMOSS. The study is based on the Gaussian modelling results of the AIR TRITIA project.

Input Factors of the Land Use Regression
Three groups of factors, i.e., emission factors, results of air pollution dispersion modelling, and land cover factors, were considered in the study. Each modelling result factor was read out as a value at pollution monitoring sites. Factors of land cover and factors representing emission were calculated in a similar manner. Based on the experience from Bitta et al. [31], all factors representing land cover and emission were enumerated as weighted averages, where the weights were defined by the estimated probability of the wind direction.
The whole area of interest was divided into 46 areas, the division was performed with respect to the terrain configuration. Each area had its own unique meteorological dataset. Buffer zones around the monitoring stations were divided into eight slices representing eight wind directions. The factors were calculated as wind direction probability weighted averages of the factors calculated in the slices. The outlines of division polygons and wind direction probability graphs are visualized in Figure 1. All factors considered in the study are listed in Table 1. Tests which buffer perimeters are the most representative for the LUR were performed in the study in Section 4.1.

Neural Network-Based Regression
Neural networks with supervised learning can be utilized as a universal regression technique, an alternative to standard statistical methods. In the case of the LUR model, the set of input factors represents input data, and the pollutant concentration is the desired output. The neural network is trained to provide the desired results from a given list of input factor values.
When neural networks are used for such a task, there is a severe risk of overfitting. Overfitting is the product of analysis that corresponds too closely or exactly to a particular set of data and, therefore, may fail to fit additional data or reliably predict future observations ( Figure 2). We limited the maximal number of input factors to five and used the five-fold crossvalidation technique to assess the suitability of the current dataset selection and neural network configuration to avoid overfitting. k-fold cross-validation is the technique that splits the initial data sample into k of equally sized subsamples. One subsample is kept for model testing and the remaining k − 1 subsamples are training data. The training process is repeated k-times, each data subsample is once used as test data. The performance of the neural network is represented by the average performance of k trained neural networks [25].
For the sake of analysis, we used multilayer perceptron (MLP) neural networks. An MLP is a class of feedforward ANNs. It is a one-directional neural network consisting of three layers of neurons-the input layer, the hidden layer, and the output layer [32]. In the experiments, we used MLPs with the hidden layer, in which the number of neurons ranged from 1 to 30, and there were two possible activation functions-the logistical function and hyperbolic tangent.
To determine the best neural network configuration for each group of the selected factors, we tested all possible neural network configurations with five different random fivefold cross-validations. That resulted in the need to train 1500 neural networks and compare their performance for each combination of the hidden layer size and activation function. The performance parameter was the R 2 estimate based on average R 2 of cross-validation neural networks. When the optimal neural network configuration was determined, the final network with those parameters was trained on the whole dataset. The R 2 performance value of this neural network represents the quality of prediction based on the selected input factors.
A neural network-based LUR model was developed to estimate spatial and temporal variability of nitrogen dioxide (NO 2 ). At first, the standard LUR model was elaborated to identify significant variables. Secondly, the deep neural network algorithm was applied on the LUR results to fit the model for predicting concentrations. Lautenschlager et al. [33] focus on the development of the OpenLUR platform. This platform consists of the LUR modelling technique combined with machine learning and open datasets.
The study performed by Alam and McNabola [34], extending the typical LUR approach with ANNs. At first, the average daily PM 10 concentrations of Vienna and Dublin were delivered using the concept of multiple linear regression (MLR) modelling. Secondly, an ANN was used to manage the input variable nonlinearity. The best result of R2 = 66% for Vienna and 51% for Dublin was reached owing to the ANN.

Data Sources
2015 data were used in the study. The study area is the area of interest of the Air Tritia project [35]. The Air Tritia project focuses on air quality modelling and assessment. Emission and pollution monitoring data were collected and published in this project. The Tritia region consists of two Polish voivodships (Silesian, Opole), the Moravian-Silesian region in the Czech Republic and the Žilina region in Slovakia. Figure 3 demonstrates the position and size of the study area.
The level of air pollution was determined by vast hard coal deposits of the Upper Silesian Basin, which has been mined since the 18th century. The presence of coal enabled the growth of industries by using coal as a source of energy or as feedstock (coal mining, coal processing, steel production, industrial chemistry), downstream industries (e.g., machine industry) and coal heat and electricity production (utility, C&I and domestic scale). The combination of industrial production services, required by densely populated surrounding urban settlements, and the unfavorable basin-like terrain configuration are the cause of severe air pollution. The EEA marks the Tritia region with a population of approximately 7.5 million inhabitants as one of the most air polluted regions of the EU (Figure 4). The most problematic pollutants are particulates (PM x ), PAH (polyaromatic hydrocarbons), and heavy metals (Hg, Cd, As) [36].

Air Pollution and Meteorological Data
Pollution monitoring data consist of yearly averages of PM 10 ( Figure 3) in 2015, collected from all pollution monitoring sites in the Tritia region. The values and site locations were obtained from yearbooks of the Czech Hydrometeorological Institute [38], the Slovakian Hydrometeorological Institute [38] and Inspectorates for Environmental Protection of the Silesian and Opole Voivodships [39,40]. There were 47 air pollution monitoring stations measuring the PM 10 concentration in the study area. Yearly average values are presented in Table A1.
Meteorological data were obtained from the Air Tritia project dataset [41].

Emission Data
Emission data were obtained from the emission database provided by the Air Tritia project. The data were divided by the country of origin (Czech-Polish-Slovak) and type of pollution source (industrial, domestic heating, car traffic). Industrial sources were represented as point sources, and the input data contained following parameters: position, average emission flow, height, exhaust diameter, speed, volume flow, and the temperature of exhaust gases. Domestic heating was represented by area sources, which were squares of a 200-m size containing a position, average emission flow, and height. Car traffic sources were modelled as linear sources with a length ≤50 m containing a line description, average emission flow, and height. Brief emission statistics are presented in the following table (Table 2) and emission squares ( Figure 5) [30].

Gaussian Model Results
The results of the Gaussian model were obtained from the Air Tritia project. The long-term SYMOS'97 model was used in the project. Meteorological conditions were standardized by the wind speed (low, medium, strong), wind direction (eight directions + calm), and atmospheric stability (Bubník-Koldovský classification, five classes) in the long-term model. Each combination of the meteorological parameters was calculated separately, and the annual average concentration was calculated as a mean of those values weighted by the probability of occurrence of such weather conditions. The emission data entered modelling in the form described in the section above [13,42].

Land Use Data
Land use data were represented by data from vector topographical datasets, namely, ZABAGED (CZ), BDOT (PL), ZBGIS (SK), which are all available at a 1:10,000 map scale CUZK [43], GUGiK [44] and GKU [45]. There were five kinds of land cover used for analysis, i.e., built-up areas, forested areas, grass-covered areas, water bodies, and open soil agricultural areas.

Experiments and Results
The modelling experiment was structured according to the following flowchart ( Figure 6). The first step of the modelling experiment was to calculate possible input factors based on emission data, land cover data, and Gaussian model results. The number of possible input factors for LUR needed to be reduced. For this reason, the second step of the experiment was factor preselection. When a proper set of possibly eligible factors was selected, LUR modelling followed.
Four sets of LUR models were constructed. Standard LUR models using linear regression were constructed for each combination of five or less input factors. The first set of input factors consisted of emission factors and land cover factors, and the second set of input factors comprised Gaussian model factors and land cover factors. Neural network-assisted LUR models were also developed for the same two groups of input factors mentioned above. Those models replaced the linear regression of LUR models with nonlinear regression provided by the neural network.
The best models were selected from each of the four calculated model groups, which represented the best available result of the combination of the input factor group and regression technique. The R 2 value of the models was chosen as the quality parameter for the result of each model.
All calculations and data analysis were performed in the Python 3.4 programming language. The pandas module was used for table data handling, the statmodels module was applied for basic statistical analysis, the matlibplot module was used for graph generation, and sklearn was applied for neural network analysis. All the modules above are parts of the Anaconda distribution of the Python language Anaconda [46], which was utilized for calculations. arcpy is the API of the ArcGIS Pro 2.6 software, which enables spatial analysis and map generation in the Python environment.
The modelling experiment entailed a large number of mathematical computations, which were significantly accelerated due to parallel computing on the CESNET [47] Metacentrum and the "Govorun" supercomputer JINR [48], where hundreds of neural networks could be trained simultaneously.

Factor Preselection
It was necessary to determine the best buffer zone perimeter for each emission and land cover factor. Analysis based on Spearman correlation coefficients was used. Factor values were calculated for distance perimeters ranging from 200 to 10,000 m with a 200 m step. Spearman's correlation between each factor and the 2015 yearly average PM 10 concentration at the monitoring sites was calculated (Figure 7). The best perimeters used in further analysis were the distances at which Spearman's correlation had its local or global maximum.  Table 3 lists the selected factors and the best perimeters. There are also factors representing the model results included. The factors were read out from the model results at locations of the monitoring stations.

Linear Regression-Based Land Use Regression
Linear regression LUR models were developed using three different input factor groups, namely, emission factors, land cover factors (Table 3), and Gaussian model result factors. All the following models were limited to a maximum of five input factors to avoid overfitting during linear regression calculations on the dataset containing 47 monitoring sites. At first, models containing any subset of emission and land cover factors were tested (Figure 8). There were 1585 linear models tested. The quality of the models was determined by the coefficient of determination of the model (R 2 ). Secondly, the linear model was selected among linear models that were constructed using Gaussian model results for the industrial source (MOD I N ), domestic heating (MOD DH ), and car traffic (MOD CAR ) and land cover factors. 465 models containing up to six input factors were constructed. The quality of the models was also determined by their coefficient of determination (R 2 ) ( Figure 10). The best model found was a model containing the following factors: built-in areas in an 8000 m distance, forested area in a 400 m distance, grass covered land in a 5000 m distance, and all three Gaussian model results ( Figure 11). This model has R 2 = 0.652 and the formula: PM 10 = 28.7064 + 0.2457 * BLD 8000 − 0.0870 * FRS 400 + 0.0459 * SOIL 5000 +2.9565 * MOD I N + 2.6991 * MOD CAR Figure 11. Observed PM 10 concentrations vs. best model predictions.

Neural Network-Assisted Land Use Regression
Two experiments with neural networks were conducted. All possible combinations of emission and land cover factors were tested, with the number of selected factors being ≤ 5 in the first experiment. A total of 1585 combinations were tested. In the following graph, each point represents the performance of neural network models. The most efficient models for each number of the input parameters are highlighted ( Figure 12).  All possible input factor combinations with Gaussian model results and land cover factors were tested in the second experiment. The number of input factors was limited to five again. 381 combinations were tested. In the following graph, each point represents the performance of neural network models. The most efficient models for each number of the input parameters are highlighted ( Figure 14).  It was required to train 1501 neural networks for each tested combination of the input factors. It means that in total, approximately 2.95 million neural networks needed to be trained.

Discussion
Two sets of input data were selected for the LUR model construction; the first dataset consisted of emission factors and land cover factors, while the second dataset comprised Gaussian model results and land cover factors. The LUR model construction was performed via two techniques, the first one was standard linear regression, the second one was the multilayer perceptron neural network. A set of LUR models was created for different selections of input factors (≤5) in each of the four experiments, determined by the input dataset and model construction technique. The best model in each experiment was defined by the R 2 score of the result (Table 4). A comparison of different possible model performance measures of all four best models is presented in Table 5. Nearly all model performance measures indicate a better performance of neural network models. The only exception is the normalized mean bias, which shows the low overestimation or underestimation of neural network models, while linear regression models show almost zero values. This is the natural behaviour of linear models.
All the best models provide significantly improved results over the pure Gaussian model. The R 2 of the Gaussian model was 0.485, and its results compared to the measurements are shown in Figure 16  Both linear regression-based LUR models provided a similar performance, the R 2 of the models was 0.639 for the emission factor linear model and 0.652 for the linear model based on the Gaussian model results. The results of these LUR models are similar to those of the other LUR models, which are 0.65 in Bitta et al. [31], 0.58 in Liu et al. [49] or 0.66-0.76 in Masiol et al. [50]. A more detail comparison of these studies and the model of the study is not possible, since each model used a different set of variables, different time scales, etc. Gaussian model results were more accurate at the monitoring sites representing urban or industrial sites. The LUR model, which included land cover factors, provided better results in rural and natural environments.
Both neural network-based LUR models performed significantly better than their linear regression-based counterparts. Both models showed similar results with R 2 of 0.937 and 0.938. A better performance of neural network-based models is the expected result, since nonlinear regression provided by neural networks can better reflect a generally nonlinear nature of pollution dispersion.
One can also compare the performance of linear regression and neural network models with the same input data. This comparison is provided in the following two tables (Tables 6 and 7).  It is clear that neural network models provide significantly more accurate results than linear regression models. Nonlinear regression provided by neural network models performed best with input factors that had slightly worse results in the linear regression model than the best possible linear regression model inputs. Conversely, inputs that provided the best linear regression model results slightly underperformed when they were used as inputs of the neural network model. There are factors that better reflect the nonlinear behaviour of air pollution dispersion.
Such high R 2 values can have several reasons. The selected method of the model construction may provide high quality results or be distorted due to overfitting (even though we tried to avoid it) or the monitoring station mesh dominated by urban background stations (42 out of 47 stations). Higher variability among monitoring stations would definitively bring higher confidence in the results.
For each LUR model construction technique, both input factor data sets provided a similar quality of results. The question arises whether Gaussian dispersion modelling is a necessary step. The Gaussian dispersion model in this case was highly computationally intensive. It took several processor years of processing, which could have been avoided. This hypothesis also requires further investigation on different datasets.
LUR models are empirical models that are constructed as statistical models based on the data provided. Each LUR model represents only a specific time and space interval, which is reflected in input data. The model formula and coefficients implicitly represent the effects of general phenomena that affect air pollution. Meteorological factors, which demonstrably influence pollution dispersion, such as precipitation, temperature, thermal stability, etc., are typical examples of factors that are implicitly reflected in LUR model coefficients. It also means that a different time period or different area of interest produces its own set of LUR model coefficients. The goal of LUR modelling should not be to provide one universally applicable model. The goal is to provide an algorithm that generates the LUR model fitting the best selected time period and area of interest.

Conclusions
We managed to significantly improve the performance of the standard linear regressionbased LUR model by replacing the linear regression algorithm with the multilayer perceptron neural network. This is an innovative approach that has the potential of future improvement of LUR modelling. The modelling technique described in the study requires further investigation and development. The goal should be a standardized modelling algorithm and standardized input datasets that provide credible results for decision makers.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The