Predicting the Spatial Distributions of Elements in Former Military Operation Area Using Linear and Nonlinear Methods Across the Stavnja Valley, Bosnia and Herzegovina

: This study has the purpose of developing a realistic soil prediction maps of the spatial distribution of elements by evaluating and comparing different modelling techniques: Kriging, artificial neural network ‐ multilayer perceptron (ANN ‐ MLP) and multiple polynomial regressions (MPR). The Stavnja Valley was selected as a test area due to the following reasons: (1) intensive metal ore mining and metallurgical processing; (2) peculiar geomorphological natural features; (3) regular geological setting, and (4) the remaining minefields. Geospatial parameters from digital elevation models (DEM) are used as an input to advanced prediction modelling techniques: ANN ‐ MLP and MPR. Soil measurements, land use data, and remote sensing are applied, developed, and finally incorporated into the models of spatial distribution in the form of 2D or 3D maps. In order to reconstruct the different processes that influenced the entire study area simultaneously, we used novel approaches to modelling. This comprehensive approach not only represents an innovation in contamination mapping, but developed prediction models also help in the reconstruction of main distribution pathways, assess the real size of the affected area, and improve the data interpretation.


Introduction
The dynamic equilibrium between erosion, transport, and deposition that occurs in most natural river systems [1], can be interrupted by human activities. Many worldwide rivers are significantly affected by multiple metals contaminants due to past and present mining activities, [2]. Metal containing particles are quite often transported over large distances by water affecting the alluvial sediments, even several hundred kilometres away from the main sources of contamination [3][4][5][6]. Metal ore mining and processing lead to some of the largest releases of trace elements causing a considerable impact on the surrounding environment. Obsolete metal production technologies used for mass production generated an enormous quantity of mining waste which is considered hazardous due to the presence of toxic metals [7,8]. Important sources of trace elements in the metal industry include different stages, starting from mining and milling operations, concentrating, ore transporting, disposal of tailings, wastewater, followed by the smelting process that includes concentrating, sintering, atmospheric discharges, and blowing dust [9]. It is known that many ore minerals accommodate various other trace metals as minor inclusions, causing contamination with major and minor ore metals in the areas surrounding mines and smelters [10][11][12].
The environmental concern in mining areas is primarily related to physical disturbance of the surrounding landscape, mine tailings spill, emitted dust and acid mine drainage (AMD) transported into rivers. The adverse effects of mining and smelting activities in the environment have been addressed by many authors in many parts of the world [13][14][15][16][17][18][19][20].
Great potential for improvement of predictive soil mapping (PSM) techniques and understanding of the relationships between soils and the environment has been created after the increase in computer efficiency and capacity, geo-information technology, data availability, and demand accurate and reliable maps [21][22][23][24].
This requires the creation of spatial soil information based on numerical models, which account for the spatial and temporal variations in soil properties based on soil information and related environmental variables [25].
The Stavnja river valley has been known for intensive mining and metallurgical activities for more than 100 years. Besides these activities, the valley has been selected for the study due to several other reasons such as (1) unique geological setting, especially the metallogenic belt, (2) the geomorphological characteristics that affect the contamination halo, and (3) remaining minefields from the last war, 1992-1995. The anthropogenic impact in the municipality of Vareš can be reflected through the three abandoned iron ore deposits (Smreka, Brezik, and Droškovec), abandoned Pb-Zn-Ba deposit Veovača and abandoned ironwork Vareš, and brown coal mining in the southern part of the study area. High concentrations of particular elements are released into the environment not only by anthropogenic activities but also by natural erosion and weathering reactions of parental rocks that add to the complexity of the environmental assessment [26]. Problems of contamination along the river can be demonstrated as anisotropic appearance between the layered (isotropic) lithological units, which can be solved using advanced linear and nonlinear modelling techniques instead of denser sampling grid which is restricted by the remaining minefields at 5%-6% of the study area.
The multivariate techniques such as linear and nonlinear are widely used in geology. For example, mapping soil organic matter, mapping presence/absence elements or mapping physical property of model, asses hazard and contaminated area [27,28]. The main purpose of this study was to develop a visualization model of spatial distribution using linear and nonlinear mathematical methods that combine a sparse chemical analysis and various geospatial parameters in Stavnja Valley (Bosnia and Herzegovina) but in the same time to evaluate and compare the performance of different modelling techniques: Kriging, artificial neural network-multilayer perceptron (ANN-MLP), and multiple polynomial regressions (MPR). For this reason, five specific purposes of the present study were: (1) to include various geospatial parameters in linear and nonlinear modelling; (2) to identify main distribution pathways and change in concentrations according to the distance from the source of contamination; (3) model verification; (4) to assess the real size of the affected area; and (5) to provide an optimal methodology for geochemical researchers in areas with restricted sampling conditions.

Description of the Study Area
The physiography of the region displays a high variety. Contrasting landscapes are found from the mountain environment in the northern part of the study, more than 1000 m altitudes and in the southern part about 400 m. The river Stavnja has a length of about 35 km, located in Central Bosnia and Herzegovina, northern from the capital Sarajevo ( Figure 1). Approximately 30,000 inhabitants live in this Valley, mostly settled into two small cities Vareš and Breza. The study site has been the place of military operation during the last war (1992)(1993)(1994)(1995). In the valley, three ethnic groups, namely Orthodox, Catholics, and Bosniaks, lived close to each other, which resulted in massive conflict, destruction, killing, and burning. According to the available maps, between 5 and 6 km 2 of the study area is covered by the minefields. The valley is known for intensive mining and metallurgical activities for more than 100 years. In the municipality of Vareš, three abandoned iron ore deposits (Smreka, Brezik, and Droškovec) are located, in which estimated reserves and resources were approximately 170 million tonnes in 1991. Apart from the iron deposits, Eastern from the city Vareš, the multi-element (Lead-Zinc-Barite) open pit Veovača is located. Southern from the urban zone sits the ironworks Vareš, which construction began in 1889, after Austrian-Hungarians arrivals to Bosnia and Herzegovina. A city Breza with surrounding settlements developed in the South of the study area that is known as brown coal mining (open pits and underground). This coal basin belongs to the Central Bosnian coal basin that lies along the river Bosna.
The outcropping stratigraphic sequence exposes rock formations spanning from the Palaeozoic ages embedded into Upper Triassic to more recent times. Ten major lithological units are present in the study (Figure 1). The river Stavnja perpendicularly crosses the lithological units, from the oldest ones in the north to the youngest in the south. Younger Quaternary layers are developed along a present watercourse or something higher in gravel-conglomerate terraces [29]. The study area is a part of the Durmitor nappe. The most important geotectonic unit of Vareš metallogenic district belongs to the Central ophiolite zone. Vareš, siderite-hematite sedimentary exhalative (SEDEX) deposits at Smreka, Droškovac, and Brezik are locus typicus mineralization of the Mid-Triassic. The deposits contain hydrothermal, stratiform siderite-hematite-chert beds. The mineralization forms part of the Anisian and Ladinian sequences and displays distinct vertical zoning, reflecting a gradual change of redox conditions in the depositional environment. The sequence starts with bituminous, thinly bedded shales with pyrite and base metal sulphides, overlain by barite and siderite, deposited under reducing conditions. Overlying clastics and oolithic limestone are succeeded by hematite shale, hematite ± chert beds, deposited in an oxidizing environment. Major minerals are siderite, manganese-rich hematite, barite, pyrite, marcasite, chalcopyrite, galena, sphalerite, tetrahedrite, and Pb-sulphosalts. Veovača Pb, Zn, Ba deposit contains ore-breccia or ore conglomerates with dm-to m-sized clasts cemented by barite and Pb-Zn sulphides. Microcrystalline dark barite is accompanied with galena and sphalerite. The barite from Veovača is typical for Triassic SEDEX deposits elsewhere in the Dinarides [30].

Sampling Design, Sample Preparation, and Analytical Methods
According to the primary purpose of research, previous sampling experience [18,19,26,[31][32][33][34] and facing other difficulties and challenges (such as the narrow gorge, inaccessible terrain, and minefields) any regular sampling grid couldn't be used. Considering all the above-mentioned, the sampling grid had been designed to cover the entire Valley with a sampling grid in order to determine: a change in altitude, distance from the main sources of pollution, a transport mode (atmospheric or river transport), and shape of the contamination halo areole. In the urban and industrialized areas, the sampling grid is denser (Figure 2). At the same time, the sampling grid has been created so that all main lithological units are covered with a significant number of samples which is necessary for the determination of bedrock influence. Each sampling site is recorded by the central point position with a GPS device. Whereas the reliability of the mine maps is relatively low (perhaps between 70-80%), we put special emphasis on integration and communication with local people who lived there before and after the war.
At places of intensive military operations, many minefields still remain, especially parts of strategic importance such as roads, industrial objects, bridges, etc. According to our experience, before sampling important following steps should be considered: (1) to learn as much as possible about the sampling site (political situation, ethnic groups); (2) to have a map with remaining minefields and possible minefields; (3) to talk to the local people, and (4) be aware of any object that looks like a mine, such as the wire or small tins, etc. According to our experience, such sampling should be done in a group of two or three people maximum, if possible, with one local. The distance between two people should be about 40 m, in case that one activates the mine, another can help. One route should be used in both directions: to the sampling site and on the way back. If possible, existing routes or hard surfaces should be used for walking.
The following sampling sites shouldn't be selected for sampling: near burned houses; near the strategic objects; near bridges which were often used as a border between two military sides/ conflict groups; sampling in bushes, high grass, and forests. In the case of soil, sampling is safer to choose the meadows and pastures because they are used for livestock grazing (cows, sheep, goats, etc.). During the digging of soil profiles, everyone should be aware that unexploded mine can be activated. Sampling in such areas is very risky because a spade is stabbed into the depth of 30 cm where it should be kept in mind that one sampling site is composed of a minimum of five subsamples that increase the risk. Similar advice can be applied for the stream sediment sampling. Here is a problem that some unexploded projectiles can be transported far from the minefields and can be embedded in deeper layers. In 2014, Bosnia and Herzegovina suffered tremendously from flood, and in many cases the landmines were dislodged. Dislodging is not affecting only Bosnian people but can create an international problem if floodwaters carry the explosives downstream through southeast Europe. Even the stabilization force (SFOR) several years after the war organized anonymous action for collecting any kind of weapons, but unfortunately, some people decided to cast off, dig, or hide them in surrounding rivers, forests, etc.
Soils are the most widespread sampling material used for their environmental assessment due to their availability and relatively undemanding sampling method. High variability of soil chemistry can be dispatched by taking composite samples. In this study, one soil sample represents the composite material collected at the central sample point itself at five points within a radius of 50 m around it. At each sampling point, topsoil (0-5 cm) and subsoil (20-30 cm) are collected and the mass of such a composite was about 1 kg [26,32]. According to the sampling sites, the collected soils can be genetically divided into two main groups, the automorphic soil, and alluvial soil, respectively. The automorphic soils [35] belong to the Cambisol, type of soil developed on meadow and pastures. Soils developed on the youngest Quaternary material (alluvium) are called fluvisol (FAO classification).
To begin with, the soil samples (n = 222) were air-dried and later completely dried in a fan drier at 40 °C. The dry material was disaggregated in a ceramic mortar and sieved through a 2 mm nylon screen and pulverized to a fine powder [34]. All samples were analysed at the ACME, Ltd. laboratory in Vancouver, Canada [36]. Determination of 36 chemical elements (Ag, Al, As, Au, B, Ba, Bi, Ca, Cd, Co, Cr, Cu, Fe, Ga, Hg, K, La, Mg, Mn, Mo, Na, Ni, P, Pb, S, Sb, Sc, Se, Sr, Th, Ti, Tl, U, V, W, and Zn) was performed by ICP-MS after aqua regia digestion (mixture HCl, HNO3 and water at 95 °C).

Data Processing
For this purpose various spatial data were acquired by digitalization of existing topographic maps (provided by the Geodetic Survey of Bosnia and Herzegovina), Google Earth maps [37], geological maps (provided by Geological Survey of Slovenia and Geological Survey Federation of Bosnia and Herzegovina) and SRTM 90 m digital elevation data (DEM) [38]. Because the topographic maps were amended in the late 1970s, they were combined with the Google Earth maps. The isolation of major lithological units is also an important step for the determination of natural and anthropogenic background. Digitalization, vectorization, additional modification, standardization of databases and integration to GIS environment were performed using Surfer 11 (Golden Software) and QGIS (GNU General Public Licence).
A database includes the following information: (1) the basic descriptive parameters such as the Identification number, Sample label, Sampling material, (2) analytical data (concentrations of elements); and (3) geospatial data such as landuse unit, lithological unit, defined zones, latitude, longitude, absolute distance from the ironworks chimneys, elliptical distance from the ironworks chimneys, distance from the river Stavnja, altitude, altitude above the bottom of Stavnja valley, terrain slope, aspect, plan curvature, profile curvature, tangent curvature, Landsat spectral bands 1-7. For the spatial distribution modelling of particular elements using artificial neural network and multiple polynomial regression, a system of grid cells has been used. For this purpose, the whole study area is divided into a 50 × 50 m grid cell. The total number of grid cells is 41.471, which has been described by the aforementioned attributive parameters [26].

Data Transformation
Transforming data means performing the same mathematical operation on each piece of original data which the main purpose is to reduce the difference between extreme values [39]. It is often observed that environmental variables are Log-normal [40][41][42] or positively skewed [43,44] and data transformation is necessary to normalize such data sets. The Box-Cox transformation represents one of the most frequently used transformations in environmental sciences and geosciences where power transformation is needed [45][46][47][48][49].
The Box-Cox transformation is given by: where y is the transformed value and x is the value to be transformed. The coefficient varies from -5 to 5. All values of λ are considered and the optimal value is the one which results in the best approximation of a normal distribution curve.

Kriging
Kriging is a term that covers a range of methods used to describe spatial optimal linear prediction with minimum errors. The method most used is ordinary kriging. It estimates the value at a region for which a variogram is known, using data in the neighbourhood. The semivariance function used in ordinary kriging is based on the assumption that nearby objects tend to be more similar than those that are farther apart.
The accuracy of values estimated with kriging lies in the fact that every estimated point comes with its estimation variance. Additionally, the spatial structure of the parameters is being observed. The geostatistical method of kriging is widely applied in the mapping of soil attributes in unsampled areas [50]. At the same time, this enables quantification of the main spatial characteristic of soil attributes and spatial interpolation methods considering only the neighbouring points of estimation data [51,52].
The first step of kriging is calculating an unbiased estimator of the semivariance function that is half of the average squared difference between paired data values: is the semivariance at a given distance h; and N(h) is the number of sample pairs in h, and z(xi + h) and z(xi) are sample values at two points separated by h. The semivariogram model is combined with various theoretical models like Gaussian, exponential, spherical, linear and power by the weighted least square method. The spatial variability of the data set and the resulting grid file are also mathematically specified. The interpolation weights, which are applied to the data points during the grid node calculations, are direct functions of the variogram model [53]. Graphical displaying is a final goal of many environmental scientists in the chain of steps that begins with sampling which prime objective is to utilize the power of visual appearance [54].
In this paper, the application of kriging to the spatial distribution of elements has been shown.

Artificial nNeural Network-Multilayer Perceptron
Artificial neural networks (ANN) have received considerable attention over recent years with a wide range of applications, accordingly in soil science research. The method is suitable for dealing with nosy data and modelling the non-linear behaviour of data. The term neural network is inspired by the biological nervous systems. It is composed of many highly interconnected processing elements (neurons or nodes) working synchronously to solve specific problems. An ANN is configured for a specific application, such as pattern recognition or data classification, through a learning process. Learning in biological systems involves adjustments to the synaptic connections that exist between the neurons [55]. The synapses transmit signals to the other neurons which then process it and also signal the neurons nearby.
Rosenblatt [56] with his perceptron, as the first model for learning with a teacher, laid the foundation for the development of a layered feed-forward ANN. Rosenblatt`s algorithm is referred to as a single-layer perceptron. The ANN uses a multilayer perceptron (MLP) which is identified by its architecture, that performs tasks such as function fitting and pattern recognition problems [57]. The learning logarithm as a part of MLP is used for the backpropagation of error, i.e., resilient propagation variation. MLPs are feedforward neural networks (FNNs) with one or more layers of units between the input and output layers [58][59][60].
The output of a neuron is given by: where is the ith input, wi is the link weight from the ith input, w = (wi ... wn) T , x = (x1... xn) T , b is a threshold or bias, and is the number of inputs. The activation function is usually some continuous or discontinuous function mapping the real numbers into the interval (-1,1) or (0,1) [56]. A different function can be used as an activation function but the most used is the sigmoidal activation function. The standard sigmoidal activation function has the following form: 1 1 The success of the method can be laid down for the following reasons: (1) They can model extremely complex systems and due to their nature can be used to model nonlinear natural systems (linearity in the sense of mathematical properties of additivity and homogeneity); when using linear algebra (such as most of the multivariate statistics) to describe nonlinear systems we always have to make an approximations; (2) There is no limitation with the dimensionality of the problem; it can be arbitrary, depending on the CPU speed and memory; and (3) simple and successful operation due to well-developed learning algorithms [61].

Multiple Polynomial Regressions
Multiple regression the term was first used by Pearson 1908 [62], refers to regression application which is designed to handle multiple inputs; predictions are made with a single dependent variable and multiple independent variables. Polynomial regression is a type of multiple regression analysis used when the variables show a curvilinear trend. It is modelled as an m th order polynomial [63]. The general polynomial regression model is given by: ⋯ where, for a set of I observations, Yi is the dependent variable, β0 is a regression constant, β1,β2,…,βm are the coefficients of the X, X 2 ,…, X m independent variables (predictors), m is order of the polynomial, and ƐI is the model error (the difference between observations and predicted values).
Although polynomial regression fits a nonlinear model to the data, the estimation problem is linear because the regression coefficients are linear in the parameters [64]. The regression analysis aims at establishing the values of the parameters of the regression equation and to find the best fit and quantify it with respect to the dependent variable (Y) [59]. Two values, R 2 and adjusted R 2 , describe how well the model fits the data. All regression models provide information based on the influence of the combined interactions of the estimator variables on the response. However, the major conceptual limitation of the regression techniques is that one can only ascertain relationships, but never be sure about the underlying causal mechanism.

Linear Method (MPR) vs. Nonlinear Method (ANN-MLP)
Due to the high cost and time-consuming nature of soil sampling, research in developing methods for the creation of soil maps from sparse soil data is becoming increasingly important. In recent years, the development of prediction methods (linear and nonlinear) that use secondary attributes sourced from the DEM, land use, and remote sensing in combination with sparse and expensive soil measurements has sharpened the focus of the research. Various modelling techniques applied for soil prediction were compared, and the best combination of prediction method and secondary information was chosen. Various modelling techniques help us reconstruct different processes that influenced the entire study area. Simultaneously, they distinguish natural and anthropogenic influences as well as transportation patterns (such as atmospheric or water transport). Their main purpose is not only to isolate the hotspots with the highest concentrations but also to provide a spatial distribution pattern of particular trace elements.
All aforementioned geospatial data were used in two prediction methods, multiple polynomial regression (MPR) and artificial neural network-multilayer perceptron (ANN-MLP), respectively. For both modelling methods, a recall grid was used. Both methods were treated by the same conditions and same software packages, Statistica 11 [65] and Surfer 11 [66]. Modelling by ANN-MLP was done by using a large number of input data, 240 hidden units and 25 training networks. Changes in some input data, a number of neurons, as well as the number of training networks was performed in order to obtain the best models. More neurons and more architecture will acquire better results. Each particular model is trained to 25 networks but only five logical networks were retained, and finally, an average model of five retained networks was calculated. Each training model contains a summary table with following parameters: training perfection, test perfection, validation perfection, all perfection, training error, test error, validation error, training algorithm, hidden activation, and output activation. Modelling by MPR is much more difficult than ANN-MLP. To get a useful model, so many data conversion, corrections, and transformations have to be performed and one must be chosen for the best optimization algorithm [67]. The MPR is a very demanding method, comparing to the ANN-MLP, and much more input data is needed. This method is developed as a controlling method. Their main goal is confirmation of its success, certainty, and superiority over the other methods.
The use of various geospatial parameters is developed over recent years and its use is necessary for the development of the prediction method. But there are two problems that must be considered when ANN-MLP is applied: (1) The use of secondary attributes and calculated parameters in some cases could imply at subjectivity in data processing; (2) Much subjectivity can lead to inappropriate data interpretation due to the fact that the learning process is overloaded. When referring to overloading, it doesn't mean that the numbers of input data are large, but that the final results can only follow one particular parameter such as the slope, aspect, distance from the source of contamination, etc. In such cases, the spatial distributions are completely useless and cannot be considered in data interpretation. Sometimes it is necessary to repeat the whole process of learning or in some extreme cases to remove the parameter from further analysis.

Model Stability and Model Signification
The essence of the whole approach is to create stable and repeatable results under the same conditions. Procedures, approaches, data preparation, and calculations have been developed so long until the spatial distributions that represents the geochemical characteristics of the landscape for both methods become stable, repeatable and realistic. The estimation of model applicability and reliability of prediction lies in the fact that all created models are repeatable. This means that the developed procedures are not random or specific to a particular item or an isolated area, but such a model can be used without a limitation. The method has been critically evaluated according to the significance of the applied transformations, the similarity between models, as well as the stability of predictions. Simultaneously, the MPR method was developed in order to prove the results of artificial intelligence and its superiority. However, the results obtained by linear methods are extremely useful and our theory about the superiority of the nonlinear method was wrong. Comparing the performances of these two different machine learning algorithms, it is distinguished that ANN-MP already provides highly reliable results with a small number of input data, whereas the MPR is a very demanding method requiring more mathematical knowledge and geospatial data.
The regression has been performed for both prediction methods before data transformation (raw data) and after Box-Cox transformation. Model stability and model signification have been tested on the distribution of Pb which represents a dominant anthropogenic pattern in the chosen study area and Ti that represents a natural distribution, mainly related to the one particular lithological unit. Table 1 presents some basic statistical parameters between normal and transformed data for the two selected elements (Pb a Ti) that represent the two most common patterns in the study area, within the raw data and predicted data. It can be observed that the work with raw data doesn't bring predicted results and compositional data in such studies must be transformed. Suitability of using the Box-Cox transformation is reflected in the fact that the distributions of natural values are significantly asymmetric. The negative values for Pb and Ti already appear up to 25 percentiles. Also, the mean and median values are higher than predicted values, but in the case of Pb, it could be explained by the fact that the sampling grid had been denser in areas with dominant human impact. The prediction assurance is measured by the cubic polynomial regression between the predicted and observed values. It should be clear that this measurement has been performed only on ANN-MP and MPR because they depend on (geo) spatial parameters. Contrary to these two modelling techniques, the Kriging method is affected by changes in concentration and distance between the observations. In order to test the normality with skewness, kurtosis, Kolmogorov-Smirnov test, and Chi-Square test between raw data and Box-Cox transformation have been performed. The model quality for MPR is described with the following coefficients: R (correlation between the observed values and predicted values), Multiple R 2 (coefficient of determination), and Adjusted R 2 (adjusted coefficient of determination) ( Table 2). Those parameters determine how well the model fits the data. It can be observed that the difference between raw data and transformed data for Pb is very low, which is conditioned by several extremely high concentrations. For Ti, this difference is higher. The penalize tendency of increasing the R squared when additional predictors are included in the model, the R2-adjusted is used. The ANN-MP model quality is described with different parameters than MPR. Training perfection, test perfection, validation perfection, all perfection is high for both selected elements after the data transformation. Regressions between transformations within the two predictive methods are provided in scatterplots with fitted regression lines in Figure 3. It is evident that Pb distribution (A-D) is tending to show exponential form, but this is affected by several samples with very high levels. The scatterplots for both predictive methods, ANN-MP and MPR (A and C) are showing quite similar tendencies, representing a result of the strong asymmetric distribution. It is typical for anthropogenic elements, where extreme levels deviate from the low levels of parental rocks. Differences between untransformed and transformed data (Box-Cox) are almost negligible for both modelling techniques (B and D). The data are transformed by applying a single mathematical function to all the raw data values, leading to symmetric distribution, stabilizing variance, and making the distribution closer to normal. Similar results are obtained for the model stability evaluation for Ti ( Figure 3E-H). The correlation diagrams for raw data are showing that there is practically no real effective connection between observed and predicted values for both advanced modelling techniques (E and G). After applying the Box-Cox transformation is achieved a significant improvement in both methods (F and H). This can be explained by the fact that geochemical and pedological data arrive in several batches at different levels and a systematic relationship between spread and level is often found increasing level usually brings increasing spread. When this relationship is strong, there are several reasons for transforming the data in a way that reduces or eliminates the dependence spread on level: the transformed data will be better suited for comparison and visual exploration, and may be better suited for common confirmatory techniques, while individual batches become more nearly symmetric and have fewer outliers [68].
The scatterplots between the three applied methods vs. normal and Box-Cox transformation are provided in Figure 4. The scatterplots for normal values between various prediction methods are providing disperse distributions for Pb (A and C), concentrated in the area of lower values. The fitted regression line connects only a small number of calculated values in the area of higher values. But the complete opposite situation is after applying the Box-Cox transformations (B and D). Similar scatterplots are obtained for Ti (E-H). The data cloud is representing the real correlations between data obtained by three different prediction methods. These results are basically a confirmation for the previous claim that methods for its validation must provide similar results. In this case, we are talking about the distribution of chemical elements, but it can be applied for the distribution of any kind of observation.

Spatial Distribution of Pb
Lead is a common anthropogenic chemical element; which concentrations depend on human activities. Distribution of lead within the determined zones and lithological units according to the raw data and prediction methods is presented in Figure A1. The high concentrations in alluvial soil are consequences of long-lasting mining and smelting activities in the upper part of Valley, around the city Vareš. A fine-grain material is transported by the Stavnja and its tributaries from higher to lower landscapes and embedded to alluvial planes. A study on river Taxco has shown that two main processes are responsible for total Pb variation throughout the year; erosion with discharge processes and dilution connected with different grain-size distribution processes. [69]. It is very interesting for an anthropogenic element that concentrations are higher in the subsoil, but it can be explained by the fact that almost all activities stopped more than 20 years ago [70].
Its concentrations are diverse through the study area, but the evident increase is in the industrial zone and alluvial planes. It can be observed that the difference in predicted values within the raw data and prediction method is present. MPR and ANN-MP are giving very similar results. Kriging, on the other hand, is giving values even lower than 50% in contaminated areas and higher concentrations are found in some lithological units where they are not expected. This could be explained with the appearance of circles in spatial distributions maps (called Bull Eye's effect). The Box-Cox transformation used in all three prediction methods brings more accurate results but also improves the symmetry of data distribution and variance stabilization. It could be observed that the anomaly is on quaternary alluvium, where maximal values reach 610 mg/kg (raw data). Along the entire valley, natural enrichment is about 50 mg/kg. This means that human impact increased its concentration between four and 12 times in this lithological unit according to all prediction methods. Predicting the spatial distribution of Pb in soil and identifying hotspots with its high concentration using three prediction methods: (A) Kriging, (B) multiple polynomial regression (MPR), and (C) artificial neural network-multilayer perceptron (ANN-MLP) are presented in Figure 5. These prediction models are showing an arrangement in concentrations across the study area after using Box-Cox transformation and average values of surface soil and subsoil. For a graphical display of spatial distribution, the maps with percentile distribution have been used, where different colours represent different concentration arrangements. The spatial distribution pattern is reconstructing its main pathway in the study area. These models are providing a similar pattern of Pb concentration hotspots, indicating the same source of contamination. In the upper part of the study area, all three models are providing the same shape of contamination areola. This part of the study area is surrounded by steep hills, and the contamination of areola is influenced by the main wind direction, which is N-S. The river transport of clay fine grain is transported down into alluvial sediments. The eroded material has been transported and embedded in the alluvial plain of urban zone Breza, where intensive agriculture occurs. Two advanced methods, MPR and ANN-MLP include lots of geospatial data and those methods are showing a more realistic arrangement in concentrations than models obtained by Kriging, where only sparse soil measurements are included. On the other hand, their similarities are reconfirmation, refinement and at the same time validation of different models. The models are repeatable, what is another success in such modelling.
Determination of the real impact of Pb in the study area can be obtained by comparing the measured data recommended by VROM 2009. According to the VROM 2009, a target value of Pb is 85 mg/kg and the intervention value 530 mg/kg. Its maximum concentrations (880-1700 mg/kg) exceed the intervention values in Zone 1 and alluvial sediments more than three times and target values for 20 times.

Spatial Distribution of Ti
Contrary to the Pb anthropogenic distribution, the level of Ti in soil depends only on a natural background. Distribution of Ti within the lithological units according to the raw data and prediction methods after using the Box-Cox transformation is presented in Figure A2. It is clearly visible that a significant increase in concentrations is related to the specific lithological unit, the Triassic Clastites, spilites, and tuff. Such parental material contains much more Ti than other isolated units in the study area. Considering its tendency of incorporation into the finest clay size particles, Ti has been transported far from the source of contamination by water. The significant increase is present in alluvial deposits fined in the morphologically lower part of the study area. Prediction values obtained by kriging are much lower than values obtained by MPR and ANN-MP in the units with higher Ti level and its main source in the soil. In these two units, the predicted values obtained by MPR are higher than ANN-MP values. As was expected, the artificial intelligence offers the best results.
Prediction models of Ti spatial distribution isolate the highest concentrations in Triassic Clastites, spilites and tuffaceous sandstones, but also along with the entire alluvial sediments (JK clastic carbonate series) ( Figure 6). ANN-MP and MPR provide significant information according to their transportation downriver. In the upper part of the study area, the high concentrations of Ti are conducted to almost all tributaries. The spatial distributions show that materials with Ti particles concentrate on the lower parts, which means, the finest grains are transported and embed into alluvial sediments. The Southern part of the study area (Oligocene Clastite complex, Miocene Clastitc series and Miocene carbonates with coal layers) is enriched too. At several places, kriging provides its concentration in the form of concentric circles, which is incorrect comparing with the other two models.

ANN-MP Model Validation and Applicability in Various Case Studies
From the very beginning of the method development, we wanted to achieve three following attributes: (1) to construct a very stable ANN-MP model; (2) model repeatability, and (3) model application within geochemically and morphologically diverse study areas. This means that once the model is prepared, it must be applicable in other case studies under similar conditions (such as input data, number of hidden units, number of training networks), solving a problem of the spatial distribution of particular trace elements or their geochemical groups. In order to show a validation and ANN-MP model applicability, one test site in Macedonia had been selected.
The study area is in the south-central part of Macedonia, about 100 km south of Skopje. The city is well known by Ferro-Nickel smelter FENI, but also famous for its vineyards and it is the main vine production region in Macedonia. The complete investigated region (360 km 2 ) was covered by 172 sampling sites [71,72].
Construction of geochemical maps using universal kriging methods ( Figure 7A) is quite useful in the determination of distribution patterns, but around the low and high points appear the circles so-called Bull's eye effect. This can only be solved by a denser sampling grid. Applying the ANN-MP ( Figure 7B), the above effect has been avoided and geochemical contamination maps show more realistic distribution, from the geological point of view. According to the presented results, the model stability and applicability could be confirmed, representing a milestone in geochemical map construction.

Conclusion
Along the Stavnja Valley, intensive mining and smelting activities have occurred for more than 100 years. Diverse mineral occurrence, specific geological settings, and geomorphological features make this study area interesting for applying advanced mathematical methods. The study area has been a place of intensive military operation during the last war, 1992-1995, which resulted in numerous remaining minefields, and made this research more complex. According to our experience, such sampling should be done in a group of two or three people maximum, if possible, with one local. The distance between two people should be about 40 m, in case that one activates the mine, another can help. One rout should be used in both directions: to the sampling site and on the way back. If possible, existing routes or hard surfaces should be used for walking. It can be concluded that there is no one rule or several rules for safe sampling in the remaining minefields but only recommendations. It can be concluded; the only rule is that there is no rule. Due to the obvious reasons, several site positions intended for this survey had to be removed accordingly.
Detailed inspection of Pb which is a typical anthropogenic element and Ti typical geogenic element helped in the reconstruction of its main pathway in the study area. The spatial distribution obtained by linear (MPR) and nonlinear method (ANN-MPR) successfully solved a problem of contamination along the river represents an anisotropic appearance between the isotropic lithological units and cannot be solved completely by standard interpolation kriging methods, based only on sparse soil measurements. Their arrangement in concentration across the Valley are more visually realistic and imagery, because they include more geospatial and geomorphological data such as geological background, land use, aspect, slope, altitude, etc. Such models will help us to understand processes that happened in a certain period and improve the data interpretation.
Various modelling techniques help us in the reconstruction of different processes that influenced the entire area. Their main purpose is not only the isolation of hotspots with the highest concentrations but simultaneously they distinguish the distribution pathways of a particular element or group of elements. The applied models are repeatable, which means that very similar spatial distribution can be obtained under the same conditions unrestrictedly. Also, a very important characteristic of each particular modelling is that one model isolates all hotspots simultaneously. In another hand, their similarities are reconfirmation, refinement and at the same time validation of different models.
Advanced modelling represents a milestone in geochemical investigations and mapping itself. The main advantages of those methods are, first of all, that we can construct very eventful and complete maps with the spatial distribution of particular elements or geochemical association, sampling at high-risk sites can be avoided (such as the Stavnja valley), and a number of sampling sites can be reduced while the maps can still remain very qualitative. Such models have a significant contribution to understanding the decomposing mechanism of mining waste and its environmental impact as well as used in remediation processes of the affected areas.