UAS-Based Hyperspectral Environmental Monitoring of Acid Mine Drainage Affected Waters

: The exposure of metal sulﬁdes to air or water, either produced naturally or due to mining activities, can result in environmentally damaging acid mine drainage (AMD). This needs to be accurately monitored and remediated. In this study, we apply high-resolution unmanned aerial system (UAS)-based hyperspectral mapping tools to provide a useful, fast, and non-invasive method for the monitoring aspect. Speciﬁcally, we propose a machine learning framework to integrate visible to near-infrared (VNIR) hyperspectral data with physicochemical ﬁeld data from water and sediments, together with laboratory analyses to precisely map the extent of acid mine drainage in the Tintillo River (Spain). This river collects the drainage from the western part of the Rio Tinto massive sulﬁde deposit and discharges large quantities of acidic water with signiﬁcant amounts of dissolved metals (Fe, Al, Cu, Zn, amongst others) into the Odiel River. At the conﬂuence of these rivers, different geochemical and mineralogical processes occur due to the interaction of very acidic water (pH 2.5–3.0) with neutral water (pH 7.0–8.0). This complexity makes the area an ideal test site for the application of hyperspectral mapping to characterize both rivers and better evaluate contaminated water bodies with remote sensing imagery. Our approach makes use of a supervised random forest (RF) regression for the extended mapping of water properties, using the samples collected in the ﬁeld as ground-truth and training data. The resulting maps successfully estimate the concentration of dissolved metals and related physicochemical properties in water, and trace associated iron species (e.g., jarosite, goethite) within sediments. These results highlight the capabilities of UAS-based hyperspectral data to monitor water bodies in mining environments, by mapping their hydrogeochemical properties, using few ﬁeld samples. Hence, we have demonstrated that our workﬂow allows the rapid discrimination and mapping of AMD contamination in water, providing an essential basis for monitoring and subsequent remediation.


Introduction
Acid mine drainage (AMD) is an environmental phenomenon that can occur either by the natural exposure of sulfide minerals to weathering conditions or as a consequence of certain mining activities. Lottermoser [1] defines AMD as a process whereby low pH mine water is formed from the oxidation of sulfide minerals. These acidic and metal-enriched waters can negatively affect the natural ecosystem's quality and aquatic life. Mainly impacted areas are rivers, lakes, estuaries, and coastal waters. AMD's advancement can take years or decades and can spatially continue for centuries [1]. Therefore, such an environmental problem needs to be carefully monitored and ideally, remediated. Several efforts have been applied in order to monitor the spatial distribution and intensity of contamination by AMD, commonly involving systematic sampling and laboratory analysis of stream sediment followed by interpolation of the results in assembled distribution maps [2,3], however, such approaches can be time-consuming, costly, and with limited spatial coverage.
Previous studies have shown the benefits of remote sensing for many environmental monitoring purposes. In relation to AMD detection and monitoring, multi and hyperspectral sensors has been widely used, due to the distinctive spectral absorption features of iron-minerals present in the visible to shortwave infrared region of the electromagnetic spectrum [4]. These studies have covered a wide range of spatial dimensions (scales) depending on the platform used for data acquisition: including satellite studies [4] for water reservoirs protection [5] and indirect pH estimations by mapping iron-bearing minerals precipitated on the stream bed [6] to airborne surveys over mine tailings [7][8][9][10]. Laboratory-scale spectral studies over AMD minerals have been performed for reference spectral libraries [11] as well as the study of spectral signatures of surface waters [12]. The emergent use of unmanned aerial systems (UAS), like multicopters coupled with lightweight hyperspectral sensors has become a tool to collect data at a higher spatial resolution than most of aircraft and satellite counterparts, resulting in greater precision (higher spatial resolution of a scene enabling the investigation of down to a few centimeters pixel size) [13]. Most recently [14] implemented is the use of UAS-hyperspectral imaging for high-resolution, multi-temporal mapping of proxy minerals for AMD in the Sokolov lignite region, Czech Republic.
Our study focuses on mapping hydrogeochemical properties to assess the extent of AMD in waters. In this paper, we integrate UAS-borne visible to near-infrared (VNIR) hyperspectral data with physicochemical parameters of water (field and laboratory measurements) to map acidity, redox potential, and metal concentrations in surface water. We propose a novel machine learning approach to map gradual changes occurring in the water properties based on their spectral signature. A multivariate regression problem is solved by a random forest (RF) algorithm which uses constrained training data to predict the properties for the entire surface water area. We use water quality samples and field spectroscopy from ground-control points to fully assess the accuracy of the method, by deriving from analytical studies and in-situ measurements the training data for the regression model. The obtained hydrogeochemical maps can be used to monitor water bodies surrounding mining ecosystems, by targeting sources and/or acidic contamination in water, promoting its continuous supervision or assisting in the selection of the most adequate remediation treatment.

Test Site
To test the approach, the confluence between the Odiel River and the Tintillo River in the Huelva Province (Iberian, Pyrite Belt, Southern Spain) (Figure 1a-c) was selected with the aim of monitoring different water behaviors in the analysis. The Tintillo River emerges in the mining area of the Rio Tinto mines and flows downstream for 10 km before being discharged into the Odiel River. After this union, the Odiel river is affected with large amounts of acidity and dissolved metals transferred by the Tintillo River. At the confluence of these rivers (Figure 1e), different geochemical and mineralogical processes occur due to the interaction of very acidic water (pH 2.5-3.0) with neutral water (pH 7.0-8.0). The hyperspectral survey covers the three branches of the confluence as detailed in Figure 1d.

Geological Framework
The Iberian Pyrite Belt (IPB) is located in a north-vergent fold and thrust belt of late Variscan age. Its extension includes Setubal (Portugal) to the north of Seville (Spain) [15]. The typical geological facies includes phyllites/quartzites, followed by slates, basalt sills, felsic volcanic (rhyolites and dacites), and Culm series (greywackes and slates) [16] with an absolute lack of carbonate or alkaline materials [17]. The IBP belongs to the South Portuguese Zone of the Hercynian Iberian Massif and according to [18] is formed by upper Palaeozoic materials that can be divided into three lithological groups: (i) the phyllitequartzite group (PQ), formed by a thick sequence of shales and sandstones, (ii) the volcano-sedimentary complex (VSC), including a mafic-felsic volcanic sequence interstratified with shales, and (iii) the culm group where shales, sandstones, and conglomerates prevail. The southern part of the Odiel river flows across Neogene marly sediments of the Guadalquivir depression. In the northern part of the basin, plutonic and metamorphic rocks from the Ossa-Morena Zone also outcrop [19]. The strata bound, volcanogenic massive sulfide lenses are hosted in felsic volcanic of Upper Devonian to Lower Carboniferous ages [15]. Zones of chloritic and argillitic alteration are correlated with those lenses. Stockwork zones occur underneath the lenses in the vicinity of faults. A gossan usually forms in the cap-rock above [18].
The IPB region has been mined for copper and smaller amounts of manganese, iron, and gold since the Bronze Age [20]. Related to VSC are many massive polymetallic sulfide deposits (>80). Pyrite (FeS 2 ) is the most abundant mineral in these deposits, with sphalerite (ZnS), galena (PbS), chalcopyrite (CuFeS 2 ), arsenopyrite (FeAsS) in minor quantities and other sulfides combined with Cd, Sn, Ag, Au, Co, and Hg [19]. Therefore, pyrite is abundant both on mine waste and rock surfaces, and AMD is present in most streams throughout the region [19]. The natural geological conditions, the long mining history of the IPB added to the lack of natural alkaline materials makes the capacity to neutralize the acidity of the acidic streams very limited [21].

Hydrology and Climatology
The hydrological characteristics of the Huelva region are typical of a semi-arid climate [17]. The precipitation is characterized by intense, short rain events in autumn, not so intense dry winters, and dry springs and summers. The largest drainage basin from Huelva is the Odiel River Basin, which covers an area of about 2300 km 2 . The Odiel River begins in the Sierra de Aracena and, as well as the Tinto river, flows down to the Ría of Huelva estuary [19].
The length of the Odiel river is about 140 km (from the headwaters to the mouth) with a 600 m drop along its way. Most of the annual precipitation (812 mm) occurs in the winter months (between October and January) [19]. These climate variations affect the hydrological behavior of the Odiel River throughout the year. The geochemistry is influenced by inter-annual water flows dependent on fluctuations in rainfall and evapotranspiration [17].
The Tintillo River is located in the north of the Huelva province, and drains an area of about 57 km 2 . The length of this river is about 10 km and is mainly formed by acid sulfate waters coming from the base of a large, sulfide-bearing waste-rock pile (∼1 km long, ∼40 m high) situated in the surrounding area. This river feeds the Odiel River with acidity and dissolved metals. In this confluence, several geochemical and mineralogical processes typical of acid water mixed with neutral water occur [21]. These acid waters affect the chemical balance of the Odiel and Tinto fluvial systems, with the transfer of large amounts of acids and dissolved metals (Fe, Al, Mn, Cu, Zn, Cd, Pb), As, and SO 2-4 [20]. Even though the Odiel river receives small discharges of acid mine water emanating from several abandoned mines of the IPB (i.e., Concepcion, San Platon, Esperanza, and La Poderosa-El Soldado) [22], the vast amount of acidity and metals of Tintillo consumes all the alkalinity and makes Odiel waters remain hyperacidic downstream until it reaches the Atlantic Ocean (70 km downstream) on the coast of Huelva [21,22].
Sánchez [17] report a main annual flood in autumn, which carries AMD from solubilized iron sulfide into the river. The sediments decrease in water content and are continuously oxidized (Figure 2a) during winter, spring, and summer. The metals produced by the mine waste stay as particulate matter along the river watershed. The dissolved metals increase during dry seasons and decrease during more humid-wet seasons [19]. There is a considerable difference in the pH ranges of the water along the Odiel River that can be used to monitor its quality [23].
During summer conditions, the low precipitation rate causes a decrease in the pH values and an increase of AMD contamination in areas closer to the mine waste heaps. The intense dark red color of acidic waters is related to their geochemistry, depending on the the precipitation of dissolved iron and aluminum in the mine water, mainly controlled by pH value [17]. The colloidal Fe 3+ is then responsible for the red color of very acid waters (see Figure 2b), whereas Fe 2+ gives a more greenish color, and white color when aluminum precipitates [23]. Transitions between the three hydro-geochemical facies (see Figure 2c) can happen within very small distances and are mapped in this study.
Several efforts have been made to reduce the environmental impact of acid mine drainage in the region, including geotechnical stabilization and revegetation of waste piles, construction of rainwater drainage systems, and sealing of mine adits, as well as passive treatments such as Anoxic Limestone Drainage and anaerobic compost wetlands. Nevertheless, these tests have not been enough due to chemical and climatic effects such as high acidity and metal contents with seasonal variability of water discharge [17].

UAS-Borne Hyperspectral Data
The hyperspectral data used in this study were acquired with the frame-based Senop Oy 'Rikola' Hyperspectral (HS) Camera ( Figure 3a). The size and weight of this HS camera have allowed it to be gimbal-mounted on a customized multi-copter (Tholeg THO-R-PX8) ( Figure 3b). This setup provides a stable platform at the time of image acquisition and allows for a sufficient integration time for the spectral imaging sensor. The Rikola sensor provides frame-based images in the VNIR spectral range between 504 and 900 nm. The specific parameters used for this survey are detailed in Table 1. Raw data are acquired as a function of wavelength and digital number which is subsequently converted to radiance using the Rikola hyperspectral imager software [24].

Flight Set-Up
All the hyperspectral images were acquired within 2 flights, the first one in a parallel line to Odiel River (before and after the confluence) and a second one for scanning the Tintillo River. All acquired images are assumed nadir position.
The flight parameters were defined in the field (see Table 2) and a pre-defined flightpath was uploaded to the UAS. Before the flight, the calibration panels (black, gray, and white; PVC) with known spectral characteristics were placed underneath the flight line for the subsequent conversion from radiance to reflectance during pre-processing stage. The hyperspectral data were pre-processed using the python-based MEPHySTo toolbox [25]. The toolbox performs a set of steps over each single image in order to co-register, correct the geometric effects (lens-and topographic correction), and fit each HS snapshot using the high-precision geo-referenced orthophoto generated with a structure-from-motion multi-view-stereo (SfM-MVS) workflow. Quality control of the pre-processed single-images was performed for removing defective scenes (e.g., distorted or out-of-size images), when keypoints for matching were not found. After this verification, the mosaic is constructed.
As the final step in the pre-processing chain, the hyperspectral radiance mosaic was converted to reflectance. Here, a simple empirical line method was applied to the dataset [26] using the known spectra from ground-reference calibration PVC panels. A gray panel proved successful for the radiometric correction since all the HSI snapshots showed low reflectance. The Rikola camera has two detectors for covering the wavelength range. This provokes a spectral-shift (a flat section) in the spectral data, thus bands in the spectral range 624-671 nm have been removed, and in consequence not considered in the processing chain of this research. The resultant reflectrance data were smoothed using the Savitzky-Golay Filter, with a window size of five and a second degree polynomial in order to remove noise [27].

Ground-Truth Data
A total of 15 control points were established within the scanned area for reference. The distribution of these points was designed to be equidistant (Figure 4a), as long as the conditions of personal security allowed it. Field measurements (Figure 4b-e) and analytical techniques were the basis for the validation and accomplishment of the delivered maps of this research.

Field Measurements and Sampling
Two sets of water samples were taken for chemical analyses. The first one for measuring major ions (Ca 2+ , Mg + , K + , SO 2-4 , Cl -, Fe, amongst others) were taken with 60 mL-syringes and immediately filtered with 0.45 µm cellulose acetate membrane filters, stored in 50 mLpolyethylene bottles and acidified down to pH < 2 with concentrated HNO 3 . The second batch of samples for ionic measurements, were taken directly in 50 mL-polyethylene bottles (flushed three times before sampling) and fully filled. All samples were stored in covered containers in the dark and were refrigerated during transport until analysis in the laboratory. Sediment and samples of 200 to 500 g were obtained using a PVC shovel and kept in a transparent polyethylene bag. The PVC shovel was washed with distilled water between each sampling point. Samples and measurements were collected at a spacing of approximately 15 m, in the center of the channel, as long as the characteristics of the site allowed it.
The on-site measurements, pH, and temperature (T) were measured with a VWR pH110 meter which also incorporates a temperature sensor and the redox potential with a Redox-One-Bar Measuring Chain (Ahlborn FY96RXEK) with Pt-Ag/AgCl electrode system. The mV measurement was corrected for temperature and adjusted to a potential relative to that of the standard hydrogen electrode (see Method [28]). The pH meter was calibrated using Hanna standard solutions (pH 4.01 and pH 7.01) and the redox meter was checked using Hanna standard solutions (240 mV and 470 mV). The probes were placed directly in the river water (approximately 10 cm deep) with gentle agitation, and the measurement was made once the reading was stabilized (Method 2550: [29]). Back in the laboratory, the Electrical Conductivity (EC) was measured for each sample with a WTW 3320 Multimeter prior to elemental concentration analyses.
For the spectral point-measurements on the water surface, a Spectral Evolution PSR-3500 high-resolution full range hand-held spectroradiometer was used. It provides spectral information in the wavelength range between 350 nm and 2500 nm (covering the visible (VIS), the near-infrared (NIR), and the short wave infrared (SWIR) part of the electromagnetic spectrum) with a spectral resolution of 3 nm in VNIR and 8 nm in the SWIR. A total of three flexible PVC panels in a white shade with known spectra (reflectance >97.0 ± 0.3%) were submerged in the river water, spatially distributed to cover the three parts of the confluence. These are monochrome synthetic 50 × 50 cm panels which are spectrally homogeneous and have nearly Lambertian reflectance. Water depth measurements at the 15 sampled points and at the position of the PCV panels presented a similar depth (10-12 cm). Water depth along the scanned river area was approximately constant.
Calibration of the device was done using a near-Lambertian reflector (PTFE, >99% in the VNIR, >95% in the SWIR). This calibration was done under sun/ambient illumination, meaning that the signal is calibrated on the actual irradiance (thus, atmospheric influences are ignored). The sensor tip was placed as close as possible to the surface of the water, where each panel was submerged, using sun-light illumination ( Figure 4d). For each spectral measurement, the integration time was set to 60 ms and 10 individual spectra were consecutively averaged.

Analytical Techniques
Spectral and mineralogical characterization of solid and water samples was performed at the Helmholtz Institute Freiberg in the Spectroscopy and X-ray laboratories, respectively. Water sample analyses were carried out by the Water Laboratory of the Department of Mining and Special Civil Engineering in the Technical University Bergakademie Freiberg.
For SO [2][3][4] and Clwater content a spectrophotometer (Hach DR3900) was used [30]. The method determines elemental concentration via absorbance of the light when interacting with the compounds present in the sample solution. Test-stripes were used regularly to correctly dilute the sample for the measurement inside the detection limits of the equipment. Microwave Plasma Atomic Emission Spectroscopy (4200 MPAES from Agilent) was used to quantify the concentration of the dissolved metals (Al, Cd, Co, Cu, Fe, Mg, Mn, Ni, Pb, Zn). After emitting a microwave-plasma that atomizes the created aerosol sample, electrons are excited and the device is able to (1) detect the emitted light at specific wavelengths (spectrum-emission lines) characteristic for each element and (2) measure the intensity of the emitted light which is compared to known emission-concentrations lines of elements in calibration curves [31].
The mineralogical characterization of the solid samples was achieved through powder X-ray diffraction (XRD) using a PANalytical Empyrean diffractometer equipped with a Co-tube, Fe-Filter, and a PIXcel 3Dmedipix area detector. Samples were first ground with an Agate mortar and then split into (3-4 g) representative samples for the next grinding step in a McCrone mill (Retsch) where the required sample grain size (4 µm) is achieved for analysis. Using the backloading technique, the sample was filled in the sample holder for measurement. Qualitative mineral phase identification was based on the ICDD PDF-4+2019 database [32] and the PANalytical HighScore 3.0.4 software. Raw data were analyzed using the package BGMN/Profex v. 3.10.2 [33], using the Rietveld method for obtaining quantitative mineralogical data. X-ray fluorescence (XRF) was used for elemental composition analysis. Additional preparation steps were required for preparing fused bets. After the sample was (4 µm) sized, the sample was further dried overnight at 105 • C before preparing for calcination at 950 • C in ceramic crucibles. The samples in the crucibles were weighted before and after calcination for the loss of ignition calculation. After this step, a fraction of 1 g of the calcinated sample was mixed with 8 g of Lithium Tetraborate (LiT) into a platinum crucible for the preparation of fused bets. Prior to this, the platinum crucible went into the Electrical Fusion Instrument at (1050 • C) for 40 min for the lithium and sample fusion. Furthermore, a batch of spectral measurements on the dried sediments samples was carried out with a Spectral Evolution PSR-3500 spectroradiometer in order to characterize the spectral mineralogy of the samples and to compile an endmember library for validation of the hyperspectral data acquired with the Rikola camera. The spectral measurements were done using the contact probe (8 mm spot size) with artificial illumination, directly on the surface rock sample. Each spectra consists of 10 individual measurements, consecutively taken and averaged. Calibration and conversion to reflectance was carried out using a pre-calibrated PTFE panel (>99% in VNIR, >95% in SWIR). Three spots were measured per sample over a 5 × 5 cm area to account for lithological heterogeneities. Calibration and conversion to radiance was carried out every 20 measurements with the Spectralon panel.

Methodological Framework
Adapted processing steps were implemented for the integration of hyperspectral images with the hydrogeochemical and mineralogical data available. The workflow in Figure 5 shows the proposed machine learning-based system in which the first step is splitting the hyperspectral image into the two main surface components to map: the river flow path (water pixels) and the exposed sediments in the river borders (soil pixels). Subsequently, training data are derived out of the chemical analysis results from water samples and mineralogical results from sediment samples. After masking the HSI data for each surface component and preparing the training data, a different supervised machine learning method is applied over each mask: A regression model for the pixels belonging to the river flow path to get a continuous output where gradual changes in water properties can be mapped, and a classification model for the pixels belonging to the exposed soil to get a discrete mineral distribution map on the river borders.

Surface Classification
The first step in the proposed approach consists of identifying the main surface components in the hyperspectral image. This is achieved by using a Support Vector Machine (SVM) classification based on features extracted by the novel Orthogonal Total Variation Component Analysis (OTVCA; [34]). OTVCA is an unsupervised feature extraction method which uses a total variation optimization of a non-convex cost function with an orthogonal restriction. The dimensions of the hyperspectral data are reduced and the spatial information is extracted while edges and structures are preserved [35]. This allows the differences in the main surface materials of the study area to be highlighted, discriminating the river flow path from the exposed soil, and also vegetation pixels, which are later removed. The accuracy of the Support Vector Machine algorithm is increased when classifying spectral data after feature extraction [35].

Training Data
The training dataset was collected by manually labeling a small number of pixels within the image based on both spectral image inspections and field observations using the main distinctive classes of land cover (water, vegetation and soil). In order to increase the number of training pixels per class, 75 to 100 extra pixels surrounding each of the georeferenced training pixels have been considered in each of the created region of interest per class.

Support Vector Machine (SVM)
SVM fits a separating hyperplane (defined as the class boundary) that segregates the feature space in two classes with the largest margin for each class [36]. An optimization problem is solved by structural risk minimization for identifying the aforementioned hyperplane. Only the samples that are closest to the class boundaries are required to train the classifier, the so-called support vectors. The classification results of SVM can have high accuracy even when only a small number of training samples are available [35][36][37]. The first conception of an SVM classifier was a binary linear method for the separation of two classes [38].
For the application of the algorithm over complex classification scenarios, where the boundaries are non-linear, kernel methods have been widely implemented for an extended SVM nonlinear separation method. In this sense, the separation is done after input data are transformed by a kernel function in a high-dimensional feature space, where the samples could be linearly separated. The Kernel function employed in this work was the Gaussian radial basis function (RBF) kernel [36]. For obtaining the optimal hyperplane, the only needed parameters are the gamma γ (spread of the RBF kernel) and the regularization parameter C (value for quantifying the amount of penalty along with the SVM optimization). In terms of simplifying a multiclass classification problem into several binary subproblems, different methods such as the one-against-one (OAO) or the one-against-all (OAA) strategies are commonly used for solving the simplification problem [38]. In this workflow, SVM is implemented using LibSVM by [39] with an RBF kernel fed with the features extracted by the OTVCA for isolating river flow path pixels from sediments and vegetation.

Hydrogeochemical Maps
A regression has been applied to the water masked pixels of the HS image. A classification for parameters that change gradually within an area, could yield maps that only show properties by clusters and lead to over/underestimate values for the rest of pixels, following the supervised-training dataset. In this sense, it is more convenient to perform a regression over the hyperspectral data since the output values will be continuous for each pixel where the property is unknown. A random forest (RF) algorithm is used to solve the multivariate regression problem and predict the values for five different water parameters (pH, redox potential, Fe, SO [2][3][4] , and Al). The approach aims to use data from the ground-validation points to predict the spatial distribution of the water properties over the entire river flow path.

Training Data
The training data for the regression model were selected manually, using the coordinates where the field measurements were performed (Figure 6a). A total of 15 ground-truth points were available for the data integration. Around 400 pixels surrounding each of these points and an additional supporting set of pixels (black colored) were selected as initial training dataset for the prediction model. For this selection, characteristic features apparent in the image spectra and the analysis of validation spectra in regions uniform within the extracted OTVCA features were considered. Subsequently, the physicochemical data (field and laboratory based) associated to each station were assigned to the training dataset, creating a multi-variable matrix for feeding the regression algorithm. Examples of spectra from the training spots of the UAS hyperspectral data are shown in Figure 6. Only the spectra of three group of pixels are shown for simplicity, one from each part of the river confluence. Groups of pixels corresponding to more acidic pH (Figure 6b) show a higher absorption feature in the 500-650 nm range, as do the slightly less acidic (Figure 6c). Unlike the acidic water pixels, the spectra related to neutral pH values (Figure 6d) lack this feature and are relatively flat. Overall, reflectance values are below 20%, as expected from the dark-red color of waters and the river bed sediments.
For assessing the performance of the regression algorithm, two validation phases have been implemented. The first one or internal validation consisted of using the same initial training dataset where each class is randomly re-sampled into a test and training dataset (75% training and 25% for test/validation). Additionally, a final validation (referred in this work as the external validation) is performed using a less populated test dataset. This second test dataset is created considering only around 100 pixels neighboring the original-georeferenced pixels where the field measurements were taken. R-squared (R 2 ) is calculated on both test datasets to assess the performance of the regression map.

Random Forest (RF)
RF is a learning method often used in classification and regression problems, in which a set of decision trees are trained and their individual results are then combined through a voting process [35]. The idea of a decision tree predictive model is to break up a complex decision into a union of several simpler decisions, hoping the final solution obtained this way would resemble the intended desired solution [40]. While the predictions of a single tree are highly sensitive to noise in its training set, the average of many trees is not that sensitive as long as the trees are not correlated [35]. Random forests for regression are created by growing trees depending on a random vector, so the tree predictor takes numerical values rather than class labels [41]. The general technique of bootstrap aggregating is used for training the model. Bootstrap aggregating is used for training data creation by resampling the original dataset in a random fashion with replacement, leading to a more efficient performance of the model [35]. Different from most of the machine learning algorithms, random forest process needs two parameters to be set for generating a prediction model: the number of regression trees and the number of evidential features which are used in each node to make the regression trees grow [42].

Mineral Maps
A supervised classification using Spectral Angle Mapper (SAM) has been applied for mineralogical mapping of the pixels extracted as exposed soil after the surface classification (dry sediments and border crust of the confluence area). The aim of this exercise is to identify the mineralogical changes in the soil related to the varying composition of the proximal water bodies.

Endmember Spectral Library
Reference spectra to be used by the SAM classifier are obtained from laboratory spectral point-measurements over field sediment samples (Figure 7a). The endmember spectra collection are then subjected to wavelength analyses in the available range ( Figure 7b) in order to identify the main occurring VNIR active iron-bearing minerals such as hematite, goethite, jarosite, and schwertmannite. Spectral interpretation of the mineralogy is validated with the XRD results and compared to reference mineral spectral libraries (USGS [43] and AMD-minerals [11]). The resultant endmember spectral library is then resampled to the available range of the Rikola HS sensor range (504-900 nm) (Figure 7c).
Two regions are gray-shaded in Figure 7b to analyze the shapes and wavelength positions of each mineral in the charge transfer (ligand to metal charge transfer) transition and those triggered by the crystal field effects (transitions of electrons from lower to higher energy states) [44]. Hematite characteristically has a narrower absorption at wavelengths surrounding 880 nm, while goethite has a broader feature with wavelengths around 920 nm or greater [44]. This feature associated with crystal field absorption around 900 nm is also found in the jarosite and schwertmannite spectral curves. However, the spectra shoulders around 650 nm, associated with the electronic charge transfer of Fe 3+ and Fe 2+ [45], allow further distinction for schwertmannite which has no known inflection point at 650 nm and spectral peak located at 738 nm [11]. The peak location at 720 nm and a small distinctive absorption feature at 2264 nm confirms spectral identification for jarosite [11].

Spectral Angle Mapper (SAM)
The spectral angle mapper is a supervised classification algorithm, measuring the similarity between a known reference spectrum and the unknown image spectra by calculating the angle (in radians) between the two spectra as they are expressed as vectors in a n-dimensional coordinate system, being n equal to the number of available bands [46]. By choosing smaller angles, the closer will be the matching to the reference spectra. After the comparison, each pixel is assigned to the class that exhibits the smallest SAM angle. Therefore, the threshold range for the similarity analysis in this work, relied on very small SAM values (0.05-0.08 rad) for performing the classification.

Hydrogeochemical Maps
The analytical results from water samples can be split according to each branch of the river confluence: (1) Tintillo, (2) Odiel before, and (3) Odiel after the confluence with Tintillo River. The Tintillo River is characterized by very acidic pH values (2.5-2.6) and high electrical conductivity (EC) values (9000-9500 µS/cm). The redox conditions are mainly governed by the Fe 2+ to Fe 3+ oxidation rate, values around 575 mV were found and are characteristic of oxidized aqueous environments. The chemical behavior of the Odiel River samples changed drastically from neutral pH values (6.8-7.9) (before confluence) passing through a transition phase (4.5-4.7) and finally decreasing to pH values of 2.8-3.9 after the confluence. The detailed water parameters for each of the samples can be found in the Table A1 in the Appendix A.
The redox conditions vary strongly in the samples from Odiel River, around 200 mV in the initial branch to around 520 to 570 mV after the mixing. In general, the EC values are lower in comparison to Tintillo waters, before the confluence (373-645 µS/cm), and increasing up to 9460 µS/cm after. Samples from the Tintillo River were brownish to light-brown in color, samples from Odiel River before the confluence were colorless, and samples from Odiel River after the confluence were yellow-brown in color. The most significant contents are those of SO 4 , Fe, Zn, Pb, Mn, Mg, Al, Cl, Cd, Cu, Ni, Co (Figure 8a). Other trace elements such as Pb, Cd, Ni, and Co were always close to or below the detection limits. The detailed elemental composition for each sample can be found in the appendices (Table A2).
Although iron speciation could not be quantified, the highly oxidizing and acidic conditions (pH 2.5-2.8) of Tintillo and Odiel waters indicate the prevalence of Fe 3+ . Dissolved Fe 2+ can be rapidly oxidized to Fe 3+ , characteristic of bacterially catalyzed oxidation, and leading to the appearance of both dissolved and particulate Fe 3+ which turns the acidic waters a deep red color [17]. A pH value around 3 favors the hydrolysis and precipitation of Fe 3+ ions, which takes place usually in the form of very fine-grained schwertmannite (with traces of jarosite) [17]. These minerals are mineralogically meta-stable and usually are transformed into goethite or hematite [47].
The spectral curves for the in-situ water point measurements are shown in Figure 8b. Each measurement corresponds to a different river branch of the studied confluence. The DI water spectral line was acquired under laboratory conditions and was used as a reference to describe the spectral features of pure water. The DI spectra is featureless until the appearance of two broad absorption features at 763 nm and 975 nm which are characteristic of the water molecule in the liquid phase [48,49]. These absorption features are produced by the combination of three fundamental vibrational modes of the water molecule [49]. After these features, the spectral curve decreases to virtually 100 % absorption. Thus, water spectral analysis in this study was focused in the VNIR range (350-1000 nm).
The spectral behavior at the 3 different river locations was characterized by an increasing absorption feature range between 350 and 650 nm and a peak reflectance around 640 nm. The absorption features of the three-river water curves from 700 nm onwards were matching to those in the pure water line (DI). The main differences found in the field water spectral measurements are in the 400-600 nm range, probably associated with iron species [11]. The Odiel River before confluence showed the smallest absorption with a very slight inflection point at 436 nm. In the case of the Tintillo River spectral curve, the absorption feature is broader and the inflection point at 436 nm is slightly less pronounced followed for a second absorption feature at 494 nm. The curve for the third point in the Odiel River after the influence of Tintillo showed similar absorption features than the Tintillo curve at 436 nm and 494 nm, but the absorption between 450-520 nm is higher.
The computed SVM classification (Figure 9b) efficiently identified the water path, exposed muddy and dry soil based on features extracted by the OTVCA Figure 9a. The parameters for finding the optimal hyperplane were determined using five-fold cross validation, resulting in C = 6.4980 and γ = 0.0442. The HSI mosaic was cropped to the water mask (Figure 9c) and OTVCA computed again. For the training dataset, five variables (pH, redox, Fe, Al, and SO 2-4 concentration) were considered in the training matrix Figure 9d. Figure 9e shows the obtained pH regression map, which clearly show the gradient in the pH value of water, and reveals the strong influence of the Tintillo that reach Odiel's cleaner part, probably caused by transport of the AMD contamination via groundwater flow or flooding during the heavy raining events of the region. The derived collection of maps for assessing the river water in Figure 10 makes it possible to see the fluctuations present in its quality by combining different criterion in the environmental assessment.  Table 3 shows the resultant R 2 values for the five variables that fed the RF algorithm. The so-called internal validation show for all the variables a close to 1 R 2 , showing a good correlation among the predicted values in comparison with the pixels only used as the validation set. Although during the external validation of the regression the R 2 values decreased, especially for variables with a wider range of values (Al, Fe, and SO 2-4 , the performance of the regression remained good.

Mineral Maps
The major visual feature of the collected sediment samples was the characteristic yellow to reddish-brown color, a product of the AMD influence in the region. The mineralogy of this precipitate consists mainly of Fe-phases precipitated from the Fe dissolved in AMD originating from pyrite oxidation in the mine site. Figure 11 shows the mineral composition of four of the main representative samples from the study area. The matrix is dominated by tectosilicates ranging from 36 to 58 wt%, the mass fractions are associated with quartz and feldspars (microcline and albite), and some sheet silicates suchas muscovite or chlorite, while the iron-bearing minerals (hematite, goethite, jarosite, and schwertmannite) are lower in mass fraction since they occur mainly on top of the layer in the form of fine-grained coatings. Highly oxidized minerals controlled by acidic pH (2.5-3), such as goethite (P13; 83 wt%) or hematite (P03, 32 wt%), cover the sediments on top of the river bank, while less oxidized minerals, such as jarosite (P04; 3 wt%) cover sediments in association with goethite, on the transition stage from metastable mineral to a stable form. More hydrated minerals, such as schwertmannite (P05; 6 wt%), grow over fine-grained sediments closer to less acidic waters at the beginning of the confluence, expected in pH ranges from 3.0 to 4.5 [5]. The detection of schwertmannite could be challenging since it is poorly crystalline in comparison to jarosite and goethite [17]. Schwertmannite fraction in P05 is identified by looking at their specific absorption features in the spectral measurement of the sample. The specific conditions for the precipitation of iron-bearing secondary products can vary widely according to the levels of pH, SO 4 concentration, and paragenetic relationships [5].
XRF results of elements and compounds match with the minerals revealed by the XRD analysis. Results are given in their oxide forms. Aluminum  (Figure 11), only a small selection of them can be detected with VNIR spectroscopy. This is because quartz, feldspar, or pyrite show no characteristic absorption features in this range. However, spectral curves of the hand specimens (Figure 7b) revealed distinctive absorption feature depths of the iron minerals (goethite, jarosite, hematite, and schwertmannite) present in the samples. Figure 12 presents the classification result at an angle of 0.08 radians for the occurring endmembers. The resultant map shows that goethite and hematite coatings cover the coarser clasts on top of the Tintillo River bars. Jarosite tends to develop over the pebbles and sands of the sides river bars on the border crust close to the Odiel river after the confluence and schwertmannite is present along the sides of the river bars over fine-grained sands closer to water in the left border of the upper part of the Odiel River.

Discussion
In this paper, we propose a framework to monitor water pollution produced by AMD integrating UAS-HSI data, field measurements, and geochemical analyses. Compiled geochemical information from water samples is used as reference data to train a regression model to up-scale the detailed information coming from the sampled spots to the entire area.

Results Assessment
As the Tintillo-Odiel area is a rapidly changing environment, imaging spectroscopy of natural water body conditions could be affected by water depth differences constantly changing due to climatic parameters, water dynamics, and seasonal conditions. This study was performed under spring conditions (April) and the measured water depth was approximately constant (10-12 cm) over the sampled spots. Hence, the strong absorption features found in the water spectra between 400 and 700 nm might be mostly influenced by the presence of Fe 3+ ions which turns the acidic waters a deep red color. This absorption was deeper, where the iron content was greater. However, it is possible that predicted values towards the borders of the river could be influenced by the slight decrease on the water depth. The catalog of maps created in this research allows evaluation of the degree of contamination in the area covering water and sediments. While the hydrogeochemical maps enable visualization of the gradual variations of the water properties, the mineralogical maps show the spatial distribution of the iron minerals that can be used as a substitute to detect the contamination and the degree of acidity. The maps resulting from this study should be understood as simplified images of hydrogeochemical and mineralogical trends in a very spatially heterogeneous environment at all scales changing rapidly over time, not only annually or seasonally but even weekly.
This study involves steps and processes that may be inevitably affected by biases, as in most analytical studies that require human interaction. The quality of the original hyperspectral data may be affected by parameters set on site. (e.g., set-up geometry, frame rate, integration time, spectral sampling interval, band range, and bandwidth). Then, during data processing, various parameters must be entered manually, such as (1) the number of features to be extracted for OTVCA; (2) the quantity (radius) of the sampled pixels to be included as training data for classification or regression algorithms; (3) precision errors during laboratory analysis. Regardless of the constraints, the information provided by the data integration portion of this study represents a significant benefit for the better evaluation of AMD affected ecosystems. Mine waters may vary strongly in composition and be dominated by diverse dissolved constituents and thermodynamic conditions according to the geological environment in which they originate [1]. Similar to optical studies of natural water bodies, the intrinsic and apparent optical properties of AMD are influenced by a combination of physical, chemical and biological processes, which may affect their spectral response. The application of this method in other study areas, with less degree of AMD and deeper water conditions is encouraged in upcoming studies.

Relevance
Overall, the field acquisition was fast, taking about one day. Hyperspectral image acquisition took only 12 min approximately and the rest of the time was used for taking the validation measurements. Ground-truth sampling was demonstrated to be vital to provide as accurate and precise maps as possible. The integration of spectral information combined with further analytical procedures contributed to better interpretation of the remote-spectral information, increasing the reliability of the results.
Mapping water quality properties, based on their spectral signatures, represents a robust basis for an innovative water quality monitoring system. The maps derived from the UAS-borne hyperspectral data help to identify patterns and physicochemical conditions of natural waters. Even if the absorption features are not as prominent as the observed between 400-700 nm due to Fe, more subtle absorptions and slope changes in the NIR range provide information about the other chemical compounds and parameters affecting AMD waters. The subtleness of these variations can be detected in the whole VNIR available range, thanks to the multiple and the contiguous number of bands provided by the hyperspectral camera, which offers the possibility to map the unique spectral shape of each pixel with the measured parameters. A map collection for pH ( Figure 13), redox, and metal concentration (Fe, Al, and SO 4 ) makes it possible to detail changes in water quality by combining different criterion in the environmental assessment. Undeniably, the AMD effects in the region are considerable, the contamination of Tintillo even reach the Odiel River before the confluence, probably by groundwater interaction or heavy rainfall events.
It is feasible to estimate the gradient of variables affecting the water quality, discriminating the most affected zones studied. The level of accuracy is remarkable, allowing risk assessors and scientists to take appropriate preventive measures according to the level of contamination by mapping changes in AMD through time, which can occur after sudden rainfall events or stationary changes. The remediation of ecosystems polluted by mining leachates can represent a high-cost investment for mining companies, whereas the proposed approach points out a cost-effective solution during the reclamation or closure planning stage of mining activities by decreasing the time of data acquisition and enhancing the sampling strategy when assessing the most adequate reclamation technologies or treatment methods to apply.

Innovation
In terms of environmental interpretations and risk assessment, the study has shown that there are several advantages to using UAS-borne hyperspectral data as a complement to traditional environmental monitoring studies. Traditional monitoring of river water quality is mainly based on the chemical analysis of water samples routinely collected over the year, and on the physical parameters of the groundwater measured by instruments located in the flow path. These tasks can be expensive, time-consuming, and constrained by access. In addition, a major advantage of UAS mapping compared to ground surveying is a reduction in the time employed on acquiring data. Furthermore, the application of UAS-surveys enables analysis of locations that may be difficult to access, under protected status, or that involve personal security risks for terrestrial-sampling. Environmental monitoring by means of unmanned aerial systems has distinct advantages: (i) low flying heights for high resolution and seamless coverage in fragile areas, (ii) cost-effectiveness after the initial purchase cost, (iii) easy flight preparation after flying permission has been given, and (iv) versatility in the sensors that can be attached to the aircraft/drone. These advantages allow repeatably and recurrent data-acquisition. Therefore, multi-temporal analyses are feasible and may allow constant and repeated monitoring of ecosystems.
Although many instruments with greater spectral resolution and wider wavelength range have been developed, the majority of this equipment tends to be too heavy, fragile, and costly to be mounted on UAS on an operational basis. Renewed efforts have been made on satellite development to increase their spatial resolution by enhancing band acquisition efficiency and making data available in open-source systems but their metric spatial resolution remains a limiting factor.

Outlook
In order to increase the spatial coverage of this approach, a fixed-wing UAS (e.g., the senseFly eBee [50]) with lightweight sensors may be useful, since longer flight duration (50 min) can be achieved. Although the flight time is longer, the payload is lower, thus a lighter sensor is required to be mounted, for example, a Parrot Sequoia camera [51]. This sensor is multi-spectral and provides only information of four bands (near-infrared [790 nm], red-edge [735 nm], red [660 nm], and green [550 nm]). A band-ratio could be calculated by dividing reflectance-calibrated band 3 (735 nm) by band 4 (790 nm). This ratio may be useful for devising iron-related absorption, as it maps part of the crystal-field absorption produced by the Fe 2+ /Fe 3+ [45] around 800 nm. Moreover, the full width at half maximum (FWHM) of the 735 nm band is 10 nm and for the 790 nm band is 40 nm, so some of the reflectance peaks of iron minerals (e.g., from hematite, goethite, jarosite, schwertmannite) might be detected. A second option would be the use of band 2 (660 nm) by band 1 (550 nm) to pick up the differences in part of the charge transfer region of the iron-oxides. Nevertheless, this type of band ratio should be interpreted carefully since the narrow spectral features are combined together or left out using only two single-bands. Hence, multi-spectral data cannot provide as detailed a result as compared to the classification using hyperspectral data can, but overall they could be used for mapping the abundance of the iron-bearing mineral and targeting more acid-affected areas (using AMD minerals as proxy) for subsequent surveys with higher spectral resolution.
The use of the red-edge and NIR bands could be also useful in vegetation health analyses by detecting chlorophyll intensity in the acidic environment. Regardless of the spectral limitations, the multi-spectral sensor has the advantage of being lighter and can be mounted on fixed-wings systems, covering much broader areas than multi-copters.
In summary, the workflow developed by this study for environmental monitoring, based on the integration of remote sensing and geochemical techniques, was successful for the field campaign at the Odiel-Tintillo test site. The acquired mapping data are easy to include in GIS or 3D mining software. This represents an advantage for companies by creating a data catalog, in which initially ignored data parts can be processed to have a complete image of the monitoring system and not having only information from certain points. The high spatial resolution of UAS-borne hyperspectral data and precise orthorectification and georeferencing allow the creation of a three-dimensional, distortion-free framework for intuitive surface visualization and interpretation of the site conditions. Although the topography of the region is basically flat, the Digital Surface Model (DSM) allows us to observe small topographic contrasts and structures such as the hill that exists between the two rivers ( Figure 13). Additionally, the 3D data may lead to integrate differences in surface textures and better classification results by using elevation information.

Conclusions
UAS-borne hyperspectral imaging represents a fast and non-invasive tool which provides high resolution maps for different hydrogeological applications. We combined the acquired VNIR hyperspectral data cube with constrained hydrogeochemical data using machine learning techniques, aiming to assess surface water quality.
The proposed methodological framework has been tested in the Odiel-Tintillo River confluence. Even though the test-site represents a very variable, confined and mixed environment, it was possible to detect gradual changes in the parameters that control water quality and the spatial distribution of secondary-iron minerals on the borders of the river flow path.
We proposed to up-scale and spread detailed hydrogeochemical information coming from discrete ground-truth points to the whole covered area by discovering the relationship of the water chemistry with the spectral behavior using the supervised machine learning regression model: random forest (RF). Hence, gradual changes in physicochemical properties (pH, Eh) and elemental concentration (Fe, Al, SO [2][3][4] ) in surface water have been mapped for assessing the AMD extent.
There are three main outcomes of this study: (1) Data acquisition times have been demonstrated to be fast in comparison with traditional environmental monitoring approaches for producing high-pixel resolution maps that covered 14,000 m 2 of the rivers. Less than one hour was needed for the UAS survey and half of a day for the groundsampling campaign to acquire reference samples. (2) Laboratory and in-situ analytical data proved fundamental for the validation of the spectral features and to support the construction of robust mineralogical and hydrogeochemical trustworthy training data. (3) Fifteen reference points proved to be sufficient for a good performance of the RF regression model. Initial validation showed a close to 1 R 2 for all the variables and after narrowing the number of samples per classes during the final validation, the performance of the regression still remained good for variables with smaller value ranges (pH R 2 = 0.72 and Eh R 2 = 0.81) as compared to variables with a wider range of values (chemical composition of Fe, Al, SO 2-4 R 2 = 0.65).
Overall, the results of this paper emphasize the capabilities of UAS-borne HSI data as a valuable support for environmental monitoring of surface water affected by acid mine drainage. The framework followed may be used for the monitoring of other mining environments potentially affected by acid mine drainage, by targeting sources, and/or contamination in water, promoting continuous supervision or to provide guidance for adequate remediation treatments.  Table A1. Field parameters of natural stream waters of the Odiel and Tintillo River confluence. Em, redox potential measured with platinum/reference electrode; pe, redox corrected and expressed as electron activity; (**) EC, electrical conductivity measurements laboratory-based; (*) S14 is located in the left border of Odiel River, which shows Tintillo waters already affecting its composition.