Prediction of Optical and Non ‐ Optical Water Quality Parameters in Oligotrophic and Eutrophic Aquatic Systems Using a Small Unmanned Aerial System

: The purpose of this study was to create different statistically reliable predictive algorithms for trophic state or water quality for optical (total suspended solids (TSS), Secchi disk depth (SDD), and chlorophyll ‐ a (Chl ‐ a)) and non ‐ optical (total phosphorus (TP) and total nitrogen (TN)) water quality variables or indicators in an oligotrophic system (Grand River Dam Authority (GRDA) Duck Creek Nursery Ponds) and a eutrophic system (City of Commerce, Oklahoma, Wastewater Lagoons) using remote sensing images from a small unmanned aerial system (sUAS) equipped with a multispectral imaging sensor. To develop these algorithms, two sets of data were acquired: (1) In ‐ situ water quality measurements and (2) the spectral reflectance values from sUAS imagery. Reflectance values for each band were extracted under three scenarios: (1) Value to point extraction, (2) average value extraction around the stations, and (3) point extraction using kriged surfaces. Results indicate that multiple variable linear regression models in the visible portion of the electromagnetic spectrum best describe the relationship between TSS (R 2 = 0.99, p ‐ value = <0.01), SDD (R 2 = 0.88, p ‐ value = <0.01), Chl ‐ a (R 2 = 0.85, p ‐ value = <0.01), TP (R 2 = 0.98, p ‐ value = <0.01) and TN (R 2 = 0.98, p ‐ value = <0.01). In addition, this study concluded that ordinary kriging does not improve the fit between the different water quality parameters and reflectance values.


Introduction
The United States Geological Survey (USGS), in their National Water Quality Assessment Program (NAWQA), defines water quality monitoring as a continuous period of data collection (in lakes, streams, rivers, reservoirs, wetlands, or oceans), in order to evaluate the chemical, physical, and biological characteristics of the body of water with respect to its ecological conditions and designated water uses [1]. Monitoring water quality typically involves a series of in-situ observations, measurements, and water sample collections that are analyzed for various parameters depending on the individual project goals, such as temperature, phosphorus (P), nitrogen (N), total solids, pH, fecal bacteria, conductivity, dissolved oxygen (DO), biochemical oxygen demand (BOD), hardness, alkalinity, suspended sediments, other nutrients, trace metals, and water clarity. Traditionally, water quality indicators are determined by the collection, field examination, and laboratory analyses of water samples, following consistent protocols and guidelines [2].
Although properly collected and analyzed in-situ measurements are highly accurate, these measurements can be time-consuming, susceptible to errors (especially visual subjectivity), and can only be related to a specific point in time and space [3,4]. Due to these potential problems, water The oligotrophic system (the Duck Creek Nursery ponds) were developed as an aquatic plant nursery and receive runoff from surrounding grasslands. The site is located in northeast Oklahoma (36.5691° N, −94.9676° W) (Figure 1a). The nursery ponds (NP) are a series of small ponds, ranging in surface area, located at the upstream part of the Duck Creek arm of Grand Lake O' the Cherokees (Grand Lake). Situated in a watershed of pasture/hay land use, yearly temperatures in the region range from −4 °C (in winter) to 33 °C (in summer) and yearly precipitation ranges from 4.78 cm (in winter) to 13.77 cm (in fall) [40]. The land is owned and managed by the Grand River Dam Authority (GRDA). These ponds are not hydrologically connected to Grand Lake, and are mainly recharged by surface run-off; however, when the water is excessively high in the reservoir (Grand Lake) these ponds serve a flood control function. Two adjacent ponds were included in this study.
The eutrophic system (the City of Commerce Wastewater Lagoons) is located in Commerce, Oklahoma (36.9334° N and −94.8730° W) (Figure 1b). The wastewater lagoons (CL) were reconstructed by the city of Commerce in 2014 and their purpose is to provide primary treatment for the city's municipal wastewater. The primary input of untreated wastewater to these lagoons has excess nitrogen (N), phosphorous (P), and carbon. Domestic wastewater from Commerce enters the system via a clarifier. After a hydraulic retention time (HRT) of 24 h, the clarifier effluent splits into two flow paths. One half of the wastewater goes to the north wastewater lagoon, while the other half goes to the south wastewater lagoon. At each lagoon, wastewater is exposed to sunlight for a period of 3-4 days. The exposure to sunlight contributes to the growth of algae, and the algae builds biomass in order to promote bacteria growth. These bacteria break down the waste present in the water [41]. After adequate HRT, both lagoons discharge their effluent into a third wastewater lagoon that serves as an environmental buffer before discharging the treated effluent to the nearest tributary located at the north-east part of the parcel, in the Grand Lake watershed. Due to its proximity to the Nursery Ponds (which are approximately 40 km south southwest), yearly temperatures and precipitation are similar. The north lagoon was included in this study.

Water Quality and Multispectral Imagery Data Collection
In total, 36 water samples were collected (24 at the Nursery Ponds and 12 at the Wastewater Lagoons) (Figure 1). At each sampling location, a Secchi Disk Depth (SDD) reading was taken using a 30-cm Secchi Disk attached to a measuring tape. Once completed, one water sample was collected using a 4.2-L PVC depth-discrete horizontal water sampler submerged 0.5 m from the water's surface. Before sample collection, the sampler was rinsed three times with sample water. Once collected, water was divided into four portions. A first portion was transferred to a 250-mL high-density polyethylene (HDPE) bottle for field analyses (turbidity). A second portion was transferred to another 250-mL HDPE bottle to be analyzed for total nitrogen (TN) and total phosphorus (TP). A third portion was transferred to a 1-L dark bottle to be analyzed for chlorophyll-a (Chl-a). Finally, the remaining portion of the sample was transferred to a 1-L bottle to be analyzed for total suspended solids (TSS). Once samples were generated, they were placed into a cooler with ice at 4 °C for later analysis. A properly calibrated YSI 600 multiparameter data sonde [42] was then deployed to obtain dissolved oxygen (DO), temperature, specific conductance, salinity, and pH data. Calibration checks were performed using pH 7 buffer, 1000 μs/cm conductivity solution, and water saturated air (for DO) during and after sample collection. All samples were collected and preserved according to procedures from the U.S. Environmental Protection Agency (EPA, Washington, DC, USA) [43].
Multispectral imagery was collected using an ATI AgBot sUAS (Aerial Technology International, Oregon City, OR, USA) (   To georeference the multispectral images taken with the RedEdge sensor, information from the GPS was transferred to the images via the UBX binary protocol using the NAV and RXM data classes [40]. Differential GPS (DGPS) was obtained using Mission Planner 1.3.68 [45]. No ground control points (GCPs) were defined before the mission (because the mission was flown over water); however, the inertial navigation system (INS) onboard the sUAS provided continuous position, orientation, and velocity of the aircraft. All of this information was transmitted to a ground control station using Mission Planner 1.3.68. Two multiple-waypoint missions were designed in Mission Planner 1.3.68. All missions were flown at an altitude of 100 m with a flying speed of 5 m/s, and estimated flight time of 10 min. For the Nursery Ponds, a total of 164 images with a ground resolution of 6.20 cm were obtained. For the Wastewater Lagoons, a total of 46 images with a ground resolution of 6.20 cm were obtained. Figure 3a,b presents the flight paths for imagery collection at the Nursery Ponds and Wastewater Lagoons, respectively. Multispectral imagery was acquired the same day as the in-situ water samples.

Methodology
The workflow for this study was divided in four phases: (1) Data collection, (2) data processing, (3) model development, and (4) validation ( Figure 4). Multispectral imagery was acquired the same day as the in-situ water samples. Table 1 presents the time window between multispectral data collection and water sample acquisition, using the ending time of the sUAS missions as the reference point.

Nursery Ponds
Wastewater Lagoons

Laboratory Analysis
After collection, all water samples were analyzed for TSS, Chl-a, TP, and TN following the methods presented in Table 2. Chl-a samples were filtered through individual glass fiber filters in order to perform pigmentation extraction using 90% acetone, then Chl-a concentrations were measured using a Trilogy laboratory fluorometer at 460 nm. TP samples were individually mixed with a solution of 5N sulfuric acid (H2SO4), antimony potassium tartrate (C8H10K2O15Sb2), ammonium molybdate ((NH4)2MoO4), and 0.1M ascorbic acid (C6H8O6), and TP concentrations were measured using a Cole Parmer 2800 UV VIS spectrophotometer at 650nm. TN samples were individually mixed with a solution of 3N sodium hydroxide (NaOH), potassium persulfate (K2S2O8), nicotinic acid ptoluenesulfonate (C13H13NO5S), and adenosine triphosphate (C10H16N5O13P3), and TN concentrations were measured with a Lachat Quikchem 8500 series 2 flow injection analysis system. Finally, TSS samples were filtered through individual glass fiber filters (Gelman type A/E), then dried at 105 °C for at least one hour. TSS concentrations were determined by mass difference for the volume of filtered sample.

Reflectance Extraction
In order to perform the extraction of the reflectance values from the multispectral imagery, three scenarios were conducted ( Figure 5): (1) Value to point extraction (Figure 5a), where the georeferenced position of each sampling station was used in order to extract the reflectance; (2) average buffer value extraction (Figure 5b), where a buffer zone of 3 m was created around the georeferenced position of each sampling station in order to extract the average reflectance (the distance of this buffer zone was defined based on the offset distance of the GPS units); and (3) kriging extraction (Figure 5c), where kriged surfaces (using ordinary kriging) were developed for each of the analytes in order to extract the reflectance values of 319 points (oligotrophic system) and 162 points (eutrophic system) inside each created surface. This last scenario was created in order to simulate a hypothetical situation where water samples could be collected from the entire surface of the systems and to determine if the "collection" of more samples (represented by the centroids of each pixel inside the systems) could improve the predictive capability of the models. Ordinary kriging was selected as the interpolation statistical method because concentration estimations (for the different water quality parameters) were to be determined at unsampled locations with minimal error. All imagery stitching and preprocessing were performed in PiX4Dmapper 4.4.9 [46], while the reflectance extractions were performed using ArcMap 10.6 [47].

Model Development and Validation
To ensure equal spatial distribution between both systems, the overall data generated in each system were randomly divided into two subsets. Each subset was then merged with its counterpart. As a result, 50% of the data were used for model development, while the other half was used for model validation.
The models were developed using single variable and multiple variable linear model regression approaches. The untransformed data from in-situ TSS, Chl-a, SDD, TP, and TN values were used as the dependent variables, while the untransformed reflectance of each band, and different ratios between them, were used as the independent variables. The best fit was determined using the coefficient of determination (R 2 ) and the small sample corrected Akaike information criterion (AICc) [48,49]. Once the best fit for each analyte was determined, the remaining 50% of the data were used for validation. Statistical difference was determined using a paired sample t-test (p-value > 0.05) given the normal distribution of the entire dataset (Shapiro-Wilk test [50] p-value > 0.05). All statistical analyses were conducted in R 3.5.1 [51].

Water Quality
Two systems located at the opposite sides of the biological productivity spectrum were selected for this study. Figures 6 and 7 present the water quality results for the oligotrophic and eutrophic systems, respectively.  For the analyzed water quality parameters, substantial differences can be observed between the two systems. Table 3 shows the statistical summary (and comparisons) between the water quality parameters measured at the oligotrophic and eutrophic systems. Table 3. Descriptive statistics and comparison between water quality parameters at the oligotrophic and eutrophic systems. SD refers to standard deviation. Units are μg/L for Chl-a, mg/L for TN, TP and TSS, and cm for SDD.

Oligotrophic System
Eutrophic System

Reflectance Extraction
Three scenarios were evaluated under the reflectance extraction procedures: (1) Point extraction,   From the reflectance values as a function of sampling stations, it can be observed that each system provides different relationships, indicating distinct compositions and characteristics. In the oligotrophic system, light reflects more than in the eutrophic system indicating clearer waters.

Models Development and Extraction Scenarios Evaluation
Using a single variable linear model regression approach, a total of 315 models (single bands and band ratios) were developed under the three extraction scenarios (105 models per extraction scenario). For this purpose, the independent variable was defined as the reflectance values from the different bands available from the multispectral sensor and the dependent variable was the in-situ measurements for the different water quality parameters. Considering the predictive capabilities of each model (under each evaluated scenario), it was determined that the point extraction scenario had stronger predictive capabilities (maximum R 2 and minimum AICc values) for all the analyzed parameters ( Figure 10). In order to improve the predictive capabilities of each model developed under the point extraction scenario, a multiple variable linear regression approach was used. Table 4 presents the best performing (highest R 2 ) models for all evaluated water quality parameters, using single and multiple variable linear approaches. From this table, it can be observed that for all water quality parameters, a multiple variable model yielded a higher R 2 . Table 5 presents the estimated statistical coefficients for the multiple linear regression analysis. Table 4. Best predictive water quality models using single and multiple variable linear approaches. WQP refers to the specific water quality parameter, while m and b are estimated coefficients fitting the regression analysis.

Validation and Spatial Distribution Maps
The developed models and their respective equations were validated using the remaining 50% of the data. Using this validation dataset as input for the models (Table 3), calculated water quality values were generated and compared against the actual in-situ measurements. Given that the R 2 values calculated for each selected model were so close to unity, bias and scatter-both of which evaluate the difference between the predicted values of the model and the real value to be predictedwere very small. Figure 11 presents a comparison between the actual and predicted optical and non-optical water quality parameters. Further evaluation of the calculated water quality data indicates that the distribution of created data followed a normal distribution (Shapiro-Wilk test p-value > 0.05 for all parameters) and that statistically there was no difference between the calculated values and the collected in-situ values (t-test p-value > 0.05 for all parameters). Once it was determined that there was no statistical difference between the calculated values and the collected in-situ values, band arithmetic function (following the models' equations) was applied to the collected multispectral images in order to establish spatial distribution maps for all of the water quality parameters in the oligotrophic ( Figure 12) and eutrophic ( Figure 13) systems. From the generated spatial distribution maps, it can be observed that both systems (oligotrophic and eutrophic) present fully mixed waters (which correlates to the water quality analyses presented in Section 3.1). Also, it is important to note from the distribution maps for the oligotrophic system, that aquatic features present during the image capturing process (aquatic vegetation, located at north and northeast portion of pond) can be clearly identified from the different optical water quality measurements. At the same time, from Figure 13, it can be observed that that in all distribution maps, there are whited-out sections. These sections are due to the inability of the pre-processing software to properly stich images at those locations.

Discussion
The main purpose of this study was to develop models capable of reliably estimating optical (TSS, Chl-a, and SDD) and non-optical (TP and TN) water quality parameters in two extremes of the aquatic biological productivity spectrum (oligotrophic and eutrophic systems), using in situ data and images collected with a multispectral sensor attached to an sUAS. In order to develop these algorithms, linear approaches using single and multiple variables were used. As a result, it was determined that linear models using multiple variables had stronger predictive capabilities for all water quality parameters. These algorithms have the capability of generating data that are not statistically different from the collected in-situ data for optical and non-optical water quality parameters.
In the paper "Comprehensive Review on Water Quality Parameters Estimation Using Remote Sensing Techniques", Gholizabeh et al. [52] references that different authors determined that the use of visible and near infrared bands of the EM spectrum from multispectral sensors can be used to obtain strong correlations between reflectance and optical water quality parameters. However, when exploring correlations for non-optical parameters, direct inference of these measurements had low predictive capabilities. Lim and Choi [53] used Landsat 8 in order to correlate spectral bands with insitu water quality measurements, in order to establish water quality models capable of estimating optical (TSS and Chl-a) and non-optical (TN and TP) parameters in the Nakdong River in Korea. As a result, they obtained algorithms that strongly estimated TSS and Chl-a (R 2 = 0.74 and 0.71, respectively), but were not as strong when estimating TP and TN (R 2 = 0.50 and 0.48, respectively). Due to this limitation, an indirect estimation approach has been taken by some authors in order to develop strong correlations that relate TP and TN to Chl-a concentrations and SDD [54,55].
When examining the multiple variable models determined by this study (Table 3), it can be observed that the combination between the ratios Blue/Red, Green/Red, and Green/Blue provide the strongest correlation between reflectance and most of the optical and non-optical water quality parameters (except for Chl-a, for which the highest correlation was obtained with the Green and Red bands). These findings are in accordance with Gholizabeh et al. [52]; however, it was determined by this study that these ratios not only have the capability of estimating optical parameters, but also nonoptical values. Lui et al. [56] determined that with the use of high-resolution imagery, linear (multiple linear regression) and non-linear (artificial neural network) models with strong predictive capabilities could be developed for TN and TP. The basis of these relationships is explained by the high spectral correlation that TN and TP have with SDD, TSS, and Chl-a [57].
It is imperative to begin this discussion with this information because the study presented herein deviates from the traditional approach of using multispectral sensors attached to satellite platforms. Instead, this study uses a more compact multispectral sensor, attached to an sUAS. By doing this, not only can direct methods of estimating non-optical water quality parameters be derived, but the use of this tool enhances spatial and temporal resolutions while eliminating the cloud coverage issues.
Earth observation satellites are the most common platforms to monitor and collect information about the Earth [58]. Table 6 presents some of the most common remote sensing satellites used for estimating water quality parameters, along with their respective spatial and temporal resolutions. From this table, it can be determined that the spatial resolution obtained by any of these platforms is much coarser when compared to the spatial resolution (6-8 cm) obtained with the sensor used in this study. To illustrate this concept, Figure 14 presents a visual comparison between images taken from two commonly used remote sensing satellites (Landsat 8 and Sentinel-2A) versus the images captured by the sUAS in the eutrophic system used in this study. By looking at these aerial images, the pixel resolution significantly increases in the picture taken with the sUAS. Table 6. Spatial and temporal resolutions of some of the most commonly used remote sensing satellites for water quality estimation, compared to the sUAS used in this study.

Satellite
Spatial The use of satellite remote sensing tools helps to expand the limited discrete sampling point coverage of traditional monitoring plans [3,6,26,59,60]. However, in addition to spatial resolution, two major drawbacks when using these tools are: (1) The longer revisiting time (temporal resolution) of these platforms and (2) cloud coverage limitations. Zhang and Kovacs [61] point out that the longer temporal resolutions of some of these platforms presents a major difficulty when trying to monitor systems that are in a constant state of change. At the same time, other authors mention that the number of images that they are unable to use due to cloud coverage accounts in some cases for up to 97% of the available captured imagery for a particular region in a 25-year period [3]. With the use of sUASs, these issues are no longer a concern. First, with an sUAS, the operator has the flexibility of deciding how often they want to capture multispectral imagery. Secondly, because sUAS fly below the clouds, all the imagery is 100% cloud coverage free.
In order to determine optical and non-optical water quality measurements from multispectral sensors, in-situ measurements are needed to develop and calibrate the different models [26]. However, due to the above limitations for use of satellite imagery, selecting images for these types of correlations can become a non-trivial task. Hicks et al. [62] suggest that ideal imagery for these types of studies should not be more than one day apart from the in-situ data collection. However, in most cases, this is not possible due to the temporal resolution of the platform or cloud coverage present in the imagery [3,6,26,60,61]. Furthermore, Barrett and Frazier [63] mention that water quality parameters can be directly influenced by rapidly changing environmental conditions in the study site, and as a result, the utility of predictive models developed from imagery that is generated days or weeks from the day of the in-situ sampling can be detrimentally impacted. As shown in Table 1, with the use of an sUAS, the time window between water sample acquisition and multispectral imagery collection can be reduced to minutes to hours. In theory, and due to the flexibility that these portable platforms provide, decreasing the time window between water sample acquisition and multispectral imagery collection translates to stronger and more reliable water quality models.
A secondary objective of this study was to evaluate if using a statistical interpolation method improved the algorithms between the different optical and non-optical water quality parameters and the reflectance values. In order to do that, three scenarios were evaluated under the reflectance extraction procedures: (1) Point extraction, (2) buffer extraction, and (3) kriging extraction. Results indicated that models created from the first scenario (point extraction) presented stronger predictive capabilities. Mu et al. [64] references that in spatial sampling, collected samples are not independent from each other and for that reason the number of samples that need to be taken in order to develop or validate remote sensing products can be decreased in order to improve accuracy. Considering the water quality results and the spatial distribution maps generated in this study, it makes sense that for fully mixed systems (such as the ponds used in this study), fewer sample stations led to more accurate models.
For all the points discussed, one can determine that the use of sUAS offers additional benefits than the traditional satellite remote sensing approach. However, it is important to point out that sUAS, just like any other remote sensing tool, have their limitations. The first and perhaps the most important limiting factor when using this technology is the weather. When planning missions with sUAS, the operator must be aware that these platforms are unable to fly under wet conditions (rain) and elevated wind speeds (higher than 5 m/s or as stipulated by the platform manufacturer). For the study presented above, these issues were not a concern. However, it is necessary to point this out, because even though sUAS offer more advantages when it comes to obtaining imagery capable of estimating optical and non-optical water quality parameters, there is a tradeoff that needs to be considered and evaluated by the user.

Conclusions
This study aimed to create different statistical water quality models for optical (TSS, SDD, and Chl-a) and non-optical (TP and TN) water quality parameters in oligotrophic and eutrophic aquatic systems using remote sensing images from an sUAS equipped with a multispectral sensor. From the results of this study, it can be concluded that: (1) When using a multiple linear regression approach, models capable of predicting optical and non-optical models (with strong prediction capability R 2 > 0.80) can be created, (2) multiple variable linear regressions in the visible portion of the electromagnetic spectrum (blue, green, and red) best described the relationship between TSS (R 2 = 0.99, p-value = <0.01), Chl-a (R 2 = 0.85, p-value = <0.01), TP (R 2 = 0.98, p-value = <0.01), TN (R 2 = 0.98, pvalue = <0.01), and SDD (R 2 = 0.88, p-value = <0.01), (3) the use of statistical interpolation (ordinary kriging) does not improve the statistical relationship between the different water quality parameters and the reflectance values, (4) 100% cloud free imagery can be collected with the use of sUAS, (5) the use of sUAS for water quality monitoring allows the user more flexibility in terms of temporal and spatial resolution, and (6) future research should evaluate if the use of this technology improves the predictive capabilities of water quality models that rely on satellite imagery and if the models developed in this study have the capacity of determining water quality in reservoirs that fall in other portions of the biological productivity spectrum.