Characterizing Leaf Nutrients of Wetland Plants and Agricultural Crops with Nonparametric Approach Using Sentinel-2 Imagery Data

In arid environments of the world, particularly in sub-Saharan Africa and Asia, floodplain wetlands are a valuable agricultural resource. However, the water reticulation role by wetlands and crop production can negatively impact wetland plants. Knowledge on the foliar biochemical elements of wetland plants enhances understanding of the impacts of agricultural practices in wetlands. This study thus used Sentinel-2 multispectral data to predict seasonal variations in the concentrations of nine foliar biochemical elements in plant leaves of key floodplain wetland vegetation types and crops in the uMfolozi floodplain system (UFS). Nutrient concentrations in different floodplain plant species were estimated using Sentinel-2 multispectral data derived vegetation indices in concert with the random forest regression. The results showed a mean R2 of 0.87 and 0.86 for the dry winter and wet summer seasons, respectively. However, copper, sulphur, and magnesium were poorly correlated (R2 ≤ 0.5) with vegetation indices during the summer season. The average % relative root mean square errors (RMSE’s) for seasonal nutrient estimation accuracies for crops and wetland vegetation were 15.2 % and 26.8%, respectively. There was a significant difference in nutrient concentrations between the two plant types, (R2 = 0.94 (crops), R2 = 0.84 (vegetation). The red-edge position 1 (REP1) and the normalised difference vegetation index (NDVI) were the best nutrient predictors. These results demonstrate the usefulness of Sentinel-2 imagery and random forests regression in predicting seasonal, nutrient concentrations as well as the accumulation of chemicals in wetland vegetation and crops.


Introduction
Wetlands in South Africa cover about 2.9 million hectares and about 2.4% of the country's land area [1]. They are recognised as highly valuable natural resources that sustain the livelihoods of local communities by providing a wide-ranging ecosystem goods and services that include, wild fruits, vegetables, rice and water purification [2,3]. They also mediate the adverse effects of extreme weather conditions [4] by attenuating floods and slowing down the speed of water movement [5]. Despite their valued recognition, the rate of wetland degradation worldwide and in SA remains high [6,7].
In South Africa, wetlands are some of the most threatened ecosystems, with~50% of them in a critically endangered state [1]. Although they are vital for the livelihoods of rural communities [8], their sustainability is being threatened by the increasing incidence of climate change-driven droughts and frequent rainfall failures [9]. In sub-Saharan countries including Ethiopia, Kenya, Malawi, Tanzania, and Zambia, wetland cultivation is a recognised and common practice in arid environments [10] that generates about 37% of consumed food and 55% of cash income per household [11]. In Zimbabwe for example, Nyamadzawo, et al. [12] compared wetland gardens against upland fields and observed that more harvests from wetland gardens (2-3 t ha −1 ) compared to upland fields (1 t ha −1 ). Despite this higher productivity, wetland ecosystems are being adversely impacted by water extraction, urbanisation, infrastructure development, pollution, poor farming practices as well as droughts and climate variability.
Given their high agricultural productivity, and the water reticulation and other ecological roles, there is a need to explore spatially explicit techniques that can be used to assess and monitor nutrient enrichment in these ecosystems. This is essential because nutrient loading and contamination by different chemicals can disrupt the ecological functioning of these vital sub-systems. A selected example of this disruption is provided by Zhu, et al. [13] who report that an increase in foliar copper content is often associated with shrinkage of mesophyll cells and, concomitant destruction of the internal structure of a plant's leaves and its physiological and structural characteristics. Although a detailed discussion of similar adverse effects is beyond the scope of this paper, it is apparent that better understanding of the foliar nutrient composition of wetland vegetation and crops is critical for informed adoption of eco-friendly management strategies. As vegetation links the physical environment and the upper levels of the food chain [14,15], these linkages need to be properly maintained because they determine the dynamics of nutrient circulation and physiological traits of plants in these systems.
Plants require at least fourteen mineral elements for adequate nutrition [16], lack of which reduces plant growth [17]. Six of these-nitrogen (N), phosphorus (P), potassium (K), calcium (Ca), magnesium (Mg), and sulphur (S)-are critically required by plants in adequate amounts [17]. Their presence at tolerable concentration levels in the plant can now be determined using remote sensing techniques [18][19][20][21][22][23][24][25][26][27]. However, considerable research still needs to be conducted on how the concentrations of these elements influence plant growth, as most of the research has tended to be narrowly focused on foliar nitrogen, phosphorus and potassium [18][19][20][21][22][23][24][25][26][27]. This bias can be explained by the dominant use of point-based sampling techniques and laboratory assaying, which are time-consuming and expensive These limitations can be overcome by tapping on the abilities of remote sensing techniques to cost-effectively provide quantitative estimates of foliar nutrient concentrations in wetland plants.
Nitrogen, phosphorus and potassium have been widely investigated using remote sensing techniques [28][29][30]. Although the literature indicates that these elements can be easily characterised using various techniques, other trace elements that are found in wetlands such as magnesium (Mg), copper (Cu), boron (B) and sulphur (S) are difficult to quantify. Remote sensing-based techniques provide a convenient way to overcome this limitation. This dexterity provides partial explanation of why we decided to use both in-situ and remotely sensing (RS) techniques to assess the foliar concentrations of these chemical elements and others.
The decision to couple these techniques was reasoned to be helpful because other researchers have also done this before using RS techniques and laboratory-based spectroscopy in order to cost-effectively maximise their usefulness [31][32][33][34]. Additional evidence to support this improvisation is provided by Li, et al. [32], who used hyperspectral data and laboratory analysis to characterise nitrogen and phosphorus concentrations in wetland plants. The majority of these studies have however mostly focused on grassland and dryland woody species [35]. Less effort has been exerted towards foliar nutrients like Cu, B, and S in wetland settings [36]. Although the traditional laboratory methods have been routinely used to provide the information that is required to guide wetland conservation and management [37], these methods are confounded by reliance on labour intensive, time consuming and costly field compilation and analysis of samples [38,39].
The use of hyperspectral RS data has proven to be capable of addressing these constraints and to further facilitate detailed characterisation of wetland plant nutrient concentrations [34]. Although these datasets are useful, they their usefulness is compromised by restricted temporal and spatial coverage with airborne hyperspectral sensors offering little for large scale applications compared to their space-borne equivalents because they are prohibitively expensive. This limitation causes hyperspectral data to be inflexible, inefficient, difficult to access and inappropriate for large scale mapping. To adequately monitor nutrient concentrations in wetland plants, a monitoring system that can be applied over optimum spatial and temporal scales is required. The use of freely available satellite imagery is a viable option as it enables cost-effective mapping and regular monitoring of inaccessible and extensive wetland areas [40,41]. Overall, the literature underscores the immense potentials of the medium resolution (M-res) datasets such as the WorldView-2, 3 and Sentinel-2 (S-2) in detecting variations in plant nutrient concentrations [1,[42][43][44] compared to the costly high-resolution sensors [45].
In the last few years, S-2 is one of M-res datasets that has attracted a lot of interest by many researchers. Although launched recently in 2015, it has already proven to be highly useful for the monitoring of vegetation quality and macronutrients such as nitrogen and phosphorus [46][47][48]. Its high revisit period of 5 days, spatial resolutions of 10-60 m, systematic global acquisition and open access policy have made it the workhorse of locallevel real-time plant nutrient monitoring. In addition to the visible and near-infrared (NIR) wavelengths, the S-2 includes three bands in the red edge region centred at 705, 740, and 775 nm, which are suitable for the characterisation of various plant traits and vegetation monitoring including macronutrients [19,26,[48][49][50]. For instance, S-2 has been used to characterise the seasonal and spatial variations of the nitrogen and phosphorus ratio in the alpine grasslands of China at optimal accuracies of R 2 0.49 and 0.59 and root mean square errors (RMSE) of 2.27 and 3.11 for the dry and wet seasons, respectively) [50]. Specifically, S-2 has 13 spectral bands comprising four bands, three bands and six bands with spatial resolutions of 10, 20 and 60 m that are centred at; (1) 496, 560, 665 and 835 nm (2), 703, 740, 783, 865, 1610 and 2202 nm, and (3), 443, 945 and 1373 nm respectively [51]. Because S-2 data can be accessed free of charge (https://scihub.copernicus.eu/25, accessed on June 2020), it has offered substantial opportunities for the mapping and monitoring of micro and vegetation macronutrients, especially in countries with limited access to spatial data due to financial and logistical constraints. However, unfortunately most of the studies that have sought to characterise foliar nutrients using S-2 remotely sensed data have tended to selectively focus on a narrow range of the primary nutrients notably, N, P, and K. This drawback argues for the need to explore ways through which the mapping of foliar micronutrients nutrients such as Mg, Cu, B and S, can be improved.
Evaluation of the nutritional requirements by different plants is becoming increasingly necessary because the scientific community is obliged to continue providing better means of enhancing the realisation of SDG goals that have a bearing to sustainable utilisation of the finite resources at our disposal. This asseveration is supported by the emergence of a community practice that is informed by the potential realisation of benefits from systematic utilisation of the rich RS datasets at our disposal and systematic use of the techniques that science has so far provided [50,52]. Accomplishing this is possible because RS provides robust techniques with demonstrated capabilities of providing the information required for meaningful realisation of SDGs, i.e., random forests (RF) ensemble, machine learning algorithms (MLAs) [53][54][55]. The use of MLAs, such as RF amongst others, in estimating nitrogen is a novel advancement that involves a fusion of several spectral vegetation indices in mapping leaf nutrient variations [56].
Although there is no machine learning method that is universally appropriate for estimating vegetation quality, several studies have evaluated the performance of the RF regression model (RF) in predicting the leaf nitrogen content of wetland vegetation [36,57].
RF has been demonstrated to be one of the most robust and widely used algorithms for this purpose [58][59][60][61][62]. The RF ensemble algorithm has been widely used in numerous studies to estimate plant foliar nutrient variations. This is so because, apart from being able to discern subtle variations in numerous variables, RF is able to identify the complex relationships between auto-correlated descriptors [63,64]. When implemented properly, RF offers additional advantages in that regardless of the sample size of the dataset used, it has a bootstrapping mechanism that accommodates utilisation of data implemented when drawing training data points for building trees for each model [65]. It is in this regard that RF was perceived to be a suitable algorithm for estimating foliar nutrient concentrations in the floodplain wetlands of Northern KwaZulu Natal, South Africa. The underlying research question of the study was, can Sentinel-2 data in concert with RF be used to characterise the concentrations of N, K, Ca, Mg, P, S, Zn, B and Cu in the leaves of wetland natural vegetation? We attempted to answer this question by calculating Sentinel-2-derived vegetation indices for different herbaceous species in. The results showed that there is urgent need to explore techniques that can be used to provide unitary perspectives on how these and other challenges can be addressed. This investigation attempts to do this by providing a case illustration of how RF regression modelling can be used to (1) characterise the N, K, Ca, Mg, P, S, Zn, B and Cu foliar concentrations in different wetland vegetation species b) determining the key wavelengths that are important predictors in ascertaining the biochemical leaf foliar variations in these chemicals and (3) characterising seasonal variations in wetland plant leaf nutrient content.

Study Area
The uMfolozi floodplain (UFS) system is located in St Lucia, a town that is situated at 28 • 22 S and 32 • 25 E in Mtubatuba Local Municipality, South Africa. The uMfolozi River consists of two main tributaries, the Black uMfolozi; that rises at around 1500 m asl. in the north and the White uMfolozi that rises to an altitude of 1620 m asl. these two tributaries converge on the sea Around 50 km west of the mouth of the uMfolozi River. Because the catchment falls within the uMfolozi-Hluhluwe Nature Reserve, the bulk of the catchment comprises natural vegetation cover. Grasslands are about 60% of natural vegetation, with 21% and being classified as thicket and bush respectively. In less than a quarter of the uMfolozi catchment, natural vegetation comes under the dominant agriculture and commercial forestry land uses. The UF floodplain is predominantly used for sugar cane cultivation (65%), and the remaining designated as protected area under the iSimangaliso Park (previously known as St. Lucia Wetlands Park).
The uMfolozi River catchment drains a portion of 11,068 km 2 of northern KwaZulu-Natal on Southern Africa's eastern seaboard. The river's surface geological layout consists of Lebombo rhyolite rock in the east and, Zululand/Maputaland rocks calcarenite, calcareous, limestone, and conglomerate formations in the north and south of the Indian Ocean respectively [66]. About 80% of the rainfall/precipitation occurs in the summer months, peaking between November and April [67]. Mean annual catchment precipitation ranges between 1288 mm in the coastal town of St. Lucia and 667 mm/a in the mid-upper catchment of the uMfolozi Game Reserve with~914 mm/a occurring in the upper catchment in Nongoma. Mean annual evapotranspiration potential is normally more than double that of precipitation, with average amounts approximating 1800 mm/a [68]. Figure 1 shows the geographic location of the study area in KwaZulu Natal province, South Africa.

Plant Leaf Sampling
The datasets comprised leaf samples of major crops (Musa acuminate, Ipomoea batatas, sugarcane and Colocasia esculenta) and dominant wetland vegetation species (Phragmites australis, Cyperus papyrus and Cynodon dactylon). Sampling points were systematically generated to cover the study area's footprint and leaves likewise selected following a sampling procedure that was designed to provide representative coverage of the sampling universe. Leaf samples were collected on the 12 March 2017 (wet/summer season) and 22 July 2017 (dry/winter season) respectively. During field investigation, the geographic positions of identifiable herbaceous species were recorded in spreadsheet inventories for further analysis and detailed verification of their geographic positions and species type A Garmin Montana 650 global positioning system (GPS) with a rated positional accuracy of ±3 m, was used to measure the location of each plant where leaves were harvested. This was followed by randomised sampling of leaves at different crown levels (top, middle, and bottom) to avoid bias. Overall, a total of 76 leaf samples (38 for crops and 38 for vegetation) were collected in summer and 85 samples (40 for crops and 45 for vegetation) collected in winter and~150 g of leaf material collected from each crop/plant. Leaf samples were then packed in labelled ziplock plastic bags and stored in a cooler box to preserve them during transportation to the laboratory. In the laboratory, the samples were oven-dried at 70 • C for at least 24 h and milled to particle sizes of <0.5 mm.

Chemical Analyses
In this study, six major elements required in large amounts by plants (nitrogen, phosphorus, potassium, calcium, magnesium and sulphur) [17] and three micronutrient elements required as trace amounts (boron, zinc and copper) [69] were chosen. The dry oxidation (Dumas) method was used to determine nitrogen [67] by igniting each leaf sample in oxygen at 950 • C to produce carbon dioxide, nitrogen gas and oxides of nitrogen.
These gases were passed through silvered cobalt oxide and a column of copper at 650 • C to reduce the nitrogen oxides to nitrogen gas by removing excess O 2 . After removal of water vapour and CO 2 , the N 2 gas was finally separated from other gases using gas chromatography, based on a helium carrier gas and detection by a thermal conductivity detector. The instrument was calibrated against the pure compound of known composition following standard procedures that been recommended for this purpose. The compound chosen for the calibration standard is phenylalanine which contains 8.48% N.
An aliquot of the digest solution was used for the inductively coupled plasma optical emission spectrometric (ICP-OES) instrument (Agilent Technologies, United States, North America) for the determination of K, Ca, Mg, P, S, Zn, B and Cu [70]. This is an Agilent 725 (700 series) simultaneous instrument which determines all the elements and wavelengths simultaneously. Thus, several of the elements may be determined at more than one wavelength allowing confirmation of the values with no increase in analysis time or consumption of digest solution. However, S was determined separately from the other elements after purging the optics of the instrument with Ar gas. This is due to the low wavelengths (<190 nm) used for detecting S and the problems caused by oxygen in the air at these very low wavelengths if the system is not purged. Each element was measured at one or two appropriate emission wavelengths that were chosen for their high sensitivity and lack of spectral interferences. Table 1 describes the characteristics of the S-2 images that were used and the numbers of sample sites from which leaf samples were collected during the wet (12 March 2017) and dry (22 July 2017) seasons. The S-2 images were preferred because of their optimum spatial and spectral resolutions for leaf-nutrient characterisation and the availability of footprint coverages of the study area. The difference in the number of days between collection dates of samples and image acquisition dates for wet and dry periods were restricted to 15 and 4, respectively (Table 1) in order to enhance the acquisition of similar leaf nutrient concentrations by using samples at the same phenological stages. The images were atmospherically corrected to surface reflectance using the Sen2Cor plug-in tool provided in the Sentinel Application Platform (SNAP) toolbox, as illustrated in literature [71][72][73]. S-2 MSI has 13 spectral bands four at 10 m (blue (band 2), green (band 3), red (band 4), and NIR 1 (band 8)), six at 20 m (RE1 to 3 (band 5, 6 and 7), NIR-2 (band 8A), SWIR 1 (band 11) and 2 (band 12), and three at 60 m spatial resolution. Except for bands 1, 9 and 10, all S-2 MSI 10 and 20mbands were used in mapping foliar nutrients in this study. The 10 m bands were all resampled to a spatial resolution of 20 m before estimating the foliar nutrients. However, bands 1, 9, and 10 which have a spatial resolution of 60 m were excluded in this study because they are not suitable for vegetation applications. The images were classified in ArcGIS 10.6 by using 70% of the data that was collected during field investigation for signature compilation with the remaining 30% being reserved for classification accuracy assessment. Vegetation indices were then computed and used with the spectral bands to estimate leaf nutrient concentrations in wetland vegetation and crops growing by using several red edge based indices (Table 2).
Although a large and growing body of the literature has illustrated that vegetation indices outperform general wavebands in estimating vegetation attributes [80][81][82], in this study we combined the vegetation indices with spectral bands considering that very few of the aforementioned studies were conducted in wetlands. Vegetation indices were used in this study because of their robustness as illustrated in the literature [50,56,57,80,[83][84][85]. They derive their robustness from two or more wavebands. These bands are often from two different regions of the electromagnetic spectrum. Their optimal performances have also been observed to be capable of circumventing the effects of the atmospheric noise, view/sun angle soil background, topographic effects and sensitivity to vegetation spectral and temporal attributes [86]. This and other considerations explain why Sentinel-2 wavebands were used in this investigation. The processes that were used for this purpose are summarised in Figure 2 which shows how satellite image data and field data were collected and combined in RF regression modelling.

Estimation of Nutrient Concentrations Using Random Forests Regression
RF was performed in the R statistical package to estimate the concentrations of nutrients in crops and wetland vegetation. RF is a blended model that is characterised by an enormous number of trees [59,87]. The model works by repeatedly splitting each tree (remotely sensed data in this study) into increasingly homogenous subsets at each node to produce a series of terminal nodes. In this study, the training of the regression model was based on 70% of the field data and new estimations determined by sensing the input down the tree and taking the means of the response variables (nutrient contents) and the remaining 30% was used for accuracy assessment as explained earlier. Each regression tree in the RF algorithm is built using a subset of training samples that are independently selected by replacement of the original samples [88]. A subset of a few variables was randomly selected to determine the split in order to increase the robustness of the model by increasing diversity amongst trees and avoiding overfitting the model [88] and the RF predictor was finally constructed by taking the average of overall trees. The samples that are not utilised to grow the tree are referred to as Out of the Bag (OOB) data [59] which the algorithm uses to estimate accuracy by using the difference in the mean square errors to compute the OOB error estimate [71]. The explanatory power of each variable is determined by a Gini coefficient which measures the total decrease in node impurity (weighted by the probability of reaching that node) averaged across all trees. In this study, two hyperparameters were used to tune the models, that is the number of trees (ntree) and the number of variables randomly sampled as candidates at each split (mtry). It has to be pointed out that the adjustment of these hyperparameters did not significantly change the results hence these were held constant for various models considering the variability in the number of samples in stages 1 to 3 illustrated in Table 3. Variable selection of the most important model parameters needed for accurate nutrient estimation was accomplished by implementing the backward feature elimination method [89,90]. The variable importance in RF determines various measures such as the importance of variables based on the Gini coefficient and permutation coefficient. The variation method is considered superior to other approaches because it uses OOB assessments [59]. RF assesses the variable importance of different factors by using the mean decrease in accuracy. Increased mean variability indicates greater importance for that particular variable, while low mean values indicate a lower influence in the model. The method works by generating all the variables of the input predictor and gradually removing input predictor variables with the least relative effect. Table 3 describes the experiments that were performed with S-2 MSI data to estimate nutrient concentrations in crops and wetland vegetation. More details on the RF regression ensemble are provided elsewhere [50,59,91,92].
To estimate nutrient concentrations, data analysis was done in three analysis stages (stage 1, 2, 3). Model input variables varied in each analytical phase. For analysis I, nutrient concentrations were estimated using Sentinel 2 MSI bands only. These were bands 2, 3, 4, 5, 6, 7, 8, 8A, 11 and 12, whereas, additional vegetation indices were computed and used as a stand-alone dataset to estimate nutrient concentrations. The following vegetation indices were computed; chlorophyll green (Cl.green), chlorophyll red edge (Cl.green), red edge position (REP 1, 2 and 3), simple ratio (SR 1, 2 and 3), the NDVI index, MERIS terrestrial chlorophyll index 1 (MTCI.1), modified normalised vegetation indices (nDVI (i.e., nDVI_B γi_ B γj )), as well as modified simple ratio vegetation indices (sR (i.e., sR_B γi_ B γj , where Bγi and Bγj are different Sentinel 2 MSI spectral bands)). The selected indices were computed based on all possible Sentinel 2 band combinations (10 spectral bands). Then S-2 MSI bands were then used as standalone model input variables except for the wet season (March) and the dry (July) seasons (Tables 4 and 5). This was undertaken to determine the ability of Sentinel-2 MSI data in detecting the seasonal differences in leaf nutrient concentrations. For stage 2 analysis, we detected and characterised the year-round nutrient concentrations using pooled seasonal data which was categorised into two datasets (wetland vegetation and crops). In addition, spectral bands and vegetation indices were used separately to estimate the foliar nutrients (Table 6). For the third analysis, to assess the robustness of S-2 MSI data in detecting and characterising nutrient concentrations across all seasons (Figure 3), wet and dry season datasets were pooled into one dataset. Specifically, vegetation indices were used as standalone model input variables (Figure 3). All estimation models were evaluated by using the explained mean squared residual (MSR) variance (Var Expl (%), root mean square error (RMSE) and, RMSE %.  The RMSE% and the R 2 were scaled between 0 and 100. To compute the RMSE%, all RMSEs from each model were normalised using the mean of each variable and then expressed as percentages [93,94]. The RMSE% has been widely used in the literature to compare different variable estimations [94][95][96] hence it was adopted and used in this study. The accuracies (RMSE % and in some instances with R 2 ) of the training datasets were presented and used to conduct the Mann-Whitney U and the Student's T tests. The Mann-Whitney U independent samples test was then used to test whether there were significant differences at α = 0.05 between the estimation accuracies (R 2 and RMSE%) derived during the summer in relation to those derived from the winter crops and wetland vegetation, respectively ( Table 6). The Mann-Whitney U Test was used following the data's significant deviation from the normal distribution at α = 0.05 based on the Kolmogorov-Smirnov Test. Similarly, a Student's t-test of independent samples was then used to assess whether the estimation accuracies derived from crops were significantly different from those derived from wetland vegetation presented in (Table 7) at α = 0.05 (Table 8). The Student's t-test was used because the data did not significantly deviate from the normal distribution based on the Kolmogorov-Smirnov test. The raster calculator tool in ArcGIS 10.6 was used to map the spatial and temporal distributions of nutrients by utilizing the RF regression model outputs and essential variables (NDVI, REP1, and band 7) to characterize the studied nutrients. Figure 2. summarises the methods that were used to determine the accuracies of these calculations.   Tables 4 and 5 shows the seasonal accuracy levels (RMSE, R 2 , and RMSE %) for different nutrients that were investigated based on the raw bands and vegetation indices. The RMSE, which depicts the standard deviation of residuals, shows that the linear regression model based on vegetation indices performed better than raw bands. Using raw spectral bands, all of the RMSE values for both crops and vegetation in summer and winter were greater than of the vegetation indices (Tables 4 and 5). As a result, the results of vegetation indices were used for further analysis of this study because of their lower prediction errors (RMSE values).

Comparison of Single Bands and Vegetation Indices in Estimating Different Nutrients in Summer and Winter Seasons
For both vegetation and crops, high R 2 and low RMSE% were observed for all nutrients in winter (Tables 4 and 5). For instance, Calcium (Ca) estimation in crops based on VIs exhibited an R 2 of 0.73 and RMSE of 5889.75 (mg/kg) which was optimally estimated to an R 2 of 0.95 and a RMSE of 39284.56 (mg/kg) (RMSE % = 12) based on vegetation indices in summer (Table 4). Meanwhile in winter Ca exhibited a R 2 = 0.72 and a RMSE = 2289.59 mg/kg (RMSE % = 36) based on bands only. These accuracies improved to an R 2 of 0.81, RMSE of 1582.82 mg/kg (RMSE % = 28). A similar trend was observed when estimating nutrients in wetland vegetation (Table 5). Specifically, poor accuracies of an R 2 = 0.73, RMSE = 5889.75 mg/kg (RMSE % = 52) were attained when estimating Ca in wetland vegetation using only spectral bands during the summer. These accuracies improved when vegetation indices were used with an R 2 of 0.95, RMSE of 3924.56 mg/kg and an RMSE % of 12. A similar trend was also observed in estimating Ca during the winter season. A R 2 = 0.56, RMSE = 1684.49 mg/kg and a RMSE% = 29. An improvement in the accuracies was realised in estimating Ca in using bands only. Again, an improvement was observed when estimating Ca using the vegetation indices (R 2 = 0.98, RMSE = 1498.356 mg/kg and a RMSE% = 25) during winter. The summer-winter RMSE percentages were higher in vegetation compared to crops and the differences in average values for both were significant (Table 6). Tables 6-8 show seasonal mean comparisons of R 2 and RMSE % nutrient estimations and seasonal nutrient estimation accuracies for crops and vegetation Table 8 shows R 2 values and RMSE% for vegetation and crop types using pooled data.

Dry and Wet Season Crop and Vegetation Nutrient Estimation Using Pooled Data
In addition, the annual estimations of nutrient concentrations presented in Table 7 were used to calculate the statistics provided in Table 8. A higher mean R 2 accuracy for nutrients across plant types was observed for vegetation (0.94) when compared to a mean R 2 of 0.84 for crops ( Table 8). The differences in the mean R 2 values were significant. A higher percentage of RMSE (27) was observed for crops when compared to what was observed for vegetation (15). The difference between these vegetation types and crop RMSE values was significant. In all cases, measured and estimated nutrient concentrations were strongly correlated with R 2 ranging from 0.98 for boron to 0.75 for magnesium. Although boron and magnesium showed a good correlation, both had high RMSE percentages. Figure 2 is a composite summary of the performance of the statistical models that were used to relate foliar nutrients measured from the laboratory and data from S-2 images. Figure 3 shows the best fits for the model (in terms of R 2 and RMSE %) arranged by performance. As it can be noted, the studied nutrients' estimation performances were strong and very consistent, with R 2 values ranging from 0.81 to 0.93. The nutrients, on the other hand, showed the opposite in terms of variability based on the RMSE%. However, it's worth noting that lower RMSE values suggest a better model estimation fit. The RMSE percentages for different nutrients ranged from 14% to 29%. Zinc, potassium, sulphur, copper, and nitrogen were found to have low RMSE percentages ranging from 14% to 17%. The RMSE of calcium, boron, and magnesium were all high, with boron having the highest of 29 per cent. The overall findings revealed that several nutrients, such as calcium, boron, and magnesium, are poorly estimated, as evidenced by the RMSE per cent values.

Variable Importance Selection
The RF model was able to rank the variables by their estimation capabilities. Figure 3 shows important variables derived in estimating N, K, Ca, Mg, P, S, Zn, B and Cu. REP1 and NDVI were the most frequent vegetation indices that yielded optimal models of estimating foliar nutrients as indicated by their high % IncMSE in Figure 4. Figure 5 illustrates the estimated spatial distributions of sulphur, potassium, magnesium, copper, nitrogen, phosphorus, zinc and boron. The lowest values correspond to areas with crop farming (dark red for waterbodies), higher values (dark green) represent the most developed vegetation (forest and grassland). It can be observed that nutrients have high concentrations in the eastern section of the study area, except for magnesium that showed high concentrations in the sugarcane infested area.

Discussion
In this study, we used the RF algorithm and S-2 on both wetlands vegetation and crops and across seasons to estimate the concentrations of N, K, Ca, Mg, P, S, Zn, B and Cu. The results show that: (a) the RF model using S-2 can estimate foliar concentration across several nutrients and (b) S-2 can estimate nutrients across plant types and seasons. Derived S-2 vegetation indices such as NDVI and bands 2 and 7 performed well in estimating nutrients in crops and vegetation. Seasonal characterisation of nutrients was also successful which could be attributed to the variability of photosynthetic pigments such as chlorophyll [36]. However, the RF model performed poorly in estimating magnesium, and sulphur in the summer season. It also performed poorly in estimating calcium, magnesium, phosphorus and boron in wetland vegetation across the seasons as demonstrated by high RMSE %s. This can be attributed to easy leaching and increased mobility of these nutrients that might have caused the decrease of their concentration during the high rainfall period hence weak correlation with S-2. Phenology is also an important factor in this result because most vegetation indices such as NDVI, especially red edge-based indices, rely on the vigour and greenness of the vegetation. Osco, et al. [97] also found nutrients like Mg, S, P, K and Ca presented inferior performances compared to nutrients such as nitrogen, zinc etc. Another contribution of this work is that it was possible to identify wavelengths and spectral regions that contributed most to nutrient estimation.
A combination of vegetation indices and spectral bands was found to be robust when compared to the raw spectral bands. As outlined above, the infusion of vegetation spectral indices proved to be important in the evaluation of the most studied nutrients [98][99][100]. Different indices and bands that were key to estimating nutrient concentrations include the red edge position, NDVI and band 2 ( Figure 2). This observation implies that different nutrients can be estimated by these different variables. In this study, the key variables that were important in estimating most of the nutrients including nitrogen, potassium, calcium, boron and copper were the red edge position 1 computed from the NIR and red edge position bands followed by NDVI for estimating P and S. Similar findings were observed in Bush Buck Ridge, Mpumalanga, South African savannah grass where canopy nitrogen was correlated to NIR spectral region [101]. In similar findings were also found in North American forests where canopy nitrogen was correlated to both the NIR spectral region as well as NIR-based vegetation indices including NDVI [102,103].
The sensitivity of red edge bands to potassium, calcium, copper and nitrogen was not surprising. Studies by other researchers confirm the strong correlation between nitrogen concentrations and red edge bands [18,51,104,105]. The red edge is considered as the surrogate measure of vegetation chlorophyll content [51,106,107]. Therefore, in this study, an expectation of the magnesium concentration to be strongly estimated in the red edge bands was not questionable. Magnesium is located in the central of the chlorophyll molecule and it is regarded as the activator of some enzymes in plants [108]. The results by [109,110] found the NIR region to be the best in estimating magnesium concentrations. NDVI computed from red and NIR bands showed to be the key variable in estimating phosphorus and sulphur concentrations. This is similar to the findings of Lisboa, et al. [111], who found NDVI to be a useful tool in estimating nitrogen and phosphorus concentrations in sugarcane crops. Several studies have used these indices (NIR spectral region and NDVI) to study heavy metals in plants [112][113][114].

Estimation of Nutrients Using the RF Model
This study has shown the utility of the RF model using S-2 in estimating concentrations of N, K, Ca, Mg, P, S, Cu, Zn and B. The technique yielded high coefficients of determination ranging between 0.75 and 0.98. The technique also exhibited low RMSE % for most nutrients except for magnesium (38%), boron (43%) and calcium (30%) which emerged as difficult to detect by using the RF model and S-2. The usefulness of RF regression and remotely sensed images is demonstrated by its ability to estimate sugarcane leaf nitrogen levels from Hyperion images [104]. Multiple studies have demonstrated that RF models often perform remarkably well in different fields of scientific research including the estimation of nitrogen content [115][116][117][118]. In this study, fusing vegetation indices in the NIR spectral region and NIR-based vegetation indices (NDVI) with RF proved to be a suitable approach for estimating the studied nutrients. RF also proved to be suitable in relation to NDVI, REP1 and band 7 in developing a relationship between nutrient variations and land use land cover types ( Figure 4) except for magnesium, which exhibited high concentrations in sugarcane farms where the land use land cover-nutrient effect variation was consistent. This is attributed to the antagonistic effects of (Ca and Mg) and K in sugarcane, where soils with high Ca and Mg can lower leaf K and vice versa.

Crops and Vegetation Nutrients Seasonal Estimations
Most nutrients exhibited significant relationships (R 2 < 0.7) between measured and estimated concentrations across seasons and plant types with few that showed a weak relationship (R 2 > 0.5), i.e., calcium, magnesium, phosphorus and boron). This implies that S-2 has the potential in estimating concentrations of selected chemical elements across the seasons and plant types. Generally, with the R 2 accuracies in estimating foliar nutrient concentrations, there were no significant differences within nutrients from crops and vegetation between the summer and winter seasons.
Similar findings by Gama, et al. [119] confirmed a weak relationship between leaf reflectance and concentrations of phosphorus, potassium and calcium. Poor estimations of magnesium, copper and sulphur in summer were observed which implies difficulties in estimating such nutrients in a wet season. Magnesium, copper and sulphur are micronutrients that are highly affected by the processes of reduction and oxidation (redox) in concert with the shifting of water levels which determines their seasonal concentration levels in wetland soils and water. Their poor detection in this study could therefore be explained by the seasonal variations in the amount of available water which regulates the redox processes thereby and their concentration levels in soil, water different plant systems [120][121][122]. For instance, a related study [121] also concluded that seasonally occurring processes such as redox and seasonal shifts in water abundance regulate the type and amount of trace elements that are available in the water.
Moreover, high the poor performance revealed by the model's high RMSE % suggesting across the seasons was also observed in calcium, magnesium, phosphorus and boron in wetland vegetation. Season-specific analyses showed that dry season models performed better than their wet season counterparts, as shown by their respective coefficient determinants of 0.87 in winter and 0.86 (Tables 4 and 5). This might be attributed to the higher reflectance in the visible spectrum (bands 2, 4 and 4) in the dry season compared to the wet season. Plants are generally less green in dry seasons due to lower moisture content and less chlorophyll which leads to lower energy absorption and greater reflection. Contrary to these observations, however, Ramoelo, et al. [123]; Cho, et al. [124] and Skidmore, et al. [125] found models such as RF to perform better in the wet season compared to the dry season.

Implications of Remote Sensing on Wetland Plant Nutrients
Over the past years, several studies [126,127] have used remote sensing and chemical analyses in estimating foliar nutrient concentrations in plants. However, these studies mostly concentrated on seasonal estimations of nitrogen in grasses. This study has attempted to build on these initiatives by broadening the application of these techniques to the estimation of different nutrients. This initiative is helpful because it offers opportunities for enhanced understanding of vegetation health which has been previously regarded as a complex research area [128]. Although limited in geographical reach, our findings show that S-2 imagery could be a significant additional source of valuable information on seasonal variations in plant nutrient content.
This study has demonstrated that RF using S-2 can be useful for monitoring and estimating various plant nutrient quantities and the quality of floodplain vegetation. S-2 images yielded significant relationships between nutrient content in the NIR spectral region and NIR-based vegetation indices (NDVI). S-2 also provided an opportunity to seasonally characterize the studied nutrients. The findings of this study present a great opportunity and technique for mapping and monitoring nutrient enrichment in wetlands beyond the characterization of plant macronutrients (NPK) across both crops and natural vegetation species. This is a step towards a time-efficient and affordable technique for mapping and monitoring wetlands from local to regional scales based on S-2's optimal spatial resolution and swath widths.
A method of performing chemical analysis in plants in the laboratory is spatially limited for many reasons. Our approach returned high coefficients of determination for most of the nutrients in crops and vegetation. By implication, remotely sensed data make up for shortcomings of the traditional methods with the advantages of real-time and rapid observations. As a result, this technique would be a useful alternative for estimating plant health conditions and nutritional status under different environmental settings.

Conclusions
This study demonstrated immense potentials for Sentinel-2 data to be used for estimating leaf nitrogen, potassium, calcium, magnesium, phosphorus, sulphur, zinc, boron and copper concentrations. Different vegetation indices and bands including the red edge position, NDVI and band 2 can be important in estimating different nutrients across the seasons even though foliar estimation using Sentinel-2 did not show a strong relationship in estimating leaf magnesium, copper and sulphur in a wet season. The major conclusion of this work is that the RF model was able to provide estimated nutrients with reasonable accuracies. The study, therefore, recommends that researchers elsewhere can use the methodology provided in this article as a tool in assessing and monitoring the quality of crops/vegetation in floodplain wetlands and the use of at least two scenes (satellite images) per season to achieve representative estimates over these periods. Considering that there is no consensus with regards to the performance of different machine learning techniques in characterising foliar chemical elements such as leaf nitrogen, potassium, calcium, magnesium, phosphorus, sulphur, zinc, boron and copper, future studies need to compare and assess their performance.

Data Availability Statement:
The data belongs to Witwatersrand University and will be kept by the authors. The data will be availed upon request for research purposes only, and the disclosure shall include the following: the title of the research or paper to be used for the stated data; the details of the organization and the supervisory body or persons concerned; the assurance that the research would not yield any commercial benefit. Our contacts are detailed in this paper to request the data.