A Comparative Estimation of Maize Leaf Water Content Using Machine Learning Techniques and Unmanned Aerial Vehicle (UAV)-Based Proximal and Remotely Sensed Data

: Determining maize water content variability is necessary for crop monitoring and in developing early warning systems to optimise agricultural production in smallholder farms. However, spatially explicit information on maize water content, particularly in Southern Africa, remains elementary due to the shortage of efﬁcient and affordable primary sources of suitable spatial data at a local scale. Unmanned Aerial Vehicles (UAVs), equipped with light-weight multispectral sensors, provide spatially explicit, near-real-time information for determining the maize crop water status at farm scale. Therefore, this study evaluated the utility of UAV-derived multispectral imagery and machine learning techniques in estimating maize leaf water indicators: equivalent water thickness (EWT), fuel moisture content (FMC), and speciﬁc leaf area (SLA). The results illustrated that both NIR and red-edge derived spectral variables were critical in characterising the maize water indicators on smallholder farms. Furthermore, the best models for estimating EWT, FMC, and SLA were derived from the random forest regression (RFR) algorithm with an rRMSE of 3.13%, 1%, and 3.48%, respectively. Additionally, EWT and FMC yielded the highest predictive performance and were the most optimal indicators of maize leaf water content. The ﬁndings are critical towards developing a robust and spatially explicit monitoring framework of maize water status and serve as a proxy of crop health and the overall productivity of smallholder maize farms.


Introduction
Water stress is one of the most drastic limiting factors of maize crop production [1]. Maize (Zea mays L.) is mostly grown under rain-fed conditions and consumed by the majority of the inhabitants in Southern Africa as a staple food [2]. Due to high population growth and the increase in food and nutrition insecurities, smallholder farmers now play a critical role in maize production and in fostering food security, particularly in developing nations such as those in South Africa [3,4]. Despite their key role, smallholder farms are constantly facing the challenge of intermittent water stress and drought, resulting in significant yield losses [5]. More so when stress occurs from the pre-flowering to the late grain-filling stages as it is often difficult to detect the onset and magnitude of intermittent water stress [6]. In addition, spatial and temporal crop management, cultivar selection, soil, and topography affect its extent and impacts on maize yield [6]. As such, there are no clear-cut spatially explicit methods of quantifying water stress near real time in smallholder farms of the global south with limited resources. It is therefore imperative to develop optimal methods for quantifying maize water stress in a spatially explicit manner. This provides a key pathway towards effectively monitoring drought impacts and deriving useful information that can be used to inform irrigation decisions.
When maize crops are in a state of water deficit, there is a decrease in leaf photosynthesis, stomatal conductance, and leaf expansion and transpiration, subsequently resulting in impaired growth [7]. The lack of water molecules results in the loss of turgor-driven cell expansion and the primary productivity of maize crops as this has detrimental impacts on its growth [8,9]. Crop water deficits result in a decline in the quantity and quality of maize produce [10]. Water stress considerably affects the phenotype, reproductive system, and seed set [10]. Strong and positive correlations have been observed between grain yield and leaf water content [7,10]. Therefore, knowledge on the accurate estimation of maize leaf water content is necessary for crop monitoring and in developing early warning systems to optimise agricultural production in exclusively rain-fed smallholder farms [11,12].
A variety of physiological indicators have been developed to quantify crop water content as a proxy for crop water stress. They include equivalent water thickness (EWT), fuel moisture content (FMC), and specific leaf area (SLA) [13][14][15]. EWT is the ratio between a crop's leaf area and the quantity of water per unit area [16]. EWT is an improvement of dry matter content as it takes into account the thickness and area covered by the canopy. FMC represents the quantity of water per unit mass of leaf dry matter. It is an effective indicator of water stress or drought conditions and is commonly used in wildfire monitoring [9]. SLA is the ratio of leaf area per unit of dry mass [17]. SLA is a fundamental indicator of crop physiology and the variability of a crop's photosynthetic capacity and growth rate [18]. Although there have been various studies conducted in monitoring crop water status [7,15], there is still a disagreement on the best-suited indicator for maize water content prediction at the leaf level in small fields.
Previously, variations in crop water status were measured through conventional methods such as the visual assessment or in situ measurements conducted by trained experts [9]. However, such techniques are laborious, costly, and comparatively time-consuming, hence not feasible for continuous and time-efficient crop monitoring [19]. Over the decades, the use of satellite-borne earth observation technologies has proven to be effective in monitoring plant water status, variations in the physiology of water-stressed vegetation, and in indicating crop water requirements for improved irrigation efficiency [20]. For instance, Xu, et al. [21] used multispectral data derived from Landsat OLI and MODIS datasets to quantify crop water content with an optimal R 2 of 0.78. Additionally, Sibanda, Onisimo, Dube and Mabhaudhi [20] utilised Sentinel-2 MSI to estimate canopy water content using EWT and FMC to an rRMSE of 20.8% and 18.45%, respectively, while Krishna, et al. [22] used the combination of hyperspectral sensors and partial least squares regression to estimate rice crop water stress with an R 2 of 0.94. However, despite these successes, the application of satellite data in characterising water indicators at farm scale is restricted by their relatively coarser spatial and temporal resolutions [23]. Although there are sensors that provide very-high-resolution (VHR) remotely sensed data, such as QuickBird and Worldview imagery, these are often costly and not ideal for monitoring maize water content in necessitous smallholder farms [9].
In recent years, unmanned aerial vehicles (UAVs), commonly known as drones, have received increased attention in precision agriculture [24]. UAVs, mounted with lightweight multispectral sensors have the capacity to provide spatially explicit near-real-time information for the monitoring of crop water content [23]. Additionally, UAV proximal sensors with a sub-metre resolution deliver rapid, cost-effective, and accurate measurements required for the detection of the maize water status at a plot level [9]. Compared with satellite imagery, UAV-based sensors can provide datasets with exceptionally high spatial and temporal resolutions. In addition, UAV platforms can hover over a specific area of interest and can acquire imagery at lower altitudes, allowing for a finer ground sampling distance, hence being suitable for better quantification of maize water content at a field scale [9]. Various studies have utilised UAV-based proximal sensing in environmental applications [25][26][27]. For example, Han, et al. [28] used a DJI Spreading Wings UAV mounted with an RGB camera to estimate the plant height of maize crops and attained an RMSE of 14.1 cm. Zhang, Basso, Price, Putman and Shuai [27] utilised a Phantom 3 UAV-based RGB image to investigate the optimal flight height for the discrimination of maize varieties. Additionally, studies have demonstrated the utility of UAV remote sensing approaches in maize yield prediction [29], maize pest and disease detection [25], and crop physiology monitoring [26]. However, these studies were conducted in controlled experimental plots in the global north. Very few studies have been conducted in the global south, particularly in smallholder croplands with rain-fed maize and other crops. As a result, the potential application of UAVs equipped with high-resolution sensors for monitoring crop dynamics such as maize water content needs to be further investigated, especially in the small, fragmented croplands of Southern Africa.
The prediction of maize water content using proximal remote sensing approaches is based on the reflectance behaviour of water molecules and dry vegetation matter in the near-infrared (NIR) and the shortwave infrared (SWIR) sections of the electromagnetic spectrum [26]. However, much of the available drone sensors that have been widely used in assessing crop water content and health have either covered the visible section of the electromagnetic spectrum or included the NIR. Very few of these studies have assessed the utility of drone sensors covering the red edge, the NIR, and the thermal sections of the electromagnetic spectrum in characterising crop water content. Furthermore, a large and growing body of literature has demonstrated the optimal performance of vegetation indices (VIs) derived from the water-sensitive sections of the electromagnetic spectrum as an instrument for retrieving crop water status [7,8,30]. For example, the Normalised Difference Water Index (NDWI), Normalised Difference Vegetation Index (NDVI), Green Chlorophyll Index (CIgreen), and the Red-Edge Chlorophyll Index (CIrededge) have demonstrated significant correlations with crop water indicators [12,30]. It is in this regard that the combination of the drone-derived red-edge, NIR, and thermal bands in conjunction with optimal vegetation indices were anticipated to yield accurate estimations of maize water content in smallholder farms.
A range of regression techniques has been proposed for the prediction of vegetation parameters using remotely sensed data. These may be broadly categorised into two: conventional regression methods and machine learning techniques [31]. A major limitation of conventional techniques, such as linear regression (MLR), is that they assume an explicit relationship between measured biophysical parameters and spectral observations, thus limiting their applicability to spatially complex datasets [32]. Recently, machine learning regression techniques such as support vector machines (SVM), random forest (RF), artificial neural network (ANN), partial least squares (PLS), and decision trees (DT) have gained popularity for their high performance in computing, quantifying, and understanding complex processes in agricultural applications [33]. Jin, et al. [34], for instance, applied the SVM model to estimate the leaf water content of maiden grass and achieved an exceptional model accuracy (R 2 = 0.98). Sibanda et al. [20]_ENREF_24 implemented the RF ensemble to predict the canopy water content of grasslands, obtaining an R 2 of 0.98 and RMSE of 9.8 gm −2 , while Yue, Feng, Jin, Yuan, Li, Zhou, Yang and Tian [19] applied machine learning techniques, including DT, PLS, and ANN, in estimating the above-ground biomass of winter wheat. The studies above illustrate the robustness and prediction capabilities associated with machine learning regression ensembles based on remotely sensed data. Although there are other algorithms that have been used in remote sensing applications, a large and growing body of literature shows that SVM, RF, ANN, PLS, and DT are the most widely adopted. This is attributed to their ease of implementation, robustness especially in dealing with small sample sizes, optimal feature selection abilities as well as the high accuracies they yield. However, the literature indicates that there is no specific algorithm that is suited for a specific context. There is, therefore, a need to assess and identify the most efficient algorithm that could accurately estimate maize foliar water content using UAV-derived data in the context of smallholder croplands.
In this regard, this study sought to investigate the potential of UAV-derived multispectral imagery and machine learning techniques in the remote estimation of maize water content from smallholder croplands. The main objectives of this study were to conduct a comparative analysis in order to (1) evaluate the performance of five regression techniques in predicting maize water content, and (2) determine the most suitable indicator of smallholder maize water content variability based on multispectral UAV data. The anticipated results will help provide a technical approach for the quick and accurate monitoring of changes in either EWT, FMC, or SLA as a result of water variability in order to inform irrigation decisions and the planning of smallholder maize crops.

Description of the Study Area
This study was conducted at Swayimane (29 • 52 S, 30 • 69 E), a communal area located within the uMshwathi Municipality, northeast of the city of Pietermaritzburg in South Africa ( Figure 1). Swayimane is situated within the moist midlands mistbelt bioresource area, characterised by an average temperature ranging between 11.8°C and 24°C, and a mean annual temperature of 17°C. The climate in the area is relatively hot with wet/cool summers and dry winters. The area receives an annual rainfall that varies between 600 and 1100 mm. Swayimane experienced an average air temperature of 23.94°C and an average rainfall of 86.56 mm during the maize growing season of 2020-2021 (Table 1). Swayimane is distinguished by arable clay loam soils and is ranked within the top 2% of high-potential land in South Africa. Such environmental conditions support the production of various grain and legume crops. Common crops produced within the study area are beans, sweet potato, sugarcane, spinach, and maize. Swayimane is dominated by smallholder maize farms cultivated by the local community. Maize farmers in the area depend primarily on traditional methods of farming such as the use of manual labour and livestock manure for fertilizer. Maize in Swayimane is cultivated both at a subsistence scale and for additional income generation. Moreover, Swayimane is a good example of a rural setup where organic farming is conducted on a semi-subsistence scale. This highlights the success of utilising organic farming methods for the optimisation of maize yield at a minimal cost. Maize experimental plots were cultivated in summer, which is the optimal maize growing season. The maize plot covered a spatial extent of 250 m 2 and was primarily rain-fed. The maize crop was sown in mid-November 2020. At the time the project commenced, the crop was 86 days old, termed the reproductive phase of the growth cycle. Specifically, the maize seedlings were at an intermediate between the kernel blister stage (growth stage R2) and the kernel milk stage (growth stage R3). This stage was selected because the literature confirms that the early reproductive stages of maize are the most sensitive to water deficits [35,36].

Field Sampling and Water Content Measurements
Field data collection was conducted on 11 February 2021 at the study site. An automatic weather station (AWS) was installed in proximity to the maize fields to acquire the bioclimatic data of the maize crops. The AWS measured air temperature, relative humidity, and wind speed. Wind direction sensors and a rain gauge measured the daily wind direction and rainfall within the experimental plot. A stratified random sampling approach was used to generate a total of 104 random sample points within the maize field. This technique was selected as it could provide a representative sample of the study area. A Trimble handheld Global Positioning System (GPS) with a sub-metre accuracy was used to navigate to the randomly generated sample points within the field. Sampling fully developed leaves from the top of the maize canopy ensures reliable measurements of plant physiological characteristics, especially since these leaves receive direct sunlight and have maximum spectral reflectance [37]. The sampling of young emerging leaves was deemed

Field Sampling and Water Content Measurements
Field data collection was conducted on 11 February 2021 at the study site. An automatic weather station (AWS) was installed in proximity to the maize fields to acquire the bioclimatic data of the maize crops. The AWS measured air temperature, relative humidity, and wind speed. Wind direction sensors and a rain gauge measured the daily wind direction and rainfall within the experimental plot. A stratified random sampling approach was used to generate a total of 104 random sample points within the maize field. This technique was selected as it could provide a representative sample of the study area. A Trimble handheld Global Positioning System (GPS) with a sub-metre accuracy was used to navigate to the randomly generated sample points within the field. Sampling fully developed leaves from the top of the maize canopy ensures reliable measurements of plant physiological characteristics, especially since these leaves receive direct sunlight and have maximum spectral reflectance [37]. The sampling of young emerging leaves was deemed not suitable for plant analysis as it could exacerbate plant stress leading to plant mortality [38,39]. In this regard, the first fully developed leaf (first leaf below whorl) was collected from the top of the maize canopy to measure leaf water content indicators. A LI-3000C Portable Area Meter combined with an LI-3050C Transparent Belt Conveyer Accessory with a one-mm 2 resolution was used to measure the leaf area (A) of sampled maize leaves (Li-Cor, USA). The fresh weight (FW) of sampled maize leaves were obtained using a calibrated scale with a 0.5 g measurement error. Field measurements were conducted between 12:00 noon and 14:00 as this is the most optimal period of the day for crop photosynthetic activity [40]. The sampled maize leaves were then dried in an oven at 70 • C until a constant dry weight (DW) was reached (approximately 48 h). The A, FW, and DW were then used as input variables to compute maize leaf water indicators using the following equations: Units: gm 2 The computed data for each crop water indicator was integrated with the GPS location and converted into a point map that was overlaid with the UAV multispectral images of the study area.

The UAV Platform, Image Acquisition, and Processing
The DJI Matrice 300 series (M300) and the MicaSense Altum imaging sensors were used to acquire images covering the maize field considered in this study (Figure 2a). The M300 UAV specifications are further detailed in Table 2. The Altum camera integrates a radiometrically calibrated thermal sensor with five spectral channels that measure reflectance in the visible to the non-visible light spectrum (i.e., blue (475 nm), green (560 nm), red (668 nm), red-edge (717 nm), NIR (840 nm), and thermal (8-14 nm)) at a ground sampling distance of 9.6 cm per pixel (Figure 2b). The main advantage of this imaging platform is its ability to capture synchronised thermal and multispectral data simultaneously in an automated manner. A shapefile of the study area was created in Google Earth Pro and exported to the M300 s handheld console to develop a UAV flight plan (Figure 2c). Before and post-flight, an automatic calibrated reflectance panel was used to compensate for incident light conditions by using known reflectance values across the spectrum to radiometrically calibrate the Altum sensor ( Figure 2d). An automated flight mission was conducted at a flight height of 100 m, with an image overlap of 80%. An orthomosaic of the imagery derived from the imaging platform was generated and pre-processed in order to enhance image features using the Pix4D Fields photogrammetry software.

Selection of Vegetation Indices
The UAV imaging platform used in this study measures reflectance in the visible, red-edge, and NIR regions of the spectrum; hence, we sought to evaluate all possible combinations of UAV spectral bands to accurately predict crop leaf water indicators. In this study, the reflectance data obtained from the Altum multispectral and thermal bands were used to derive vegetation indices (VIs). Table 3 shows a list of VIs that were selected for this study based on their direct and indirect correlation with plant water status indicators. As mentioned earlier, the prepared spectral data were then overlaid with the point data associated with measured maize water indicators to derive data that was used for the statistical prediction of maize water content.   The UAV imaging platform used in this study measures reflectance in the visible, red-edge, and NIR regions of the spectrum; hence, we sought to evaluate all possible combinations of UAV spectral bands to accurately predict crop leaf water indicators. In this study, the reflectance data obtained from the Altum multispectral and thermal bands were used to derive vegetation indices (VIs). Table 3 shows a list of VIs that were selected for this study based on their direct and indirect correlation with plant water status indicators. As mentioned earlier, the prepared spectral data were then overlaid with the point data associated with measured maize water indicators to derive data that was used for the statistical prediction of maize water content.

Spatial Analysis
The sampled data were randomly split into training (70%) and validation data (30%). The former was used in model development and the latter in assessing the accuracy of predictive models. A comparative analysis was conducted between the support vector regression, random forest regression, decision trees regression, artificial neural network regression, and the partial least squares regression algorithms in predicting the leaf water content indicators (i.e., EWT, FMC, and SLA). According to Lary et al. [43], RF, SVM, DT, ANN, and PLS are the most widely used machine learning algorithms in the geosciences. These non-parametric algorithms are robust, efficient, and can be parameterised and implemented with ease [31,33]. Above all, these algorithms have been used in the literature and are renowned for their accuracy, which is facilitated by their ability to optimally select spectral features for accurate predictions [43,44]. It is in this regard that these algorithms were chosen for this study. Then, variable selection was performed for each prediction model to identify the variables that are most influential in the prediction of the named indicators. Variable selection reduces issues associated with variable redundancy and multicollinearity, which affect the performance of regression models [45]. Details on how each algorithm was used in this study are provided below. Support vector regression (SVR): Three parameters were tuned for the SVR model; more specifically, the penalty parameter (C), precision parameter (ε), and kernel parameter (γ). In this study, the grid search and 10-fold cross validation method, recommended by Shafiee, et al. [46], was performed on the training data and the SVR model was performed optimally at a C value of 8, the ε equal to 0.5, and the γ kept at a default of 1.
Random forest regression (RFR): The quality of the RFR model depends on the proper setting of the RFR hyperparameters. The RFR model is generally optimised based on two parameters, namely Ntree, which is the number of decision trees to be generated, and Mtry, the number of predictor variables tested for the best split when growing the trees [47]. The optimal hyperparameter values for the prediction of maize water content in the study was determined to be an Ntree equal to 500 and an Mtry of 11 after numerous iterations.
Decision tree regression (DTR): In this study, the fine-tuning process of the DTR algorithm was performed until no improvements were observed and the model parameters were specified. The minimum split, which is the minimum number of values that must exist at a node before the split is attempted [48], was fixed at 20. The maximum depth to which the tree is allowed to grow was set at 30. Finally, the termination criteria for the regression tree was specified at 0.01. These parameters were identified after numerous iterations and errors had been noted.
Artificial neural network (ANNR): The hyperparameters of the ANNR algorithm were fine-tuned to decrease the error of prediction and improve the accuracy of the model. As such, the hyperparameters of the optimal ANNR model was determined to be 10 nodes and 2 hidden layers.
Partial least squares regression (PLSR): The optimal PLSR model was obtained from the model that yielded the minimum relative error. The forward selection and backward elimination technique was used to select the optimal predictor and PLSR model that was used in this study. The optimal PLSR model was characterised by the least impact of multicollinearity and the greatest prediction accuracy.
To optimise the outputs of the abovementioned models, the variable importance scores were used to determine the most influential bands and indices for estimating leaf water content indicators [49]. The least important predictor variables were progressively removed and the model was re-developed [49,50]. The Caret Package was used to develop the regression models in the RStudio software version 1.4.1564.

Accuracy Assessment of Derived Maize Water Content Models
An accuracy assessment was conducted to evaluate the performance of regression models in predicting leaf water content indicators. The coefficient of determination (R 2 ), the root mean square error (RMSE), and the relative root mean square error (rRMSE) were used to compare the accuracy of different models. More specifically, the R 2 was used to measure the variation between measured and predicted maize leaf water content, and the RMSE was used to assess the magnitude of error between the field measurements and the modelled water content. The rRMSE was used to compare the performance of regression models across different algorithms and maize water indicators. To compute rRMSE, the RMSEs from each model were normalised using the mean of each variable and then expressed as a percentage [51].

Descriptive Analysis of Maize Crop Water Indicators and Measured Biophysical Variables
A wide range of variations was recorded in both the biophysical variables and the crop water indicators of maize crops. Table 4 represents the descriptive statistics of leaf FW, DW, Leaf area, EWT, FMC, and SLA. Averages for FW, DW, and Leaf area were 37.06 g, 6.94 g, and 0.09 m 2 , respectively, while the averages for crop water indicators, particularly EWT leaf , FMC leaf , and SLA leaf were 356.52 gm −2 , 81.27%, 29.86 gm −2 , and 0.01 m 2 g −1 , respectively. A Kolmogorov-Smirnov normality test revealed that all crop water indicators did not deviate significantly from the normal distribution curve.  Table 5 illustrates the model accuracies obtained in predicting leaf EWT, FMC, and SLA based on the RFR, DTR, ANNR, PLSR, and SVR regression techniques. The accuracies of the prediction models varied greatly for the crop water indicators. For example, when estimating EWT leaf , the DTR yielded the poorest model, with an RMSE of 25.16 gm −2 and an R 2 of 0.73. The prediction of EWT leaf improved slightly for the PLSR model (RMSE = 17.1 gm −2 and R 2 = 0.74). Similarly, the SVR and the ANNR models predicted EWT leaf at an improved RMSE = 15.05 gm −2 , R = 0.76, and RMSE = 14.29 gm −2 , R 2 = 0.84, respectively. The optimal algorithm for estimating EWT leaf was derived from the RFR model with an RMSE of 10.28 gm −2 and an R 2 of 0.89 (Table 5).

Evaluation of Maize Water Indicators and Optimised Regression Models
Similarly, the ANNR model exhibited the lowest prediction accuracy in estimating FMC leaf (RMSE = 1.54% and R 2 = 0.34). This was followed by the PLSR, with an RMSE of 0.48% and an R 2 of 0.45. The prediction accuracy increased significantly with the DTR and the SVR models, with an R 2 = 0.65 and an R 2 = 0.69, respectively. The RFR model optimally predicted the FMC leaf with the lowest RMSE = 0.45% and R 2 = 0.76 (Table 5).
When predicting SLA leaf , the lowest RMSE of 0.0008 g −1 m 2 and R 2 of 0.6 was obtained using the PLSR model. The ANNR model improved the prediction by a magnitude of 8, i.e., R 2 = 0.68. The ability of the DTR and SVR to predict SLA differed slightly with an RMSE = 0.0009 m 2 g −1 and R 2 = 0.7, and an RMSE = 0.0005 g −1 m 2 and R 2 = 0.71, respectively. The optimal model for estimating SLA leaf exhibited an RMSE of 0.0004 g −1 m 2 and R 2 of 0.73 ( Table 5). Figure 3 illustrates the results obtained when all maize water content indicators were estimated based on the optimal regression models. The EWT leaf performed optimally as an indicator of maize water content, with an rRMSE of 3.13% and an R 2 of 0.89. The most optimal variables that were selected for estimating EWT leaf were NDVI, NIR, NDWI, CIgreen, NDVI rededge, Red, CIrededge, NDRE, and NGRDI, in order of importance ( Figure 3A).

Optimal Models for Estimating Maize Water Content Indicators
Meanwhile, the FMC leaf based on the PLSR model performed better than EWT leaf by a magnitude of 2.53%, with an rRMSE of 0.6%. The most suitable predictor variables included NDRE, NIR, NDWI, CIrededge, NDVI rededge, red-edge, CIgreen, blue, thermal, NDVI, red, and the green band ( Figure 3B). Additionally, the FMC leaf SVR model produced a relatively high rRMSE of 0.89%. However, although the rRMSE of these FMC leaf models were high, there was a high variation between the measured and estimated FMC leaf values, with an R 2 of 0.45 and 0.69, respectively. In comparison, the FMC leaf based on the RFR model exhibited an optimally high R 2 of 0.76 and an acceptable rRMSE of 1%, making it the optimal FMC leaf model.
The optimal model for the prediction of maize SLA leaf exhibited an rRMSE of 3.48% and an R 2 = 0.73. The variables that had the highest influence in the SLA model were the NDVI, Thermal, NIR, NDRE, CIgreen, red-edge, NDVI rededge, CIrededge, NGRDI, and the NDWI, in order of descending importance ( Figure 3C).
The results revealed that the optimal indicators of maize water content based on the RFR models were FMC leaf and EWT leaf , followed by SLA leaf . Additionally, the UAV multispectral bands and derived VIs were successful in predicting all maize water content indicators.

Mapping the Spatial Distribution of Maize Leaf Water Content Indicators
The spatial distribution of leaf EWT, FMC, and SLA was estimated based on the optimal models. Figure 4 illustrates the spatial distribution of maize water content indicators. It can be observed that the water content of maize is relatively high throughout the maize fields and seem to decrease towards the edge of the maize plot, with the exception of FMC, which revealed small patches of lower maize water content within maize fields.

Discussion
Smallholder farmers are frequently faced with the need to optimise maize production; therefore, an assessment of maize water status through the monitoring of EWT, FMC,

Discussion
Smallholder farmers are frequently faced with the need to optimise maize production; therefore, an assessment of maize water status through the monitoring of EWT, FMC, and SLA could provide essential information for the improvement of crop-water use efficiency and the enhancement of maize productivity under water-limited conditions [52]. The essence of this study was to assess and identify a suitable indicator for maize water content and to evaluate the predictive performance of robust algorithms in predicting maize water status. Thus, this study sought to investigate the use of UAV-derived remotely sensed data and machine learning techniques in estimating maize EWT, FMC, and SLA.

Estimating Maize Water Content Indicators
Results in this study indicate that when estimating maize equivalent water thickness, an optimal prediction (rRMSE = 3.13% and R 2 = 0.89) can be obtained based on spectral variables derived from the NIR section of the electromagnetic spectrum (NDVI, NIR, NDWI, and NDRE). The literature confirms that the quantity of water in crop leaves is statistically correlated with leaf reflectance across the spectrum [26]. More specifically, the variation in water molecules present in the leaf cell strongly influences the reflectance of solar radiation in the NIR region; hence, this section of the spectrum is commonly used to quantity leaf water status [8,52]. The variation in leaf reflectance is related to plant vigour, which is primarily controlled by the changes in leaf cuticles, mesophyll thickness, and intercellular air spaces as a result of leaf water and nutrient availability [20,53]. Furthermore, the NIR section has widely proven to be related to the leaf water absorption zone, hence its optimal influence in estimating the leaf EWT of maize in smallholder farms. Correspondingly, studies by Mobasheri and Fatemi [54] and Riaño, et al. [55] successfully illustrated the use of leaf optical reflectance in the NIR section of the electromagnetic spectrum for optimally predicting EWT with an R 2 of 0.95 and 0.75, respectively. EWT also displayed high sensitivity to chlorophyll-based indices, especially CIgreen and CIrededge. This could be explained by the fact that changes in the level of chlorophyll in leaves, which alters crop greenness and leaf pigmentation, is closely related to water status [12,56]. As in this study, Zhang and Zhou [30] noted that these chlorophyll-based indices presented a higher sensitivity to crop water indicators.
Fuel moisture content (FMC) was optimally predicted to an rRMSE of 1% and an R 2 = 0.76. The results of this study show that FMC is particularly sensitive to the rededge waveband and associated derivatives of these spectral channels. For instance, there was a significant influence of the red-edge, NDRE, NDVI rededge, and CIrededge in the prediction of maize FMC. Such sensitivity of the red-edge band in predicting FMC can be explained by its positive association with crop biomass as well as chlorophyll content, which is also positively correlated with FMC [20]. Generally, the variations in crop water content are largely associated with chlorophyll activity and leaf area index, which influence the reflectance of leaf tissue in the red-edge section of the electromagnetic spectrum [57]. This was the case in studies by Bar-Massada and Sviri [58] and Cao and Wang [59], which confirmed a variation in the reflectance of green leaves under water-stressed conditions in the red-edge band, making this wavelength a significant predictor of FMC.
Furthermore, NDWI, which is primarily derived from the NIR band, has a significant influence on the prediction of FMC. This VI is particularly important in predicting water content as it is sensitive to the variations of leaf reflectance induced by water molecules and dry matter content, and hence, strongly correlates to plant water stress [12]. A study by Sow,et al. [60] demonstrated the importance of the NDWI in predicting FMC by achieving an R 2 of 0.85. In this regard, the literature supports the relationship between FMC and the red-edge as well as the NIR sections of the electromagnetic spectrum [20,57].
Finally, the results in this study show that SLA could be estimated to an rRMSE of 3.48% and an R 2 of 0.73. SLA was particularly sensitive to the UAV-derived thermal, NIR, and red-edge wavelengths. When crops are in a state of water deficit, there is an overall increase in crop surface temperature due to the closure of leaf stoma, which decreases the evaporation cooling effect [61]. In this regard, the literature notes the fact that the thermal band has been well-established as a key wavelength for early plant water stress detection [61,62]. Again, NDVI was the most influential predictor of maize SLA in this study. This could be explained by the fact that NDVI is proportional to chlorophyll content, which is sensitive to the changes in crop water content [63]. Furthermore, when crops are water-stressed, there is a decrease in the absorption of chlorophyll at the red wavelength and a decrease in reflectance at the NIR region due to the shrinkage of leaf thickness during the wilting process [64]. In a similar study, Ali, et al. [65] noted that NDVI was very effective in optimally estimating SLA (R 2 = 0.73 and RMSE = 4.68%). Wijewardana, Alsajri, Irby, Krutz, Golden, Henry, Gao and Reddy [26] confirm that the combination of both the NIR and red wavelengths allows NDVI to be an invaluable predictor of photosynthetic activity and long-term water stress. Additionally, SLA was sensitive to the NDRE, NDVI redege as well as chlorophyll-based VIs. The influence of these red-edge-based VIs in predicting SLA stems from the fact that the variations in leaf thickness and area, as well as leaf pigmentation due to water stress, is promptly detected at the red-edge section [66]. In this regard, the variations in leaf photosynthetic capacity provides essential information pertaining to maize leaf water vapour and water content [65].
Furthermore, the results illustrate that all maize leaf water content indicators were optimally predicted using UAV-derived data. Accordingly, FMC and EWT yielded the highest predictive power of water content, while SLA was effectively estimated. In comparison, the FMC and EWT are the most ideal crop water indicators for monitoring water stress using field spectroscopy techniques [13,16].

The Performance of Machine Learning Algorithms in Predicting Maize Water Content Indicators
Results in this study show that the RFR approach is the most suitable explorative tool for predicting all maize water content indicators. For instance, RFR optimally predicted FMC, EWT, and SLA, producing the highest prediction accuracy (rRMSE = 1%, 3.13%, and 3.48%). The RFR algorithm can effectively establish the relationship between leaf reflectance and maize water at farm scale. The strength of RFR could be explained by the fact that the algorithm is not highly affected by noise in the data, hence there is a reduced risk of producing overfitting models [67,68]. In a similar study, Sibanda, Onisimo, Dube and Mabhaudhi [20] confirmed the robustness of the RFR model in modelling water content elements, particularly the FMC, by achieving optimal R 2 s as high as 1 and an RMSE of 16.4%.
The SVR approach was also optimal in predicting maize leaf EWT, FMC, and SLA. The strength of the SVR lies in its ability to circumvent outliers and exhibit a high generalisation capacity to handle unseen patterns [69]. The results in this study reveal that the SVR is similar to the RFR in predictive power. This could be explained by the fact that the SVR and RFR ensembles optimally operate with a relatively small number of training samples, which is often the case for data acquired at the field scale after avoiding spatial autocorrelation [63,68]. Therefore, the results of this study demonstrate that the model properties of RFR and SVR are well suited for the estimation of smallholder maize water content. Generally, DTR did not perform well in predicting maize water indicators. This could be explained by the fact that DTR does not have features such as the bootstrapping in RFR and hyperplanes in SVR for effectively encompassing all the samples during the prediction procedure [69]. This can result in the DTR algorithm being conservative in its prediction procedure, hence exhibiting lower prediction accuracies. In this regard, there are very few studies that have evaluated its predictive performance in the context of canopy and leaf water content. In comparison, the ANN and PLSR exhibited poorer performance in predicting maize water content. This could be due to the fact that both the ANN and PLSR are best suited for large training datasets in order to produce credible results [63,70]. Thus, this study prompts future studies to investigate the optimal sample size required to produce accurate predictions of smallholder maize water content when using a combination of UAV imagery and machine learning techniques. Additionally, there are prospects to evaluate the ability of other empirical models and deep learning methods to accurately model maize water variability.

Conclusions
The present study tested the utility of unmanned aerial vehicle (UAV)-based multispectral data in a comparative approach of estimating water content using random forest regression (RFR), support vector regression (SVR), decision tree regression (DTR), artificial neural network (ANNR), and partial least squares regression (PLSR) machine learning techniques and the equivalent water thickness (EWT), fuel moisture content (FMC), and specific leaf area (SLA) of maize crops in smallholder farms. Based on the findings of the study, it can be concluded that:

•
The EWT, FMC, and SLA water content indicators of maize could be optimally predicted using the NIR and red-edge-derived spectral variables; • The RFR and SVR modelling techniques have a more robust capacity for predicting water content indicators of maize in comparison to the DTR, ANNR, and PLSR; • FMC and EWT, in concert with the RFR approach, exhibited the highest predictive performance, and are therefore valid indicators of maize water content.
This study demonstrates that UAV-derived multispectral data is capable of predicting maize water variations of smallholder farms with exceptional accuracy and can hence complement and inform farms on drought-related water stress. However, there are research gaps that demand further inquiry, particularly on smallholder maize farms. Future studies should aim to evaluate the utility of UAV-derived data and the optimal water indicators for characterising the variation of maize water content across different phenological stages. Furthermore, a key limitation of this study is the lack of the SWIR spectrum, which would be valuable as it is an essential water absorption band. Therefore, additional studies are necessary to evaluate whether UAV sensors that measure spectral reflectance along with the SWIR section of the electromagnetic spectrum can improve the prediction of smallholder maize water content. Finally, this study was site and crop-specific; therefore, studies conducted across various climates, different smallholder crops, and at a multitemporal scale should be assessed to draw broader conclusions in the characterisation of crop water stress.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The data are not publicly available due to authorisation restrictions from the funder that limit the distribution of data as the article is part of an ongoing project where other manuscripts are still being prepared.