Assessing the Efﬁciency of Remote Sensing and Machine Learning Algorithms to Quantify Wheat Characteristics in the Nile Delta Region of Egypt

: Monitoring strategic agricultural crops in terms of crop growth performance, by accurate cost-effective and quick tools is crucially important in site-speciﬁc management to avoid crop reduc-tions. The availability of commercial high resolution satellite images with high resolution (spatial and spectral) as well as in situ spectra measurements can help decision takers to have deep insight on crop stress in a certain region. The research attempts to examine remote sensing dataset for forecasting wheat crop ( Sakha 61 ) characteristics including the leaf area index (LAI), plant height (plant-h), above ground biomass (AGB) and Soil Plant Analysis Development (SPAD) value of wheat across non-stress, drought and salinity-induced stress in the Nile Delta region. In this context, the ability of in situ spectroradiometry measurements and QuickBird high resolution images was evaluated in our research. The efﬁciency of Random Forest (RF) and Artiﬁcial Neural Network (ANN), mathematical models was assessed to estimate the four measured wheat characteristics based on vegetation spectral reﬂectance indices (V-SRIs) extracted from both approaches and their interactions. Field surveys were carried out to collect in situ spectroradiometry measurements concomitant with the acquisition of QuickBird imagery. The results demonstrated that several V-SRIs extracted from in situ spectroradiometry data and the QuickBird image correlated with the LAI, plant-h, AGB, and SPAD value of wheat crop across the study site. The determination coefﬁcient (R 2 ) values of the association between V-SRIs of in situ spectroradiometry data and various determined wheat characteristics varied from 0.26 to 0.85. The ANN-GSIs-3 was found to be the optimum predictive model, demonstrating a greater relationship between the advanced features and LAI. The three features of V-SRIs comprised in this model were strongly signiﬁcant for the prediction of LAI. The attained results indicated high R 2 values of 0.94 and 0.86 for the training and validation phases. The ANN-GSIs-3 model constructed for the determination of chlorophyll in the plant which had higher performance expectations (R 2 = 0.96 and 0.92 for training and validation datasets, respectively). In conclusion, the results of our study revealed that high resolution remote sensing images such as QuickBird or similar imagery, and in situ spectroradiometry measurements have the feasibility of providing necessary crop monitoring data across non-stressed and stressed (drought and salinity) conditions when integrating V-SRIs with ANN and RF algorithms.


Introduction
The scarcity of freshwater resources is considered an essential consideration in both arid and semi-arid environments and thus the accessibility of water resources of low quality (e.g., drainage water, wastewater, and brackish water) has become a more important consideration in supplementing supply [1][2][3][4]. Broadly, irrigation efficiency within cultivated sectors in many regions worldwide is low and remarkable water savings could be accomplished via a more precise, robust and efficient management of available irrigation water resources. Jones [5] confirmed that water stress is the main factor influencing the limitations of crop production and therefore crop water requirements should be satisfied to avoid low crop productivity.
In the Nile Delta region, grain crop production is mainly hindered by issues of water availability and salinity as a result of limited water resources [6]. Detecting stress in agricultural crops across vast areas by traditional methods (e.g., point-sampling) is time consuming and costly and is sometimes unrepresentative in offering a spatial panorama of stress patterns [4,[7][8][9]. Precise and fast assessment and monitoring ways for crop health status to quantify crop characteristics can enhance site-specific management to obtain a higher crop productivity in comparison to traditional monitoring techniques. In this regard, remote sensing of different platforms may offer a reliable tool in precision agriculture [10][11][12][13]. Drought and salinity stress are considered major inhibitors of strategic crop production (e.g., wheat, corn and rice), and more energies to spot their effects for irrigation management strategies are compulsory as several studies have quantitatively evaluated the potential of remote sensing to identify cultivated areas that are suffering from water and/or salinity stress [14]. Irrigation and water salinity management procedures must be carefully managed to maximize water use efficiency and to avoid higher salinity levels in the root zone by adding the optimum rates of water [6,15,16]. Observing crop health status is normally dependent upon destructive sampling, which as mentioned earlier is time consuming, costly, and is an unrepresentative process. A reliable alternative is the utilization of remote sensing of different platforms as a fast and robust tool that integrates the crop response to the negative effects of drought and salinity.
Expected climate change and global warming over the present century will lead to increase evaporation and as a result, a lack of freshwater resources linked to water stress, which is an obstacle to the world's food security [17][18][19]. Basically, water stress and/or salinity stress of soil disrupt the main plant functions, resulting in significant decreases in several morphological and physiological crop features, such as chlorophyll concentration, the water status of plants, photosynthetic activity, stomatal conductance and aerial fresh biomass, causing significant losses in crop productivity [20][21][22][23]. In this context, the accurate and instantaneous detection of drought and salinity stress impact on both the local and regional scales is fundamentally important for achieving high crop yields and maximizing water productivity.
Remote sensing of different platforms has been used to monitor agricultural crops status over the past few decades. It is able to provide instantaneous information over vast cultivated regions of high economic importance, such as wheat and corn [6,24,25]. Such instantaneous information related to crop productivity is fundamentally crucial for experts and decision makers, from small-scale growers to the national authority, offering a step forward for maximizing crop yield whilst using available water resources in a more efficient way. Recently, the launch of a new generation of satellites (QuickBird, Agriculture 2022, 12, 332 3 of 21 GeoEye 1, 2, and 3) with high capacities has been utilized in the precision farming field, which in reality would be useful in site-specific management. Many previous studies have revealed the feasibility of satellite images to quantify various crop characteristics (e.g., biochemical and biophysical). These include, for example, the identification of chlorophyll concentration [6,[26][27][28]; the prediction of yield [29][30][31][32]; measuring leaf area index (LAI) [33][34][35]; the identification of nitrogen status [36]; estimating above-ground biomass [37]; quantifying evapotranspiration [38]; and the detection of crop disease [39,40]. These above-mentioned studies used satellite-based V-SRIs for the identification of crop characteristics, showing the robustness of high resolution satellite imagery (<2 m) to assess crop growth performance and to avoid stress on crops at a local scale. A decent number of satellite-based indices (Vegetation-SRIs) have been utilized as indicators of crop growth performance and yield. For instance, LAI and above ground biomass were noticed to be responsive to the Green Normalized Vegetation Index (GNDVI) and the simple ratio (SR) [41,42]. The sensitivity of the Red-edge Triangle Vegetation Index (RTVI) and other red-edge based indices to LAI and biomass has been demonstrated [41]. Intra-field variability can also be mapped using QuickBird at broad scales. Monitoring crop growth performance using ground-based remote sensing data based on spectral reflectance has also been used in making effective agricultural decisions [43][44][45][46]. Ground-based remote sensing is beneficial because the spectra are recorded at a close range to the plant canopy, removing the influence of environmental variables such as clouds and shadows that sometimes hinder the acquisition of clear images.
The latest machine-learning models such as Artificial Neural Network (ANN), and Random Forest (RF), can enhance the efficiency of predicting different crop characteristics based on spectra reflectance data. ANNs have been proven to be highly efficient as a regression approach, especially when utilised for pattern recognition and function determination. Compared to traditional methods, an ANN can tolerate and interpret an incomplete dataset; approximate results as well as they are less vulnerable to outliers [47]. As a result of their massively parallel processing architecture ANN models are able to effectively handling complicated calculations and therefore they are among the most preferred techniques for high speed processing of massive datasets [48]. ANNs can generalize non-linear styles within a certain dataset and resolve sophisticated issues. The main advantage of the RF model is the flexibility to variable distribution and it is not at risk to abnormal outputs and/or noise, and also an advanced-dimensional data-sensitive model. The RF model is a reliable model against over-fitting and it has been utilized efficiently in solving regression problems [49]. These two methods involve a great number of spectral or band-ratio indices into a single index to elevate the detection of measured crop characteristics. Many studies have pointed out that the SRI's integrated with these machine learning models can achieve precise estimation of varying biophysical crop properties [50][51][52]. For example, Yang et al. [52] found that integrating optimized SRIs with an RF model performed better for estimating the AGB of potato at certain growth stages independently or throughout all growth stages together. Wang et al. [52] and Niu et al. [53] found that linking SRI with RF, and ANN promoted the performance for quantifying AGB of wheat and maize at various growth phases. The present research hypothesized that coupling remotely sensed data at both ground and satellite-based remote sensing data and ANN and RF models can enhance the estimation of wheat growth performance and yield.
There is limited evidence available to evaluate the ANN and RF approaches for predicting the LAI, plant-h, AGB, and SPAD value of wheat dependent upon a combined approach of vegetation-SRIs extracted from both the in situ spectroradiometry dataset and QuickBird images. Hence, the specific objective of the current investigation were (i) investigate the effects of better irrigation practices, water and salinity stress conditions on the four measured wheat characteristics (LAI, plant-h, AGB, and SPAD value); (ii) having mapped different crops via remote sensing, quantify wheat characteristics through remotely sensed data across healthy, moisture and salinity stressful conditions; (iii) evaluate the efficiency of different classification algorithms for mapping varying crops throughout Agriculture 2022, 12, 332 4 of 21 the entire study site; (iv) evaluate the efficiency of varying V-SRIs extracted from in situ spectroradiometry measurements and QuickBird imagery to estimate four measured wheat characteristics; and (v) assess the effectiveness of ANN and RF dependent on V-SRIs obtained from in situ spectroradiometry data and QuickBird images and their combination to detect different measured wheat characteristics of wheat.

Study Site Description
The study was conducted in a region located in the southwest Nile Delta, Egypt (lat of 30.93 • N and long of 29.89 • E). Field work campaigns were conducted in wheat fields during the March of the winter season 2007 concurrent with satellite imagery acquisition. The field campaigns were planned to couple with the capturing time of satellite images. March time was chosen for detecting stress in wheat crops since the climate starts to be warmer over this period of time and wheat crops grow quickly. To have a spatial variation in crop health status, three study areas were selected, namely, elnaser, elkahr and elbangar covering a large area. The predominant soil of these study areas is mainly a sandy loam, since these lands have been reclaimed and cultivated recently. The climate in this region can be described as having slightly hot summers and moderate winters, with average minimum and maximum temperatures of 16.6 • C and 24.3 • C, a rainfall of 0 mm and 28.3 mm per year; humidity of 69 and 68%; and wind speed of 3.7 and 3.94 m s −1 . To have varying growing conditions of wheat such as healthy, water and salinity induced stress, twelve fields were selected across the entire study site (Figure 1). Field campaigns aimed to gather ground reference data of wheat crops, which were collected randomly at different sites. All types of common irrigation systems are used across the entire study site (traditional surface, sprinkler and drip irrigation). Farmers sometimes use alternative sources of fresh water (e.g., agricultural drainage water) to irrigate crops at the tail end of the irrigation canal system, even with the risk of an increased salinity level. The canal system of delivering water into the main irrigation canals sometimes takes a Agriculture 2022, 12, 332 5 of 21 longer time than usual for different reasons and thus subject crops to water stress that end up with a reduced crop productivity. Wheat is usually planted during the first and second week of November throughout the entire region and is often terminated at the beginning of May.

In Situ Spectroradiometry Measurements
In situ spectroradiometry measurements were collected from different fields across the whole study site during winter season of 2007 (7 March,8 March and 9 March) concomitant with the collection of QuickBird satellite image. Furthermore, spectral measurements were collected in 2014 and 2015 to be used for modelling. Spectral measurements were collected from random fields across the entire study region, taking into consideration the size of the field and even the crop health status. An ASD FieldSpec mobile spectroradiometer with a spectral range of 325-1075 nm was employed to collect reflectance from wheat leaves and canopies. To keep the changes in solar zenith angle at a minimum, the spectral reflectance acquisition was restricted to between 11:00 and 15:00 h GMT. The obtained spectra were smoothed to eliminate noise at both ends of the spectrum using the ASD software, which was done for all datasets acquired by the instrument. During spectra collection, a metal stand of 2 m height was used to put the instrument at a constant distance from the ground level at all sampling locations.

Remote Sensing Imagery Acquisition, Processing and Analysis
One QuickBird image was captured covering wheat crops within the study site. Quick-Bird has four multi-spectral bands with a resolution of 2.4 m as high spatial resolution photography. Technical characteristics of QuickBird image of wheat fields captured is presented in Table 1. The QuickBird image of wheat and other classes (e.g., clover, bare soil and water) was captured at 09:13 h GMT on 7 March 2007. The 7 March QuickBird image was radiometrically corrected by the image supplier (Infoterra Group, Newcastle upon Tyne, UK). The image to image technique in the ENVI software was run to geo-correct the image, using the previously collected ground control points (GCP) sampled for certain locations over field campaign times covering a large area of the study site. The FLAASH (Fast Lineof-sight Atmospheric Analysis of Hypercubes) module was performed to atmospherically correct the obtained image. Unsupervised and supervised classification algorithms were run in ENVI software using the corrected image to choose the most efficient algorithm for identifying wheat and other classes across the entire region. Varying unsupervised and supervised algorithms were evaluated on the acquired QuickBird image utilizing ENVI v5.1 to identify wheat, and other classes across the whole study area. Unlike unsupervised algorithms, supervised algorithms require a pre-prepared dataset collected during field campaigns. A validation dataset was constructed manually on the QuickBird image comprising more than 2000 clear pixels for every class to exclude interference between pixels of different classes (different spectral signature). Pure pixels were picked up carefully to avoid misleading data. A post classification technique, confusion matrix, was extracted for both k-means and MLC algorithms.

Sampling Strategy of Wheat Crop
During field campaigns, three random vegetative samples within each field were sampled after collecting reflectance spectra from wheat canopies to determine various wheat characteristics (LAI, plant-h, AGB and SPAD value). An area of 1 m 2 was sampled at the soil level and this was repeated three times within each field of the field survey. AGB was timely recorded for each sample. Wheat plant height was measured using a measuring tape. LAI is known as the ratio between the total leaf area of a certain number of sampled plants and the occupied area allocated for them. Following the calculation of the leaf area for individual samples, the LAI was calculated by the following formula: where: LAI refers to the leaf area index; LA refers to the leaf area per sample; OA is the taken land area by plants.
A hand-held SPAD 502 chlorophyll meter (Minolta, Osaka, Japan) was employed during field visits survey to measure chlorophyll concentration as SPAD values. Care was taken to measure chlorophyll relatively in apical leaves. The measures of chlorophyll were recorded at different locations on the same leaf to keep variability at a minimum, and then the average of these records was calculated.

Calculating Vegetation Spectral Indices
QuickBird has four spectral bands including blue, red, green, and infrared. Based on the sensitivity to variations in leaf pigments, aboveground biomass, leaf structure and the concentration of plant water, previously used V-SRIs were extracted from both in situ spectroradiometry data and the 7 March QuickBird image to estimate the efficiency of remote sensing and to assess the wheat characteristics. The V-SRIs were selected based on their sensitivity to changes in biomass, leaf pigmentation, leaf/tissue structure and plant water content. The indirect effects, which are manifested by changes in reflectance in the VIS and NIR ranges, are linked to leaf and canopy properties, such as leaf pigments, leaf structure, and scattering, which change as a result of stress factors. Table 2 lists the formulae used for calculating different selected V-SRIs associated with references.

Back-Propagation Neural Network (BPNN)
One of the commonly used artificial neural networks is the BPNN [64], which is a neural network with 3 layers. The input layer of BPNN is responsible for storing the basic data of the neural network while the hidden layer is considered a connection between the first (input) and final (output) layers; the output layer is responsible for delivering the outcome of the input data. There is also a single hidden layer in the network with several nodes; the number of nodes in the hidden layer is determined by the regression's accuracy. The hidden layer, which is often described as weight, represents the "activation" nodes. The output layer is the final layer that displays the detected value of the measured parameter. ANN models are developed as extended mathematical models that replicate human cognition in prediction and pattern recognition based on a sequence of neurons or nodes that are connected by rated connections [65,66].
The network was trained for at least 1000 iterations or until the error value dropped below a certain threshold (10 −4 ). On the training dataset, the validation technique with the LOOV procedure was used to determine the number of neurons in the concealed layer. The restricted memory parameter of Broyden-Fletcher-Goldfarb-Shanno was chosen to efficiently implement the algorithm [67]. The formula below was used to determine the most informative feature in order to improve the predictive efficiency of the regression model and to reduce the hyperspectral image dimensionality [68].
where M is a significant measure for the input variable, n_p refers to the number of variables in the input, n_H equals the number of nodes in the hidden layer,

Back-Propagation Neural Network (BPNN)
One of the commonly used artificial neural networks is the BPNN [64], which is a neural network with 3 layers. The input layer of BPNN is responsible for storing the basic data of the neural network while the hidden layer is considered a connection between the first (input) and final (output) layers; the output layer is responsible for delivering the outcome of the input data. There is also a single hidden layer in the network with several nodes; the number of nodes in the hidden layer is determined by the regression's accuracy. The hidden layer, which is often described as weight, represents the "activation" nodes. The output layer is the final layer that displays the detected value of the measured parameter. ANN models are developed as extended mathematical models that replicate human cognition in prediction and pattern recognition based on a sequence of neurons or nodes that are connected by rated connections [65,66].
The network was trained for at least 1000 iterations or until the error value dropped below a certain threshold (10 −4 ). On the training dataset, the validation technique with the LOOV procedure was used to determine the number of neurons in the concealed layer. The restricted memory parameter of Broyden-Fletcher-Goldfarb-Shanno was chosen to efficiently implement the algorithm [67]. The formula below was used to determine the most informative feature in order to improve the predictive efficiency of the regression model and to reduce the hyperspectral image dimensionality [68].
where M is a significant measure for the input variable, n_p refers to the number of variables in the input, n_H equals the number of nodes in the hidden layer, 〖|I|〗_(P_j) is the absolute value of the hidden layer weight for the pth input variable and the jth unseen layer, and |O|_j belongs to the absolute value of the output layer rating for the jth hidden layer.

Random Forest Regression (RF)
RF model, which is based on regression trees, could be a useful strategy for determining the relationship between a large number of independent variables and a single dependent variable. The RF uses the repeated partitioning method to fragment the dataset into many nodes within a homogeneous subset termed regression tree (ntree) before averaging the results of all the trees. Bootstrap sampling is then implemented based on the training dataset to build each tree to its maximum size without interrupting the input variables selection at each node. In the regression phase of each tree, the RF uses randomization by selecting a random subset of variables (mtry) for the determination of the split at each node [69]. The leave-one-out validation approach (LOOV) is then used to optimize the two major model parameters (mtry and ntree), producing a less root mean squared error of validation (RMSEV). The ntree must have a value within 1 to 25, while the value of mtry is calculated with different feature numbers. All the features were arranged once the model had been trained using the optimal parameters; the selection of the optimal features was based on the use of variable significance statistics [70]. During all iterations, the results were collected, and several options were analyzed to determine the best feature interaction, which are the ones with the lowest cost.
|I| _(P_j) is the absolute value of the hidden layer weight for the pth input variable and the jth unseen layer, and |O|_j belongs to the absolute value of the output layer rating for the jth hidden layer.

Random Forest Regression (RF)
RF model, which is based on regression trees, could be a useful strategy for determining the relationship between a large number of independent variables and a single dependent variable. The RF uses the repeated partitioning method to fragment the dataset into many nodes within a homogeneous subset termed regression tree (ntree) before averaging the results of all the trees. Bootstrap sampling is then implemented based on the training dataset to build each tree to its maximum size without interrupting the input variables selection at each node. In the regression phase of each tree, the RF uses randomization by selecting a random subset of variables (mtry) for the determination of the split at each node [69]. The leave-one-out validation approach (LOOV) is then used to optimize the two major model parameters (mtry and ntree), producing a less root mean squared error of validation (RMSEV). The ntree must have a value within 1 to 25, while the value of mtry is calculated with different feature numbers. All the features were arranged once the model had been trained using the optimal parameters; the selection of the optimal features was based on the use of variable significance statistics [70]. During all iterations, the results were collected, and several options were analyzed to determine the best feature interaction, which are the ones with the lowest cost.

Model Evaluation
The statistical metrics root mean square error (RMSE) and determination coefficient (R 2 ) were used to estimate the effectiveness of the regression model [71,72]. All parameters can be explicated by: F act refers to the pre-determined true value which is based on laboratory calculations, F p refers to the detected or simulated value, F ave represents the mean value, and N represents the total number of data records.

Statistical Analysis
Data were statistically analyzed using SPSS v. 12.0, (SPSS Inc., Chicago, IL, USA) and were also tested for normality by the Anderson-Darling manner at the 95% probability level. The relationship between the tested SRIs and various wheat characteristics was also investigated using simple linear regression performed in Sigma Plot v. 11.0 (SPSS, Chicago, IL, USA) to clearly identify the optimum SRIs based on the greatest value of a determination coefficient (R 2 ). The significance level of the R 2 for these relationships was set at a 0.05 confidence level.

Effect of Well Irrigated and Varying Stress Conditions on the LAI, Plant Hight, Biomass and SPAD Value of Wheat
Data collected over different field visits demonstrated a significant difference in various wheat crop characteristics. It is obvious that salinity stress has the greatest negative impact on different crop characteristics as presented in Table 3. LAI values differed from 2.85 to 4.09 for non-stress, varied from 2.30 to 2.94 for drought stress, and varied from 0.67 to 2.85 for salinity stress across three years (Table 3) (Table 3), across three years.
The results further showed that salinity stress had the highest impact on different crop characteristics followed by water stress, as detailed in Table 3. The noticeable variations in LAI, plant-h, AGB, and SPAD values between non-stress, drought stress, and salinity stress wheat may be attributed to various photosynthetic activities that are closely related to the development of leaves and aerial fresh biomass. The adverse effect of both parameters can lead to impaired necessary development processes. Salinity is among the most negatively influencing parameter of productivity and quality of wheat through the changing biochemical and physiological main activities in crops [73]. It also delays the seedling germination, inhibits the growth of seedlings, and seedling metabolism lead to a decreased plant growth and final crop yield [74]. Additionally, salinity negatively impacts the growth and thus the final yield of crops through reducing the available soil moisture in the rootzone, and also the toxic effects of high concentrations of both chloride and sodium ions to plants [75]. Moreover, salinity reduces the number of fertile tellers, kernel weight and the number of spikelets per spike (Abass et al., 2013), decreases the number of grains per spike, to a 1000 grain weight [76] and finally leads to less productivity [77]. The salinity stress remarkably affected the measured characteristics of corn in comparison to the other factors. Salinity inhibits leaf initiation and elongation, as well as internode advancement, and accelerates leaf abscission, resulting in lower BFW and LAI [20,21]. Salinity stress reduces photosynthetic pigments such as chlorophylls a and b, as well as carotenoids, which are linked to a poor net photosynthesis rate in maize [21]. These findings support the need for a regular and timely evaluation of the assessed cop traits in order to improve wheat tolerance to changing stress situations. As a result, boosting wheat tolerance necessitates large-scale techniques (e.g., remote sensing of different platforms) that are dependable, rapid, and non-destructive. The values having same letters are non-statistically significant (p ≤ 0.05) among different fields.

Satellite-Based NDVI for Well-Irrigated and Stressed Wheat Fields
During fieldwork campaigns across the three study sites, twelve different fields were chosen randomly, taking into consideration the crop health status to have a range of healthy wheat fields and varying stress conditions (e.g., drought and salinity stress) to spot the spatio-temporal variations in the records of LAI, plant-h, AGB and SPAD values of the wheat crop. Obvious differences in values of the measured wheat characteristics were noticed. As seen in Figure 2 as an example of satellite-based NDVI for non-stress and stressed wheat fields. The observed healthy wheat fields (red in color) receive the same agricultural practices (e.g., planting date, fertilization dose, same wheat cultivar and irrigation regimes). Stressed wheat fields, seen in the top right corner of the image, are lacking freshwater resources since they are at the tail end of the irrigation network and therefore wheat crops suffer from either water or salinity stress caused from using agricultural drainage water as an alternative water source to canal fresh water. Generally, this area is influenced by high salinity stress since it has been irrigated with highly salinewater (water salinity > 4 dS m −1 ) for more two decades. Figure 2 also shows a gradient in NDVI records in the healthy location, which may have been a result of moderate moisture stress. Moreover, the efficiency of the drainage system may vary from one field to another in the neighboring fields (small-field system) depending on how far the ditches are, which in turn affect the availability of water for plants.

Classifying Wheat and Other Crops across the Study Area
As presented in Table 4, MLC algorithms produced four different classes mainly wheat, water surfaces, clover, and soil. As a post-classification technique, a confusion matrix was built up to assess various classification algorithms providing the total accuracy, accuracy for single class, kappa coefficient, producer's and user's accuracies. The obtained results demonstrated a high accuracy for classifying different classes ranging between 86.0% for classifying clover and 97.8% for classifying water. The same trend was observed for both producer's and user's accuracies, since they were also high in MLC algorithm (>0.84). It can also be observed that the accuracy for determining wheat crops using MLC was greater than 0.90%, along with high accuracies for categorizing other classes (Table 4). Unlike supervised algorithms, k-means does not require a reference dataset. Although k-means gave high classification accuracy for wheat and clover as the two main cultivated crops in the winter across the study area throughout the image (97.6% and 94.6%) as presented in Table 5, a lesser accuracy for identifying water and soil surfaces is clear, which may lead to a greater number of misclassified pixels and thus produced a lesser overall accuracy. K-means outputs produced a high accuracy for classifying wheat crops across the study area, but in the meantime, the total accuracy for identifying individual classes are less when compared with MLC. Furthermore, the Kappa coefficient for MLC was higher (0.90) compared with that obtained value from the k-means algorithm (0.70), which can be attributed to great number of misclassified pixels. One source of misclassified pixels is the interference between dry bare soils and wet bare soils, which varied in reflecting solar energy. The results therefore suggested that the MLC classifier produced a higher total accuracy and single classification accuracies, as well for classifying various crops throughout the entire study area.

Assessment of Various Vegetation-SRIs Derived from Both In Situ Spectroradiometry and Satellite Based Remote Sensing Data under Non-Stress and Stress Conditions
The SRIs calculated from different fields throughout the study site showed high effectiveness in assessing wheat crop characteristics including LAI, plant-h, AGB and SPAD value. The spectra collected on different dates were combined to choose the optimum indices for the detection of wheat characteristics. As an indicator for the relationship between wheat characteristics and SRIs, the coefficient of determination (R 2 ) showed strong relationships. The results demonstrated that most of the SRIs derived from in situ spectral measurements produced high correlations with the measured wheat characteristics with R 2 values reaching up to 0.85 for the detection of different wheat characteristics (Table 6). SRIs were presented as having moderate to strong relationships with LAI and AGB, and R 2 varied from 0.54 to 0.84 and R 2 varied from 0.47 to 0.85, respectively. Interestingly, RDVI extracted from in situ measurements was shown to be the most sensitive index for assessing LAI, and AGB with respective R 2 values of 0.84, and 0.85 across three years. SRIs were presented as having low to moderate relationships, with plant-h with R 2 varying from 0.16 to 0.59. The highest relationships were found between IPVI and NDVI and plant-h, with R 2 = 0.59 at the second year. SRIs were presented moderate to strong relationships with chlorophyll (SPAD values) with R 2 varied from 0.39 to 0.82. The IPVI, NDVI, and SLAVI were the optimum indices for the detection of chlorophyll (SPAD values) with an R 2 of 0.82 at first year ( Table 6). The results further showed the effectiveness of QuickBird satellite images in detecting wheat characteristics. The SRIs derived from the satellite image showed good relationships with three wheat characteristics (AGB, LAI and SPAD value) and R 2 values varied from 0.51 to 0.61; 0.52 to 0.67, and 0.41 to 0.61 respectively. The NDVI derived from the satellite data was shown to be the most sensitive index for assessing AGB, LAI and chlorophyll ( Table 7). The NDVI included with VIS and NIR wavebands was widely recommended for assessing biophysical properties (biomass, LAI, and green vegetation cover) [78,79]. All the SRIs derived from the satellite image showed weak relationships, with a wheat plant height. The difference in R 2 values produced from in situ spectra and satellite data could be due to the time difference between collecting the vegetation sample and the acquisition of satellite images. The satellite image was captured a few days prior to the field surveys, while the spectroradiometry campaigns were concurrent with collecting vegetation samples; this explains why R 2 values for the SRIs extracted from in situ data had higher correlations with the measured wheat characteristics. Many previous studies employed ground and satellite-based vegetation-SRIs which agree with our findings in terms of assessing plant growth and SPAD value performance [78][79][80][81][82][83][84]. For example, Gao et al. [80] discovered that four maize plant growth parameters and SRIs had positive correlation coefficients: the ratio vegetation index (RVI) for the LAI (r = 0.47), biomass (r = 0.59), height (r = 0.59), and leaf water area index (LWAI) (r = 0.54). For these variables, Towers et al. [85] found that the NDVI and Enhanced Vegetation Index (EVI) showed similar correlation coefficients. The NDVI showed a closer relationship with LAI than the perpendicular (PVI), mixed soil-adjusted vegetation index 2 (SAVI2) and modified soil-adjusted vegetation indices (MSAVIs) and the chlorophyll index (CIrededge). The findings of this study revealed that ground remote-sensing and satellite imagery have the possibility to provide critical crop monitoring data in irrigated and stressed areas. Table 6. Coefficient of determination for the association between various vegetation-SRIs obtained from in situ spectroradiometry and Egyptian wheat characteristics in three years, collected from the study site.   The differences observed between the values of R 2 in Tables 6 and 7 may be due to; (1) the time difference between collecting in situ spectra data and satellite acquisition time; (2) another reason is that in situ spectra are collected at the nadir position, while satellite images are not; and (3) spectroradiometers have an accuracy of 1 nm, while the QuickBird satellite has a resolution of 2 m, which can affect the derivation of different indices.

Performance Evaluation of Various Models to Detect the Measured Wheat Characteristics
In the present investigation, the two groups of SRIs (group1, SRIs of in situ spectrometry and group2, SRIs of satellite imagery) were used in a single or a combined form as input variables in the ANN and RF, to assess the four measured wheat parameters In Tables 8-11. These indices showed a high performance for identifying the tested crop characteristics. Both models of ANN and RF of in situ spectroradiometry were tested using the two-year datasets as calibration datasets (n = 72) and validation datasets (n = 36). Both models of ANN and RF of high resolution satellite images were tested using the two-year datasets as calibration datasets (n = 24) and validation datasets (n = 12). As well as this, both models of ANN and RF combining data from ANN and RF of in situ spectroradiometry and high resolution satellite images were tested using the two year datasets as calibration datasets (n = 96) and validation datasets (n = 48).
The ANN and RF were trained using the V-SRIs of ground-based, satellite-based remote sensing and all indices as independent variables for detecting the tested characteristics of wheat (dependent inputs) as presented in Tables 8 and 10. The results revealed that the ground-satellite indices (GSIs) were the outstanding integration to separate the topmost variables, as shown in Tables 9 and 11. A comparison between the predictable values and the reserved values was conducted, which was not implemented for the ANN. Our investigation aimed to evaluate machine-learning approach performance and then compare the outputs clearly, so that the use of this model remarkably enhances predictability. Independent validation is the most optimal way for evaluating the efficiency of the regression model because validation datasets are not used in the model construction process. According to its performance, the ANN-GSIs-8 was the top detective model, resulting in a stronger link between complex features and LAI (Table 9, Figure 3). The three features of V-SRIs of both methods comprised in such model are of considerable significance for estimating LAI. It generated an R 2 value of 0.99 and 0.97 for the training and validation datasets, respectively. Regarding the plant-h identification performance, the ANN-GSIs-4 model outperformed the other models. The R 2 value for the training dataset was 0.94, while that of the validation dataset was 0.72 as seen in Table 9. The achieved R 2 values of 0.94 and 0.87 for the training and validation datasets, respectively, portrayed the ANN-GSIs-10 as the most accurate AGB prediction model in this study. Table 9 shows the ANN-GSIs-8 model that was developed for determining chlorophyll in plants; the performance of this model exceeded the expectations, as it achieved the R 2 value of 0.90 and 0.86 for the training and validation datasets, respectively. Elsherbiny et al. [48] suggested that despite the improvement in the expected performance, there is a need for more actions during the training process to upgrade the regression models for better predictions; such actions may include the separation of the high-level features, and optimization of the model hyperparameters. Similar to our results, Kizil et al. [86] found that ANNs based on the NDVI, SR, green NDVI, chlorophyll green (CLg), red NDVI, and chlorophyll red edge (CLr) indices have a high potential for predicting lettuce yield under drought stress. Table 8. Outcomes of calibration and validation models of ANN for the association between V-SRIs extracted from in situ spectrometry and satellite imagery and leaf area index, plant height, above ground biomass and SPAD value prior selecting the best features. Levels of significance: **, ***: at p < 0.01, and p < 0.001, respectively. Group 1, SRIs extracted from in situ spectrometry. Group 2, SRIs extracted from satellite imagery. Table 9. Outcomes of calibration (n = 96) and validation models (n = 48) of ANN for the association between the best V-SRIs extracted from in situ spectrometry and satellite imagery and leaf area index, Plant height, above ground biomass and SPAD value. Levels of significance: *, **, ***: at p < 0.05, p < 0.01, and p < 0.001, respectively. Group 1, SRIs extracted from in situ spectrometry. Group 2, SRIs extracted from satellite imagery. Table 11. Outcomes of calibration (n = 96) and validation models (n = 48) of random forest for the association between the best V-SRIs extracted from in situ spectrometry and satellite imagery and leaf area index, plant height, above ground biomass and SPAD value. The performance of the RF models was compared based on the ground-satellite indices, integrating all the studied indices as shown in Table 10. This Table illustrates the best proposed indices, best parameters, and model outputs for RMSE, and R 2 via training, and validation set. Outcomes of advanced models after adopting the best features were optimized, as explained in Table 11. The RF-GSIs-6 model was superior in the prediction of LAI and was built with 3 (ntree: number of trees) and 4 (mtry: number of features). This model enhanced R 2 , and RMSE to 0.93, and 0.161, respectively. The RF-GAI-3 model achieved a high expectation at 2 ntree and 11 mtry for forecasting plant-h (Table 11). The performance rose to 0.64 and 0.044 for R 2 , and RMSE, respectively. The RF-GSIs-7 model was established through 4ntree and 10 ntry to estimate AGB. The R 2 increased to 0.81, while RMSE decreased to 0.126. At RF-GSIs-14 model, the validation at ntree of 2 and mtry of 16 was optimized by 14 indices. Model behavior improved with R 2 (0.75), and RMSE (1.605), respectively in Table 11. The proposed RF models achieved high performance by following some procedures; optimizing hyperparameters and selecting the best indices [87]. Recently, some investigations have concluded that the SRIs concurrent with the promotion of different machine-learning models would be useful for improving the detection of plant characteristics in comparison to the use of individual SRIs [48,50,52,53,88]. For example, Yang et al. [52] found that the optimized V-SRIs were more reasonable as input parameters in the RF models and some previously published SRIs (e.g., NDVI, MSAVI, and SRVI), in estimating the biomass of potato and corn crops. These findings show that selecting the appropriate SRIs as input variables in various models is critical to the models' performance in determining crop characteristics. Both ANN and RF seem to be potential tools for predicting the measured crop traits using the integrated SRIs from ground-based remote sensing and satellite imagery. The ANN model provides a step forward to increase the predictability of measured crop characteristics, more than RF and SRIs. The results suggested that robust prediction accuracy for the proposed variables could be achieved if a suitable algorithm and higher variables were assigned. Compared to previous research, Song et al. [89] explained that the ANNs and SVMs do not have significant differences in classification accuracy, but the SVM usually performs slightly better. The SVM classifier has a greater tolerance on a small training set and avoids the problem of insufficient training of ANN classifiers. The ANNs and SVMs can vary greatly with regards to training time. Salas et al. [90] showed that the Maximum Entropy (MaxEnt) and Generalised Linear Model (GLM) had strong discriminatory image classification abilities, with area under-the-curve (AUC) values ranging between 0.75 and 0.93 for MaxEnt and between 0.73 and 0.92 for GLM. The ensemble model resulted in improved accuracy scores compared to individual models. In this work, the advanced models were performed more accurately than the previous researchers due to the optimization of hyperparameters and the selection of different high-level features.

Conclusions
This research work aimed to assess the effectiveness of in situ spectroradiometry and satellite-based remote sensing data as robust and reliable approaches in site-specific management in both arid and semi-arid environments. In our research, we evaluated the performance of V-SRIs obtained from both in situ spectroradiometry measurements and high resolution QuickBird images to quantify LAI, plant-h, AGB, and SPAD values of wheat across healthy, water and salinity-induced stress in the Nile Delta region. The main results obtained from the presented research indicated that most of vegetation-SRIs extracted from the two platforms could effectively assess wheat characteristics. Generally, the vegetation-SRIs of spectrometry data demonstrated a greater R 2 with the measured wheat traits in comparison with the vegetation-SRIs obtained from QuickBird data. Overall, both in situ spectroradiometry and satellite-based indices showed a high performance for detecting the measured wheat characteristics. Both ANN and RF seem to be potential tools for predicting the measured crop traits using the integrated SRIs from in situ spectroradiometry and remote sensing satellite images. The ANN model offers an advantageous tool to increase the prediction efficiency of measured crop characteristics more than RF and spectral indices. For example, the ANN-GSIs-3 model constructed for the determination of chlorophyll in the plant, which had higher performance expectations (R 2 = 0.96 and 0.92 for training and validation datasets, respectively). In conclusion, both ANN and RF models are simple, accurate, reliable and would be highly efficient in non-destructive and large-scale methods to detect different plant morpho-physiological characteristics quickly and precisely, especially with the new advances in satellite imagery technology.