1. Introduction
Precision agriculture (PA) is a management technique that selectively applies crop farming resources such as fertilizer, water, pesticides, and herbicides based on the plant needs within a field [
1,
2,
3]. Nitrogen is an essential macronutrient to plants as a major constituent of organic material including enzymic processes, chlorophyll, and oxidation-reduction reactions; levels of nitrogen in plant tissue can indicate yield potential and crop health [
4]. However, nitrogen is one of the most expensive nutrients to supply, and studies found that nitrogen recovery efficiency by annual crops was, on average, less than 50% of the amount of fertilizer applied [
5,
6]. Excessive fertilizer can leach from the soil and contaminate waterways, disrupting local ecosystems and causing denitrification that results in greenhouse gas emissions [
7]. Nutrients that have been added beyond the critical level of maximum growth can continue to accumulate in the plant tissue without any further yield increase [
4]. Commonly in grain crops such as wheat, excessive nitrogen can cause plant stems to grow tall to the point of lodging—the stems bend over, making it difficult to harvest and increasing the chances of grain moisture and disease, and often reducing yield significantly [
8]. Usually, nitrogen deficiency can be noted from chlorosis, the condition in which leaves yellow as the plant’s chlorophyll content drops [
9]. With reduced photosynthetic activity, the plant will not reach peak health and yield will be low. Water is also key to the transportation of nutrients from the soil to a plant. The availability of water to a plant depends on the weather conditions during the growing season, the soil moisture, and the field micro-topography affecting water flow and accumulation [
10,
11]. Understanding a field’s characteristics as well as monitoring plant biophysical characteristics including height, leaf area, and leaf colour can provide useful information in nitrogen fertilizer applications.
In PA, remote sensing imagery is useful because it does not require physical or destructive contact with plants to gather valuable crop information [
12]. Vegetation indices (VIs) can be derived from the spectral information provided by the imager; VIs are mathematical combinations or transformations of spectral bands that have been widely used in agricultural research. VIs allow for the deriving of specific plant properties such as chlorophyll or nutrient content by taking advantage of the differential spectral properties of plants in the visible and near-infrared (NIR) wavelengths [
13,
14,
15]. The VI information can then provide timely knowledge of crop conditions, allowing for a suitable rate of application at the right time and location depending on the variations within a field.
Optical satellite imagery for crop monitoring has had several decades of research and application [
16]. Examples of recently launched optical satellites in operation include RapidEye since 2008, Landsat 8 since 2013, and Sentinel-2 since 2015, all frequently used in studies on crop nutrient, yield, and growth management [
3]. RapidEye has five spectral bands with 6.5 m resolution. Depending on the location, the five-satellite constellation revisit time is between one to five days. Landsat 8 Operational Land Imager has nine spectral bands with varying spatial resolutions of 15 to 30 m. It has a 16-day revisit time to the same area and takes over 700 scenes a day. Sentinel-2 has 13 spectral bands with 10 m, 20 m, and 60 m spatial resolutions depending on the band. Sentinel-2 constellation is composed of two satellites allowing for a five-day revisit time over the same area. Limitations in optical satellite imagery include low spatial sensitivity as the spatial resolution may be too coarse for small-scale crop fields [
12]. The temporal sensitivity can be rather low, such as that of Landsat 8 with a 16-day revisit time, during which crops would have changed significantly, and thus, valuable information on the different stages of growth would not be obtained. Sentinel-2 and RapidEye have higher temporal resolutions of one to five days, but it can vary by location and not all images may be useful due to cloud cover obscuring land. Additionally, factors such as cloud cover, geometric distortion, and atmospheric distortion may require advanced processing expertise to ensure sufficient image quality [
17].
New satellite systems are improving in spatial and temporal sensitivity, such as the PlanetScope satellite constellation [
18]. Designed for collecting information for use in land-change detection, crop monitoring, climate monitoring, and more, the PlanetScope satellite constellation is composed of over 130 satellites called Doves allowing for spatial resolutions of 3 to 5 m and daily revisit. Beginning with the first launch of a group of Doves in March 2016, over 10 more groups have launched since to improve revisit time, as well as spatial and spectral resolutions. PlanetScope imagery products are also available in multiple asset forms with different radiometric processing and rectification, such as the “surface reflectance” product imagery downloaded for this study. Currently, a select portion of Planet data is available for free download under an open data access policy. PlanetScope imagery has been used in studies of wheat yield, biomass, and LAI monitoring and modelling with promising results [
19,
20,
21]. However, there are few studies focused specifically on nitrogen management using PlanetScope data, a gap that this study aims to fill.
With the rapid advancement in UAV technology in recent years, there is much research interest in UAV-based crop canopy nitrogen retrieval [
3]. UAV-based remote sensing can provide low cost and higher spatial and temporal resolution data for crop management. Individuals with basic training can operate a UAV using programmed routes and collect images with <10 cm resolutions [
22]. They can be flown to capture more frequent image data and offer flexibility in operation for times when weather is most suitable [
23]. Compared to satellites, overall, UAV-based systems are often lower in cost for data collection and processing. Studies have shown significant correlations between crop spectral variables derived from UAV imagery and crop nitrogen content [
24,
25,
26]. Many studies are based on single or combinations of different spectral indices’ relationships with crop nitrogen content, noting variation in the relationships at different stages of crop growth [
24,
27,
28]. The spectral indices with the strongest relationship to crop nitrogen were noted to occur during early wheat growth stages before and up to flowering. Often, studies involving the estimation of nitrogen were conducted in controlled experimental conditions, and more studies are needed on real field conditions.
Wheat was selected for this study because it is among the most grown crops in Ontario [
16]. With the development of new remote sensing technologies, processing methods, and computing capabilities, estimation models for crop nitrogen can be improved. Machine learning is an area of research interest as it can be used to develop accurate crop monitoring models for large, nonparametric, nonlinear datasets [
29]. Recent studies have tested the use of linear regression, Random Forest (RF), and Support Vector Regression (SVR) models in UAV-based canopy nitrogen weight (CNW) prediction models [
25,
26,
27,
29]. Although linear regression is a commonly used method to predict nitrogen, some VIs (e.g., NDVI) may saturate beyond the early growth crop stages and some models may have reduced accuracy due to multicollinearity [
26,
30]. By contrast, machine learning-based regression methods such as RF and SVR were found to produce more accurate models compared to classical linear regression methods, as they are unaffected by the assumptions of linear regression [
30]. However, most current literature on the use of remote sensing data and machine learning have only considered spectral information for crop nitrogen modelling [
28]. As a crop’s nitrogen status can be affected by many factors including fertilizer application, soil characteristics, water availability, and field micro-topography, nitrogen prediction models may be improved if these plant physiological and environmental variables are considered [
31].
With better management of nitrogen fertilizers, not only can costs and negative environmental impacts be minimized—yield and quality can also be increased. This study aims to evaluate machine learning modelling methods with plant spectral, biophysical, and field environmental variables to predict CNW in wheat crops using UAV and satellite-based imagery. The objectives of this study include, (i) studying the relationship between the spatial variation of CNW and factors such as plant height, LAI, soil moisture, and topographic metrics within wheat fields in Southwestern Ontario using multispectral UAV—or PlanetScope—imagery; (ii) determining the optimal combination(s) of spectral variable(s), crop variables, and/or environmental conditions (soil, water, and topographic data) for wheat canopy nitrogen estimation and prediction; and (iii) evaluating the temporal variation of nitrogen estimation and prediction during the early growth stages of wheat using UAV images or PlanetScope images, and related variables.
4. Discussion
For this study, the RF and SVM regression methods were used to predict the CNW of wheat using UAV MicaSense band reflectances, PlanetScope band reflectances, selected VIs, plant height, LAI, soil moisture, and topographic metrics. The models created were grouped according to UAV-based and satellite-based data.
For the UAV RF and SVR regression models, calibration was conducted with 28 variables from single and multi-date datasets. Evaluating the validation models of each dataset, the performance of UAV single-date models was poor with R
2 values of, at most, 0.25 and overall non-significant results. Combining UAV multi-date data yielded better results, with the best performance from the RF three-date model of 12, 20, and 27 May resulting in an R
2 of 0.74 and an RMSE of 2.76 g/m
2. For the PlanetScope RF and SVR models, the calibration of the models used 24 variables for single and multi-date datasets. Of the validation models, the single-date models of 20 May and 4 June had the lowest performance. However, the other PlanetScope single-date models’ performances were much better overall compared to the UAV single-date models. In general, the PlanetScope multi-date models did not have significantly better results than its single-date models. The best-performing PlanetScope model was that which was based on three dates, 12, 20, and 27 May, using RF, with an R
2 of 0.83 and an RMSE of 1.77 g/m
2. Both the best UAV and best satellite models were from 12, 20, and 27 May data, during which the wheat crops were in the BBCH 23–41 growth stages, mainly defined by tillering, stem elongation, and the beginning of the booting stage. As noted by Hawkesford (2017), the application of nitrogen fertilizer during these early growth stages before flowering is most conducive to efficient nitrogen use and yield response [
67]. The ability to accurately estimate the nitrogen levels of crops during early growth stages would be most beneficial for farmers.
In the RF variance importance plot of the best-performing UAV model, of all variables, plant height was the most important predictor of CNW. Song and Wang (2019) also noted that plant height is useful in estimating phenology, biomass, and yield in addition to nitrogen uptake in wheat [
68]. On the plot, LAI was the second most important predictor of CNW. LAI has been used extensively in studies to successfully predict crop chlorophyll content, biomass, and yield [
69,
70]. The study by Zhao et al. (2014) found a significant positive relationship between LAI and differences in crop nitrogen content across wheat growth stages [
71]. The gap fraction method of calculating LAI is more accurate during the earlier growth stages of a crop when the canopy is not so dense, allowing for contrast between the vegetation and soil or vegetation and sky images [
68]. Among the VI’s used in the model, the red-edge VIs (NDRE, RVI2, and CI_RE) were amongst the most important. The red-edge region (680–800 nm) has been shown to encompass sharp changes in the canopy reflectance and can be used to identify important biophysical parameters of the crop. Nitrogen levels have shown the sensitivity of the red-edge region in estimating leaf chlorophyll content due to the high absorption of red radiation and high reflectance of NIR radiation [
69,
72]. Of the MicaSense bands individually, the NIR band was of highest importance in the model while other individual bands had little effect. Soil moisture also appeared as a variable of high importance, and subsequent variables on the plot had noticeably lower importance. Studies have noted the importance of soil moisture in soil nitrogen mineralization, crop nitrogen uptake, and utilization [
73,
74]. Of the topographic metrics, the topographic wetness indices were most important, while the remaining metrics had little effect.
In the RF variance importance plot of the best-performing satellite model, similar to the UAV model, plant height was the most important predictor of CNW, with LAI following closely. Interestingly, the PlanetScope blue band was the second most important variable. As the PlanetScope blue band has greater width compared to MicaSense, perhaps the wider bandwidth captured a change of canopy reflectance in the blue-edge region (480–517 nm) which was previously noted in the study by Wei et al. (2008) to be related to crop nitrogen [
75]. Other PlanetScope bands in the model had varying levels of importance interspersed amongst the 11 VIs used. On the plot, other non-spectral variables of soil moisture and topographic metrics were of least importance to the model.
From the UAV-based RF variance importance plot, groups of variables were selected for testing in models. Groups of the top 7, 8, 9 13, 16, and spectral-only variables were modeled, with the group of top seven variables of the SVR model performing the best with an R
2 of 0.80 and an RMSE of 2.62 g/m
2. The top seven variables included plant height, LAI, all three red-edge VIs, BNDVI, and the MicaSense NIR band. Compared to the best models from studies by Asataoui et al. (2021), Jiang et al. (2019), and Zheng et al. (2018) using UAV-based spectral variables to estimate wheat nitrogen content, their models had lower R
2 values, ranging from 0.76 to 0.63, and greater RMSE values [
24,
26,
27]. With the UAV-based best model in this study, significantly lower RMSE is a major advantage in terms of reducing the costs of nitrogen fertilizer.
The satellite-based variance importance plot was used to select variable groups for model testing including the top 6, 10, 13, 17, and spectral-only variables. The RF model group of the top 17 variables had the best performance with an R2 of 0.92 and an RMSE of 1.75 g/m2. The height, LAI, all four PlanetScope bands, and total 11 VIs were the variables in the best-performing model. For both the UAV and satellite best-performing selected-variable models, plant height and LAI were the only non-spectral variables. With methods for deriving the height and LAI of a wheat crop field from the UAV Phantom 4 RTK imagery, all variables in the top models can be obtained from in-situ, non-destructive, remote sensing data.
For both the UAV and satellite spectral-only variable groups, the results were poor with R
2 values <0.50 and significantly higher RMSE values compared to other tested variable groups. This is consistent with the studies by Astaoui et al. (2021) and Schirrmann et al. (2016), in which it was noted that, within wheat crops, UAV imagery was limited for the observation of nitrogen status but had good performance in the monitoring of biophysical parameters [
27,
28]. For the UAV validation spectral-only model, the RMSE dropped by 32% as compared to the best-performing top seven variable model. In the satellite validation spectral-only model, the RMSE dropped by 45% compared to the best-performing top 17 variable model.
In the final validation of canopy nitrogen models with variable combinations, the UAV SVR models mostly had greater R
2 values but also greater RMSE values compared to the RF models. In the UAV spectral-only variable group models, RF had better results than SVR. Considering studies with spectral-only variables for crop nitrogen models, the results are consistent, with RF yielding better nitrogen level prediction compared to SVR models [
25,
29,
76]. Only in the UAV best-performing model of top seven variables was SVR performance better in terms of having both a higher R
2 value and a lower RMSE compared to RF. Of the satellite variable combination models, SVR had better performance than RF except for the top 13 and 17 variable groups. The best-performing satellite model was RF with the top 17 variable group. Although it appears difficult to determine if RF or SVR models are better when built with non-spectral and spectral variables together, ultimately the ideal result is a model which can most accurately predict canopy nitrogen in wheat. In both the UAV and satellite models with different variable combinations, the top variable groups had good overall performances. In comparison to studies with spectral-only variable models, the variable combination models in this study all had lower RMSE values. In the context of nitrogen estimation and practical application, lower RMSE (g/m
2) in models is most beneficial for fertilizer management recommendations.
5. Conclusions
In this study, machine learning regression methods were tested to predict wheat CNW using UAV MicaSense band reflectances, PlanetScope band reflectances, associated VIs, plant height, LAI, soil moisture, and topographic metrics. For UAV models using 28 variables, the combination of 12, 20, and 27 May data with the RF validation model produced the best results with an R2 of 0.74 and an RMSE of 2.76 g/m2. From the model’s variable importance plot, the top 7, 8, 9, 13, 16, and spectral-only variable groups were tested. The best validation model used SVR with the top seven variables, which included plant height, LAI, four VIs, and the MicaSense NIR band. For the PlanetScope models using 24 variables, the best performing model was RF with 12, 20, and 27 May data resulting in an R2 of 0.83 and an RMSE of 1.77 g/m2. Based on the model’s variable importance plot, the top 6, 10, 13, 17, and spectral-only variable groups were tested. The validation model with the best performance was RF using the top 17 variables including height, LAI, all four PlanetScope bands, and 11 VIs.
A common limitation of in-situ agricultural models, including those developed in this study, are their empirical nature, and applicability can be limited to the dataset they are built and validated upon. Each field and growing season has different conditions and factors that affect plant growth, so models will need further testing to determine their effectiveness in precision agriculture methods. PlanetScope satellite constellation has also launched, and a third generation of sensors were introduced in 2020, known as SuperDove, with the potential to capture imagery with eight spectral bands including a red-edge band. Future work can consider further testing satellite-based nitrogen prediction models including red-edge VI variables.