The majority of India’s population (60%) depends on the agricultural sector for their livelihood [1
]. Agriculture depends mainly on monsoon rainfall, surface water and ground water irrigation. Since the variability of monsoon rainfall is high, it forces the south Indian farmers to adapt their irrigated areas to local water availability [2
]. Irrigated crop production is a major contributor to the green revolution, which has enabled the country to be self-sufficient [3
], accompanied by fertiliser application and other inputs in semi-arid parts of India. Timely fertiliser application with water supply is essential for a successful crop. Spectral data from Remote Sensing (RS) have been studied for many years for an adequate assessment of nutrient and water variability for yield optimisation [4
RS can be an effective tool in monitoring crop production [7
] and estimating yield [10
]. Early estimation of yield may allow better planning and forecasting the market prices and support food security based on the regional, national and global demand and supply. RS allows collecting information about crop production using non-destructive methods [12
] on a large scale for many fields at the same time. Hyperspectral (HS) RS provides continuous narrow spectral data from 400 to 2500 nm and have been proved to capture the variations in spectral response of the crop for the detection of nitrogen (N) content [13
], biomass [15
] and water stress [6
]. Development of HS sensors and their application in estimating crop biomass from multi-year data [17
] has gained increasing attention in the recent years. Multi-temporal images provide more information on vegetation phenology under wet and dry conditions than a single image [18
]. Many studies related to multi-temporal hyperspectral imaging have been published on crops such as rice (Oryza sativa
], wheat (Triticum aestivum
] and maize (Zea mays
]. Besides maize, lablab (Lablab purpureus
L.) and finger millet (Eleusine coracana
L.) are primary crops in the semi-arid region of South India. The state of Karnataka generates the major share of lablab (90%) [22
] and finger millet (62.01%) [23
] production of India. However, well-defined multi-year studies on the estimation of biomass for maize, lablab and finger millet using multi-temporal hyperspectral data under varying nitrogen (N) fertiliser and water supply levels are still lacking.
The aim of this study was to assess the potential of terrestrial hyperspectral imaging for the estimation of monsoon crop biomass based on data from three years (2016–2018). The specific objectives of the study were: (1) to develop statistical models to predict the fresh matter biomass (FMB) of the three crops: lablab, maize and finger millet; (2) to assess the effect of different levels of N and water supply on the predicted FMB value and for a wide range of crop phenology over the complete growing period; and (3) to evaluate the importance of spectral regions in the resulting models and understand the causal relationships of the model.
In the rainfed experiment, the range of FMB (S sub-plot) over the three years 2016–2018 was 0.16–14.6 t/ha for lablab, 0.76–67.71 t/ha for maize and 0.89–59.39 t/ha for finger millet (Figure 3
). Similarly, for irrigated experiment, it was 0.22–44.33 t/ha for lablab, 2.28–79.38 t/ha for maize and 0.91–69.63 t/ha for finger millet. Crop growth continuously increased until S3 or S4 and started to decrease at later stages in all crops and along the three years. The FMB was higher in I than R experiment except for finger millet at Y1S1 and Y2S2.
To gain an impression of the spectral variation for each crop, minimum, average and maximum spectral reflectance from the images of rainfed and irrigated experiments were determined for the three crops lablab, maize and finger millet during the three monsoon seasons (Figure 4
3.1. Crop Specific FMB Models
To develop a prediction model for FMB, which is valid for varying conditions, individual FMB models were developed for two irrigation regimes (i.e., R and I) and a combination of the datasets of both irrigation regimes (i.e., Generalised model). The prediction accuracy of the FMB models varied between the crops and depended on the dataset (R, I, or Generalised) used for model development (Figure 5
). The lowest rRMSEP value nearing to zero was considered as a better model. Building the RFR models separately for both R and I treatments, the lowest median rRMSEP for lablab was found in I with 17.9% (R²val
= 0.34) and for maize and finger millet in R experiment with 18.5% (R²val
= 0.60) and 19.8% (R²val
= 0.46), respectively. With the combined dataset, the rRMSEP for lablab was 13.9% (R²val
= 0.53), for finger millet was 18% (R²val
= 0.46) and for maize was 18.7% (R²val
= 0.53). Overall, compared to the experiment-wise modelling approach, model accuracy (in terms of rRMSEP) was higher for all crops when models were built with data from both water supply levels.
In RFR modelling, the mtry parameter indicates the number of input variables randomly chosen at each node. Optimum mtry values (best tune values) were found to be 13, 7 and 7 for lablab; 8, 12 and 13 for maize; and 8, 2 and 7 for finger millet, respectively, for the Rainfed, Irrigated and Generalised models.
The plots of fit for the 100 randomised Generalised models of the three crops are shown in Figure 6
. The randomised models were based on stratified (according to sampling date and fertilisation rate) randomly selected samples for the calibration and validation dataset. Having considered these random effects in RFR modelling, it becomes obvious that predictions show a substantial underestimation with increasing FMB values (Figure 6
3.2. Performance of the Generalised Models Considering N Application Rates, Sampling Dates and Water Supply
The normalised deviation of predicted and measured biomass was used to check if the prediction accuracy of Generalised models varied among the three levels of N application (Figure 7
). Overall, only minor deviations were found among low, medium and high levels of N supply for all crops between 2016 and 2018.
Prediction accuracy of Generalised models varied strongly among the sampling dates (Figure 8
). While in Y1, normalised deviation for lablab showed an irregular pattern, such as an overestimation (Y1S1 and Y1S3), underestimation (Y1S2) and good concordance (Y1S4 and Y1S5). A decreasing trend of deviation was observed with increasing crop maturity in Y2. With maize, there was a general decline across sampling dates both in Y1 and Y2. With finger millet, there was overestimation for the early sampling dates (Y1S1–Y1S2 and Y2S1–Y2S2) followed by decreasing underestimation for the later sampling dates in 2016 (Y1S3–Y1S5). Following this deviation, it can be concluded that crop phenology influenced model performance with a tendency towards overestimation at early stages and an underestimation at later stages of crop growth.
No systematic over- or underestimation was found for biomass prediction of the three crops at the two levels of water supply (Figure 9
). Hence, model prediction was rather robust with slightly larger deviations for lablab at both water supply levels as compared to the other crops.
3.3. Importance of Wavelengths
The wavelengths of the crop helped in differentiating and identifying the crop traits based on their spectral region. The best model was identified out of 100 Generalised models on each crop based on the lowest RMSE value. From the best model, the wavelengths contributing above 75% in the prediction of FMB were identified, as shown in Figure 10
. For lablab, it was found that a multitude of spectral bands from the green, red and near infrared (NIR) region (546–910 nm) contributed significantly to the estimation of biomass. Contrastingly, for maize, only wavelengths in the NIR region (750–794 nm) and for finger millet in both the red and NIR region (686, 694, and 774–814 nm) were important.
The aim of the study was to estimate the monsoon crop biomass for three crops (lablab, maize and finger millet) based on terrestrial hyperspectral imaging during crop growth season across three years (2016–2018). With a high number of samplings during three consecutive monsoon seasons, a wide range of phenological stages of crops could be covered. This is an important issue considering the validity range of prediction models, since the harvest time of crops varies considerably in agricultural practice, for example due to nutrient and water availability and moisture content of grains. Thus, by our deliberate multi-temporal approach, the validity range of Generalised models was significantly broadened, which was further enhanced by the integration of crop measurements under a wide range of N fertiliser and water supply.
The FMB models were developed based on the predicted FMB values and tested with the observed FMB values for validation. Overall, the results indicate that the Generalised models had higher estimation accuracy (with rRMSEP ranging from 13.9% to 18.7%) for all the three crops, as compared to the rainfed and irrigated models. One reason may be that, with the combination of data from two experiments, representing severe water limitation (Rainfed experiment) and optimum water supply (Irrigated experiment), the range of crop productivity became much broader, which eventually may have increased the robustness of regression models.
Similar prediction errors were found in a previous study for maize biomass by RGB images (relative error 16.66%, R² = 0.78) [40
]. In contrast to our study, their models included canopy height parameter additional to RGB information, which shows the promising potential of structural data calculated with photogrammetric methods particularly when they are combined with data from other sensor types [11
]. Although spectral information was limited to the Red Edge Modified Ratio Index (REMRI), the combination of spectral data with LiDAR-derived metrics produced only a slightly smaller error in the estimation of maize biomass [10
] as compared to our study. However, as the sampling was done at only one date of one single year and because no defined N and water supply was applied, the transferability of such modelling approaches beyond the study area may be limited.
Although lablab is an important legume in the food and cattle production system in India, this plant has not been subjected to any remote sensing assessment this far. The fact that it was the least productive crop in both experiments across all years, strongly reduced the range of FMB values for model calibration. However, the highest prediction errors obtained were between those of the more productive crops maize and finger millet. Similarly, finger millet is a rarely researched crop in terms of remote sensing. In a single-year satellite-based study with pearl millet, which exhibits a similar growth pattern as finger millet, Lambert et al. [43
] found a strong relationship between Sentinel-2 based LAI data and crop biomass (R² = 0.84), which is much higher than in our study (R² = 0.46). Although neither sensors and platforms, nor the range of crop phenology and management were comparable, this study highlights the scope of well-informed satellite-based hyperspectral imagery, and proximal imagery may make important contributions to such developments, e.g., by the provision of crop-specific spectral libraries as a source of reference spectra that can aid the interpretation of hyperspectral and multispectral image [44
Although we observed quite some deviation between predicted and observed FMB, the median was close to zero at all levels of N and water supply, when the Generalised models were used for all the three crops. This proves the robustness of models, which allow biomass prediction irrespective of varying nitrogen and water management practices. However, the pronounced pattern of deviations along the sampling dates in Y1 and Y2 points at the limitations of models, which are solely built on spectral information. Although soil-containing pixels were masked out of the images prior to model calibration, a substantial overestimation of biomass at the initial sampling dates in the growing season occurred, while biomass was frequently underestimated at later sampling dates. The overestimation of biomass may be caused by weeds at the initial sampling dates as the effect of weeds could not be controlled in the estimation of biomass. Further, the prediction error for crops increased in the order lablab (13.9%), finger millet (18%) and maize (18.7%), which clearly shows that spectral information captured at the top canopy layer is increasingly less representative of the biomass at lower layers of the canopy. This effect is also addressed as the “saturation constraint” and was regularly found in previous studies (e.g., [45
]) particularly when vegetation indices, such as the Normalised Differential Vegetation Index (NDVI), were used. Obviously, this problem cannot be circumvented by the use of individual spectral wavelengths instead of vegetation indices, but stresses the vital necessity to develop multi-sensor approaches, in which each sensor’s shortcomings are compensated by other sensors [10
As a common trait for all three crops, wavelengths in the red-edge area were of utmost importance for the estimation of crop biomass. The Generalised model for lablab further comprised several wavelengths in the green, red and NIR region, indicating a larger number of variables in these models. Similar important bands were found by Manjunath et al. [50
] in the discrimination of chickpea, pea and lentils. While in maize the most important variables were found in the red-edge region, the model of finger millet also contained wavelengths in the red region as important variables. For lablab, several bands were identified in the visible part of spectrum (450–750 nm) to be important for biomass prediction. These bands are known to be affected by plant pigments, especially by chlorophyll [51
]. The ability of lablab to fix atmospheric nitrogen may have resulted in longer greenness of the leaves over the growing period, which leads to a higher reflectance at the green peak (~550 nm) and a higher absorbance in red (~650 nm). In general, the identified spectral bands confirm accepted knowledge about biomass-reflectance relationships [52
It has been shown that random forest regression modelling based on multi-temporal hyperspectral imagery allows the prediction of fresh matter biomass of three major food and feed crops, i.e., lablab, maize and finger millet, grown in the monsoon season on vast areas of southern India. The results of this study showed that Generalised models, which were built on crop data from both rainfed and irrigated conditions, are more robust than water management specific models. For all Generalised crop models, deviations between predicted and observed values were independent of N fertiliser and water supply, indicating a wide validity range of the models. However, an overestimation of crop biomass was detected at initial growth stages of crops along with an underestimation at the later stages of the crop growth, which was particularly pronounced with the more productive crops maize and finger millet. While wavelengths in the red edge region were important variables in all three Generalised crop models, several others in the visible and near infrared region were important in models for lablab and finger millet. The results of this study suggest that, for the tested monsoon crops at advanced maturity, even hyperspectral information is not sufficient for an accurate biomass prediction. Data fusion from a combination of sensors may improve the prediction performance, as complementary sensors can compensate for their respective deficiencies. Although lablab and finger millet are important food and cattle crops in South India, there is surprisingly little research done up to date, thus further research in this field will be of major importance considering the dynamic changes in societal and climatic conditions in this region.