1. Introduction
Modern plant breeding efforts depend on phenotypic data to select high-yield and stress-tolerant plants quickly and efficiently [
1,
2]. Ag remote sensing technologies have been developing rapidly for many years. Currently, field phenotyping activities are being conducted with satellites, airborne platforms (manned and unmanned), and ground-based vehicles [
3,
4]. Various sensors such as RGB (Red–Green–Blue), hyperspectral and thermal cameras are carried by these platforms to take images of the crop field. These technologies have been proven effective through various Ag remote sensing projects [
3,
5]. However, the quality of Ag remote sensing data is still limited by various noise sources such as the changes of day light, wind speed, temperature, sun angle, etc. [
6,
7,
8,
9]. Among these noises, the diurnal variability is one of the major factors causing significant quality issues in Ag remote sensing.
The diurnal impact to plants’ reflectance characteristics is a complicated process. It introduces strong noises in plant phenotyping result [
8,
10,
11]. For example, the reflectance characteristics of the same plants at noon can be very different from those in the afternoon. These variabilities are caused by interactions between camera’s sensitivity, camera’s view angle, canopy geometry, solar zenith angle, solar azimuth angle, and shadows [
12,
13]. Meanwhile, the plant itself is endogenously sensitive to the environmental conditions with the complicated interactions between the genetic backgrounds, the external environments and the treatments (G (Genotype) * E (Environment) *
t (Treatment)). All these affect the final reflectance characteristics of plants throughout the day, causing the diurnal variabilities.
The diurnal variance on phenotyping data has been reported as unignorable impacts in many plant studies. Gardener [
14] stated that it was a major unresolved noise issue in using reflectance measurements for estimating leaf area, plant biomass, or phenology as it was affected by diurnal changes. In the last decades, diurnal variability has been documented in various plant phenotyping studies for corn [
15], soybean, wheat [
11] with both passive [
7,
16] and active sensors [
15]. These variances are retained in the captured images, weakening the signal power of the data. Sometimes, the generated variances are even larger than the plant differences caused by biotic or abiotic stresses, which severely limits the accuracy in phenotyping data [
11,
16]. However, for most of current remote sensing studies, people rarely consider the impacts from diurnal variabilities, which introduces severe noises into the final analysis results. For example, the NDVI values had over 10% difference at different time points on the same plant from the raw remote sensing measurements [
11,
16]. Similar diurnal variances are also found in many other plant features such as plant temperature, spectra features, chlorophyll content, etc. [
9,
17]. Therefore, it is extremely important to reduce the diurnal variabilities in the data for improved remote sensing quality.
With the development of plant phenotyping, researchers are better aware of the diurnal variabilities while there are limited studies on how to deal with diurnal variability in Ag remote sensing. To reduce impacts of diurnal variability, remote sensing technologies, such as unmanned aerial vehicles (UAV), usually have strict rules on the sampling time window and weather condition requirement [
18,
19,
20]. According to Bellvert et al. [
21], the proper time of the day to acquire thermal and multispectral images should be around noon, because there is almost complete absence of shadow effects. Meanwhile, plant physiology changes on a cyclical diurnal nature based on photosynthetic activity and processes dependent on incident solar radiation [
22]. Consequently, UAV data collections are required to operate during a certain range of time for accurate monitoring crop physiology [
20].
It is important to quantitatively model the diurnal variability of crop phenotyping features for improved Ag remote sensing quality. The combined PROSPECT leaf optical properties model and SAIL canopy bidirectional reflectance model (PROSAIL) has been widely used to predict the change of plant canopy spectral reflectance influenced by the changing environmental conditions such as solar angle [
23,
24]. However, the model does not meet the accuracy requirement in plant phenotyping remote sensing. For example, Berger et al. [
25] compared PROSAIL’s simulation result with the spectra data collected from the field and found that PROSAIL’s predicted spectra showed severe drifts from the field measurements especially at the green and red edge (~700 nm) wavelengths [
25]. This was also confirmed in the field remote sensing experiment in 2019 at Purdue University, in which the PROSAIL model only predicted less than 5% of the diurnal variance of the observed NDVI from a hyperspectral camera. Another potential solution arises from the use of different regression methods to predict and compensate the diurnal effects. For example, a correction model with the polynomial regression method was developed to predict the crop reflectance as a function of solar zenith angle, time of day, and ICI (instantaneous clearness index). The capability of the model in reducing the diurnal variation with Green Normalized Difference Vegetation Index (GNDVI) and some individual bands [
7] was tested. However, the plant data was collected on a small portion of the leaf by a handheld radiometer with four bands, which may not properly simulate airborne remote sensing platforms carrying hyperspectral or multispectral cameras. The simple polynomial regression model could successfully describe the changes in data over three consecutive days. However, it may fail to represent the general pattern on other days when the plants are at different stages of their growth cycle. Therefore, a more accurate diurnal variability model is still urgently needed.
Modeling diurnal variability in plant phenotyping data is difficult because of the complicated interactions between plants and real-time environment conditions such as cloud coverage, wind and so on. It is necessary to collect hundreds or thousands of time series images of the same field in order to model the diurnal changes, but this had been difficult with the current airborne platforms. For instance, the UAV platforms were rarely reported to image the same field for more than one time a day due to various environmental and logistical limitations [
26]. In order to solve this problem, the Ag engineers at Purdue University deployed a fixed field VNIR hyperspectral gantry platform as a “mock drone” system in Purdue’s research farm in 2019, which is similar to the phenotyping system called Terra-Ref [
27]. With this field gantry imaging system, hyperspectral images of the same field can be continuously collected under various weather conditions throughout the daytime in minutely base. A local weather station and distributed soil sensors in the field provide synchronized environmental condition data with the images.
Recent advances in remote sensing technologies enable the measurement of various image-based phenotypic features such as vegetation indices (e.g., NDVI), biochemical constituents (e.g., chlorophylls), water, or raw spectral features, that dramatically boosts modern crop studies [
3,
28]. However, as described above, the problem of diurnal variations in these phenotypic features is still unsolved due to the limits from both hardware and software aspects. In order to close this gap, in this paper, we performed a comprehensive investigation on the diurnal variability in over 8000 repeated hyperspectral images of the same corn field taken by the field imaging gantry at Purdue University over the 2019 growing season. Images of corn plants from vegetative stage V4 to reproductive stage R1 were captured. Hyperspectral image processing algorithms were applied to calculate the reflective canopy spectra and predict plant physiological features such as NDVI, RWC, and so on. The time series decomposition method was applied to obtain diurnal curves for these plant phenotyping features. After combining the imaging results over 31 days, it clearly showed repeatable diurnal changing patterns for NDVI and other phenotyping features. Regression models were successfully developed to precisely describe these diurnal changing patterns. With the result of this work, Ag remote sensing users will be able to more precisely understand the deviation/change of crop feature predictions caused by the specific imaging time of the day. It’ll also be easier to decide on the acceptable imaging time range and calibrate the diurnal factors with the developed models.
2. Methods
2.1. High-Throughput Field Imaging Acquisition System
The field VNIR hyperspectral gantry platform at Purdue University’s Agronomy Center for Research and Education (ACRE) was used to collect imaging data in this study. A weatherproof VNIR push-broom hyperspectral camera (MSV-101-W, Middleton Spectral Vision, Middleton, WI, USA,
Table 1) was carried by the 7 m high gantry platform to scan a 50 m by 5 m strip field under a wide range of weather conditions. The VNIR images had spectral range from 376 to 1044 nm with spectral resolution of 1.22 nm, and spatial resolution of 0.5 cm/pixel ground sample distance (GSD). The system could be configured to automatically scan the crops in the field repeatedly. It takes 6.5 min to scan the 250 square meters field, and the scanning frequency can be higher if a sub-portion of the field needed to be scanned (
Figure 1).
This single sided imaging gantry stands by the north side of the field and the length of the camera structure on the top was restricted by a certain ratio of the gantry’s height. This unique design kept 100% of the shadow away from the crops. This shadow-free feature enabled the system to more realistically simulate drone remote sensing in the field. The hyperspectral camera utilizes sunlight as the lighting source, so the gantry system can function any time after sunrise until sunset on each day. A white reference panel was installed 0.5 m underneath the hyperspectral camera and moves together with it. A local mini weather station and distributed soil sensors were installed in the same field to collect real-time environmental condition data such as temperature, solar irradiation, wind speed and soil moisture when each image was taken.
2.2. Experiment Design and Data Collection
To study the diurnal variation in phenotyping data, two genotypes of corn plants, including genotype B73 × Mo17 and P1105AM, were grown in the field underneath the gantry in 2019. Each genotype was treated with three different nitrogen solutions: high nitrogen (HN) with 56 kg/ha (32 mL 28-0-0 in 1 L water), medium nitrogen (MN) with 28 kg/ha (16 mL 28-0-0 in 1 L water), and low nitrogen (LN) with 0 kg/ha (water). Each of the genotype by nitrogen treatment combination is repeated in 5 2-rows-by-3-m mini-plot replicates so there are 30 plots in total in the field (
Figure 2).
In order to capture the instant impacts to the images from the changes of cloud coverage, wind speed and other environmental conditions, the team decided to have an imaging frequency of every 2.5 min. This allowed us to scan up to six different plots considering the extra time needed for data transfer, real-time image processing, and homing the gantry cart. The six plots were selected to cover all three nitrogen levels and two genotypes.
The continuous imaging started when the corn plants reached the V4 stage and lasted until 31 days later when the plants are at R1 stage on average (
Figure 3). On each of the 31 days, imaging started at 8:00 a.m. and ended at 7:30 p.m. Around 280 hyperspectral images were collected with the repetitive imaging frequency of every 2.5 min. The gantry was only turned off during extreme weather conditions such as thunderstorms to protect the equipment’s safety. By the end of the 31 days, a total of 8631 hyperspectral images were collected.
In the middle of the project when the plants were at V9, we collected ground truth measurements such as nitrogen content, RWC and plant fresh weight. Two plants were randomly sampled from each plot. The plant shoot was cut to measure the fresh weight. A small section (2.5 × 5.0 cm
2) of the top collared leaf was taken to measure the RWC using the Equation (1) [
29,
30]. The remaining part of top collared leaf was sent to the great lakes A and L laboratories (A and L Great Lakes Laboratories, Inc., Fort Wayne, IN, USA) for measuring the nitrogen percentage.
where
FW is fresh weight,
DW is dry weight, and
TW is the turgid weight.
2.3. Image Segmentation and Feature Extraction
After data collection, standard imaging processing protocols were performed to extract the interested plant phenotyping features [
5,
31,
32]. The raw hyperspectral images were firstly calibrated with the real-time white reference. It ensured each scanning line from this push-broom sensor was calibrated with the reference under the same lighting conditions. The image calibration was performed with the following equation:
where
is the calibrated image,
is the raw hyperspectral image,
is dark reference image and
is the hyperspectral image of the white reference. The calibrated images were then processed using a segmentation procedure with convolution methodology [
33,
34]. A vector of sequential integers from −20 to 20 was multiplied by the reflectance intensity vector from the red-edge region (680–720 nm). By choosing threshold 7 as the boundary, the plant tissue was successfully segmented from the background (
Figure 1h). In some of images, there were some weeds which were ignorable in size compared to the corn plants (See the red boxed in
Figure 1h).
The average reflectance spectrum from each plot was calculated. In total, 51,786 spectra (8631 images * 6 plots/image) were calculated for the plots with different genotypes and nitrogen treatments. These spectra were used to calculate the crop remote sensing results such as NDVI and predicted RWC. The formula below was used for calculating NDVI.
where
R800nm and
R650nm are the reflectance values of wavelength 800 nm and 650 nm respectively [
35,
36]. The partial least squares regression (PLSR) model was used to predict RWC from the spectra. In order to avoid prediction drifts between facilities [
37,
38,
39], instead of using an existing RWC prediction model developed from the other facilities, the team decided to build a new PLSR model with the spectra and RWC ground truth data collected in the same project. The new model predicted RWC with the cross-validation coefficient of determination (R
2) of 0.722 and root mean square error (RMSE) of 6.22% (
Figure 4). This RWC model was then applied to all the 51,786 spectra to predict RWC in those plots over the 31 diurnal cycles.
2.4. Data Quality Check
After calculating the remote sensing result data such as spectra, NDVI and RWC, the data quality was checked, and outlier data was removed. For each feature from each day, the measurements between the upper inner fence (Q3 + 1.5IQR) and lower inner fence (Q1 − 1.5IQR) were kept [
40]. IQR is the interquartile range, being equal to the difference between 75th (Q3) and 25th (Q1) percentiles. This quality filtering removed the blunders/gross errors in the image acquisition setup/process. For example, some of the removed outlier images were taken under very-high wind-speed conditions, when the plant stems were bent severely, making the images look significantly different. The data before 10:00 a.m. and after 5:30 p.m. also showed severe variances and noises. This could be caused by the dews on the surface of leaves and dim lighting conditions [
41]. Only the data collected during 10 a.m.–5:30 p.m. were used in the diurnal pattern analysis as very few airborne remote sensing activities happen outside this time range. We ended up using this selected data pool (
Table 2) for the diurnal pattern analysis. Please note the time was in the Eastern Time Zone Daylight Saving Time and this selected time window was actually still centered at the solar noon time in West Lafayette IN.
2.5. Evaluating the Impacts from Treatments, Stages and Genotypes to Diurnal Changing Patterns
Before modeling the diurnal changing patterns, the diurnal patterns between different nitrogen treatments, growth stages and genotypes were compared to decide if any of these factors had significant impact on the diurnal pattern. If not, the data from different treatments, stages and genotypes could be combined for the diurnal pattern modeling. Otherwise, the modeling should be done separately for each different case.
The comparison of the changing patterns was done by applying the dynamic time warping (DTW) method to calculate the similarity between the relative different ratio (RDR) curves from the different plants plots [
42]. An RDR curve was calculated to describe the diurnal changing pattern on each day as the % of the change of the phenotyping feature value relative to the feature’s value at the reference time point (Equation (4)). For the reference time, we selected the solar noon time as this is the center point of the daytime, and this is when the lowest NDVI value was observed every day [
11].
The DTW method was selected as it is an algorithm more commonly used to measure the similarity between two time-series data [
42]. More specifically, DTW is a time series alignment algorithm developed to align two sequences of feature vectors by warping the time axis iteratively until an optimal match [
43]. Thus, a distance score is generated during the process of alignment, which can be used as the difference between two curves. For example, small distance score means higher similarity between two curves. DTW also allowed non-linear mapping which was appropriate for the purpose of pattern matching. With the distance scores from DTW, the similarity of RDR curves of different plots were quantitively compared and discussed.
2.6. Diurnal Patterns Calculation by Time Series Signal Decomposition
Inspired by the idea of time series decomposition method [
44], we decomposed the changing signal of each feature into two major parts: day-to-day trend (
) and diurnal pattern (
).
is calculated with LOESS (locally estimated scatterplot smoothing) method [
45]. By fitting a non-parametric regression curve on the scattered plot of the data, the day-to-day changing trend can be clearly extracted from the raw signal [
44]. This trend is majorly reflecting the changes of plant growth stage and general weather conditions over the 31 days of imaging, which would be bothering if it is included in the diurnal pattern modeling. The diurnal component (
) was calculated by subtracting the day-to-day trend (
) from the raw signal.
is also called the detrended data.
contains the higher frequency variance components majorly caused by the short-term impacts such as plant’s circadian behavior, sun angle, solar irradiation and temperature changes during the day.
The mean curve of for 31 days was calculated for the diurnal pattern fitting. Meanwhile, the 95% confidence interval was also calculated to evaluate the consistency of the diurnal patterns across the days.
2.7. Diurnal Pattern Fitting
Based on the observation for the diurnal changing pattern of NDVI, a piecewise linear regression function was selected to fit the pattern. By calculating the 1st order derivative of the mean curve, the critical lowest point
(which perfectly matched the time with the highest sun angle) was derived.
where
t is the time offset (in hours) from solar noon. (i.e.,
t at 12:15 pm is -1.5, on a day and at a location where solar noon is 13:45.).
Non-linear diurnal changing patterns were observed for the other features such as RWC prediction and a few single spectral bands. We selected the 2nd order regression model (Equation (7)) for fitting the pattern of those features.
2.8. Model Performance Evaluation
The performance of the developed diurnal model was evaluated and compared with the R2 and RMSE between the prediction results and the desired diurnal changes. Moreover, to further assess the impacts from nitrogen treatments and genotypes, the general diurnal model was tested on each of the six plant plots, and the R2 and RMSE were obtained for each plot, respectively.
2.9. Diurnal Models’ Applications
These regression models quantitively summaries the diurnal impacts. For one example, the models can be used to calculate the exact imaging window based on any specific diurnal variance requirement. Besides, the models can also be used to remove the diurnal impact. For example, the NDVI measured at any other time point of the day can be converted to the NDVI at the highest sun angle time with the fitted model.