Radiometric Calibration of ‘Commercial o ﬀ the Shelf’ Cameras for UAV-Based High-Resolution Temporal Crop Phenotyping of Reﬂectance and NDVI

: Vegetation indices, such as the Normalised Di ﬀ erence Vegetation Index (NDVI), are common metrics used for measuring traits of interest in crop phenotyping. However, traditional measurements of these indices are often inﬂuenced by multiple confounding factors such as canopy cover and reﬂectance of underlying soil, visible in canopy gaps. Digital cameras mounted to Unmanned Aerial Vehicles o ﬀ er the spatial resolution to investigate these confounding factors, however incomplete methods for radiometric calibration into reﬂectance units limits how the data can be applied to phenotyping. In this study, we assess the applicability of very high spatial resolution (1 cm) UAV-based imagery taken with commercial o ﬀ the shelf (COTS) digital cameras for both deriving calibrated reﬂectance imagery, and isolating vegetation canopy reﬂectance from that of the underlying soil. We present new methods for successfully normalising COTS camera imagery for exposure and solar irradiance e ﬀ ects, generating multispectral (RGB-NIR) orthomosaics of our target ﬁeld-based wheat crop trial. Validation against measurements from a ground spectrometer showed good results for reﬂectance (R 2 ≥ 0.6) and NDVI (R 2 ≥ 0.88). Application of imagery collected through the growing season and masked using the Excess Green Red index was used to assess the impact of canopy cover on NDVI measurements. Results showed the impact of canopy cover artiﬁcially reducing plot NDVI values in the early season, where canopy development is low.


Introduction
In crop phenotyping, vegetation indices (e.g., NDVI) derived from canopy reflectance are commonly used to assess certain physiological traits of interest [1], including (i) plant vigour [2,3], (ii) plant biomass [4,5], (iii) plant nitrogen status [6], (iv) plant Leaf Area Index (LAI) [7,8] and (v) final crop yield [9]. However, these indices are typically influenced by both the target vegetation condition and variables such as background soil properties and canopy cover/density [10]. The combined influence of each variable quite often remains unacknowledged when associating vegetation indices (VIs) to traits of interest-a problem when there are multiple situations (e.g., low canopy cover and high vegetation vigour versus high canopy cover and low vegetation vigour) that may equate to similar VI measures. Such situations can provide significant uncertainty, and even false indications of plant status [11]. Traditional methods of measuring canopy spectral reflectance (e.g., ground spectrometers and/or satellite based remote sensing) offer insufficient spatial resolution to investigate and dissect 1.
Develop a method for full radiometric calibration of COTS camera imagery, with new methods for exposure normalisation and individual image incoming solar irradiance adjustment.

2.
Quantitatively assess the influence of the radiometric calibration steps and the final quality of the derived reflectance and NDVI datasets.

3.
Utilise the very high-resolution maps derived from the UAV imagery to analyse the influence of canopy cover on NDVI trends for a field-based wheat crop trial.

Field Site
All data were collected at the experimental farm operated by Rothamsted Research, UK (51 • 48 34.56"N, 0 • 21 22.68"W). We focused on the Defra-funded Wheat Genetic Improvement Network (WGIN) Diversity Field Experiment [24], whose aim is to test the influence of applying different nitrogen fertiliser treatments to different wheat cultivars. A total of 30 different cultivars were grown Remote Sens. 2019, 11,1657 3 of 20 at 4 different nitrogen application rates, with three replicates making a total of 360 plots. (Table 1) [25]. Each plot consisted of a 9 m × 3 m non-destructive plot and a 2.5 m × 3 m plot reserved for destructive sampling. This study focuses on the non-destructive part only.

UAV Imagery
A DJI S900 UAV [30] fitted with a DJI flight controller was flown on a pre-determined flight plan at 45 m altitude over the field site nine times between 7 March 2017 and 4 July 2017. The flight plan was designed to ensure 80% overlap between concurrent images was obtained. Two Sony (Tokyo, Japan) α5100 mirrorless digital cameras [31] mounted on the UAV were used for the image collection. These cameras contain 24.3 mega pixel complementary metal-oxide semiconductor (CMOS) sensors, and both were fitted with 20 mm F2.8 Sony prime lenses. One camera was left as standard to record RGB imagery, and one had had its internal NIR-blocking filter replaced with an 830 nm long pass filter to block visible light and enable recording of NIR waveband imagery. The 830 nm filter was selected to ensure minimal capturing of visible light in the imagery, as seen with the 660 nm filter used by Berra et al. [25,32].
All images were captured at 1-sec intervals and in Raw format, with focus set to 45 m to reflect the UAV flying height. Aperture and ISO were left on automatic, whilst shutter speed was fixed to 1/500sec to ensure minimisation of motion blur. The UAV and cameras were flown over the field site at a time relatively close to local solar noon, with actual recording times varying from 10:11 to 13.25. Twelve Ground Control Points, whose positions were measured with a Trimble Geo 7 DGPS [33], were used for georeferencing final orthomosaics. To provide measures of total incoming solar irradiance, a Tec5 HandySpec Field spectrometer (Oberursel, Germany) [34] fitted with a cosine corrected downwelling optic was deployed at a fixed location next to the field and set to measure at 1-second intervals. Spectral measurements were collected at 10 nm spectral resolution across the wavelength range 360-1000 nm.

Validation Data
Mean plot canopy reflectance, measured with the Tec5 HandySpec Field spectrometer, was used for validation of UAV derived canopy reflectance measures. To collect the spectrometer measurements, a single scan of each plot's canopy was collected with the spectrometer optic held approximately 1 m above the plot; the standard procedure employed by Rothamsted Research. Each scan produced one spectral reflectance measure for the plot at 10 nm spectral resolution across the wavelength range 360-1000 nm. This procedure was repeated for all 360 plots on three separate dates during the growing season between 19 April and 4 July 2017. The Tec5 HandySpec adjusts for changes in solar illumination between measurements using a downwelling optic fitted with a cosine diffuser; reflectance is calculated using proprietary software. Before comparing to UAV results, the Tec5 results Remote Sens. 2019, 11, 1657 4 of 20 were convolved to the same spectral wavebands as the cameras. These ground-based measurements were not always collected on the same days as UAV flights due to logistical constraints, but were within 3 days. Additional validation data was obtained by flying a Parrot Sequoia multispectral imager [13] simultaneously with the dual camera system for a single date (21 June 2017). The Sequoia was set to capture images every second and the Sequoia's downwelling sunshine sensor was mounted atop the UAV for collection of irradiance measurements. The Sequoia images were processed using Pix4D (Lausanne, Switzerland) (Version 4.3.1) [35] using standard recommended settings, downwelling light sensor data and manufacturer derived calibrations, producing Green, Red, and NIR reflectance orthomosaics at a ground sampling distance (GSD) of 5 cm.
Due to the differences between COTS camera and Parrot Sequoia spectral responses (Table 2), direct comparison between the cameras was not possible. Therefore, assessment of accuracy of the individual UAV-based imaging systems was conducted by comparing both against the Tec5.

Post-Processing of Captured Imagery
The processing of the dual-camera imagery followed the workflow outlined in Figure 1. Specific details on the main correction steps, including the novel exposure corrections are provided. measurements, a single scan of each plot's canopy was collected with the spectrometer optic held approximately 1 m above the plot; the standard procedure employed by Rothamsted Research. Each scan produced one spectral reflectance measure for the plot at 10 nm spectral resolution across the wavelength range 360-1000 nm. This procedure was repeated for all 360 plots on three separate dates during the growing season between 19 April and 4 July 2017. The Tec5 HandySpec adjusts for changes in solar illumination between measurements using a downwelling optic fitted with a cosine diffuser; reflectance is calculated using proprietary software. Before comparing to UAV results, the Tec5 results were convolved to the same spectral wavebands as the cameras. These ground-based measurements were not always collected on the same days as UAV flights due to logistical constraints, but were within 3 days. Additional validation data was obtained by flying a Parrot Sequoia multispectral imager [13] simultaneously with the dual camera system for a single date (21 June 2017). The Sequoia was set to capture images every second and the Sequoia's downwelling sunshine sensor was mounted atop the UAV for collection of irradiance measurements. The Sequoia images were processed using Pix4D (Lausanne, Switzerland) (Version 4.3.1) [35] using standard recommended settings, downwelling light sensor data and manufacturer derived calibrations, producing Green, Red, and NIR reflectance orthomosaics at a ground sampling distance (GSD) of 5 cm.
Due to the differences between COTS camera and Parrot Sequoia spectral responses (Table 2), direct comparison between the cameras was not possible. Therefore, assessment of accuracy of the individual UAV-based imaging systems was conducted by comparing both against the Tec5.

Post-Processing of Captured Imagery
The process Figure 1. Flow chart of key processing steps used to convert raw images to reflectance images. The blue circles indicate inputs, green squares indicate processing steps and yellow squares derived products.

Relative Spectral Response
Relative Spectral Responses (RSR), Figure 2, of both cameras were determined using a double monochromator fitted with an integrating sphere, using the method described by Berra et al. [32]. The unmodified RGB camera shows greatest sensitivity in the green channel, as expected from a Bayer matrix colour filter array [32]. For the modified NIR camera, overall sensitivity was similar in all three bands that originally measured Red, Green and Blue waveband light. Whilst now mostly sensitive to NIR wavelength light, all channels show some sensitivity to light below 830 nm (i.e., sensitivity to radiation outside the NIR spectral range remained), indicating that the modified internal filter was Remote Sens. 2019, 11, 1657 5 of 20 not performing at 100% at 830 nm. The 'Blue' channel of the NIR-adapted camera displayed the least sensitivity to light below 830 nm, therefore was best suited for use as the NIR channel. The wavebands determined for each channel of the RGB and NIR channels are presented in Table 3.
monochromator fitted with an integrating sphere, using the method described by Berra et al. [32]. The unmodified RGB camera shows greatest sensitivity in the green channel, as expected from a Bayer matrix colour filter array [32]. For the modified NIR camera, overall sensitivity was similar in all three bands that originally measured Red, Green and Blue waveband light. Whilst now mostly sensitive to NIR wavelength light, all channels show some sensitivity to light below 830 nm (i.e., sensitivity to radiation outside the NIR spectral range remained), indicating that the modified internal filter was not performing at 100% at 830 nm. The 'Blue' channel of the NIR-adapted camera displayed the least sensitivity to light below 830 nm, therefore was best suited for use as the NIR channel. The wavebands determined for each channel of the RGB and NIR channels are presented in Table 3.  Table 3. Sony α5100 camera band sensitivities. Sensitivities were measured using a double monochromator fitted with an integrating sphere.

RAW Conversion
Images were collected in RAW format before conversion to 16-bit Tagged Image File Format (TIFF) format using DCRAW 9.27 [36]. This was done using bilinear conversion algorithms and a dark current correction, to maintain original sensor DN measurements. The exact settings are presented in Table 4. Dark current correction images for each camera were captured in complete darkness (i.e., lens cap on and lights turned off), and used for the DCRAW processing.  Images were collected in RAW format before conversion to 16-bit Tagged Image File Format (TIFF) format using DCRAW 9.27 [36]. This was done using bilinear conversion algorithms and a dark current correction, to maintain original sensor DN measurements. The exact settings are presented in Table 4. Dark current correction images for each camera were captured in complete darkness (i.e., lens cap on and lights turned off), and used for the DCRAW processing. Table 4. Details of DCRAW settings used to convert images from raw to Tagged Image File Format (TIFF).

DCRAW Command Action
-v Print verbose messages -6 Write 16bit -W No automatic image brightening -g 1 1 Apply unadjusted gamma curve -T Write Tiff format -r 1 1 1 1 Set unadjusted white balance -t 0 Do not rotate image -q 0 Apply linear demosaicing -o 0 Raw output colour space -K darkimage.pgm Apply dark image correction using file specified

Exposure Corrections
To determine the relationships between DN and exposure settings (aperture and ISO), a series of images were collected of a Lambertian spectralon reflectance panel, set up indoors under constant illumination with a white incandescent bulb light. For each exposure setting (aperture, ISO and shutter speed), a series of images were captured under the settings full range (e.g., ISO100-ISO1000), whilst other settings remained fixed. As illumination remained constant; three image sets were produced, each modelling the influence of changing one exposure setting on image DNs.
Linear relationships between pixel DN and aperture and ISO were observed ( Figure 3). From these relationships, the aperture correction factor (CF app ) was derived to normalise images captured under varied aperture to an aperture value of 1 (Equations (1) and (2)).
where f-stop is the aperture value the image was captured with and Image RAW is the DCRAW converted TIFF image and Image app is the aperture corrected image. (TIFF).

DCRAW Command Action -v
Print verbose messages -6 Write 16bit -W No automatic image brightening -g 1 1 Apply unadjusted gamma curve -T Write Tiff format -r 1 1 1 1 Set unadjusted white balance -t 0 Do not rotate image -q 0 Apply linear demosaicing -o 0 Raw output colour space -K darkimage.pgm Apply dark image correction using file specified

Exposure Corrections
To determine the relationships between DN and exposure settings (aperture and ISO), a series of images were collected of a Lambertian spectralon reflectance panel, set up indoors under constant illumination with a white incandescent bulb light. For each exposure setting (aperture, ISO and shutter speed), a series of images were captured under the settings full range (e.g., ISO100-ISO1000), whilst other settings remained fixed. As illumination remained constant; three image sets were produced, each modelling the influence of changing one exposure setting on image DNs.

=
(1) where f-stop is the aperture value the image was captured with and ImageRAW is the DCRAW converted TIFF image and Imageapp is the aperture corrected image. For ISO, Equation (3) was used to normalise images to an ISO value of 100, the lowest and most commonly used setting on the cameras.  For ISO, Equation (3) was used to normalise images to an ISO value of 100, the lowest and most commonly used setting on the cameras.
where ISO image is the ISO setting used to capture the image; Image app is the aperture corrected image; and Image ISO is the ISO and aperture corrected image.

Vignetting Correction
An adapted version of the method outlined by LeLong et al. [16] was used in this study; such that camera, band and aperture-specific vignetting correction filters were generated for each data collection date. For each flight, the following steps were taken to produce vignetting filters: 1.
Images of matching camera, band and aperture settings were summed together and averaged.

2.
The radial vignetting profile of the averaged image was modelled using the median of evenly spaced concentric rings.

3.
A 2nd degree polynomial function interpolated the vignetting profile from the median of rings.

4.
The interpolation values were then divided by the minimum value to produce a multiplicative correction factor which brightened the corners. Unlike LeLong et al. [16], the vignetting filter was applied as a multiplicative filter rather than additive-this is in order to preserve the underlying patterns within the original images.
1. Images of matching camera, band and aperture settings were summed together and averaged. 2. The radial vignetting profile of the averaged image was modelled using the median of evenly spaced concentric rings. 3. A 2nd degree polynomial function interpolated the vignetting profile from the median of rings. 4. The interpolation values were then divided by the minimum value to produce a multiplicative correction factor which brightened the corners. 5. The concentric rings are given the value of the correction factor corresponding to its distance from the centre to produce the final vignetting filter (

Cross Calibration Factor and Reflectance Calibration
Before converting image DNs to reflectance, it was necessary to cross calibrate the Tec5 downwelling sensor and the cameras. To do this, the empirical line method was used to retrieve the relationship between Tec5 irradiance measures and exposure and vignetting corrected image DN, over 5 Lambertian reference targets ( Figure 5).

Cross Calibration Factor and Reflectance Calibration
Before converting image DNs to reflectance, it was necessary to cross calibrate the Tec5 downwelling sensor and the cameras. To do this, the empirical line method was used to retrieve the relationship between Tec5 irradiance measures and exposure and vignetting corrected image DN, over 5 Lambertian reference targets ( Figure 5). The camera and band specific calibration factors, Table 5, were applied to individual images, before Equation (4) was used to convert images from DN to reflectance using time matched Tec5 irradiance measurements.
where R b , t is the final reflectance image at time t and waveband b, Image b,t is the single image captured at time t and band b, and Tec5 irradiance b,t is the Tec5 irradiance measurement captured at the same time, t and convolved to the same band b as the image.  [37] was used to process final imagery to orthomosaics, including automatic lens correction. For each date, two orthomosaics were generated, RGB and NIR. Agisoft processing settings, Table 6, were kept consistent for all orthomosaics. In order to minimise the impact of geometric distortion and variation, the disabled blending mode was used to generate the orthomosaic [38]. This mode takes pixel data from the image whose view is closest to nadir. Orthomosaics were generated and exported at 1 cm Ground Sampling Distance (GSD). NDVI orthomosaics were generated using Equation (5), before mean values for each plot in each camera band and NDVI were extracted using custom Python-based processing tools. As in Holman et al. [12], a 50 cm buffer was applied to each plot before extracting mean values in order to prevent the influence of the plot edge effect.
where R NIR is measured reflectance in the NIR band and R r is measured reflectance in the red band.

Canopy Masking
To dissect green canopy from background variables, the Excess Green Red (ExGR) index was used (Equation (6)), with a threshold of > 0 to classify green vegetation [39,40]. Figure 6 shows an example of the produced mask, with reasonable agreement between visual green canopy and pixels classified as green by ExGR. The masks were used to extract mean plot NDVI of green pixels only. Figure 6. Example image of an RGB image of wheat trial plots and right the ExGR mask output. In the ExGR mask, white represents green classified pixels and black non-green pixels. Imagery is from the 21 June 2017 UAV data collection campaign.

Validation of Calibrations
The influence of the calibration steps applied to COTS camera imagery on precision of results was first assessed. For a single date (21 June 2017), camera imagery was processed with corrections applied cumulatively to understand their influence on results. For clarity, extracted mean plot results for the red and NIR bands have been scaled to a range 0-1.

Validation of Calibrations
The influence of the calibration steps applied to COTS camera imagery on precision of results was first assessed. For a single date (21 June 2017), camera imagery was processed with corrections applied cumulatively to understand their influence on results. For clarity, extracted mean plot results for the red and NIR bands have been scaled to a range 0-1.  For the NIR band (Figure 8), a more significant impact of correction steps is observed. Gains in both the linear fit and R 2 are achieved at each step, with vignetting indicating the most significant influence. Both nRMSE and bias decline in accuracy with the addition of correction steps. The influence of corrections on NDVI, calculated from non-scaled data (Figure 9), indicates high precision (R 2 = 0.91) but poor accuracy (nRMSE = 0.85, Bias = − 0.57) compared to the ground validation data. Addition of irradiance correction greatly improves accuracy, particularly nRMSE, bias and linear trend. Exposure corrections improve correlation, though drops in nRMSE and bias are also introduced. Finally, the addition of vignetting improves all statistics, indicating that the complete collection of calibration steps produces best results in terms of both accuracy and precision. The influence of corrections on NDVI, calculated from non-scaled data (Figure 9), indicates high precision (R 2 = 0.91) but poor accuracy (nRMSE = 0.85, Bias = −0.57) compared to the ground validation data. Addition of irradiance correction greatly improves accuracy, particularly nRMSE, bias and linear trend. Exposure corrections improve correlation, though drops in nRMSE and bias are also introduced. Finally, the addition of vignetting improves all statistics, indicating that the complete collection of calibration steps produces best results in terms of both accuracy and precision. Further investigation of camera settings (Figure 10), via calculation on Exposure Value (Equation (7)), highlights how the cameras adjusted exposure independently during UAV flight. This independence explains the poor accuracy of NDVI from raw images, where variable camera settings (which can vary between the independent cameras used to gather RGB and NIR data) artificially altering the red to NIR ratio. Inclusion of the varying solar spectral irradiance data corrects this, improving the data consistency greatly; inclusion of exposure and vignetting corrections removes all influence of variable exposure settings, producing even higher accuracy data.
where fi is the image aperture, ti is the image shutter speed and ISOi is the image ISO value. Further investigation of camera settings (Figure 10), via calculation on Exposure Value (Equation (7)), highlights how the cameras adjusted exposure independently during UAV flight. This independence explains the poor accuracy of NDVI from raw images, where variable camera settings (which can vary between the independent cameras used to gather RGB and NIR data) artificially altering the red to NIR ratio. Inclusion of the varying solar spectral irradiance data corrects this, improving the data consistency greatly; inclusion of exposure and vignetting corrections removes all influence of variable exposure settings, producing even higher accuracy data.
where f i is the image aperture, t i is the image shutter speed and ISO i is the image ISO value.

Accuracy Assessment of COTS Camera Reflectance
For three dates (19 April 2017, 21 June 2017, and 4 July 2017), the COTS camera-derived mean plot reflectance and calculated NDVI were assessed against Tec5 results. Results for the blue reflectance band ( ) show good fit against the Tec5 in the first two dates (R 2 > 0.79), with slopes close to 1 and intercepts close to 0. Small nRMSE and biases also indicate good agreement with the spectrometer. Poorer results in all statistics in the last date (4 July 2017) show reduced agreement with Tec5 reflectance measurements. At this later date, onset of senescence will increase the variability in canopy reflectance both within and between plots, as seen by the increase in vertical error bars. This spatial non-uniformity of senescence onset is better measured by the UAV data as opposed to the spectrometer, leading to poorer statistics at this time point.

Accuracy Assessment of COTS Camera Reflectance
For three dates (19 April 2017, 21 June 2017, and 4 July 2017), the COTS camera-derived mean plot reflectance and calculated NDVI were assessed against Tec5 results. Results for the blue reflectance band ( Figure 11) show good fit against the Tec5 in the first two dates (R 2 > 0.79), with slopes close to 1 and intercepts close to 0. Small nRMSE and biases also indicate good agreement with the spectrometer. Poorer results in all statistics in the last date (4 July 2017) show reduced agreement with Tec5 reflectance measurements. At this later date, onset of senescence will increase the variability in canopy reflectance both within and between plots, as seen by the increase in vertical error bars. This spatial non-uniformity of senescence onset is better measured by the UAV data as opposed to the spectrometer, leading to poorer statistics at this time point. The green (Figure 12) and red bands ( Figure 13) show very similar trends in accuracy to the blue band. Both bands show good fit (R 2 ≥ 0.84) and consistent small negative biases, indicating slight underestimation of reflectance from the cameras. The same trend between nitrogen treatments over Figure 11. Accuracy assessments of blue band reflectance for three dates. Tec5 reflectance is convolved to the spectral response of the COTS cameras for comparison. The points are coloured based on nitrogen treatment applied to the plot. Standard deviation of reflectance measured by the COTS cameras is presented by vertical error bars. The dashed line represents the 1:1 line. The green (Figure 12) and red bands ( Figure 13) show very similar trends in accuracy to the blue band. Both bands show good fit (R 2 ≥ 0.84) and consistent small negative biases, indicating slight underestimation of reflectance from the cameras. The same trend between nitrogen treatments over time is also present, as well as the greater within-plot variation for the last date compared to the earlier two. Figure 11. Accuracy assessments of blue band reflectance for three dates. Tec5 reflectance is convolved to the spectral response of the COTS cameras for comparison. The points are coloured based on nitrogen treatment applied to the plot. Standard deviation of reflectance measured by the COTS cameras is presented by vertical error bars. The dashed line represents the 1:1 line.
The green (Figure 12) and red bands ( Figure 13) show very similar trends in accuracy to the blue band. Both bands show good fit (R 2 ≥ 0.84) and consistent small negative biases, indicating slight underestimation of reflectance from the cameras. The same trend between nitrogen treatments over time is also present, as well as the greater within-plot variation for the last date compared to the earlier two.  The NIR band ( Figure 14) shows reduced fit in comparison to the visible bands (0.64 ≥ R 2 ≤ 0.7) and larger biases indicate lower accuracies achieved in the NIR band. In contrast to the visible bands, the NIR shows lowest accuracy in the first date before improving for the subsequent two dates. Trends between nitrogen treatments show that N1 treatments consistently have the lowest reflectance, indicating reduced vegetation in these plots, as expected. Standard deviations show the same increased variability in plot reflectance of the last date. Overall, the results of the NIR camera indicate lower sensitivity to higher canopy reflectance compared to the Tec5. The NIR band ( Figure 14) shows reduced fit in comparison to the visible bands (0.64 ≥ R 2 ≤ 0.7) and larger biases indicate lower accuracies achieved in the NIR band. In contrast to the visible bands, the NIR shows lowest accuracy in the first date before improving for the subsequent two dates. Trends between nitrogen treatments show that N1 treatments consistently have the lowest reflectance, indicating reduced vegetation in these plots, as expected. Standard deviations show the same increased variability in plot reflectance of the last date. Overall, the results of the NIR camera indicate lower sensitivity to higher canopy reflectance compared to the Tec5. and larger biases indicate lower accuracies achieved in the NIR band. In contrast to the visible bands, the NIR shows lowest accuracy in the first date before improving for the subsequent two dates. Trends between nitrogen treatments show that N1 treatments consistently have the lowest reflectance, indicating reduced vegetation in these plots, as expected. Standard deviations show the same increased variability in plot reflectance of the last date. Overall, the results of the NIR camera indicate lower sensitivity to higher canopy reflectance compared to the Tec5. Accuracy assessments of calculated NDVI ( Figure 15) show high correlations (R 2 ≥ 0.88) and low nRMSE. Additionally, biases indicating overall very good accuracy are achieved from the COTS Figure 14. Accuracy assessments of NIR band reflectance for three dates. Tec5 reflectance is convolved to the spectral response of the COTS cameras for comparison. The points are coloured based on nitrogen treatment applied to the plot. Standard deviation of reflectance measured by the COTS cameras is presented by vertical error bars. The dashed line represents the 1:1 line.
Accuracy assessments of calculated NDVI ( Figure 15) show high correlations (R 2 ≥ 0.88) and low nRMSE. Additionally, biases indicating overall very good accuracy are achieved from the COTS cameras. Temporal accuracy shows a similar drop in accuracy for the final date, as seen in the visible bands, but overall good stability is achieved. The lower accuracy of the NIR band appears to not impact the accuracy of calculated NDVI. cameras. Temporal accuracy shows a similar drop in accuracy for the final date, as seen in the visible bands, but overall good stability is achieved. The lower accuracy of the NIR band appears to not impact the accuracy of calculated NDVI. Additional assessment compared results from the COTS cameras with the Parrot Sequoia ( Figure 16), a commercially available multispectral imager whose data is processed using proprietary calibrations. Of the individual bands, green showed the strongest agreement between the two camera systems, with comparable R 2 , nRMSE and bias. The red band indicated poorer accuracy achieved by the Sequoia, with large nRMSE, negative biases and poorer linear agreement with the Tec5. In the NIR band, both camera systems showed comparable accuracy levels and precision. NDVI results show greater accuracy achieved by the COTS cameras, with the Sequoia overestimating NDVI compared to the TEC5 (as indicated by positive bias). The similarity in results between the COTS cameras and Parrot Sequoia, particularly in the NIR waveband, suggests discrepancy between imaging and non-imaging spectral monitoring technologies. Additional assessment compared results from the COTS cameras with the Parrot Sequoia ( Figure 16), a commercially available multispectral imager whose data is processed using proprietary calibrations. Of the individual bands, green showed the strongest agreement between the two camera systems, with comparable R 2 , nRMSE and bias. The red band indicated poorer accuracy achieved by the Sequoia, with large nRMSE, negative biases and poorer linear agreement with the Tec5. In the NIR band, both camera systems showed comparable accuracy levels and precision. NDVI results show greater accuracy achieved by the COTS cameras, with the Sequoia overestimating NDVI compared to the TEC5 (as indicated by positive bias). The similarity in results between the COTS cameras and Parrot Sequoia, particularly in the NIR waveband, suggests discrepancy between imaging and non-imaging spectral monitoring technologies.

Influence of Canopy on NDVI
The focus of this component of the study was to investigate the potential for high spatial resolution imagery to be used to dissect the influence of canopy cover on derived vegetation indices. For nine dates, the COTS camera imagery was calibrated, processed and NDVI calculated. For one date, 18 May 2017, significant shadowing impacted on the results of masking; as such, this date was removed from further processing. For the remaining eight dates, a subset of ten cultivars have been used. Examples of cropped NDVI orthomosaics for three dates highlight the temporal and spatial variation achieved from the COTS cameras ( Figure 17).

Influence of Canopy on NDVI
The focus of this component of the study was to investigate the potential for high spatial resolution imagery to be used to dissect the influence of canopy cover on derived vegetation indices. For nine dates, the COTS camera imagery was calibrated, processed and NDVI calculated. For one date, 18 May 2017, significant shadowing impacted on the results of masking; as such, this date was removed from further processing. For the remaining eight dates, a subset of ten cultivars have been used. Examples of cropped NDVI orthomosaics for three dates highlight the temporal and spatial variation achieved from the COTS cameras ( Figure 17). For all treatments, shallower temporal trends in NDVI are observed, with the N1 treatment displaying a close to horizontal trend with the peak in late-May no longer featuring. Comparing the difference between NDVIunmasked and NDVIExGR (Figure 18c, third row) shows the greatest influence of masking occurs early season where % green pixel is lowest (Figure 18d, bottom row). Figure 18. Temporal trends of ten wheat cultivars grown under four different nitrogen treatments for: (a) the standard unmasked mean NDVI; (b) mean NDVI derived from ExGR masked plots to remove the influence of background soil; (c) displaying temporal differences between masked and unmasked NDVI results; (d) percentage green pixel as calculated from the ExGR masks. Nitrogen application dates and quantities for the N2, N3 and N4 treatments are presented by the vertical lines. All data represent the means of three replicates. Assessment of NDVI unmasked (Figure 18a, top row) shows typical trends over time and between nitrogen treatments. All cultivars and treatment levels show a starting NDVI value around 0.4, increasing to a peak in late-May before dropping off at the end of the season. Comparison between nitrogen treatments shows clear differences between plots with (N2, N3 and N4) and without (N1) fertiliser application, with the N1 treatment showing lower maximum NDVI, despite similar initial NDVI values (~0.4). The drop in NDVI values at the end of the season is likely a result of senescence and the browning of crop canopy. Application of ExGR derived masks (Figure 18b, second row) to extract NDVI of green classified pixels only, produces new trends between treatments and over time. For all treatments, shallower temporal trends in NDVI are observed, with the N1 treatment displaying a close to horizontal trend with the peak in late-May no longer featuring. Comparing the difference between NDVI unmasked and NDVI ExGR (Figure 18c, third row) shows the greatest influence of masking occurs early season where % green pixel is lowest (Figure 18d, bottom row). Assessment of NDVIunmasked (Figure 18a, top row) shows typical trends over time and between nitrogen treatments. All cultivars and treatment levels show a starting NDVI value around 0.4, increasing to a peak in late-May before dropping off at the end of the season. Comparison between nitrogen treatments shows clear differences between plots with (N2, N3 and N4) and without (N1) fertiliser application, with the N1 treatment showing lower maximum NDVI, despite similar initial NDVI values (~0.4). The drop in NDVI values at the end of the season is likely a result of senescence and the browning of crop canopy. Application of ExGR derived masks (Figure 18b, second row) to extract NDVI of green classified pixels only, produces new trends between treatments and over time. For all treatments, shallower temporal trends in NDVI are observed, with the N1 treatment displaying a close to horizontal trend with the peak in late-May no longer featuring. Comparing the difference between NDVIunmasked and NDVIExGR (Figure 18c, third row) shows the greatest influence of masking occurs early season where % green pixel is lowest (Figure 18d, bottom row). Figure 18. Temporal trends of ten wheat cultivars grown under four different nitrogen treatments for: (a) the standard unmasked mean NDVI; (b) mean NDVI derived from ExGR masked plots to remove the influence of background soil; (c) displaying temporal differences between masked and unmasked NDVI results; (d) percentage green pixel as calculated from the ExGR masks. Nitrogen application dates and quantities for the N2, N3 and N4 treatments are presented by the vertical lines. All data represent the means of three replicates. Figure 18. Temporal trends of ten wheat cultivars grown under four different nitrogen treatments for: (a) the standard unmasked mean NDVI; (b) mean NDVI derived from ExGR masked plots to remove the influence of background soil; (c) displaying temporal differences between masked and unmasked NDVI results; (d) percentage green pixel as calculated from the ExGR masks. Nitrogen application dates and quantities for the N2, N3 and N4 treatments are presented by the vertical lines. All data represent the means of three replicates.

Discussion
This study has provided a quantitative assessment of commercial off the shelf (COTS) digital cameras for supporting the UAV-based remote sensing of field-based crop trials. COTS cameras provide very high spatial resolution imagery, potentially enabling the separation of canopy influences on vegetation indices from those of the background soil and thus making the derived information more relevant to crop health assessment and monitoring.
We have designed and tested a data processing workflow to radiometrically calibrate COTS camera imagery into reflectance units, in blue, green, red and NIR wavebands. Cameras were allowed to vary their exposure settings during flight, to cope with varying solar illumination conditions, whilst having a fixed shutter speed to avoid blurring from the UAV-motion. We find that our processing workflow can cope with this setup, and that the influence of the different pre-processing steps varies by band. In particular, the NIR band showed greater impact from vignetting corrections than the visible wavebands, agreeing with past studies which found that the modified internal filters used in NIR COTS cameras increase the impact of vignetting in images by up to 30% [16,25]. The influence of varied exposure settings as well as the success of the developed corrections was perhaps most clear during calculations of NDVI, because the separate visible and NIR COTS cameras used did not necessarily change their exposure settings in the same way at the same time, leading to artificial changes in derived NDVI. Even without corrections, good precision is observed in COTS camera calculated NDVI. However, NDVI values calculated from such raw camera data (or that calibrated into radiances rather than reflectances) are always significantly different to those derived from calibrated reflectances, and this difference is sensor specific [41]. Ultimately, this means VIs calculated from different sensors can only be intercompared in a fully meaningful way if calculated from calibrated reflectance measures.
Temporal consistency of the developed workflow was tested over three dates via comparison of COTS camera and Tec5 field-spectrometer-derived mean plot reflectance. Results showed good accuracy with NDVI results (R 2 ≥ 0.88, nRMSE ≤ 0.15) comparable to those achieved by other studies [25,42,43]. Consistency of results over this period indicates a good level of robustness in the developed methods for variable weather conditions both during and between data collection flights. Some variability occurred between time points in all bands and NDVI; likely a result of the UAV and Tec5 spectrometer obtaining measurements at different spatial resolutions [38] and datasets not being collected on the same date. Rossi et al. [44] demonstrated the impact of non-concurrent data collection when comparing different reflectance from different sensors, with a single day lag negatively impacting on correlation results. Variability in accuracy also occurred between bands, particularly for visible versus NIR, which was observed consistently over time. The same variations were also observed in the Parrot Sequoia results, as well as by Aasen and Bolten [38], who found that the varying field of views between cameras and spectrometers coupled with varying bidirectional reflectance factors in visible and NIR wavelengths impacted on correlations in the NIR band. Lack of influence of the NIR results on NDVI accuracy further indicates disparity between imaging and non-imaging measurement systems, as opposed to error in the NIR band. Investigation of this variability between bands and data sources should be a focus of future work.
Application of the very high resolution (GSD = 1 cm) reflectance imagery over time was used to investigate the impact of canopy cover and background soil on derived NDVI. Results of the unmasked NDVI presented temporal trends over a season in relation to differing nitrogen treatments and for different wheat cultivars. Masking of background soil pixels, via Excess Green Red, offered new insights into temporal NDVI trends in relation to canopy cover. Greatest differences between NDVI unmasked and NDVI ExGR (Figure 18d) occurred in the early season, where canopy cover is lowest [45], indicating that background soil and canopy cover can artificially influence measured vegetation indices. Isolation of the crop canopy for VI measurements should provide improved relationships between VI and traits of interest such as yield, canopy quality, senescence and canopy chlorophyll content [10,46], and therefore should be a focus of future studies.

Conclusions
This study has presented methods for radiometric calibration of commercial off the shelf digital camera imagery to reflectance for use in plant phenotyping. New calibrations for image exposure normalisation, combined with robust vignetting and irradiance corrections produced accurate reflectance and NDVI, comparable to a Parrot Sequoia multispectral camera and Tec5 ground spectrometer. The very high-resolution imagery obtained provided new insights into the influence of canopy cover and background soil on derived plot NDVI, especially in the early season. Future studies should look to incorporate additional UAV phenotyping methods such as 3D structure and thermal measurements to provide a more extensive low-cost phenotyping UAV-based system.