Spatial Co-Registration of Ultra-High Resolution Visible, Multispectral and Thermal Images Acquired with a Micro-UAV over Antarctic Moss Beds

In recent times, the use of Unmanned Aerial Vehicles (UAVs) as tools for environmental remote sensing has become more commonplace. Compared to traditional airborne remote sensing, UAVs can provide finer spatial resolution data (up to 1 cm/pixel) and higher temporal resolution data. For the purposes of vegetation monitoring, the use of multiple sensors such as near infrared and thermal infrared cameras are of benefit. Collecting data with multiple sensors, however, requires an accurate spatial co-registration of the various UAV image datasets. In this study, we used an Oktokopter UAV to investigate the physiological state of Antarctic moss ecosystems using three sensors: (i) a visible camera (1 cm/pixel), (ii) a 6 band multispectral camera (3 cm/pixel), and (iii) a thermal infrared camera (10 cm/pixel). Imagery from each sensor was geo-referenced and mosaicked with a combination of commercially available software and our own algorithms based on the Scale Invariant Feature Transform (SIFT). The validation of the mosaic’s spatial co-registration revealed a mean root mean squared error (RMSE) of 1.78 pixels. A thematic map of moss health, derived from the multispectral mosaic using a Modified Triangular Vegetation Index (MTVI2), and an indicative map of moss surface temperature were then combined to demonstrate sufficient accuracy of our co-registration methodology for UAV-based monitoring of Antarctic moss beds.


Introduction
In recent times, the increased development and availability of micro and small-sized Unmanned Aerial Vehicle (UAV) platforms in combination with lightweight and low-cost Inertial Measurement Units (IMUs), GPS receivers, and scientific imaging sensors has driven a proliferation in the civilian use of UAVs. Small fixed wings, helicopters, and multi-rotor UAVs with a total weight of 5 kg or less (typically known as Micro-UAVs or MUAVs) are increasingly being used for scientific purposes, in areas such as photogrammetry and environmental remote sensing [1,2]. The use of UAVs for vegetation monitoring has been demonstrated by Dunford et al. [3], who mapped riparian forests, and by Rango et al. [4], who mapped rangelands in New Mexico. UAVs have also been proven to be useful for mapping agricultural crops, for example, mapping of vineyards [5], monitoring of wheat trials [6], and quantitative remote sensing of orchards and vineyards [7,8]. However, research on the use of multiple sensors, which are expanding the remote sensing capabilities of UAV platforms, is rather limited. The use of multiple sensors presents unique challenges related, in particular, to the co-registration of the different image sensors.
UAVs offer particular advantages over other remote sensing platforms, especially if a fine spatial resolution (<10 cm/pixel) is required. In terms of agricultural crop monitoring or mapping of natural vegetation such as Antarctic moss, satellite imagery acquired at very high spatial resolution (e.g., pixel size ~0.5 m provided by sensors on-board platforms such as WorldView or GeoEye), often provide insufficient detail to monitor vegetation structure [9] and to extract detailed biophysical information, such as leaf size [10] Although imaging systems carried by manned airborne platforms can provide high spatial and temporal resolution imagery, they are limited by high operational complexity and costs, particulary in Antarctica and polar regions. UAVs can offer a cost-effective alternative to traditional airborne remote sensing, but it is essential that techniques used to process the large amount of high spatial resolution data collected by a UAV are accurate and efficient.
Adequate spectral resolution is another key factor, particularly when monitoring vegetation, which exhibits great variability in magnitude of the near-infrared (NIR) reflectance. The lack of NIR reflectance information imposes limitations on vegetation characterization and thus a multispectral sensor that can capture data over several, preferably narrow, spectral bands is required [11]. However, few lightweight multi-/hyper-spectral sensors suitable for UAV operations are currently available [10], which limits research progress in this area. A typical approach to acquire NIR data from a small UAV is to modify a conventional digital camera by removing the infrared filter. As demonstrated by several studies [12][13][14][15], a consumer-modified camera can collect useful Color Infrared (CIR) imagery. However, for detailed analysis of vegetation structural and biochemical parameters, narrow band multi-or hyper-spectral sensors are required. Kelcey and Lucieer [16] described a correction workflow for reducing noise and optical distortion of the 6-band Tetracam multispectral sensor UAV image data, also used in this study. Berni et al. [8] mapped olive orchards with a radiometrically calibrated multispectral sensor. They used images corrected for atmospheric effects to retrieve per-pixel leaf chlorophyll content and Leaf Area Index (LAI). Laliberte et al. [11] classified land cover types from atmospherically corrected multispectral imagery of rangelands in New Mexico using Object Based Image Analysis (OBIA) techniques. More recently, Zarco-Tejada et al. [7] used calibrated and fully corrected multispectral imagery to estimate content of leaf carotenoid pigments of Spanish vineyards.
Satellite thermal imaging is currently limited to low spatial resolutions, for example 90 m per pixel from the TERRA-ASTER instrument [17], a high spatial resolution assessment of vegetation moisture can be obtained from Thermal Infrared (TIR) imagery acquired from a UAV platform [14]. Uncooled TIR sensors, based on microbolometer technology, are generally used on MUAVs, because they are smaller and have lower power consumption [18]. Unfortunately, uncooled sensors are less sensitive and have a lower spectral resolution than their cooled counterparts [18]. They also require a specific spectral calibration and geometric characterization to retrieve the "true" ground surface temperature [8]. Despite these challenges, several studies have successfully used uncooled TIR sensors to map plant surface temperature for the purposes of crop monitoring [8,18,19].
As demonstrated by many studies cited above, there is a significant potential for UAV-based remote sensing of vegetation in the visible, near infrared, and thermal infrared region of the electromagnetic spectrum. A typical limitation in using current MUAVs for this purpose is, however, the ability to carry only one sensor at a time, due to the current lack of lightweight sensors, in particular sensors capable of measuring reflectance in the NIR. Imagery of visible, NIR and TIR wavelengths are, therefore, collected on separate flights [12], which results in a requirement to spatially co-register the separate datasets. Berni et al. [8], Bryson et al. [15] and Bendig et al. [19] conducted their research using various multispectral and TIR sensors, but they did not discuss co-registration accuracy of their datasets. Although, Nagai et al. [20] developed a method to co-register laser scanner data with data from visible and NIR cameras, most other UAV studies have focused on sensor calibration and correction [8], correlation of multi-sensor data with biophysical properties of the vegetation [7], or on object based classification of vegetation types [21]. Therefore, the objective of this study is to present a workflow for spatial co-registration of visible, multispectral, and TIR imagery acquired at different ultra-high spatial resolutions during separate UAV flights. We describe technical specifications of the sensors used in this study, their image recording systems, and the multiple-image georectification and co-registration workflows.
To demonstrate the applicability and accuracy of our methodology we will present multi-sensor datasets collected over three moss study sites in Antarctica. Moss is the dominant form of vegetation in Antarctica, capable of preserving in their shoots up to a hundred-year long record of Antarctic climatic conditions [22]. Despite Antarctica's sensitivity to climate change, there have only been few studies investigating the response of Antarctic vegetation to dynamic climatic conditions [23,24]. There is, therefore, a need for mapping methods allowing detailed inventory and subsequent spatial monitoring of the changes in these vulnerable ecosystems. These Antarctic moss beds are spatially highly fragmented and cover only small areas (<1 ha). It is, therefore, practically impossible to map their extent with even the highest spatial resolution satellite imagery currently available (0.5 m/pixel). Moreover, local logistical obstacles limit the acquisition of conventional aerial photography in Antarctica, which may not provide the required spatial resolution (<10 cm/pixel). UAVs are an ideal platform from which image data of moss beds can be collected with sufficient spatial detail [25].

Test Sites
In Turner et al. [26] we introduced a technique to georeference and mosaic multiple visible images collected by a MUAV. In the present study we are using two additional datasets of thermal and multispectral images, to demonstrate our multi-sensor spatial co-registration methodology. Input data were collected at three study sites in the Windmill Islands region, Antarctica (near the Australian base, Casey), where some of the most well-developed continental Antarctic vegetation is located.
The three study sites were named: Robinson Ridge, Red Shed, and Antarctic Specially Protected Area 135 (ASPA 135) (see Figure 1 for location overview). Robinson Ridge is located approximately 10 km south of Casey station, the Red Shed site is beside a melt lake behind the main accommodation building at Casey, and ASPA135 lies approximately 500 m east of Casey. Further description of all three test sites can be found in Turner et al. [26] and in Lucieer et al. [25].

Platform
Multi-rotor UAVs are becoming more commonplace and are frequently used for commercial and recreational aerial photography. For this study we used an eight rotor Mikrokopter Micro-UAV called an "Oktokopter" supplied by HiSystems (GmbH, www.mikrokopter.com, Germany) (see Figure 2). The Oktokopter had a payload capacity of around 1 kg, a flight duration of 5 min (with a typical payload), and was equipped with a gimballed camera mount (i.e., self leveled during flight based on onboard gyroscopes that measure the roll and pitch of the airframe), to which we individually fitted each of the three sensors. The Mikrokopter flight electronic systems can be used to automatically maintain level flight, control the altitude, log system data, and to fly the UAV through a series of predefined, three-dimensional waypoints.

Figure 2.
Oktokopter fitted with FLIR Photon 320 Thermal Infrared camera with Ethernet module mounted below.

Visible Digital Camera
To collect visible imagery we used a Canon 550D Digital Single Lens Reflex (DSLR) camera (15 Megapixel, 5184 × 3456 pixels, with Canon EF-S 18-55 mm F/3.5-5.6 IS lens). The image capture rate was controlled by the UAV's flight control board, which was programmed to emit a pulse at a desired frequency. The flight control board was connected to a custom-made cable that triggered the remote shutter release of the camera. The Canon camera was operated in shutter priority mode (a fast shutter speed was required to minimize motion blur), in which the desired shutter speed (typically 1/1250-1/1600 sec) was set before flight and the exposure was adjusted automatically by varying the camera's aperture. Images were captured in RAW format and stored on the memory card in the camera for post-flight download.

Thermal Infrared Sensor
To collect TIR imagery we used a FLIR Photon 320 (FLIR Systems, Inc. the USA, www.flir.com) uncooled thermal sensor (see Figure 2). The Photon 320 had a 14 mm lens providing a 46° field of view and acquired image frames of 324 × 256 pixels as raw 14-bit Digital Numbers (DNs) at the rate of 9 Hz. Image frames from the camera were converted into ethernet data packets by the FLIR Ethernet module and this data was then stored on a Single Board Computer (SBC), a Gumstix Verdex Pro XM4-BT, equipped with netCF and console expansion cards. System time of the Gumstix SBC was set to GPS time prior to flight, so that the thermal data files could be synchronized with UAV GPS log files. The 9 Hz data rate was too fast for the data buses of the Gumstix SBC and thus images could only be collected at a rate of around 1 Hz, which was fast enough for our purposes and retained sufficient image overlap on ground.
After the flight, the raw image data was downloaded from the SBC memory card and processed with code written in the IDL/ENVI image-processing environment (Exelis Visual Information Solutions, Inc. USA, www.exelisvis.com) to extract the image frames from the captured data packets. Extracted images were stored as 16-bit ENVI single band files containing the original 14-bit raw DNs as collected by the TIR sensor. A set of JPEG quick look images were simultaneously generated allowing a visual check of image quality. As the DNs typically do not cover the full 14-bit dynamic range, a contrast stretch to the data had to be applied, so that subsequent image processing software was able to identify features within the images. To identify an appropriate stretch, we created a histogram of the DNs of all pixels in all images, chose an upper and lower threshold such that the full dynamic range of the scene was covered, and applied a linear stretch based on these minimum and maximum thresholds. The images were then stored as 16-bit TIFF files, keeping note of the thresholds used such that the pixel values could later be converted back to the original DNs. The DN-values in the thermal imagery represent at-sensor radiance. After mosaicking and co-registration the DN-values were converted to absolute temperature in °C based on an empirical line correction. Nineteen targets with similar emissivity (0.97 assumed for moss and dark rock) were marked with shiny aluminum disks. Due to the very low emissivity of shiny aluminum these targets were clearly visible in the thermal imagery. A temperature observation was collected with a thermal radiance gun (Digitech QM7226) in between two aluminum disks. A GPS coordinate was also recorded for these observations. Matching pixels were extracted from the thermal imagery and based on the matching reference temperatures a linear regression was calculated. With this empirical relationship we converted the whole thermal mosaic into absolute temperature, assuming a constant emissivity of 0.97 (which is justifiable given our interest in the moss bed).

Multispectral Sensor
The multispectral sensor used in our study was a Tetracam (Tetracam, Inc. the USA) mini-MCA (Multiple Camera Array) with an array of six individual image channels. Each channel has its own Complementary Metal Oxide Semiconductor (CMOS) sensor that could acquire 10-bit image data at an image size of 1280 × 1024 pixels. It was possible to fit customized waveband pass filters to each lens, allowing the user to define the desirable spectral band configuration. The data sets collected for this study had 530, 550, 570, 670, 700 and 800 nm optical filters fitted, with a Full-Width at Half Maximum (FWHM) of 10 nm.
The mini-MCA could be set to a "burst" mode in which it captured images continually from the time the shutter release was first pressed, we used the maximum rate of 0.5 Hz. Each of the six arrays stored the images in a proprietary raw format onto individual compact flash memory cards. After the flight the image data was downloaded from the cards resulting in six files for each camera exposure. As with the TIR data, we designed processing code in the IDL/ENVI environment to read the raw format files and to merge the layers into a single six band, 16-bit (to store the 10-bit data) ENVI image file format. The next stage was to correct mini-MCA imagery for sensor noise and other image distortions. Detailed description of these corrections can be found in Kelcey and Lucieer [27]. In short, the following three corrections were applied: (i) noise reduction using dark current imagery, (ii) lens vignetting correction based on spatially dependent correction factors, and (iii) a Brown-Conrady model removed lens distortion [27].
Finally, the image bands had to be aligned, as the six mini-MCA camera lenses were spatially offset. Tetracam Inc. provides software and alignment equations to correct for these offsets, but similar to Laliberte et al. [11] we found this alignment correction inaccurate at our typical UAV flying heights. To improve the band misalignment, we developed our own technique based on detecting geometric features within the imagery with the Scale Invariant Feature Transform (SIFT) keypoint detector [28]. The alignment process considered the first spectral band (Band 1) of each image to be the master and aligned the other bands to it by matching key points between the bands. SIFT was run on each band of an image to create a set of key files and a key matching algorithm was then run between Band 1 and each of the other bands. Extracting the x, y locations of the matching features allowed us to create a control point file that aligned a given band with Band 1, i.e., there were five control point files for bands 2 to 6. The control points were applied to each band with a Delaunay triangulation combined with a nearest neighbor resampling in order to perform a non-linear local transformation for alignment with the first band. The band alignment was highly dependent on the distance between the camera and the imaged surface. Provided that this distance remained relatively constant (within ±5 m), the technique would create a set of alignment parameters applicable to the current dataset. Final inspection of the band alignment for objects with sharp edges in the images was found to be satisfactory. Moreover, the method had the added advantage of being fully automated.

Mosaicking of Visible Imagery
During a typical flight of our UAV the Canon camera collected around 200 images. It was therefore necessary to join the images into a single mosaic of the whole study area. The image mosaic also needed to be georeferenced, such that the imagery from the different sensors could be co-registered. There are various methods for mosaicking UAV imagery; e.g., Berni et al. [8], Bryson et al. [29], Laliberte et al. [30], Turner et al. [26], and Turner et al. [31]. Recently, new commercial software packages for automatically georectifing UAV imagery have become available (a review of some of these packages can be found in Turner et al. [31]). Based on our previous research results we selected Photoscan Professional by Agisoft (Agisoft LLC, Russia) to georectify and mosaic the visible UAV imagery.
An overview of the Structure-from-Motion (SfM) workflow in Photoscan software can be found in Lucieer et al. [25]. Prior to processing, the images were geotagged with their approximate location as recorded by the UAV's on-board navigation-grade GPS. The internal time of the camera was set to GPS time prior to flight to ensure that the images could be easily synchronized with the position data in the UAV GPS log file. The blurry images were then detected with an algorithm that calculates an image blur metric according to Crete et al. [32]. A more detailed description of this image processing stage can be found in Turner et al. [31]. In the final step images with excessive overlap are removed [31].
Once the set of images to be processed was finalized, they were imported into Photoscan, which then detected and matched thousands of features between the images. Using these matches it performed a bundle adjustment to estimate the camera positions, orientations, and lens calibration parameters. Based on this information, the geometry of the scene (in the form of a 3D model) was created by applying a dense, multiview stereo reconstruction to the aligned images. Once the 3D geometry of the scene was constructed, a Digital Surface Model (DSM) and an orthophoto mosaic could be exported [25].
To improve the absolute spatial accuracy of the mosaics, we manually located Ground Control Points (GCPs) distributed within the imagery. The GCPs were 30 cm diameter metal disks with a bright orange rim that were laid out in the study area prior to flight and measured with Differential RTK GPS (DGPS) with a typical accuracy of 2 cm in the horizontal and 4 cm in the vertical direction (relative to a local coordinated benchmark). Similar to the large GCPs, between 20 and 45 smaller (10 cm diameter) orange metal disks were randomly laid out across the study area and also coordinated with a DGPS. These small GCPs were used later as check points to verify the accuracy of the georeferenced mosaics.
Photoscan provided a simple interface to mark the location of a GCP on the 3D model and its location was then automatically marked on all the images that covered that part of the model. The user then needed to manually verify and, if necessary, adjust the location of the GCP in each image. Although this process was time-consuming, taking from 2 to 4 h to mark about 20 GCPs in a dataset of 200 photographs, the significant improvement in spatial accuracy justified the work required.

Mosaicking of Thermal Infrared Images
Similarly to the visible camera, the TIR sensor could collect hundreds of images in a single flight and it also required selection of the best images from the acquired dataset. The response time of the microbolometer in the FLIR Photon 320 is approximately 10 milliseconds (see www.flir.com), giving it an effective shutter speed of 1/100th of a second, resulting in motion blur in around 40% of the images. Blurriness was assessed with the algorithm described in Section 2.6 and images with a blur metric greater than 0.3 were automatically removed from the dataset (see Figure 3). TIR images were subsequently processed with Photoscan in a manner similar to the visible imagery. Photoscan was provided with initial estimations of camera position based on the time stamp of the image and the position as recorded in the on-board GPS log file. The TIR imagery was stored as a single band 16-bit file. Features in the images were enhanced with a linear contrast stretch, which was based on the minimum and maximum temperature DNs detected in the full scene (see Section 2.4). Once Photoscan had aligned the images, the GCPs needed to be identified within the imagery to allow accurate co-registration. It should be noted that only large aluminum trays (30 cm diameter) with an unpainted central part (see Figure 4b) were detectable in the lower resolution TIR imagery (10 cm/pixel for a typical UAV flight). The metallic aluminum surface of the trays with a low emissivity appeared as very cold (dark) pixels, which made the GCPs easy to visually identify. After the GCPs were marked, the scene's geometry was constructed, and an orthophoto was generated following the same approach as described in Section 2.6.

Mosaicking of Multispectral Images
Unlike the TIR and the visible imagery it was not possible to process the multispectral imagery with Photoscan. Although multi-band 16-bit images could be imported into Photoscan, it was not feasible to reliably align the images of different spectral bands, no matter what parameters were selected. The CMOS sensors in the mini-MCA have a rolling shutter, which built up each image as a scan from top to bottom rather than a whole-frame snapshot as in the case of a global shutter. Given the movements of the sensor during image acquisition the rolling shutter leads to geometric distortions in each image, which were unpredictable. The SfM algorithm in Photoscan expects images to be acquired by a global shutter. Thus, the distorted image geometry of the mini-MCA leads to very poor and false image matching results. Laliberte et al. [33] developed a method for rectifying, georeferencing, and mosaicking UAV visible imagery by matching the individual images with a pre-existing orthophoto of the study area using image correlation techniques. This method served as the basis for our approach, however, we have implemented the SIFT algorithm as we did with the mini-MCA band alignment workflow (see Section 2.5). Instead of using a low-resolution orthophoto we matched each mini-MCA frame to the ultra-high spatial resolution visible orthophotomap (Section 2.6). This allowed SIFT to select thousands of features per image to be used as control points, providing a denser transformation matrix.
The mini-MCA collected a large number of overlapping images during flight. High overlap (80%-90%) is essential for the SfM algorithm, but for the mini-MCA imagery we only needed overlap of about 25%-30% to form a continuous mosaic. Therefore, the first processing step of the multispectral mosaicking workflow was determining an optimal subset of input images. The position of the airframe at the time of each image exposure, logged by the on-board GPS unit, is used to select the mini-MCA images based on their spatial distribution. Through process optimization we determined that a threshold of 7 m between the image positions achieved a mosaic with full coverage, but minimal seam lines. This selection reduced the number of images in a test dataset from 142 to 41, greatly reducing the number of seam lines within the final mosaic, whilst maintaining sufficient image overlap of around 25% (as determined via experimentation across all datasets). The blurriness of mini-MCA images was verified in the same manner as the other two datasets. However, the blur factor of the mini-MCA images was, in general, so low that no images were excluded from the datasets.
Before we matched features in a mini-MCA image, we had to identify all the features in the visible orthophoto. To overcome limitations of the conventional SIFT algorithm implementation that usually runs only on low-resolution images, we implemented a SIFT distribution called "libsiftfast" [34], which uses multiple CPU cores and can process large images. As result, SIFT was able to detect 3,643,780 features in one of our 8690 × 17,215 pixels large visible mosaics. Since the SIFT algorithm runs per single image band, the six mini-MCA spectral bands supply more geometrical features for the matching process than the standard RGB imagery. From the millions of matches, there are often many false matches, which were removed with a Random Sample Consensus (RANSAC) algorithm developed by Fischler and Bolles [35]. In the application of the RANSAC algorithm we made the assumption that we were working with a projective model and that the epipolar constraints would hold despite the possibility of rolling shutter distortions within the imagery. To ensure these distortions did not affect the results we applied a low distance threshold (0.01 in normalized space), which reduced the number of resultant matches, whilst ensuring there were no false positives. We did not require a large number of matches as the remaining feature matches (post RANSAC) were not going to be used in an SfM context but were instead used to create control point files.
Using the feature matches, and the feature key files containing the locations of each feature in each image, we automatically created a series of dense GCP files for each mini-MCA image. This was facilitated by matching the known x, y coordinates of mini-MCA image features with their corresponding x, y coordinates in the RGB orthophoto, which were converted to easting and northing coordinates as the RGB mosaic was already georeferenced. The GCPs were then used in a Delaunay triangulation, which transformed a mini-MCA image using a nearest neighbor resampling algorithm into the same coordinate system as the visible and TIR mosaics. The pixel size of the resulting georeferenced image was set based on the flying height during the image acquisition (3 cm/pixel for our test datasets). Finally, all the images were merged into a georeferenced mosaic with the IDL/ENVI mosaicking routine using a feathering of the seam lines to smooth the original image edges.

Calculating MTVI2 from Multispectral Data
Raw images of the mini-MCA, recorded as 10-bit digital counts with maximal signal strength equal to 1024 DNs, were transformed into the physically meaningful relative reflectance by applying an empirical line correction [36]. Three spectrally flat calibration panels (40 × 40 cm) of white, grey, and black color with reflectance intensities ranging from 2 up to 80% were placed within the UAV flight path and captured in the mini-MCA imagery. The target's actual DN values (approx. 130-140 pixels per panel) were extracted from the airborne image and empirically related to their reflectance functions measured with a spectrally calibrated ASD HandHeld2 (HH2) spectroradiometer (Analytical Spectral Devices, PANalytical Boulder USA) on the ground immediately after completion of the mini-MCA acquisition. The absolute reflectance of the calibration targets measured with the ASD-HH2 between 325 and 1075 nm (751 spectral bands of 1 nm bandwidth) were spectrally convolved to resemble the six broader spectral bands of the mini-MCA instrument with FWHM of 10 nm. The empirical line correction coefficients established between the convoluted ASD-HH2 and the six acquired mini-MCA spectral bands were then applied per pixel to the mini-MCA image mosaic to remove optical attenuation caused by scattering and absorption processes of atmospheric gases and aerosols between the sensor and observed surfaces and to standardize the multispectral signal as the relative reflectance function.
An efficient way of detecting photosynthetically active vegetation in multispectral imagery and assessing its actual physiological state is to transform the relative reflectance function into an optical vegetation index. The Modified Triangular Vegetation Index 2 (MTVI2) was originally introduced by Haboudane et al. [37] to estimate green biomass density of spatially homogeneous agricultural crops. It is computed as: where , and are the reflectance values at 550, 670 and 800 nm. As a successor of the Triangular Vegetation Index [38], MTVI2 integrates an area delineated by the reflectance at 550, 670 and 800 nm, which is influenced by the changes in leaf and canopy structure, normalized by a soil adjustment factor that reduces the contamination effect of the bare soil background.
Being stressed by insufficient water supply and high photosynthetically active and ultraviolet irradiation, Antarctic moss turf can, within few days, change in compactness and pigmentation. Over days to weeks it changes from a healthy, green open-leaved form to a stress-resisting denser, yellow-brown or red closed packed turf. If dry periods persist over longer periods (months to years) the turf will lose photosynthetic pigments, forming grey-black mounds of moribund (dormant/dead) moss. Since these physiological stress reactions systematically alter the moss reflectance at the wavelengths of 550, 670 and 800 nm, we could apply MTVI2 to separate photosynthetically active moss (health > 60%) from moribund moss, lichens, and the rocky surroundings.
For assessment of the ability of the MTVI2 index to determine moss health, the Robinson Ridge MTVI2 results were compared with field samples from the 2012 field season collected as part of a long term monitoring system for Australian State of the Environment Indicator 72 (SoE 72). Established in 2003, this monitoring system comprises a set of 30 permanent quadrat locations, across 10 transects spanning a water gradient across three community types: from the wettest community dominated by mosses (Bryophyte community) to the driest community dominated by moribund moss encrusted with lichens (Lichen community), with a Transitional community between [39]. For each quadrat, the percentage of live bryophytes was evaluated from 9 small samples, each containing approximately 20-50 moss shoots, at 9 intersections within a 20 × 20 cm grid. For a more detailed description of moss ground monitoring see Wasley et al. [39] and Lucieer et al. [25] and for a description of spectral properites of Antarctic moss see Lovelock and Robinson [40]. To replicate the ground based sampling scheme, a grid of 3 × 3 pixels was extracted from the MTVI2 map and averaged per sampled quadrat.

Accuracy Assessment
To achieve the best possible co-registration of the three different datasets, it was essential that each of the mosaics was georeferenced with the highest spatial accuracy possible. For the RGB orthomosaics we, therefore, measured the positional error of all small orange disk check points that were measured with DGPS, but not used by Photoscan to georeference the mosaics. These disks were, unfortunately, too small to be visible in the TIR imagery and to be accurately identified in the mini-MCA imagery. For these two datasets the larger GCPs were used as check points. The Root Mean Square Error (RMSE) was computed between check point coordinates measured in the field with DGPS and coordinates retrieved from georeferenced image mosaics to assess the overall spatial accuracy of each dataset.

Image Mosaics
The three tests sites were flown with our manually navigated Micro-UAV on three separate days. Manual navigation was required due to failure of autopilot navigation, which was caused by the extreme magnetic declination of Eastern Antarctica (~100° West). A basic description of the acquired datasets is provided in Table 1. Flights were all carried out in good solar illumination conditions, light winds, and at an altitude of approximately 50 m Above Ground Level (AGL). Despite flying at the same height AGL during each UAV mission, the resulting image mosaics have different spatial extents and different spatial resolutions caused by differences in the technical parameters of each sensor. Applying the methods described in Section 2, we created georeferenced mosaics for each of the nine flights. An example of a visible mosaic from the Robinson Ridge site is shown in Figure 4a. Figure 4b illustrates the detail that can be seen in this RGB imagery with a pixel size of 1 cm. The same spatial subsets of the lower resolution thermal infrared mosaic (10 cm/pixel) and of the false colour multispectral mosaic (3 cm/pixel) are shown in Figure 4c,d, and finally Figure 4e gives an example of a typical spectral signature for the healthy moss.
Using the methods described in Section 2.10, the spatial accuracy of each orthomosaic was measured and a summary of the spatial errors and RMSE is provided in Table 2. The RMSE ranges in general from 1 to 2.6 pixels, which means that all the mosaics exhibit a comparable level of spatial accuracy. The ASPA135 visible mosaic has the lowest RMSE, which can be attributed to the fact that it is a relatively small site with limited geomorphological variability, whereas the other two test sites are larger in spatial extent and have more diverse terrain morphology.

Co-Registration Accuracy
Large GCPs were used as cross-comparison check points for the co-registration accuracy assessment, since they were the only features clearly identifiable in all three datasets. The location of the comparison points in the visible mosaic was considered to be the reference position. The co-registration errors of these GCPs in the mini-MCA and TIR mosaics are summarized in Table 3. The RMSEs are generally around 2 pixels, which matches the absolute RMSE that was obtained for all the mosaics listed in Table 2.

Assessing Health of Antarctic Moss from Multisprectral Imagery
To demonstrate sufficient spatial accuracy of these geocoded mosaics and the ability of the multispectral imagery to assess the actual health state of Antarctic mosses, we computed the MTVI2 optical vegetation index from the mini-MCA imagery collected at the Robinson Ridge test site (see Section 2.9). The Robinson Ridge data was selected as the most suitable of the three available datasets, because it has the least snow cover, it was acquired under optimal light conditions (bright, but diffuse irradiation), and ground observations of actual moss health are available for this site. Since reflectance signatures of stressed mosses and agro-systems with low leaf density (i.e., low leaf area index) are spectrally similar, we could apply MTVI2 to assess the spatial distribution of the health state of moss bed at Robinson Ridge.
Comparing the moss health measured in summer 2012 and mean MTVI2 values derived from the mini-MCA mosaic we found a strong quadratic relationship between the two (see Figure 5). MTVI2 is insensitive to moss health in the driest lichen community quadrats due to the low abundance of photosynthetically active moss. MTVI2 does, however, show a statistically significant positive relationship for quadrats with a significant and/or dominant presence of bryophytes (R 2 = 0.636). Such a strong correlation provides evidence of a good co-registration agreement between mini-MCA and DGPS localization of quadrats and georectification of the mini-MCA image mosaic. The statistical relationship shown in Figure 5 allowed us to approximate per-pixel moss health of the whole moss bed captured in the mini-MCA mosaic. Unfortunately, a considerable number of erroneous pixels were identified after a close inspection of the moss health map. These were caused, for instance, by high sensor noise combined with low light in shadows that produced incorrect spectral signatures of rock surfaces, mimicking high MTVI2 values of healthy moss. To prevent false moss health estimates, we applied the following rules: (i) the calculated per-pixel value of the Normalized Difference Vegetation Index (NDVI) must always be positive, (ii) the reflectance value of the 800 nm spectral band must be positive and greater than the reflectance in the shorter wavelengths, i.e., at 550, 570 and 700 nm, and finally (iii) to eliminate negative estimates, moss health is assessed only if the MTVI2 value is greater than 0.4. The final map of moss health with these rules applied is displayed in Figure 6b.

Assessing Temperature of Healthy Moss from Thermal Infrared Image Mosaic
Simultaneously with the UAV flight over the Robinson Ridge study site, ground measurements of surface temperature were carried out with a handheld infrared thermometer at various locations recorded with DGPS. The corresponding DN value at each of the ground sample points was extracted from the georeferenced TIR mosaic and compared to the ground temperature measurements. A strong linear relationship established between both datasets was used to convert the TIR DN values to indicative surface temperatures (see Figure 7), providing a map of indicative surface temperature for the whole study site. Applying the moss health map as a mask for the TIR imagery produced the surface temperature map of only moss pixels. The resulting surface temperature map of moss with a relative health score greater than 60% is shown in Figure 6c.

Discussion
The primary aim of this study is to demonstrate that our image co-registration workflow is able to produce sufficiently accurate georectified mosaics from images collected by three sensors during three separate flights at ultra-high spatial resolution (1-10 cm/pixel). It is important to note that each mosaic is produced with its own fully independent workflow and that none of them has been cross adjusted to improve the co-registration result. The mean accuracy of the co-registration (1.78 pixels) is regarded as satisfactory and acceptable for mapping of Antarctic moss beds, as verified by the statistically significant agreement with moss health ground observations ( Figure 5). Unfortunately, there are currently no similar studies of co-registration of UAV imagery, to which our results could be compared. Nevertheless, the thematic maps of moss health and moss surface temperature could only be created because of accurate co-registration of the three independent datasets.
The actual moss health map does not provide estimates below 60% (Figure 6b), which is mostly populated by desiccated (or moribund) plants. The spectral signature of such moss is similar to signatures of surrounding rocks and bare soil in the area, therefore, MTVI2 is unable to distinguish between these surfaces. In Lucieer et al. [25] a correlation between moss health and water availability was established based on local point measurements. Occurrences of high moss health (health > 90%) were shown to coincide with areas of high water flow accumulation. Geostatistical analysis of the moss health map produced in this study has potential to quantify the statistical significance of this relationship spatially at the scale of the whole study area, but such an analysis is beyond the scope of this paper.
The map of moss surface temperature shows subtle variations across the moss beds. These variations do not seem to be related to moss health, but rather reflect the local micro-topography, shadowing effects, and local moss moisture (water availability) variability. The phenomenon can be clearly seen in the areas of cooler moss occurring south of the large boulders (Figure 6c). These areas are shadowed from the sun, which at the time of image collection had an azimuth of 350 degrees and an elevation of 35 degrees. Thus the areas to the south of the boulders are cooler due to the microclimatic differences between the sunlit and shaded sides of the boulders. It is unlikely that the apparent zones of lower moss temperatures are caused by inaccurate spatial co-registration of multispectral and thermal mosaics, as the marked thermal shadow of a large boulder in the center of Figure 6c is too large to be attributed wholly to co-registration errors.
The creation of the visible of thermal mosaics was straightforward with the use of the Photoscan software, however, significant work was required to mosaic the multispectral imagery as it was not compatible with Photoscan. The SIFT algorithm provided the innovation required, allowing the multispectral imagery to be matched to the already georeferenced visible imagery. Matching of ultra-high resolution UAV multispectral imagery with visible imagery using the SIFT algorithm provides a new method to coregister UAV imagery. It should be noted that the methodology needs to be tested with more topologically variable terrain to validate its performance with datasets collected over steep terrain.
Although the co-registration techniques described in this study have proven to be robust and accurate, an elimination of GCPs and implementation of a direct georeferencing system such as described in Turner et al. [31] would certainly be a significant future improvement. There are also some limitations to the mini-MCA camera, in particular the rolling shutter and the sensor noise, which need further attention

Conclusions
In this study, we have developed a semi-automated workflow for accurate spatial co-registration of image datasets acquired from a Micro Unmanned Aerial Vehicle (MUAV) platform equipped with three different sensors: visible, multispectral, and thermal. We demonstrated that the methodology can achieve a mean co-registration accuracy of 1.78 pixels. A significant achievement of this study was a method to georectify and mosaic a large number of multispectral images acquired by the TetraCam mini-MCA, which could not be previously achieved with available image processing software. The study is also the first to present a detailed analysis of co-registration accuracy of three UAV image datasets acquired at an ultra-high spatial resolution. The benefit of accurate co-registration of UAV sensors was demonstrated through a case study assessing moss plant health and estimating moss surface temperature at a permanent study site in Eastern Antarctica. Both thematic maps, created from the accurately co-registered image mosaics, provide important spatial insights into the dynamic environment and growing conditions of the Antarctic mosses. In this study, we have shown that UAVs carrying multiple sensors can be used to accurately map vegetation canopies. Although future applications will likely deploy all such sensors simultaneously in order to eliminate changes caused by flight time delays, a similar co-registration methodology will still be required.
Postgraduate Award. We thank Jane Wasley, Johanna Turnbull for setting up the original quadrats and Jessica Bramley-Alves and Rebecca Miller for helping to collect samples in 2012, and ANARE expeditioners in 2003, 2008 and 2010 for their field support. Luke Wallace is acknowledged for his assistance implementing the RANSAC algorithm and we thank Joshua Kelcey for applying radiometric sensor correction algorithms to the mini-MCA imagery.

Author Contributions
Darren Turner and Arko Lucieer acquired the UAV field data in Antarctica. Sharon Robinson and Diana King collected the moss samples and completed all labwork required to produce the moss health dataset. Darren Turner processed all the imagery and developed the new algorithms as required. Zbyněk Malenovský applied the empirical line correction to the multispectral data, implemented the MTVI2 vegetation index, and provided expertise on spectral signatures of Antarctic moss. Darren Turner, Arko Lucieer, and Zbyněk Malenovský wrote the manuscript with editorial contributions from Sharon Robinson and Diana King.