Surface Heat Monitoring with High-Resolution UAV Thermal Imaging: Assessing Accuracy and Applications in Urban Environments

: The urban heat island (UHI) effect, where urban areas experience higher temperatures than surrounding rural regions, necessitates effective monitoring to estimate and address its diverse impacts. Many existing studies on urban heat dynamics rely on satellite data with coarse resolutions, posing challenges in analyzing heterogeneous urban surfaces. Unmanned aerial vehicles (UAVs) offer a solution by providing thermal imagery at a resolution finer than 1 m. Despite UAV thermal imaging being extensively explored in agriculture, its application in urban environments, specifically for surface temperatures, remains underexplored. A pilot project conducted in Athens, Georgia, utilized a UAV with a FLIR Vue Pro R 640 thermal camera to collect thermal data from two neighborhoods. Ground data, obtained using a handheld FLIR E6-XT infrared imaging camera, were compared with UAV thermal imagery. The study aimed to assess the accuracy of the UAV camera and the handheld camera for urban monitoring. Initial testing revealed the handheld’s accuracy but tendency to underpredict, while UAV camera testing highlighted considerations for altitude in both the rjpg and tiff image pixel conversion models. Despite challenges, the study demonstrates the potential of UAV-derived thermal data for monitoring urban surface temperatures, emphasizing the need for careful model considerations in data interpretation.


Introduction
The urban heat island (UHI) phenomenon, denoting elevated temperatures in urban areas compared to their rural counterparts [1], was documented as early as the 1810s [2].This temperature difference is attributed to the heat-absorbing and radiating properties of materials such as concrete and steel, compounded by anthropogenic activities and a lack of green spaces, impeding natural cooling processes [3,4].The repercussions of UHI are profound, encompassing heightened heat stress with consequential health implications, leading to health issues and over 10,500 deaths from heat between 2004 and 2018 [5].Particularly susceptible demographics, including the elderly and socioeconomically disadvantaged individuals, face increased risks [6,7].Increased heat also drives electricity consumption for cooling, leading to pollution and health problems [8].Additionally, UHI affects water bodies and ecosystems [9,10].Cities are responding with heat mitigation plans, including reducing surface coverage, using cooler materials, and increasing green spaces [11].
Currently, the majority of urban heat island studies follow one of two methods: either the overall city surface temperature is estimated broadly from satellite imagery, or in situ measurements are collected [12][13][14][15].Both of these methods face limitations and obstacles to linking ground measurements with overall city image estimates.In particular, urban surface temperature studies are limited to coarser resolutions because they are usually estimated from satellite imagery.Landsat 7 offers 60 m thermal bands and Landsats 8-9 provide 2100 m bands.Images offered at 30 m resolution from Landsat have been resampled but are essentially derived from the 60 m or 100 m resolutions.These coarse resolutions become more disadvantageous due to the highly heterogeneous nature of urban environments.In addition, satellite temperature readings are limited to surface temperatures (not air temperatures) [16].
The second urban heat monitoring method involves microscale point data from singlepoint measurements at ground level or from weather stations.These measurements are limited to the temperature of that single point, and multiple recordings are needed to represent an area.The majority of these studies measure air temperature (not ground surface temperature).In one study, the air temperature was represented as a line because it was recorded continuously on a bike, a technique that was also employed in another study that used a car [15,17].However, these measurements are still greatly restricted to a small space and require greater manual labor on the ground to cover more area.In other words, urban heat island research has been unable to scale fine-resolution temperature data to a larger area.While some studies have been able to acquire very-high-resolution (VHR) airborne thermal infrared data to address this need, airborne deployment can be costly and complicated [18].
UAVs present the opportunity for fine-resolution spatial data over larger areas and are currently implemented in other fields, such as precision agriculture and geology [19][20][21][22].In contrast, a limited number of studies have leveraged the potential of UAVs in urban environments for any application, including addressing the gap in the urban surface temperature literature [23][24][25][26][27][28].These range internationally from Chile to China, but none to our knowledge have been conducted in the United States.Of these, only a few were designed with the purpose of extensively testing the accuracy of a thermal camera in an urban environment [25,28].
Thermal UAV imagery presents its own complications, regardless of the environment.These issues include how to store temperature data in each image pixel, what type of file to store the data in, how to process the different image file types, and how to stitch multiple images together to cover a wider area [19,21,29].In addition, while some studies, such as those in agriculture, have used highly accurate ground-truthing stations or thermal tiles, such equipment cannot be easily placed and maintained in an urban environment, meaning that alternatives, such as handheld infrared imagers, need to be explored.These issues together pose a major obstacle for any kind of larger-area urban application.
Several of the above questions and concerns arise from the flexibility to select different image formats per camera.For example, FLIR, a prominent thermal camera manufacturer, offers users the choice between an 8-bit radiometric jpg (rjpg) and a 14-bit tag image file format (tiff).The 8-bit rjpg format, despite lower radiometric resolution, yields per-pixel temperature values based on user-input environmental and emissivity parameters.The rjpg image format is easy to use in the FLIR software.However, challenges emerge when collectively analyzing large sets of images, particularly those acquired over a large area via unmanned aerial vehicles (UAVs), and when deciding how to stitch them together.In contrast, the 14-bit tiff format can be utilized with third-party software and integrated into larger images.Nevertheless, it presents raw digital numbers for pixel values, disregards user-input parameters, and necessitates independent conversion to temperature, requiring users to account for factors such as emissivity and environmental parameters.Realizing the significant tradeoffs between the two image types, it is questionable how comparably accurate the two options are.To our knowledge, UAV thermal research has not compared the two options.Moreover, in studies utilizing the tiff format, researchers either devised their own models using a blackbody instrument or other ground-truth data or employed the singular model provided by FLIR [19,21,25].There is a lack of consensus on the relative accuracy of these approaches.
Therefore, due to the uncertainties in each of these areas, this study had multiple objectives (Figure 1).The three main objectives of this research were to (1) test the accuracy of the FLIR Vue Pro R camera as affixed to a UAV, (2) demonstrate UAV application in an urban environment, and more specifically, the potential for UAV thermal application in an urban environment, and (3) perform an initial analysis of surface temperature readings from two different neighborhoods in Athens, Georgia across different ground surfaces compared with the Vue Pro R readings from the UAV and compare temperatures of urban surface types in shaded and unshaded conditions in two neighborhoods.The results from each of these three overarching goals will address technical knowledge and experience gaps regarding larger-scale, fine-resolution thermal data collection from UAV applications in an urban environment.Therefore, due to the uncertainties in each of these areas, this study had multiple objectives (Figure 1).The three main objectives of this research were to (1) test the accuracy of the FLIR Vue Pro R camera as affixed to a UAV, (2) demonstrate UAV application in an urban environment, and more specifically, the potential for UAV thermal application in an urban environment, and (3) perform an initial analysis of surface temperature readings from two different neighborhoods in Athens, Georgia across different ground surfaces compared with the Vue Pro R readings from the UAV and compare temperatures of urban surface types in shaded and unshaded conditions in two neighborhoods.The results from each of these three overarching goals will address technical knowledge and experience gaps regarding larger-scale, fine-resolution thermal data collection from UAV applications in an urban environment.

Materials
The total materials included three thermal tiles, a 13 mm FLIR Vue Pro R 640 thermal camera (FLIR, Middletown, NY, USA), a handheld FLIR E6-XT infrared imaging camera (FLIR, Middletown, NY, USA), and an Autel Robotics EVO II Dual (Autel Robotics, Bothwell, WA, USA).The three thermal tiles were each composed of a 60 cm by 60 cm aluminum sheet, each 2.2 cm thick.Four thin-film platinum resistance temperature detectors (RTDs) with three conductors were affixed underneath the tiles to give accurate temperature readings (these sensors are classified as type A with an accuracy of +/−0.15 °C), which were recorded on an SD card every few seconds.To avoid heat interference from the ground, the tiles were elevated above the ground via a PVC pipe frame, and a 25 mm layer of expanded polystyrene insulation foam rested between the PVC pipe frame and the aluminum plate (with the four sensors in between the foam and plate).Each tile was painted

Materials
The total materials included three thermal tiles, a 13 mm FLIR Vue Pro R 640 thermal camera (FLIR, Middletown, NY, USA), a handheld FLIR E6-XT infrared imaging camera (FLIR, Middletown, NY, USA), and an Autel Robotics EVO II Dual (Autel Robotics, Bothwell, WA, USA).The three thermal tiles were each composed of a 60 cm by 60 cm aluminum sheet, each 2.2 cm thick.Four thin-film platinum resistance temperature detectors (RTDs) with three conductors were affixed underneath the tiles to give accurate temperature readings (these sensors are classified as type A with an accuracy of ±0.15 • C), which were recorded on an SD card every few seconds.To avoid heat interference from the ground, the tiles were elevated above the ground via a PVC pipe frame, and a 25 mm layer of expanded polystyrene insulation foam rested between the PVC pipe frame and the aluminum plate (with the four sensors in between the foam and plate).Each tile was painted with a matte paint in either white, gray, or black to account for different absorptive properties.Additional details on the tile construction process can be found in Lacerda et al. (2022) [22].To collect other ground-truth temperature data, the FLIR E6-XT handheld was set to record two images simultaneously: both the thermal image in the rjpg file format and the standard RGB image, which would aid in surface type interpretation alongside the thermal image.A table of specifications for both thermal cameras is in Table 1.The internal temperature calibration configuration process for FLIR is proprietary.Publicly available information regarding the FLIR Vue Pro R camera is that it performs internal calibration using an internal shutter for about 1 s, which occurs automatically based on internal camera parameters.The camera's documentation does not refer to a stabilization period, and FLIR technical support claims that one is not required.However, there is a stabilization period of 5 min for the handheld E6-XT camera.

Overview
In order to determine the best flying height, thermal camera file type settings, and the accuracy of the FLIR UAV and handheld cameras compared to the most accurate temperature values from the thermal tiles, we conducted an initial test on an open, grassy field at the Intramural Fields on the University of Georgia campus on 19 August 2021, when the sky was clear and the wind was still at 0 m/s.Subsequent measurements in residential neighborhoods were planned for days exhibiting comparable meteorological conditions, as detailed in Table 2.As seen in Figure 1, Objectives 1a-c, our goals for the initial test were to: (1) determine the accuracy of both the handheld and UAV FLIR thermal imaging cameras as compared to the tile readings, (2) determine whether using rjpg or tiff files in the thermal camera provided more accurate temperature data, and (3) determine the optimal flying height at which we could obtain the maximum area while still reading the correct temperature of the 60 cm × 60 cm tile pixels.The results from this initial testing phase were used to inform decisions when flying in the neighborhoods.Two total flights were conducted over the three tiles and two grass locations on either side of the tiles (Figure 2).The thermal camera was set to record rjpg files during the first flight and to record tiff files during the second flight.Per flight, six different height levels were investigated at 30.5 m (100 ft), 45.7 m (150 ft), 50.3 m (165 ft), 53.3 m (175 ft), 56.4 m (185 ft), and 61.0 m (200 ft).At each flight level, three of the clearest images were chosen that represented the full duration the UAV hovered at that height (i.e., one image was chosen near the beginning of the time the UAV moved to that height, then one in the middle of the time, and one more from near the end before it transitioned to the next height).This resulted in 180 total observations (5 ground-truth objects × 2 flights × 6 height levels × 3 images per height), or 36 observations per ground-truth object.At each height level, tile readings were recorded on an SD card, and the handheld FLIR camera was used to obtain independent readings of the top of each tile, plus readings of the grass on either side of the tiles.This totaled 60 handheld observations (5 ground-truth objects × 2 flights × 6 height levels).Two total flights were conducted over the three tiles and two grass locations on either side of the tiles (Figure 2).The thermal camera was set to record rjpg files during the first flight and to record tiff files during the second flight.Per flight, six different height levels were investigated at 30.5 m (100 ft), 45.7 m (150 ft), 50.3 m (165 ft), 53.3 m (175 ft), 56.4 m (185 ft), and 61.0 m (200 ft).At each flight level, three of the clearest images were chosen that represented the full duration the UAV hovered at that height (i.e., one image was chosen near the beginning of the time the UAV moved to that height, then one in the middle of the time, and one more from near the end before it transitioned to the next height).This resulted in 180 total observations (5 ground-truth objects × 2 flights × 6 height levels × 3 images per height), or 36 observations per ground-truth object.At each height level, tile readings were recorded on an SD card, and the handheld FLIR camera was used to obtain independent readings of the top of each tile, plus readings of the grass on either side of the tiles.This totaled 60 handheld observations (5 ground-truth objects × 2 flights × 6 height levels).

Test Handheld Accuracy against Ground Data (Objective 1a)
After initial data collection, tile and handheld camera readings were compared to determine the handheld camera's temperature accuracy.All 60 images from the FLIR E6-XT handheld were in rjpg format, which allowed for analysis in the FLIR Thermal Studio software (version 1.9.40.0).Pixels comprising the center of each tile were averaged in the White, gray, and black thermal tiles

Test Handheld Accuracy against Ground Data (Objective 1a)
After initial data collection, tile and handheld camera readings were compared to determine the handheld camera's temperature accuracy.All 60 images from the FLIR E6-XT handheld were in rjpg format, which allowed for analysis in the FLIR Thermal Studio software (version 1.9.40.0).Pixels comprising the center of each tile were averaged in the software.This average value represented the handheld temperature reading for the tile.Recorded tile temperatures were averaged around the time the handheld image was taken.To compare handheld readings versus tile readings, we examined (1) Pearson's correlation coefficient test results, (2) t-test results, and (3) Tukey's test results.

Testing the FLIR Camera on a UAV against Accurate Ground Data (Objectives 1b and 1c)
To determine whether the rjpg or tiff images from the FLIR Vue Pro R were more accurate, we first averaged the rjpg format's central tile values inside Thermal Studio, as was carried out for the handheld rjpg images.For the tiff files, we tested two methods.
First, we used the formula supplied by FLIR to transform the pixel values to radiant temperature ( • C): Second, we created our own model for comparison against the FLIR model.We used linear regression to model the relationship between the recorded tile readings for all 3 tiles and the tiff pixel values.We also integrated grass temperature readings from the FLIR handheld after establishing that its readings were well-correlated with the tile values.We evaluated the models' residuals and outlying values.We then applied the better-performing model to transform all pixels in the image to radiative temperature values.Next, emissivity for different surface types was included to obtain skin temperature estimates.Skin temperatures were estimated with the following equation: where T rad is radiative temperature, T skin is the skin temperature, and ε is the emissivity per surface type [30].Vegetation emissivity was used as listed in Table 3, whereas the emissivity of the tiles was estimated using black electrical tape with a known emissivity value of 0.95.Emissivity values were derived by measuring the temperature of the tape with the emissivity parameter set to 0.95.Then, while measuring the adjacent tile surface temperature, the emissivity was adjusted until the tile's temperature matched that of the tape.Temperatures from all 3 methods were compared against the tile readings, and the root-mean-square error (RMSE) was calculated for each.To determine the optimal flying height, we measured temperatures from both the rjpg and tiff files over the center of each tile and compared these values to the actual tile readings, as discussed in Section 2.2.3.The RMSE was calculated for the three tile and two grass locations at each height level.We looked for the maximum flying height that still showed temperature readings with low RMSE values.Based on a later examination of the raw tiff digital numbers, we further ran Tukey's HSD test to investigate which height levels, if any, were considered significantly different.

Demonstrating UAV and Thermal Applications in Urban Environments for Urban Surface
Analysis (Objectives 2 and 3) 2.3.1.Demonstrating General UAV Applications and Larger Scale (Street-Level) Thermal Applications in Urban Environments (Objectives 2a and 2b) A 13 mm FLIR Vue Pro R camera was mounted to a drone.The camera was set to record an image every two seconds, which was then stored on an SD card.While the airspace above a residence might be considered public, respecting resident privacy and upholding strict safety standards are key to promoting any urban UAV data collection program.To respect private neighborhood space while avoiding flying over road traffic, flight lines were established inside a single block, following along the sidewalk, right-ofway area, or along heavily forested areas where there were no residences (Figure 3).Thus, each flight and its thermal data were collected on a per-block basis.
Analysis (Objectives 2 and 3) 2.3.1.Demonstrating General UAV Applications and Larger Scale (Street-Level) Thermal Applications in Urban Environments (Objectives 2a and 2b) A 13 mm FLIR Vue Pro R camera was mounted to a drone.The camera was set to record an image every two seconds, which was then stored on an SD card.While the airspace above a residence might be considered public, respecting resident privacy and upholding strict safety standards are key to promoting any urban UAV data collection program.To respect private neighborhood space while avoiding flying over road traffic, flight lines were established inside a single block, following along the sidewalk, right-of-way area, or along heavily forested areas where there were no residences (Figure 3).Thus, each flight and its thermal data were collected on a per-block basis.Data collection times were set to afternoon hours during the hottest time of day.The days selected had no or minimal cloud cover.Table 2 provides weather details for each data collection day.

General Urban Surface Temperature Comparison (Objective 3a)
In order to collect ground-truth data, the three 2′ × 2′ aluminum tiles were placed within view of the thermal camera for every flight.During each flight, additional ground temperature measurements were taken of different surfaces (concrete, asphalt, grass, pine straw, and mulch) using the FLIR E6-XT handheld.At the same time each ground temperature was being taken, a GPS point was recorded to pinpoint each ground temperature reading.These ground temperature measurements were used as ground data to compare to the drone thermal data.Environmental parameters, such as the air temperature and humidity, were input into the handheld for the day's conditions.The emissivity values of each surface type were input into FLIR Thermal Studio.

Data Processing and Analysis
UAV thermal images were stitched in Agisoft Metashape to produce one overall image per each street sampled.In order to properly convert pixel values to temperatures, the FLIR-provided linear regression model, which was the better-performing model for the neighborhood tiff images, was applied (Equation ( 1)).Standard emissivity values were applied per material type (Table 3) using Equation (2).The accompanying RGB images from the Evo camera were also stitched to provide a high-resolution image of the street.Flight lines followed parallel to the street, in between the road and properties.The star represents the takeoff and landing pad.The height level chosen was a balance between the best results for thermal data and at least 80% overlap.Data collection times were set to afternoon hours during the hottest time of day.The days selected had no or minimal cloud cover.Table 2 provides weather details for each data collection day.

General Urban Surface Temperature Comparison (Objective 3a)
In order to collect ground-truth data, the three 2 ′ × 2 ′ aluminum tiles were placed within view of the thermal camera for every flight.During each flight, additional ground temperature measurements were taken of different surfaces (concrete, asphalt, grass, pine straw, and mulch) using the FLIR E6-XT handheld.At the same time each ground temperature was being taken, a GPS point was recorded to pinpoint each ground temperature reading.These ground temperature measurements were used as ground data to compare to the drone thermal data.Environmental parameters, such as the air temperature and humidity, were input into the handheld for the day's conditions.The emissivity values of each surface type were input into FLIR Thermal Studio.

Data Processing and Analysis
UAV thermal images were stitched in Agisoft Metashape to produce one overall image per each street sampled.In order to properly convert pixel values to temperatures, the FLIR-provided linear regression model, which was the better-performing model for the neighborhood tiff images, was applied (Equation ( 1)).Standard emissivity values were applied per material type (Table 3) using Equation (2).The accompanying RGB images from the Evo camera were also stitched to provide a high-resolution image of the street.The RGB images contained coordinates, which allowed for easier stitching, whereas the thermal images did not.Therefore, the stitched thermal image for each street was georeferenced to the RGB image.GPS points were examined for accurate placement over the thermal images and moved to accurate placement if necessary.Some points that were ultimately concealed by vegetation cover were marked as not visible and removed from the sample.The remaining points were used to extract the thermal image temperature data, and this was compared to the ground-truth data for that same point.

Handheld Accuracy (Objective 1a)
In order to collect additional ground-truth data in the neighborhoods, first, we had to evaluate the handheld's accuracy by comparing the handheld readings to the accurate tile readings.Table 4 shows the average tile value compared to the FLIR handheld reading, along with the average measurement errors and RMSE for each color tile.The measurement error of each observation was calculated by subtracting the predicted handheld reading from the tile temperature recorded.Each of the 36 observations that are summarized in Table 4 are in Appendix A. The average absolute measurement error was highest for gray with 1.2 • C of underprediction, followed by white with 0.5 • C. Black tile readings were the most accurate, with a measurement error of only 0.3 • C. Figure 4 shows the correlation between the handheld and tile values, as well as the linear relationship based on tile color.Pearson's correlation coefficients showed the handheld measurements were highly linearly related with the tile readings at the 99% confidence interval (r = 0.998, df = 34, p-value < 0.001).Tukey's HSD test of the average handheld readings compared to the corresponding average tile readings showed that there was a not significant difference at the 95% confidence Interval (p-value = 0.228).However, a pairwise t-test at the 95% confidence interval comparing the handheld and tile readings showed that the difference in means was significantly different from zero (df = 35, p < 0.001).

Vue Pro R Accuracy for All Three File Format Methods (Objectives 1b and 1c)
To determine the accuracy of the Vue Pro R camera as applied from a UAV in an urban environment, as well as to decide which image file type to use in the neighborhoods, Vue Pro R images from the UAV were compared to true tile readings.The UAV was flown twice: once for the rjpg file type and again for the tiff file type.For the rjpg flight (the first flight), the average UAV-based tile values were taken from the rjpg images using Flir Thermal Studio.Environmental and emissivity parameters were input before the flight and could be further modified in FLIR Thermal Studio.For the tiff flight (the second flight), the average tile values were taken from the tiff image using two methods: either the tiff digital numbers were converted to temperature values using the FLIR formula provided or using the linear regression model developed from the data.After applying either model, emissivity values were applied per surface type.The root-mean-square error (RMSE) was estimated for each of the three methods against the tile values, which were averaged across the time the UAV was flying at that particular height level.
Table 5 shows the temperature values for all three temperature image approaches along with their RMSE.For the two grass readings listed, the handheld was used as the ground truth.The overall RMSE values for the rjpg format, the tiff format with the FLIR model, and the tiff format with our model were 3.1 • C, 5.0 • C, and 2.6 • C, respectively.The RMSE for the rjpg method ranged from 0.1 to 7.9 • C, with an average of 3.1 • C. Thirteen out of the thirty (43.3%)RMSE averages were outside of the ±5 • C accuracy specified by FLIR [33].The highest standard deviation was seen at the 150 ft level.The RMSE for the tiff using the FLIR formula ranged more than the rjpg, from 0.5 to 16.0 • C, with a higher average RMSE of 7.2 • C. Eighteen out of the thirty (60%) RMSE values were outside of the ±5 • C accuracy.The highest standard deviation was seen at the 200 ft height.The RMSE for the tiff using the linear regression model ranged from 0.1 to 8.0 • C, with an average of 3.3 • C.Only seven of the thirty (23.3%)RMSE values were outside the ±5 • C accuracy.As with the other tiff approach, the highest standard deviation was seen at the 200 ft height.In the majority of the observations, the lowest RMSE belonged either to the tiff linear model or the rjpg approach, not the tiff FLIR formula approach.
Both the tiff FLIR formula and rjpg approaches showed the highest error for the two grass measurements across height levels, with one exception of a tiff FLIR formula value at a 200 ft height level.Among the three tiles, both approaches had the highest error for the white (coolest) tile, again with the exception of one tiff FLIR formula value at 200 ft.With few exceptions, both approaches tended to have the lowest error for the black (hottest) tile.For most observations, the error was due to overestimation of the actual ground temperature by both methods.
Results from the linear regression model overall varied from the other two approaches.Figure 5b illustrates the resulting linear regression model developed from the raw tiff digital numbers and observed ground temperatures.In comparison to the FLIR formula, the model was 0.04963x − 354.3.Notably, there is a spread in the model's predicted temperature values despite having similar or only slightly increased observed ground temperature values.The raw tiff digital numbers show the same trend, revealing it to be an underlying pattern and not a result of either tiff prediction formula (Figure 5a).Upon investigation, it was revealed that the the tiff values and subsequent predicted temperatures were directly increasing with every height level.Thus, the 100 ft level had the lowest tiff values and predicted temperatures, with each succeeding height showing an increase through 185 ft.This pattern was broken at the 200 ft height level, which had even lower values than those at 100 ft.Inspection of the residuals and outlying points revealed that the points with the highest error were all from the 200 ft level.Thus, all points from the 200 ft height were removed to create the final model (n = 75), which was 0.049-345.7.
When performing simple pairwise t-tests with Bonferroni correction, each of the three method types, compared to the true ground temperature, had a non-significant p-value (p > 0.10).Table 5. Summary of average values for ground-truth temperature versus average readings from the three methods using UAV thermal imagery (n = 270 UAV images total, with three images selected and averaged per image method type, per altitude).The three methods were: (1) using the tiff file type and applying the FLIR-supplied formula, (2) using the tiff file type and applying the linear regression model, and (3) using the RJPG file type, which gives the temperature per pixel directly in FLIR Thermal Studio software.For the grass surfaces, handheld readings were used as the ground-truth data.Standard deviations are included alongside averages to show variation among readings per flight altitude.lowest tiff values and predicted temperatures, with each succeeding height showing an increase through 185 ft.This pattern was broken at the 200 ft height level, which had even lower values than those at 100 ft.Inspection of the residuals and outlying points revealed that the points with the highest error were all from the 200 ft level.Thus, all points from the 200 ft height were removed to create the final model (n = 75), which was 0.049-345.7.

Temperature (
(a) (b) When performing simple pairwise t-tests with Bonferroni correction, each of the three method types, compared to the true ground temperature, had a non-significant pvalue (p > 0.10).

Optimal Flying Height (Objective 1b)
For the tiff digital numbers, Tukey's HSD test revealed that there was a significant difference between height levels (df = 89, p-value < 0.001).Specifically, there was a significant difference between all height levels whose difference from each other was more than 4.6 m (15 ft).The 30.5 (100 ft) and 61.0 (200 ft) height levels were significantly different from all other height levels, except with each other (the 30.5 and 61.0 height levels were not significantly different).Other height levels that were significant were 45.7-53.3,45.7-56.4,and 56.4-50.3.In contrast, there was no significant difference between height levels for rjpg values (df = 89, p-value > 0.10).Figure 6 shows the progression of temperature values across all height levels and their estimated values from both tiff image sets and rjpg images.As seen in Table 5, values from both tiff approaches at 200 ft plunged in relation to the actual ground temperature, whereas the rjpg values did not.

Optimal Flying Height (Objective 1b)
For the tiff digital numbers, Tukey's HSD test revealed that there was a significant difference between height levels (df = 89, p-value < 0.001).Specifically, there was a significant difference between all height levels whose difference from each other was more than 4.6 m (15 ft).The 30.5 (100 ft) and 61.0 (200 ft) height levels were significantly different from all other height levels, except with each other (the 30.5 and 61.0 height levels were not significantly different).Other height levels that were significant were 45.7-53.3,45.7-56.4,and 56.4-50.3.In contrast, there was no significant difference between height levels for rjpg values (df = 89, p-value > 0.10).Figure 6 shows the progression of temperature values across all height levels and their estimated values from both tiff image sets and rjpg images.As seen in Table 5, values from both tiff approaches at 200 ft plunged in relation to the actual ground temperature, whereas the rjpg values did not.

Urban Surface Thermal Analysis (Objective 3)
Although the linear model developed from the ground-truth data in the initial testing phase produced the lowest RMSE, its application to the neighborhood images showed higher errors than when applying the FLIR-provided model.Further analysis showed that the FLIR model predicted the hotter manmade surfaces more accurately than the fielddeveloped model.Therefore, the FLIR model was applied for the neighborhood images.
For most surfaces, both shaded and non-shaded, the average temperature prediction from the camera fell within the specified ±5 • C (Table 6).The three exceptions were for averages of shaded grass on two different days and for shaded asphalt on one day.Surfaces with RMSEs above 5 included both shaded and non-shaded grass, non-shaded pine straw, and shaded asphalt.In contrast, the hottest manmade surfaces (not including pine straw) showed the lowest average measurement errors and RMSEs.Following the trend evidenced in the initial testing phase, the vegetated and shaded surfaces (i.e., the cooler surfaces) showed higher measurement errors, up to an average 8.7 • C measurement error and 9.5 • C RMSE for shaded grass on the second day.While overall averages for cooler surfaces appeared to be more consistently inaccurate, the hotter, nonshaded surfaces show a wider range in error.Based on the measurement error average (with direction included), the temperature predictions tended to overpredict the temperature relative to the handheld in every case but four, in which the hottest surfaces of asphalt, pine straw, and concrete were slightly underpredicted.

Urban Surface Thermal Analysis (Objective 3)
Although the linear model developed from the ground-truth data in the initial testing phase produced the lowest RMSE, its application to the neighborhood images showed higher errors than when applying the FLIR-provided model.Further analysis showed that the FLIR model predicted the hotter manmade surfaces more accurately than the fielddeveloped model.Therefore, the FLIR model was applied for the neighborhood images.
For most surfaces, both shaded and non-shaded, the average temperature prediction from the camera fell within the specified ±5 °C (Table 6).The three exceptions were for averages of shaded grass on two different days and for shaded asphalt on one day.Surfaces with RMSEs above 5 included both shaded and non-shaded grass, non-shaded pine straw, and shaded asphalt.In contrast, the hottest manmade surfaces (not including pine straw) showed the lowest average measurement errors and RMSEs.Following the trend evidenced in the initial testing phase, the vegetated and shaded surfaces (i.e., the cooler surfaces) showed higher measurement errors, up to an average 8.7 °C measurement error and 9.5 °C RMSE for shaded grass on the second day.While overall averages for cooler surfaces appeared to be more consistently inaccurate, the hotter, nonshaded surfaces show a wider range in error.Based on the measurement error average (with direction included), the temperature predictions tended to overpredict the temperature relative to the handheld in every case but four, in which the hottest surfaces of asphalt, pine straw, and concrete were slightly underpredicted.
Manmade surfaces were expectedly hotter than live, natural surfaces (i.e., the grass).Pine straw, a natural but dead material, exhibited high temperatures near the level of asphalt (Table 6).In order to examine additional surface types and to extend measurements past the areas publicly accessible by the handheld camera on the ground, 50 random points were created for each of the 10 streets/thermal images (n = 500), and the UAV camera temperature readings were extracted.Figure 7 shows the distribution of readings by Manmade surfaces were expectedly hotter than live, natural surfaces (i.e., the grass).Pine straw, a natural but dead material, exhibited high temperatures near the level of asphalt (Table 6).In order to examine additional surface types and to extend measurements past the areas publicly accessible by the handheld camera on the ground, 50 random points were created for each of the 10 streets/thermal images (n = 500), and the UAV camera temperature readings were extracted.Figure 7 shows the distribution of readings by surface types for all types that totaled more than 15 random sample points.The results continue to support the findings in Table 6 in which live natural surfaces, even when non-shaded, tend to be cooler than manmade surfaces.Trees that existed in forested or clumped conditions showed the coolest temperatures in comparison to trees that existed in isolated conditions.Table 6.Average shaded and non-shaded surface temperatures per neighborhood and day.Thermal image temperatures were predicted using the FLIR-supplied formula.Measurement error is calculated as the handheld temperature minus the UAV thermal image temperature prediction.RMSE is calculated using the handheld values as the actual observations and the FLIR model temperature predictions from the UAV imagery as the predicted observations.

Discussion
Several limitations of this study should be acknowledged and considered for future research.Emissivity values were accounted for based on general reported values.However, values can vary even within the same material type.In addition, the urban environment offers a multitude of surface types, such as a variety of rooftop materials and mulch or plant bedding types.An inclusive survey of different urban materials and their emissivity values could be valuable for future thermal research.Atmospheric correction is the other pre-processing method common to thermal data collected from the space-or airborne level.However, the UAV-based thermal literature seems to have largely avoided this step.Atmospheric correction seems to be applied only when the user has selected the rjpg setting, which accepts various environmental input parameters.As such, we likewise did not apply atmospheric correction for the tiff images.A further analysis would be exceedingly useful to examine the effects of the atmosphere at levels as low as the 30.5 m (100 ft) to 61 m (200 ft) range used in this study, specifically for thermal data.The field data acquisition was performed during sunny days in summer, when most of the sky conditions were clear.If small clouds were present, the UAV was always flying below them.While we attempted to measure on days with these similar meteorological conditions, factors such as air temperature and relative humidity will inevitably vary.Additionally, it is important to note that our study did not incorporate consideration of potential interference from radiation reflected by nearby objects surrounding the target of interest.
Testing the handheld with the tiles showed mixed results.While Tukey's test showed no significant difference between each tile average and handheld average, a pairwise ttest showed a significant difference (p-value < 0.001).Because carrying and protecting cumbersome sensors such as the tiles in urban settings present challenges and because handhelds allow the extension and proliferation of many urban surface temperature measurements, we relied on the handheld measurements to compare against those from the camera.Initial testing with the tiles showed that the handheld measurements for the

Discussion
Several limitations of this study should be acknowledged and considered for future research.Emissivity values were accounted for based on general reported values.However, values can vary even within the same material type.In addition, the urban environment offers a multitude of surface types, such as a variety of rooftop materials and mulch or plant bedding types.An inclusive survey of different urban materials and their emissivity values could be valuable for future thermal research.Atmospheric correction is the other pre-processing method common to thermal data collected from the space-or airborne level.However, the UAV-based thermal literature seems to have largely avoided this step.Atmospheric correction seems to be applied only when the user has selected the rjpg setting, which accepts various environmental input parameters.As such, we likewise did not apply atmospheric correction for the tiff images.A further analysis would be exceedingly useful to examine the effects of the atmosphere at levels as low as the 30.5 m (100 ft) to 61 m (200 ft) range used in this study, specifically for thermal data.The field data acquisition was performed during sunny days in summer, when most of the sky conditions were clear.If small clouds were present, the UAV was always flying below them.While we attempted to measure on days with these similar meteorological conditions, factors such as air temperature and relative humidity will inevitably vary.Additionally, it is important to note that our study did not incorporate consideration of potential interference from radiation reflected by nearby objects surrounding the target of interest.
Testing the handheld with the tiles showed mixed results.While Tukey's test showed no significant difference between each tile average and handheld average, a pairwise t-test showed a significant difference (p-value < 0.001).Because carrying and protecting cumbersome sensors such as the tiles in urban settings present challenges and because handhelds allow the extension and proliferation of many urban surface temperature measurements, we relied on the handheld measurements to compare against those from the camera.Initial testing with the tiles showed that the handheld measurements for the hottest surfaces, such as asphalt and pine straw, are likely highly accurate, whereas measurements for the coolest surfaces have higher error.This might contribute, in part, to the higher error from shaded surfaces, as seen in Table 6, yet the fact that the handheld tends to underpredict in contrast to the camera, which overpredicts, may also explain some of the error.
Testing of the three image file type methods (rjpg, tiff with the FLIR formula, and tiff with the linear model) shows that, while the tiff format with the linear model tended to have the lowest RMSE at most height levels during initial testing, its application to the wider urban environment revealed higher error than when applying the FLIR-provided formula to the tiff values.However, our own linear model is similar to other published regression models, including that of Sangha et al. (2020), who also used a ground-truthing station with tiles [21].It is likely that the FLIR-given formula was derived by using a blackbody instrument, as did Sagan et al. (2019), who, by utilizing this approach, derived their own formula, which was very similar to the FLIR formula [19].
Initial testing showed that tiff image values ranged across flight heights despite barely any accompanying change in temperature on the ground.This suggests that model development and application will be partially dependent on the height level flown and should be considered well before model development and application.While our results showed a drastic change in temperature values at the 61 m (200 ft) level, the camera is advertised as capable of flying up to 12,192 m (40,000 ft), and the same camera has been used in other research at levels higher than 61 m [20,25,28,33].In addition, the error of the FLIR formula and rjpg method actually saw a decrease.We might surmise that the rjpg format and FLIR formula may have been developed for application at flight altitudes greater than 61 m, which supports the decreased RMSE for both at that height.However, other studies that flew at low altitudes have not reported such inaccuracy.Based on the results of this study, those hoping to apply the FLIR Vue Pro R camera in an urban setting will need to decide if flying higher than 61 m will still provide enough of the desired detail and resolution for their particular project.Small targets and their temperatures will be increasingly altered with greater height due to coarser resolutions and the interference of temperatures from surrounding objects.Future research could investigate accuracy up to the US FAA legal limit of 400 ft above ground level.
As seen consistently across all three methods and in the final results, temperature accuracy was seen to range from least accurate for the coolest surfaces to most accurate for the hottest.Given that FLIR products are known largely for both military and manufacturing/factory applications, it follows that such sensors would be more accurate at higher temperatures because of their original application purpose.While the average RMSE and the absolute average error still remained within the ±5 • C specified by FLIR, individual temperature readings from the camera sometimes strayed outside of 5 • C.However, the increased error linked to cooler or shaded surfaces has received minimal attention in previous reports, with one notable exception.These findings support those from Song et al. (2020), who found that vegetation readings exhibited a high RMSE of 8.2 • C. Furthermore, they found that the points with the highest errors were those cast in shadow, concluding that surface temperatures cannot be accurately detected by the FLIR Vue Pro R when the surface is shaded.Thus, urban or other applications that are interested in obtaining cooler temperatures and surfaces in shade, especially at flight altitudes lower than 61 m (200 ft), must realize that such readings will be less accurate.
While urban surface results support widely known findings that manmade surfaces are hotter than natural, green surfaces, our findings also point to potential spatial configuration effects for trees.Specifically, trees that are clustered together are known to have cooler surrounding air temperatures due to evapotranspiration, which may have an effect on canopy temperature [15,17].However, this may also be due in part to taller trees, which will cast some shading on smaller trees [25].Monitoring the tree canopy's temperature will likely become more important as climate change progresses since higher temperatures lead to decreased and eventually totally halted photosynthetic rate and thus other impaired processes, including cooling processes [34][35][36].This situation might be more acute in urban areas, which are already experiencing higher temperatures.

Conclusions
This research sought to identify the accuracy of both the FLIR Vue Pro R 640 thermal UAV camera and the handheld FLIR E6-XT infrared imaging camera, whose application could then be demonstrated in an urban environment.Initial testing of the handheld against thermal tiles showed the handheld tended to underpredict, with a maximum RMSE of up to 1.3 • C for the gray tile and a low RMSE of 0.1 • C for the black (hottest) tile.The results show that the handheld can be used as ground-truth data in an environment where the use of other accurate thermal ground truth is challenging.Initial testing of the UAV camera, its image types, and its image pixel conversion methods showed that the rjpg image type had the highest standard deviation at the 100 and 150 ft altitudes but had a lower RMSE than using the FLIR-supplied formula with the tiff image type.Both methods showed the same trend in which the error was higher for the cooler surfaces (the white tile and both grasses) and lowest for the hottest surfaces (the gray and black tiles).However, the use of the tiff image with the ground-truth-created model showed lower RMSE values for both grass surfaces than the use of the rjpg image and tiff image alone.The application of the model to urban surfaces in both neighborhoods showed greater error to the handheld ground-truth data than when using the FLIR-supplied model for the tiff image.This method similarly showed lower RMSE values for the hotter surface types than for the cooler surfaces in the neighborhoods.Our results suggest that flight altitude should be carefully considered if creating a model for pixel conversion for the tiff image and that, at least in the case of the other two methods, readings will range from higher to lower error along the cool-to-hot temperature gradient.Finally, the use of the UAV in urban neighborhoods demonstrates a method of urban data collection that balances privacy considerations with safety and practical applications.

Figure 1 .
Figure 1.The main 3 objectives related to UAV and thermal urban surface analysis, with technical objectives listed under each.

Figure 1 .
Figure 1.The main 3 objectives related to UAV and thermal urban surface analysis, with technical objectives listed under each.

Figure 2 .
Figure 2. Initial testing of the FLIR Vue Pro R, FLIR handheld, and tiles.

Figure 3 .
Figure 3. Flight lines from one sampled street on top of the stitched RGB image from the EVO camera.Flight lines followed parallel to the street, in between the road and properties.The star represents the takeoff and landing pad.The height level chosen was a balance between the best results for thermal data and at least 80% overlap.

Figure 3 .
Figure 3. Flight lines from one sampled street on top of the stitched RGB image from the EVO camera.Flight lines followed parallel to the street, in between the road and properties.The star represents the takeoff and landing pad.The height level chosen was a balance between the best results for thermal data and at least 80% overlap.

Figure 5 .
Figure 5. (a) Raw tiff digital numbers compared with observed ground temperature (°C), revealing a spread in digital numbers despite similar ground temperatures.(b) Linear regression model predictions compared with observed ground temperature (°C).

Figure 5 .
Figure 5. (a) Raw tiff digital numbers compared with observed ground temperature ( • C), revealing a spread in digital numbers despite similar ground temperatures.(b) Linear regression model predictions compared with observed ground temperature ( • C).

19 Figure 6 .
Figure 6.Progression of image temperatures across all height levels against their corresponding ground-truth measurements (tile and handheld values during either flight).Height levels are 100, 150, 175, 185, and 200 ft.Shaded rectangles from left to right represent the 5 sampled objects: white, gray, and black tiles, and grass 2 (cooler grass) and grass 1 (warmer grass).

Figure 6 .
Figure 6.Progression of image temperatures across all height levels against their corresponding ground-truth measurements (tile and handheld values during either flight).Height levels are 100, 150, 175, 185, and 200 ft.Shaded rectangles from left to right represent the 5 sampled objects: white, gray, and black tiles, and grass 2 (cooler grass) and grass 1 (warmer grass).

Figure 7 .
Figure 7. Boxplot of UAV camera temperature readings extracted from 500 random points.Surface types with n < 15 were excluded.Dots represent outliers.

Figure 7 .
Figure 7. Boxplot of UAV camera temperature readings extracted from 500 random points.Surface types with n < 15 were excluded.Dots represent outliers.

Table 1 .
Camera specifications for the UAV FLIR Vue Pro R 640 and FLIR E6-XT (handheld).

Table 2 .
Weather conditions for all data collection days in both neighborhoods in Athens, Georgia.

Table 3 .
Emissivity values per surface type.

Table 4 .
For objective 1a: testing the accuracy of handheld readings against tile readings (n = 36).This is a summary table of average readings in Celsius per color tile.For the average measurement error with direction included, a positive sign indicates underestimation of the handheld.

) Per Surface by Height Level
• C