Archaeological Remote Sensing Using Multi-Temporal, Drone-Acquired Thermal and Near Infrared (NIR) Imagery: A Case Study at the Enfield Shaker Village, New Hampshire

While archaeologists have long understood that thermal and multi-spectral imagery can potentially reveal a wide range of ancient cultural landscape features, only recently have advances in drone and sensor technology enabled us to collect these data at sufficiently high spatial and temporal resolution for archaeological field settings. This paper presents results of a study at the Enfield Shaker Village, New Hampshire (USA), in which we collect a time-series of multi-spectral visible light, near-infrared (NIR), and thermal imagery in order to better understand the optimal contexts and environmental conditions for various sensors. We present new methods to remove noise from imagery and to combine multiple raster datasets in order to improve archaeological feature visibility. Analysis compares results of aerial imaging with ground-penetrating radar and magnetic gradiometry surveys, illustrating the complementary nature of these distinct remote sensing methods. Results demonstrate the value of high-resolution thermal and NIR imagery, as well as of multi-temporal image analysis, for the detection of archaeological features on and below the ground surface, offering an improved set of methods for the integration of these emerging technologies into archaeological field investigations.


Introduction
The potential of thermal and near infrared (NIR) imagery to reveal otherwise invisible archaeological features has been well known to researchers for several decades [1][2][3][4][5][6]. However, the cost of collecting thermal and multi-spectral NIR data has generally been prohibitive while the spatial resolution of aircraft or satellite-acquired imagery is generally too coarse to reveal many archaeological features. Recent advances in thermal and multi-spectral sensor technology, dramatic improvements in the sophistication of commercial drones, and the development of powerful mission planning and photogrammetric image processing software have collectively revolutionized our ability to deploy thermal and multi-spectral imagery as an archaeological remote sensing tool [7][8][9][10][11]. Enabling researchers to collect high-resolution imagery over vast areas at very low-cost, this suite of technologies offers a powerful complement to conventional terrestrial geophysics or aerial imagery analysis, and a potentially transformative approach to explore archaeological landscapes.

Fieldwork Methods
A major goal of this research is to help understand temperature fluctuations across the diurnal cycle, a critical factor affecting archaeological visibility [7,8]. In order to investigate the effects of

Fieldwork Methods
A major goal of this research is to help understand temperature fluctuations across the diurnal cycle, a critical factor affecting archaeological visibility [7,8]. In order to investigate the effects of diurnal temperature change, long-term temperature flux, soil moisture, and other related environmental variables on archaeological visibility in aerial thermal imagery, we undertook repeated drone surveys at the Enfield Shaker Village across a range of seasonal and temperature conditions, visiting the site on 7 different occasions to complete 15 aerial surveys (see Table 1). All mapping flights were programmed and run via a third-party application (Pix4d Capture) for autonomous survey grid missions, in order to collect image sets with pre-calculated overlap and crosslap for photogrammetric post-processing. Flights with the FLIR Vue Pro R mounted (see discussion below) mounted on either a 3DR Solo or DJI Phantom 4 pro, either covered just the small area in front of the "Great Stone Dwelling" (approximately 130 × 40 m) at 30 m elevation or both the lawn area and a portion of the field across the street (approximately 180 × 180 m) at 60 m. Since all flights were recorded at one of two altitudes, the ground sample distance of the thermal images remained fairly constant at either~4 cm (30 m) or~8 cm (60 m). Visible light images were also recorded using a DJI Phantom 3 and Tuffwing UAV fixed wing drone. The Phantom 3 flights covered only the area surveyed by the thermal and terrestrial survey, while the fixed wing drone was flown at a higher altitude (90 m) over a wide area in order to get a high resolution basemap of the entire village (approximately 700 × 300 m at 2 cm ground resolution). Multi-spectral flights were recorded with the Parrot Sequoia mounted to both the 3dr Solo and fixed wing drone. Flights with the Solo were at recorded at 50 m altitude for a ground resolution of 4.7 cm and on the fixed wing at 90 m for a ground resolution of 8 cm. In order to accurately georeference the image sets recorded by the different sensors, Ground Control Points (GCPs) were placed across the site before every flight and recorded using either a Total Station or a Real-time Kinematic Global Navigation Satellite System (RTK GNSS) survey instrument, in this case a pair of Emlid Reach RS units [14]. For visible and multi-spectral mapping these targets Remote Sens. 2020, 12, 690 5 of 21 consisted of printed paper targets that could be easily identified in the resulting imagery. For thermal data, the GCPs consisted of aluminum sheeting cut in the shape of an "X" which show up clearly as much cooler objects in the thermal imagery, due to differences in emissivity with the surrounding ground [8]. Additionally, some flights included post processed kinematic (PPK) image geotags for additional spatial accuracy checks [15].
Although thermal data were recorded via drone across several nights and many temperature conditions, each flight only gives a brief window into the temperature fluctuations on the ground. To provide an additional perspective on changing surface temperatures, we mounted the FLIR Vue Pro R thermal camera on a small tripod on a 6th story window of the "Great Stone Dwelling" at the site. The camera was pointed toward one of the known features (the former administrative building foundation) and set on intervalometer mode to record an image every 20 s. The intervalometer was run during a late fall afternoon, 16 October 2018, from 4:31 p.m. to 9:40 a.m., collecting thermal data across an entire night. The temperature when the recording began was 53 • F, sunset was at 6:04 p.m., and sunrise the following day was at 7:05 a.m. Even though images from a single viewpoint could not be processed photogrammetrically, the resulting 3000+ images were converted into a time lapse video in order to visualize how the visibility of archaeological features changed with fluctuations in temperature and relative humidity. We analyzed the resulting dataset in FIJI/ImageJ software to document changing temperature profiles for particular areas of interest, such as the boundary between the buried foundation, surface features, and the surrounding soil.

Drones
The last decade has seen drones move from a relatively obscure and experimental device to a nearly ubiquitous tool that forms an essential part of many archaeological projects' recording strategies [16][17][18]. Archaeologists commonly use drones as cost-effective tools for producing oblique aerial photography, spatially accurate 3D models, and orthoimagery of excavations, sites, and landscapes. Although most archaeological implementations of drones utilize traditional visible-light cameras, many UAV systems are easily adapted to alternative sensor packages. For this project, we used both a 3DR Solo quadcopter and DJI Phantom 4 Pro for thermal and multi-spectral imaging. The Flir Vue Pro R was mounted to the 3dr Solo via a custom gimble, capable of passing GPS location data from the flight control computer on the 3dr Solo to the Vue Pro R to be written as Exif data. On the Phantom 4 pro, the Flir Vue pro was attached to a custom 3d printed fixed mount, received GPS geotag data from an external GPS Geotagger, and was powered directly from the drone. Missions were programmed via the Pix4Dcapture app, which allows simple rectangular survey grids to be captured easily. In addition to surveys using the Vue Pro R, several flights were also performed with the Parrot Sequoia multi-spectral camera mounted via a fixed nadir mount. Visible light missions were flown with a DJI Phantom 3 as well as a Tuffwing UAVmapper fixed wing drone carrying a Sony a6000 camera with a 16 mm lens [15]. Survey flights were flown at 40 m, 60 m or 90 m depending on the size of the area to be covered by each flight.

Thermal Imaging
One of the main limitations on archaeological applications for aerial thermography in the past was the large size and high cost of thermal cameras, often requiring full-scale aircraft to take them aloft and producing low-resolution analogue imagery [1,2]. The development of small, lightweight, uncooled thermal cameras enabled them to be fitted to drones and flown closer to the ground and under a wide range of weather conditions [7]. In the most recent generation of thermal cameras, companies like FLIR have begun producing thermal imaging systems specifically designed for incorporation into drone platforms. In this study, all thermal surveys were accomplished using the FLIR Vue Pro R, a drone-optimized camera system that records radiometric 14-bit still images. The Vue Pro R uses an uncooled microbolometer with a resolution of 640 × 512 pixels and a spectral band of 7.5-13.5 µm. The camera is powered directly by the drone, can be mounted on 3rd party gimbals to enable stabilization, can receive and record GPS position information from the drone as Exif info on each thermal image, and can record radiometric images in 14 bit tiff format or proprietary radiometric JPG format directly to an onboard SD card.
A second key improvement seen in the latest generation of drone-optimized thermal cameras is in their ability to record full spectrum radiometric imagery. Previous generations of thermal cameras had built-in automatic gain control (AGC), such that the camera would evaluate the total range of thermal values present in each image, and then process the image using a lookup table (LUT) to output an 8-bit (256 value) image. This presented problems for archaeological applications for a number of reasons. First, photogrammetry software, like Photoscan Pro and Pix4D, struggle to align images if adjacent images have wildly different exposure values due to AGC fluctuations. Second, the presence of features with high or low relative temperature will cause AGC processing to mask any subtle variation between warmer and cooler soil temperatures that might indicate buried archaeological features. For instance, the presence of a warm asphalt road surface will drive up the total range of temperature values in an image and might force all of the pixels that contain soil around the asphalt to be rendered as the same 8-bit value. New radiometric thermal cameras like the FLIR Vue Pro-R used in this study instead record a 14-bit image (16,384 values), in which the pixel value of any point in the image will be independent of all of the other temperature values in the same image, ostensibly recording the true thermal radiance value of that point on the ground.
Radiometric thermal cameras, however, can still suffer from problems with thermal drift. Most currently available thermal cameras for drones utilize uncooled bolometer cores. The radiometric data recorded from such sensors tend to drift over time due to several factors, including temperature fluctuations of the sensor [19]. The effect of this drift is often not visible in non-radiometric camera data, as it is likely masked by the application of the automatic gain control to counteract this drift, the camera will automatically and periodically perform a "Flat-Field Correction (FFC)" in which a uniform temperature shutter is placed in front of the sensor, and new corrections are applied to the images. In practice, this means that the apparent temperature of the ground can drift slowly over the course of a flight and then change suddenly when the FFC events occur. It is possible to turn off the FFC events, to ensure that they do not create sudden changes in apparent temperature, but this will not eliminate the underlying problem of apparent temperature drift. One way to mitigate this problem is to let the camera adjust to prevailing temperatures for a significant length of time before recording data. Berni et al. [19], for instance, found that their older FLIR thermal camera would drift less if it was allowed to adjust to ambient temperatures for up to an hour after it was switched on. Unfortunately, though, it is likely impossible to keep the camera temperature from fluctuating in real fieldwork conditions, due to the effect of flying the camera at altitudes and temperatures different from those on the ground and placing the camera in the downwash from the drone propellers.
However, since the goal of archaeological thermal survey is to identify buried archaeological features by the local difference in surface temperature relative to the immediately surrounding soil, slow drift of the apparent surface temperature is more annoying than prohibitive. The actual temperature of the ground is less important than our ability to resolve subtle local temperature differences. While it is true that sufficient temperature drifts can tend to mask portions of a survey area, this drift can be removed in post-processing to keep the apparent general temperature of the ground even throughout the survey area (see below). Similarly, the emissivity of the ground must be established (and entered into the camera) for the observed temperatures to match the real-world temperature. Emissivity is likely to change across time as soil moisture content changes. If we wanted to ensure true temperature measurements, we would need to establish the emissivity of the ground for each drone observation by establishing the temperature of the ground and adjusting the emissivity parameter in the camera until the correct temperature was displayed. However, since we were primarily interested in relative measurements, and wanted to be as efficient in the field as possible, we did not measure ground temperatures directly, and utilized a consistent emissivity parameter meant to approximate dry ground (.90) for all flights.

Multi-Spectral NIR Imaging
Although the value for archaeological survey of near infrared imagery and near infrared-derived vegetation indices such as NDVI (Normalized Difference Vegetation Index), has long been known and applied to satellite imagery [20][21][22] a new generation of affordable, lightweight near infrared sensors designed for low elevation drone-based imaging have recently become available. While archaeologists wishing to record low-elevation, multi-spectral data had previously been limited to custom modified visible light cameras [23,24], recently developed specialized multi-spectral sensors like the MicaSense RedEdge series, the Parrot Sequoia, the Sentera Quad Sensor, the Mapir Kernel, and the Slantrange 3p can now record individual spectral bands as separate images and are specifically designed for integration with commercial drones. Cheaper commercial multi-spectral options exist as well, like the Mapir Survey 3 and Agrocam Geo NDVI, which are more similar to the older DIY options and record multiple spectral bands on a single RGB (red, green, blue) sensor that has been specially filtered to record narrower bands that include the near infrared.
In this study we use a Parrot Sequoia to record the survey area from 30, 60, and 90 m altitudes. The Sequoia cam records Green (530-570 nm), Red (640-680 nm), Red Edge (730-740 nm), and Near Infrared (770-810 nm) spectral bands as separate images, along with a traditional RGB sensor recording visible light images simultaneously with the individual spectral band images. We found the RGB sensor on the Parrot Sequoia is particularly unsuitable for photogrammetry, as it suffers from significant "rolling shutter" effects, but individual spectral band images can be used to create false color, NDVI or other image products effectively. The Sequoia also records information about the amount of light falling on the current scene via a sunshine sensor, written as Exif information on the images, and thus can be used to calibrate the resulting NDVI values.

Terrestrial Geophysics
Remote sensing via drone, using thermal and multi-spectral data, can cover a much wider area, more rapidly, than traditional terrestrial geophysical methods. However, as discussed below, there are significant constraints on where and when these methods might identify buried features. For this survey we wanted to be able to compare the visibility of known buried features in thermal and multi-spectral data to the visibility of the same features using more traditional terrestrial geophysics. Magnetometry was collected using a Bartington Grad 601-2 magnetic gradiometer in 20 m 2 grids with a 0.5 m zig-zag transect pattern at 8 samples per meter. These data were processed for display using the open source software ArchaeoFusion. Ground Penetrating Radar (GPR) data were collected using a GSSI UtilityScan system configured with a 350 MHz frequency digital antenna. Surveys were undertaken in the same 20 × 20 m grid system as magnetic data also using bi-directional survey at 0.5 m intervals, and data were processed using RADAN software from GSSI. Grids for the geophysical survey were laid out with the same tools used for GCP recording (a combination of total station and GNSS survey instruments) which ensured that the geophysical results could be accurately compared to the drone-derived data.

Post Processing
All drone flights produced sets of overlapping images of either thermal, multi-spectral, or visible light images. All image sets were combined with terrestrial GCP data, and processed using Photoscan Pro software to create orthophotos and digital elevation models (DEMs) that could be further manipulated in ESRI ArcGIS.
In an effort to eliminate the apparent temperature drift caused by the uncooled thermal sensor in the FLIR Vue Pro-R, we exported ortho-image mosaics from Photoscan Pro as geotiffs that preserved the raw radiometric data from the sensor (Figure 3: left). We then de-trend the data by applying a low-pass filter to remove the macro-scale drift values. In ArcGIS, we accomplished this by running the Focal Statistics tool to calculate the mean thermal value for a large area around each cell in the Remote Sens. 2020, 12, 690 8 of 21 raster of temperature values, producing a raster data set that shows the major temperature drift trend from one end of the survey to the other (Figure 3: center). This trend raster is then subtracted from the original raster resulting in a visualization that preserves local contrast while removing the apparent temperature drift (Figure 3: right). While some experimentation may be needed to identify the appropriate size of the "window" needed to capture mean temperature drift without capturing local temperature variation when applying the focal statistics tool, our approach created thermal ortho-image mosaics that could be more easily combined and analyzed quantitatively (see below).
All drone flights produced sets of overlapping images of either thermal, multi-spectral, or visible light images. All image sets were combined with terrestrial GCP data, and processed using Photoscan Pro software to create orthophotos and digital elevation models (DEMs) that could be further manipulated in ESRI ArcGIS.
In an effort to eliminate the apparent temperature drift caused by the uncooled thermal sensor in the FLIR Vue Pro-R, we exported ortho-image mosaics from Photoscan Pro as geotiffs that preserved the raw radiometric data from the sensor (Figure 3: left). We then de-trend the data by applying a low-pass filter to remove the macro-scale drift values. In ArcGIS, we accomplished this by running the Focal Statistics tool to calculate the mean thermal value for a large area around each cell in the raster of temperature values, producing a raster data set that shows the major temperature drift trend from one end of the survey to the other (Figure 3: center). This trend raster is then subtracted from the original raster resulting in a visualization that preserves local contrast while removing the apparent temperature drift (Figure 3: right). While some experimentation may be needed to identify the appropriate size of the "window" needed to capture mean temperature drift without capturing local temperature variation when applying the focal statistics tool, our approach created thermal ortho-image mosaics that could be more easily combined and analyzed quantitatively (see below). The multi-spectral data recorded by the Parrot Sequoia camera was post processed to produce NDVI data directly in Photoscan Pro. As of version 1.4, Photoscan Pro is capable of loading all four independent spectral bands as a single multi-band file, photogrammetrically processing all of the bands simultaneously, and then utilizing the sunshine sensor data to correct for the amount of light The multi-spectral data recorded by the Parrot Sequoia camera was post processed to produce NDVI data directly in Photoscan Pro. As of version 1.4, Photoscan Pro is capable of loading all four independent spectral bands as a single multi-band file, photogrammetrically processing all of the bands simultaneously, and then utilizing the sunshine sensor data to correct for the amount of light falling on the scene in order calculate accurate NDVI values. It is then possible to utilize the internal "raster calculator" in Photoscan Pro to generate and export rasters with NDVI values using the formula "NDVI = NIR-Red/NIR+Red". The resulting raster can be further processed using a custom LUT (look-up table) to enhance local contrast within the NDVI image in order to increase archaeological feature visibility.
Since the thermal and multi-spectral temporal data is well georeferenced via accurate GCPs, it is possible to combine many discrete surveys into a single multi-band raster that can help draw out and identify features that are subtly visible across different times and sensors. This is accomplished in ArcGIS by clipping all overlapping thermal and multi-spectral raster files to the same extents and then using the "composite bands" tool.

Thermal and Multi-Spectral Survey
After post processing, subsurface features are clearly visible in the thermal and multi-spectral results (Figures 4 and 5). In Figure 4, comparing a wide area of thermal data to an orthophoto overlaid by a historical map of the village, it is possible to see several rectilinear patterns in the thermal image that correspond with historic features, or relate to more recent structures. For instance, there is evidence for at least two rectilinear buildings in the front lawn of the main building ( Figure 4A,B) that have been confirmed by field school excavations on site. A previous iteration of the current driveway is visible to the south and north ends of the main lawn ( Figure 4C), an earlier path leading from one of the existing structures to the road is visible in the middle of the lawn ( Figure 4D) and portions of the building located across the modern road (NH 4a) are hinted at ( Figure 4E). Additional features are also visible in the NDVI data, including rectilinear features that correlate well with structures in the historic map on the west side of the road ( Figure 5A,B), and modern pipes ( Figure 5C).
When thermal data collection was repeated on different dates and at different times, there was significant variation in thermal visibility. In several survey sessions, thermal recording was repeated at several different times (see Table 1). Visibility was often highest soon after dusk until several hours after sunset. Figure 6 shows four orthophotos of the same portion of the site recorded at 8 p.m., 11 p.m., 4 a.m., and 9 a.m. on 11-12 May 2017. Visibility of features is relatively similar at 8 p.m. and 11 p.m. At 4 a.m., features are still visible, but contrast is muted. This is almost certainly due to the effects of dew which mask thermal differences. Over several nights of survey, the appearance of dew on the ground always resulted in lower visibility. Although Casana et al. [7] recommend increasing benefits from later surveys due to increasing temperature contrasts from differing thermal inertias of different materials, in many practical cases this seems to be limited by when the dew point is reached. When the sun comes up and is directly warming the ground, the visibility of patterns in the thermal data disappear rapidly. Almost none of the patterns remain visible by 9 a.m.
Remote Sens. 2020, 12, x FOR PEER REVIEW 10 of 24 falling on the scene in order calculate accurate NDVI values. It is then possible to utilize the internal "raster calculator" in Photoscan Pro to generate and export rasters with NDVI values using the formula "NDVI = NIR-Red/NIR+Red". The resulting raster can be further processed using a custom LUT (look-up table) to enhance local contrast within the NDVI image in order to increase archaeological feature visibility.
Since the thermal and multi-spectral temporal data is well georeferenced via accurate GCPs, it is possible to combine many discrete surveys into a single multi-band raster that can help draw out and identify features that are subtly visible across different times and sensors. This is accomplished in ArcGIS by clipping all overlapping thermal and multi-spectral raster files to the same extents and then using the "composite bands" tool.

Thermal and Multi-Spectral Survey
After post processing, subsurface features are clearly visible in the thermal and multi-spectral results (Figures 4 and 5). In Figure 4, comparing a wide area of thermal data to an orthophoto overlaid by a historical map of the village, it is possible to see several rectilinear patterns in the thermal image that correspond with historic features, or relate to more recent structures. For instance, there is evidence for at least two rectilinear buildings in the front lawn of the main building ( Figure 4 A      p.m. At 4 a.m., features are still visible, but contrast is muted. This is almost certainly due to the effects of dew which mask thermal differences. Over several nights of survey, the appearance of dew on the ground always resulted in lower visibility. Although Casana et al. [7] recommend increasing benefits from later surveys due to increasing temperature contrasts from differing thermal inertias of different materials, in many practical cases this seems to be limited by when the dew point is reached. When the sun comes up and is directly warming the ground, the visibility of patterns in the thermal data disappear rapidly. Almost none of the patterns remain visible by 9 a.m.

Combination Processing
Kvamme [25] has suggested that data fusion, combining multiple data sets into a single interrogable composite, rather than side by side comparison of multiple data sets, is a powerful approach to remote sensing data that remains underutilized by archaeologists. One major benefit of multiple georeferenced aerial surveys, across multiple dates and times, and multiple sensors (thermal and multi-spectral) is the production of many comparable raster data sets that can be combined and analyzed together. Figure 7 shows the result of creating a multi-band raster of 12 different thermal surveys as well as all four bands of the multi-spectral data for the main lawn. This multi-band raster is constructed from all the thermal flights listed in Table 1 except flights 14 and 16 (due to shadows and objects in the daylight imagery), and from the multi-spectral flight 11. At the top of Figure 7 are

Combination Processing
Kvamme [25] has suggested that data fusion, combining multiple data sets into a single interrogable composite, rather than side by side comparison of multiple data sets, is a powerful approach to remote sensing data that remains underutilized by archaeologists. One major benefit of multiple georeferenced aerial surveys, across multiple dates and times, and multiple sensors (thermal and multi-spectral) is the production of many comparable raster data sets that can be combined and analyzed together. Figure 7 shows the result of creating a multi-band raster of 12 different thermal surveys as well as all four bands of the multi-spectral data for the main lawn. This multi-band raster is constructed from all the thermal flights listed in Table 1 except flights 14 and 16 (due to shadows and objects in the daylight imagery), and from the multi-spectral flight 11. At the top of Figure 7 are three of the bands of this multi-band raster visualized as a false color RGB (consisting of the Green band (530-570 nm) from flight 11, thermal data from flight 17, and the near infrared band (770-810 nm) from flight 11. The bottom figure shows the result of the first three principle components of that multi-band raster also displayed as a false color RGB image. This pair of images helps improve contrast in several areas, and highlights several of the features visible across all of the data sets. For instance visibility is increased for an old curved roadbed on the left side of the lawn ( Figure 7A,G), two of the structures visible in the historic map ( Figure 7B,C,D,F), a former path across the lawn (Figure 7E), and a modern pipe channel running parallel to the modern road ( Figure 7H). Additionally, one of the best ways of detecting subtle thermal variation may be simply calculating the temperature difference between thermal images recorded at different times of day or different times of year [26,27]. This was attempted for several different pairs of thermal images of the site, but the best results, with the clearest details, seemed to come from the images taken within the same day. Figure 8 includes a difference map showing the data from the 11 May, 11 p.m. flight subtracted from the 11 May, 8 p.m. flight. This difference map increases contrast in some features, similar to the increased contrast in the multi-band and PCA processing (Figure 7). Improved contrast highlights the two foundations ( Figure 8C,D) and the old path across the lawn and pipe bed ( Figure 8A,B).
the best results, with the clearest details, seemed to come from the images taken within the same day. Figure 8 includes a difference map showing the data from the 11 May, 11 p.m. flight subtracted from the 11 May, 8 p.m. flight. This difference map increases contrast in some features, similar to the increased contrast in the multi-band and PCA processing (Figure 7). Improved contrast highlights the two foundations (Figure 8: C and D) and the old path across the lawn and pipe bed (Figure 8: A  and B).

Oblique Time-Lapse Recording
The time-lapse thermal recording of the lawn in front of the Great Stone Dwelling provides a fine-grained visualization of how temperatures fluctuate across a single night (see Supplemental File). Figure 9 shows four images from the time lapse from different times through the night, each

Oblique Time-Lapse Recording
The time-lapse thermal recording of the lawn in front of the Great Stone Dwelling provides a fine-grained visualization of how temperatures fluctuate across a single night (see Supplemental File). Figure 9 shows four images from the time lapse from different times through the night, each with an inset temperature profile of the same section of the lawn (outlined in yellow) centered over the foundation of the buried building. Just after sunset ( Figure 9A) the stones on the surface marking the foundation are much warmer than the surrounding ground, but there are some areas in the lawn with discernible variation in temperature. After a few hours ( Figure 9B) the stones on the surface have cooled significantly, there is increased contrast across the lawn, and some features are apparent. At this point the warmer area over the foundation is wider than the stones and the stones themselves have become cooler, as indicated on the thermal cross section. This is possibly reflecting the actual buried foundation or disturbance in the soil around the foundation. Between 1:45 a.m. and 2:30 a.m. dew forms across the lawn. All temperature contrasts are reduced ( Figure 9C) though the stones remain slightly warmer than the surrounding ground. Just after dawn ( Figure 9D) when the sun is shining directly on the ground, the lawn heats up rapidly, and the stones on the surface are colder than their surroundings, the inverse of the pattern during the night. However, there is little discernible temperature contrast across the lawn.

Terrestrial Geophysics
The results from the gradiometer and GPR surveys identify many of the same features visible in the thermal and multi-spectral data sets, and in many cases provide increased visibility of these features. Figure 10 shows the results of the gradiometer survey and Figure 11 shows three slices of GPR data from approximately 15, 30, and 50 cm below surface. Many features are visible in the aerial data are visible in one or both of these terrestrial data sets. The gradiometer data shows some features visible in the thermal data such as the buildings in the main lawn ( Figure 10 C and F), the modern

Terrestrial Geophysics
The results from the gradiometer and GPR surveys identify many of the same features visible in the thermal and multi-spectral data sets, and in many cases provide increased visibility of these Remote Sens. 2020, 12, 690 14 of 21 features. Figure 10 shows the results of the gradiometer survey and Figure 11 shows three slices of GPR data from approximately 15, 30, and 50 cm below surface. Many features are visible in the aerial data are visible in one or both of these terrestrial data sets. The gradiometer data shows some features visible in the thermal data such as the buildings in the main lawn ( Figure 10C,F), the modern irrigation pipe (Figure 10D), and some features not clearly visible in the multi-spectral or thermal data such as the large pipe on the west side of the road ( Figure 10A) and some of the historic foundations in this area ( Figure 10B,E). The GPR data shows some of the features visible in the thermal data, such as the building foundations ( Figure 11A,B) and the enigmatic round feature visible near the surface in the GPR and multi-spectral processing ( Figures 11C and 6 (top)) while providing more details about the depth and internal features of the foundations (Figure 11 center and right).

Discussion
Our results demonstrate that under favorable environmental conditions, multi-temporal aerial thermal and multi-spectral imaging can reveal a wide range of archaeological features on and below the ground surface. Some of these features, including two building foundations in the main lawn and a pipe paralleling the road are visible in all of our datasets, including GPR, magnetic gradiometry, multi-spectral and thermal imagery. In some cases, such as the historic pathway and the building foundations west of the road, features appear more clearly in aerial imagery than in terrestrial geophysical data. In other instances, such as the buried water pipe and the remains of metal artifacts and installations, magnetic gradiometry and GPR reveal features that are not evident in aerial imagery. Because each of these methods are measuring distinct physical phenomena, and the thermal and multi-spectral imagery are surficial investigations versus subsurface investigation, it is to be expected that certain types of features will be readily apparent in one dataset but may not appear in another. Figure 12 provides side-by-side comparison of a few features in the thermal, magnetic

Discussion
Our results demonstrate that under favorable environmental conditions, multi-temporal aerial thermal and multi-spectral imaging can reveal a wide range of archaeological features on and below the ground surface. Some of these features, including two building foundations in the main lawn and a pipe paralleling the road are visible in all of our datasets, including GPR, magnetic gradiometry, multi-spectral and thermal imagery. In some cases, such as the historic pathway and the building foundations west of the road, features appear more clearly in aerial imagery than in terrestrial geophysical data. In other instances, such as the buried water pipe and the remains of metal artifacts and installations, magnetic gradiometry and GPR reveal features that are not evident in aerial imagery. Because each of these methods are measuring distinct physical phenomena, and the thermal and multi-spectral imagery are surficial investigations versus subsurface investigation, it is to be expected that certain types of features will be readily apparent in one dataset but may not appear in another. Figure 12 provides side-by-side comparison of a few features in the thermal, magnetic gradiometry, and NDVI data sets. The top row of Figure 11 shows the buried foundation in the main lawn, labeled "B" in Figure 4. The middle row shows the foundation on the west side of the road labeled "A" in Figure 5. The bottom row of Figure 11 shows the buried pipe labeled A in Figure 9. These results reinforce the importance of understanding various remote sensing methods as complementary to one another. However, the time investment to record the site using these 4 technologies varies dramatically. In our study, a single, 20-min drone flight recorded 6 hectares, while GPR and gradiometer surveys require days of effort per hectare. Thus, with relatively little time investment, archaeologists conducting geophysical surveys can add a valuable dataset by using aerial thermal and multi-spectral imaging.
Remote Sens. 2020, 12, x FOR PEER REVIEW 18 of 24 gradiometry, and NDVI data sets. The top row of Figure 11 shows the buried foundation in the main lawn, labeled "B" in Figure 4. The middle row shows the foundation on the west side of the road labeled "A" in Figure 5. The bottom row of Figure 11 shows the buried pipe labeled A in Figure 9. These results reinforce the importance of understanding various remote sensing methods as complementary to one another. However, the time investment to record the site using these 4 technologies varies dramatically. In our study, a single, 20-min drone flight recorded 6 hectares, while GPR and gradiometer surveys require days of effort per hectare. Thus, with relatively little time investment, archaeologists conducting geophysical surveys can add a valuable dataset by using aerial thermal and multi-spectral imaging.  Results of our investigations at Enfield Shaker Village support previous studies' [7,8] conclusions that diurnal heat flux is one of the most significant variables affecting the visibility of archaeological features in aerial thermal images. Building foundations and other features are most evident in thermal images collected after dark, while the same features are virtually invisible in imagery collected during the day. We also demonstrate the importance of the dew point, as features visible in thermal imagery collected after dusk disappear when surface temperatures reach the dew point. NDVI imagery reveals many of the same building foundations and other features that are visible in thermal imagery, including the two foundations in the main lawn, the old paved segments of driveway, modern pipe features, and possibly additional historic foundations to the west, because in each case subsurface archaeological features are impacting vegetation health.
One of the most important questions about the efficacy of thermography for identifying buried features is the depth to which it can be effective. Visibility on the surface, of thermal contrasts reflecting buried features, is affected by several key factors including the difference in thermal inertia between archaeological features and the soil in which they are buried [3], the volumetric heat capacity and thermal conductivity of the features and the surrounding matrix, and thermal emissivity of the surface [8]. There is significant literature, from researchers studying terrestrial heat flow and climate change, on temperature fluctuations of soil temperature, depending on depth, and the effect of these key factors e.g., [28][29][30][31][32]. While much of this literature focuses on a scale that is too large for the fine-grained thermal variation that can reveal archaeological features, there is significant literature on the detection of shallowly buried mines via thermography [33][34][35][36][37][38]. While much of the recent work with aerial thermography for archaeology has focused on the importance of the short term, diurnal heating cycle for creating differences in surface temperatures due to differences in thermal inertia between buried archaeological features and surrounding soils [7][8][9][10], Scollar et al. [5] argue that the diurnal heating cycle is too transient to show the effect of differing thermal inertias for more deeply buried features. Instead, significant warming or cooling of the ground over a period of several days is more likely to allow deep differences in thermal inertia to be reflected on the surface. They suggest, then, that only when there are consistent shifts in temperature over several days will more deeply buried features be visible, and that otherwise the variation of soil temperature on the surface will only reflect very near surface features.
Directly measuring buried soil temperatures is one critical way of understanding how deeply diurnal and longer-term temperature variation is affecting and reflecting soil temperature at the surface and thus the potential depth of features that might be visible. Unfortunately, we did not have access to local soil temperature data at the site during these surveys, and this is something that we hope to include in future work. Instead, in order to try to get some handle on the daily and longer-term soil temperature changes at different depths in the area, we looked at published soil temperature, soil moisture, wind speed, relative humidity, air temperature and dew point data collected on an hourly basis at the nearby Mascoma River Site (approximately 19 km away) by the Natural Resource Conservation Service Soil Climate Analysis Network (SCAN, https://www.wcc.nrcs.usda.gov/scan/). This data is presented here as an imperfect proxy for directly observing soil temperatures on site. The air temperature and diurnal heating experienced by both sites is likely to be similar, given their proximity, though since the soils and moisture content at the two sites are likely different, this comparison should be taken with a grain of salt. Figure 13 shows soil temperature data from 5 different depths at Mascoma River, for the two weeks prior to thermal surveys conducted at Enfield Shaker Museum on May 11th and October 22nd, 2017, while Figure 14 shows air temperature and precipitation data from a local weather station for the same period. Although the data show that in May 2017 temperatures were relatively constant, in October 2017 temperatures cooled rapidly in the two weeks leading up to our survey. In the week leading up to 22 October, there was a sudden drop in daily air temperatures (Figure 14), and this is shown in the soil temp data (Figure 13) with the temperature at 6 inches below the surface dropping below the temperature at 40 inches for several days. This is exactly the kind of multi-day temperature transition that Scollar et al. [5] were describing. However, these differences do not seem to produce a marked effect on the visibility of archaeological features in thermal imagery at the site, suggesting that daily temperature variations are in fact the more important variable to consider when planning archaeological thermal surveys. The Mascoma River soil temperature data also illustrate the depth to which soil in this environment is affected by diurnal temperature changes. In both May and October, soil temperature is only measurably changing during diurnal cycles in the top 12-15" (30-38 cm) of soil, suggesting that, in this particular environment, this is the maximum depth to which thermal imagery is likely to reveal subsurface features. However, because thermal conductivity of soils and sediments can vary by a factor of 20, the effective depth at which archaeological features will be visible in an aerial thermal image will vary considerably depending on soil moisture and composition.
Remote Sens. 2020, 12, x FOR PEER REVIEW 20 of 24 temperature variations are in fact the more important variable to consider when planning archaeological thermal surveys. The Mascoma River soil temperature data also illustrate the depth to which soil in this environment is affected by diurnal temperature changes. In both May and October, soil temperature is only measurably changing during diurnal cycles in the top 12-15" (30-38 cm) of soil, suggesting that, in this particular environment, this is the maximum depth to which thermal imagery is likely to reveal subsurface features. However, because thermal conductivity of soils and sediments can vary by a factor of 20, the effective depth at which archaeological features will be visible in an aerial thermal image will vary considerably depending on soil moisture and composition.  Our results also highlight the critical importance of soil moisture in determining the visibility of buried archaeological features as well as for environmental data collection [39]. The high thermal inertia of water means that surface moisture, from either dew or recent rainfall, will likely obscure archaeological features that might otherwise be visible. On the other hand, surveys undertaken in extremely dry environments may show little contrast between archaeological features and the soils in which they are buried [8]. The Mascoma River data provides some useful proxy information about soil moisture, listed in Table 1. The moisture percentage in the first 6" of soil at Mascoma River is significantly lower during the fall surveys in October 2016 and 2017, which may help explain some of the differences in feature visibility between datasets. Directly measuring the soil moisture content on site before each thermal mapping session is likely to be an important workflow addition for predicting and tracking successful outcomes. Our results also highlight the critical importance of soil moisture in determining the visibility of buried archaeological features as well as for environmental data collection [39]. The high thermal inertia of water means that surface moisture, from either dew or recent rainfall, will likely obscure archaeological features that might otherwise be visible. On the other hand, surveys undertaken in extremely dry environments may show little contrast between archaeological features and the soils in which they are buried [8]. The Mascoma River data provides some useful proxy information about soil moisture, listed in Table 1. The moisture percentage in the first 6" of soil at Mascoma River is significantly lower during the fall surveys in October 2016 and 2017, which may help explain some of the differences in feature visibility between datasets. Directly measuring the soil moisture content on site before each thermal mapping session is likely to be an important workflow addition for predicting and tracking successful outcomes.

Conclusions
At the Enfield Shaker Village both thermal and multi-spectral surveys are able to quickly and efficiently resolve historic anthropogenic features, depending on environmental conditions and the diurnal cycle. With a combination of correct timing and post processing, thermal and multi-spectral imaging can reveal some of the same features detectable via traditional terrestrial geophysics but at a much larger scale. These technologies are rapidly becoming critical tools for archaeological remote sensing. While they may be more limited by environmental restrictions than terrestrial geophysics, their cost effectiveness, ease of use, and efficiency makes these attractive solutions for larger surveys. Limiting factors can be overcome by selecting appropriate dates and times for surveys based on environmental conditions, and hardware limitations can be addressed via post-processing. These tools can fit well into a larger research strategy that incorporates historical records, traditional terrestrial geophysical survey, and excavation for validation and ground truthing.

Conclusions
At the Enfield Shaker Village both thermal and multi-spectral surveys are able to quickly and efficiently resolve historic anthropogenic features, depending on environmental conditions and the diurnal cycle. With a combination of correct timing and post processing, thermal and multi-spectral imaging can reveal some of the same features detectable via traditional terrestrial geophysics but at a much larger scale. These technologies are rapidly becoming critical tools for archaeological remote sensing. While they may be more limited by environmental restrictions than terrestrial geophysics, their cost effectiveness, ease of use, and efficiency makes these attractive solutions for larger surveys. Limiting factors can be overcome by selecting appropriate dates and times for surveys based on environmental conditions, and hardware limitations can be addressed via post-processing. These tools can fit well into a larger research strategy that incorporates historical records, traditional terrestrial geophysical survey, and excavation for validation and ground truthing.