Multispectral Contrast of Archaeological Features: A Quantitative Evaluation

This study provides an evaluation of spectral responses of hollow ways in Upper Mesopotamia. Hollow ways were used for the transportation of animals, carts, and other moving agents for centuries. The aim is to show how the success of spectral indices varies in describing topologically simple features even in a seemingly homogeneous geographic unit. The variation is further highlighted under the changing precipitation regime. The methodology begins with an exploration of the relationship between the date of a multispectral scene and the visibility of hollow ways. The next step is to evaluate the impact of rainfall levels on numerous indices and to quantify spectral contrast. The contrast between a hollow way and its background is evaluated with Welch’s t-test and the association between precipitation regime and spectral responses of hollow ways are investigated with Correspondence Analysis and Fisher’s test. Results highlight an intrinsic relationship between the precipitation regime and the ways in which archaeological features reflects and/or emits electromagnetic energy. Next, the categorization of spectral indices based on different rainfall levels can be used as a guidance in future studies. Finally, the study suggests contrast becomes an even more fruitful concept as one moves from the spatial domain to the spectral domain.


Introduction
In remote sensing, the detection of an archaeological feature within its surrounding matrix depends on the level of contrast [1]; a target is easier to decipher-both visually and computationally-when it has higher contrast.Within the context of satellite remote sensing, contrast is the difference between the electromagnetic energy emitted and/or reflected from a feature (or its proxy) and the background energy.The level of difference as registered by the sensor determines whether the feature mapping and analysis will be successful.
Contrast is an underlying concept of all remote mapping projects.In particular, the increasing access to Very High Resolution (VHR) satellite imagery (with sub-meter resolutions in panchromatic bands) and the widespread use of online imagery viewing platforms make it easier to manually or (semi-)automatically detect potential targets [2].While doing so, researchers tend to focus mainly on spatial manifestations of spectral contrast: that is, the separation boundary between two internally homogeneous spectral sets.Yet, spectral data in and of themselves also contain useful information aside from location and shape.
To date, a wide range of spectral data analyses have been performed in order to identify the relation between vegetation classes and human occupation patterns [3], solve the spectral mixture problem [4], or investigate human-environment relations [5].In particular, detection and documentation of archaeological features have been an important research domain [6][7][8].However, aside from a few notable examples [9][10][11] the temporal conditions of spaceborne spectral data and how archaeological features react spectrally to these conditions (especially for changing climatic conditions) have not fully attracted the attention of archaeologists.
Satellite remote sensing analysis is most successful when the electromagnetic energy emitted and/or reflected from an archaeological feature is directly recorded by the sensor (i.e., direct contrast).However, proxy data may be used if direct detection is not possible, such as when the feature is buried [1].In fact, this approach has been extensively discussed in the literature on aerial remote sensing [12,13].For instance, wall foundations, ancient water channels, and compacted road surfaces can be detected by proxy by studying the vegetation growth both above and around them.In another example, a buried structure replaces the soil and limits the amount of nutrients available to the vegetation above it.Thus, the biomass above buried archaeological features becomes less healthy in comparison to vegetation with access to deeper soils.Likewise, differential deposition (e.g., an ancient ditch later in-filled with loose topsoil) may create variations in moisture-retention capacity such that the buried feature may be still visible as soil mark proxies.Exploiting these relations, researchers can use multispectral analysis of vegetation and/or bare soils in order to document both buried and unburied archaeological features [14,15].
Vegetation and soil characteristics and their energy reflections on the electromagnetic spectrum are variable.For instance, optimal conditions for measuring agricultural crop characteristics via satellite sensors are distorted when local producers variably observe and react to seasonal climatic variations, and this may result in canopy differences even at short distances.Also, even if the biomass is compatible, different crop-management strategies-such as non-uniform access to irrigation, manuring, weed control, row spacing, and field orientation-have an impact on the relations between vegetation and how it is measured with a multispectral satellite sensor [16][17][18].To normalize for such external factors, remote sensing scientists use spectral indices.These indices are dimensionless radiometric measures that are calculated using different combinations of portions of the electromagnetic spectrum.Traditionally, archaeological remote sensing studies favor the near-infrared (NIR) portion of the spectrum and, therefore, spectral indices based on NIR [19].Among many other indices, the Normalized Difference Vegetation Index (NDVI) has been widely adopted by archaeologists [20][21][22].
Building a spectral index is a goal-oriented task.For instance, the Soil-Adjusted Vegetation Index (SAVI) is an improvement over NDVI and is less sensitive to soil background conditions [23].The Automated Water Extraction Index (AWEI) aims to map surface water levels with high accuracy when there is considerable environmental noise [24].The index choice also depends on the available sensor spectral resolution (i.e., the width and number of spectral intervals in the electromagnetic spectrum to which a sensor is sensitive).A coarser spectral resolution, such as in Landsat, will only allow for the calculation of broad-band indices [25].Therefore, the decision of which index to use is inherently a negotiation between the physical characteristics of a target and available multispectral dataset(s).
This negotiation is also influenced by external conditions of the sensor-feature relationship.For instance, one of the variables in Earth Observation is air/surface temperature [26].Variations in temperature may have a direct impact on soil moisture content, evapotranspiration cycle, vapor pressure, and many other factors.In return, these factors may influence the target spectra [27].An ideal remote sensing study would integrate all the related climatic, geographic, and geological factors which might alter the ways in which targets emit/reflect electromagnetic energy, but in reality, this is an impossible task.Therefore, researchers tend to focus on a limited number of variables when performing spectral analysis, and especially when the spectra is investigated in time-series form.Among other determinants, such as, understanding the relationship between the precipitation regime and spectral data is a primary concern among remote sensing specialists [28][29][30][31][32]. Nevertheless, this methodological Upper Mesopotamia has a rich archaeological record dating from the onset of the Neolithic to Medieval times.This paper focuses on the second half of the Early Bronze Age (ca.2500-2300 BCE; EJ III), when urbanism and intensification of agricultural production shaped the Upper Mesopotamian landscape significantly.In turn, these two socioeconomic factors contributed to the creation of a unique landscape of movement in the form of a dense transportation network between Early Bronze Age sites as well as between settlements and their pasture lands.
First, urbanism introduced a compact way of organizing space.Closely built structures around existing settlement mounds created a new style of habitation.The concentration of population (and Upper Mesopotamia has a rich archaeological record dating from the onset of the Neolithic to Medieval times.This paper focuses on the second half of the Early Bronze Age (ca.2500-2300 BCE; EJ III), when urbanism and intensification of agricultural production shaped the Upper Mesopotamian landscape significantly.In turn, these two socioeconomic factors contributed to the creation of a unique landscape of movement in the form of a dense transportation network between Early Bronze Age sites as well as between settlements and their pasture lands.
First, urbanism introduced a compact way of organizing space.Closely built structures around existing settlement mounds created a new style of habitation.The concentration of population (and animals) resulted in higher movement fluxes from settlements to outer lands, exposing paths to more trampling and compression.Second, Upper Mesopotamian urbanism was mainly possible in regions where dry farming was practiced [33].Relatively reliable and high levels of precipitation were needed to support dependable agricultural production and, thus, the urban settlement system as a whole.In fact, production was intensified in such a way that rural tributary communities supported extensive populations of large central settlements [34] (pp.312-314).

Features of Spectral Interest
Hollow ways are the markers of intensified agricultural production during the Early Bronze Age (for a detailed analysis of structure and dating, see [35]).These linear features, with widths ranging from 30 to 100 m, radiated from settlements and terminated after an average distance of 2.5 to 3.0 km.In some other cases, they extended further into the landscape and connected settlements to one another.Wilkinson [27] shows that hollow ways were used for the controlled transportation of flocks from settlements to open pastureland.As a result, of continuous foot traffic from animals and humans, linear depressions were formed around settlements.This evidence for repetitive ancient movement still survives today and is visible on satellite imagery (Figure 2a) [36].A complete digital inventory of hollow ways can be viewed at the Harvard WorldMap project [37] and vectorized hollow-way features can be downloaded for local viewing and analysis in a Geographic Information Systems (GIS) software [38].
Remote Sens. 2019, 11, x FOR PEER REVIEW 4 of 25 animals) resulted in higher movement fluxes from settlements to outer lands, exposing paths to more trampling and compression.Second, Upper Mesopotamian urbanism was mainly possible in regions where dry farming was practiced [33].Relatively reliable and high levels of precipitation were needed to support dependable agricultural production and, thus, the urban settlement system as a whole.In fact, production was intensified in such a way that rural tributary communities supported extensive populations of large central settlements [34] (pp.312-314).

Features of Spectral Interest
Hollow ways are the markers of intensified agricultural production during the Early Bronze Age (for a detailed analysis of structure and dating, see [35]).These linear features, with widths ranging from 30 to 100 m, radiated from settlements and terminated after an average distance of 2.5 to 3.0 km.In some other cases, they extended further into the landscape and connected settlements to one another.Wilkinson [27] shows that hollow ways were used for the controlled transportation of flocks from settlements to open pastureland.As a result, of continuous foot traffic from animals and humans, linear depressions were formed around settlements.This evidence for repetitive ancient movement still survives today and is visible on satellite imagery (Figure 2a) [36].A complete digital inventory of hollow ways can be viewed at the Harvard WorldMap project [37] and vectorized hollow-way features can be downloaded for local viewing and analysis in a Geographic Information Systems (GIS) software [38].
This study is based on the axiom that hollow ways have detectable differential moistureretention capabilities and soil-compaction levels due to their use histories.Roads carrying more traffic must have had more soil compaction (and thus lower moisture-retention capacities) and vice versa.Variations in soil moisture have an immediate impact on reflectance values, as well as soilproductivity levels: vegetation growing above and around hollow ways appears differently in spectral data because of variable access to soil moisture as well as differential root growth due to the compaction of ancient road surfaces (Figure 2b).Therefore, hollow ways provide a unique opportunity not only for feature mapping, but also for evaluating levels of spectral contrast.This is because their formation and structure are well understood, thanks to previous archaeological research [39].Second, when preserved, even sensors with coarse spatial resolutions can decipher these features.Third, the linear nature of hollow ways means that they have limited topological-spectral complexity.Fourth, they are situated in a relatively flat and homogeneous terrain, minimizing topographic effects on spectral readings.Finally, there is This study is based on the axiom that hollow ways have detectable differential moisture-retention capabilities and soil-compaction levels due to their use histories.Roads carrying more traffic must have had more soil compaction (and thus lower moisture-retention capacities) and vice versa.Variations in soil moisture have an immediate impact on reflectance values, as well as soil-productivity levels: vegetation growing above and around hollow ways appears differently in spectral data because of variable access to soil moisture as well as differential root growth due to the compaction of ancient road surfaces (Figure 2b).
Therefore, hollow ways provide a unique opportunity not only for feature mapping, but also for evaluating levels of spectral contrast.This is because their formation and structure are well understood, thanks to previous archaeological research [39].Second, when preserved, even sensors with coarse spatial resolutions can decipher these features.Third, the linear nature of hollow ways means that they have limited topological-spectral complexity.Fourth, they are situated in a relatively flat and homogeneous terrain, minimizing topographic effects on spectral readings.Finally, there is an immediate relationship between hollow way soils and differential vegetation growth.This paper is an attempt to show how variable precipitation conditions result in a dynamic response of hollow ways on spectral data.
As alternative functional descriptions, some researchers have suggested that a hollow way results from a hydro-compaction process in which where the rupturing of fine soil material initiated the formation of shallow gullies [40].It has been also argued that these depressions were components of man-made drainage systems [41].Therefore, it is true that the slope of the ground can be an important determinant for the hollow ways' spectral reflectance characteristics, especially since the remnants of hollow ways may continue to facilitate surface run-off.Nevertheless, previous research also suggests that hollow ways did not necessarily follow the hydraulic grade [39,42], and so their function in carrying run-off water may be considered unlikely.A systematic evaluation of a high-resolution Digital Elevation Model may further resolve this issue.

Visibility Conditions
The first step in the methodology is to explore the association between the acquisition date of the satellite data and the spectral visibility of features without considering any prior background information.To illustrate this relationship, Tell Beydar and its wider hinterland (#1 in Figure 1), which are located in the heart of Upper Mesopotamia, are presented as a sub-case study.Hollow ways are not evident in the basaltic environment to the west of the site.This suggests that the underlying geology led to differential conditions of preservation, or perhaps that it played a role in influencing people's decisions about movement.The existence of a detailed map of hollow ways from Tell Beydar [34] (p.321) increases the reliability of the visibility analysis (Figure 3).an immediate relationship between hollow way soils and differential vegetation growth.This paper is an attempt to show how variable precipitation conditions result in a dynamic response of hollow ways on spectral data.
As alternative functional descriptions, some researchers have suggested that a hollow way results from a hydro-compaction process in which where the rupturing of fine soil material initiated the formation of shallow gullies [40].It has been also argued that these depressions were components of man-made drainage systems [41].Therefore, it is true that the slope of the ground can be an important determinant for the hollow ways' spectral reflectance characteristics, especially since the remnants of hollow ways may continue to facilitate surface run-off.Nevertheless, previous research also suggests that hollow ways did not necessarily follow the hydraulic grade [39,42], and so their function in carrying run-off water may be considered unlikely.A systematic evaluation of a highresolution Digital Elevation Model may further resolve this issue.

Visibility Conditions
The first step in the methodology is to explore the association between the acquisition date of the satellite data and the spectral visibility of features without considering any prior background information.To illustrate this relationship, Tell Beydar and its wider hinterland (#1 in Figure 1), which are located in the heart of Upper Mesopotamia, are presented as a sub-case study.Hollow ways are not evident in the basaltic environment to the west of the site.This suggests that the underlying geology led to differential conditions of preservation, or perhaps that it played a role in influencing people's decisions about movement.The existence of a detailed map of hollow ways from Tell Beydar [34] (p.321) increases the reliability of the visibility analysis (Figure 3).Hollow ways are documented with the help of the underlying CORONA imagery.Photogrammetrically corrected images were acquired from the CORONA Atlas & Referencing System (http://corona.cast.uark.edu/)and they are orthorectified using SRTM 90m Digital Elevation Model.Next, two commonly used spectral indices in archaeology, the Normalized Difference Vegetation Index (NDVI) and Normalized Difference Water Index (NDWI) were calculated monthly for the years between 2014 and 2017, and annually for the same timeframe.Landsat 8 equipped with the OLI-TIRS sensor (30 m spatial resolution) provided the necessary spectral bands for the operation.Atmospherically corrected reflectance data were acquired from the United States Geological Survey (USGS) EarthExplorer portal.After the visual check for cloud coverage and image quality, spectral indices were calculated using near-infrared (NIR), red, and green bands (see Table A1 in the Appendix A for their formulas).
Spectral indices are calculated for each acquisition date.Monthly values are estimated by a single value if there is only one acquisition in the month, or by an average where there are two or more acquisitions in the month.Finally, monthly indices were averaged for across four years (2014, 2015, 2016, and 2017).The result was two spectral time series (one for NDVI and one for NDWI) from January to December.
Next, a representative sample of 30 hollow ways in the Tell Beydar hinterland was randomly selected from the main dataset (Figure 4).NDVI and NDWI datasets were manually checked for these hollow ways as there was a reference set showing where hollow way existed.Next, the amount of the hollow way that could be seen is digitized for each monthly set of NDVI and NDWI.Finally, vector lengths were calculated for time-series comparison (Figure 5).Next, a representative sample of 30 hollow ways in the Tell Beydar hinterland was randomly selected from the main dataset (Figure 4).NDVI and NDWI datasets were manually checked for these hollow ways as there was a reference set showing where hollow way existed.Next, the amount of the hollow way that could be seen is digitized for each monthly set of NDVI and NDWI.Finally, vector lengths were calculated for time-series comparison (Figure 5).Based on the inferences from the time-series comparison, the focus of analyses shifted to May and November in order to investigate how precipitation impacts the spectral contrast of the features while maximizing the likelihood of discriminating the hollow ways in length (Section 2.2.3).Furthermore, other sample areas were integrated into the study in order to evaluate different geographic conditions (Figure 6).Finally, a large set of spectral indices were calculated to represent most of the electromagnetic spectrum, and how these indices perform under different precipitation levels was investigated.
Figure 5.The total length of digitized hollow ways is represented in the y-axis (in meters).The variation between months can be attributed to differential vegetation growth and changes in soil moisture.For this sample, the May average detected the greatest proportion of the full extents of hollow ways for NDVI (11,672 meters).For NDWI the November average maximized the documentation of hollow ways (13,317 meters).
Based on the inferences from the time-series comparison, the focus of analyses shifted to May and November in order to investigate how precipitation impacts the spectral contrast of the features while maximizing the likelihood of discriminating the hollow ways in length (Section 4.2).Furthermore, other sample areas were integrated into the study in order to evaluate different geographic conditions (Figure 6).Finally, a large set of spectral indices were calculated to represent most of the electromagnetic spectrum, and how these indices perform under different precipitation levels was investigated.
Figure 5.The total length of digitized hollow ways is represented in the y-axis (in meters).The variation between months can be attributed to differential vegetation growth and changes in soil moisture.For this sample, the May average detected the greatest proportion of the full extents of hollow ways for NDVI (11,672 m).For NDWI the November average maximized the documentation of hollow ways (13,317 m).
Figure 5.The total length of digitized hollow ways is represented in the y-axis (in meters).The variation between months can be attributed to differential vegetation growth and changes in soil moisture.For this sample, the May average detected the greatest proportion of the full extents of hollow ways for NDVI (11,672 meters).For NDWI the November average maximized the documentation of hollow ways (13,317 meters).
Based on the inferences from the time-series comparison, the focus of analyses shifted to May and November in order to investigate how precipitation impacts the spectral contrast of the features while maximizing the likelihood of discriminating the hollow ways in length (Section 4.2).Furthermore, other sample areas were integrated into the study in order to evaluate different geographic conditions (Figure 6).Finally, a large set of spectral indices were calculated to represent most of the electromagnetic spectrum, and how these indices perform under different precipitation levels was investigated.

Precipitation Dataset
The precipitation dataset was obtained for the years between 1981 and 2010 from the Global Precipitation Climatology Center (GPCC) at the Deutscher Wetterdienst (Schneider et al. 2011).For the best spatial representation, grids with 0.5 degrees separation were converted to monthly precipitation surfaces using natural neighborhood interpolation.The sample areas were then clipped from these raster layers.To better describe the regime, cumulative rainfall statistics from areas were calculated from December to May (cumulatively called May) and from June to November (cumulatively called November) and averaged for the region (Figure 7).This average accurately describes the precipitation regime of the study region (the eastern portion of Upper Mesopotamia), as sample areas have an even distribution.Finally, four statistics (minimum, low, high, and maximum) are selected to represent the time-series.
Landsat mosaic imagery from the USGS.

Precipitation Dataset
The precipitation dataset was obtained for the years between 1981 and 2010 from the Global Precipitation Climatology Center (GPCC) at the Deutscher Wetterdienst (Schneider et al. 2011).For the best spatial representation, grids with 0.5 degrees separation were converted to monthly precipitation surfaces using natural neighborhood interpolation.The sample areas were then clipped from these raster layers.To better describe the regime, cumulative rainfall statistics from areas were calculated from December to May (cumulatively called May) and from June to November (cumulatively called November) and averaged for the region (Figure 7).This average accurately describes the precipitation regime of the study region (the eastern portion of Upper Mesopotamia), as sample areas have an even distribution.Finally, four statistics (minimum, low, high, and maximum) are selected to represent the time-series.

Building Spectral Indices
Following the calculation of precipitation statistics in sample areas, 25 indices were generated (Table A1) [43,44].This inclusive stance allowed for the creation of many different algebraic combinations of bands representing the electromagnetic spectrum fully in an undiscriminating manner.Following an inductive approach, the initial aim is to look for patterns where seemingly discrete indices converge and explain the differences in spectral contrast between the hollow ways and their immediate surroundings.Also, a high degree of similarity between different bands can cause redundancies in spectral data analysis.However, as the first comprehensive study of the spectral index characterization of hollow ways distributed in a vast study area, the methodology

Building Spectral Indices
Following the calculation of precipitation statistics in sample areas, 25 indices were generated (Table A1) [43,44].This inclusive stance allowed for the creation of many different algebraic combinations of bands representing the electromagnetic spectrum fully in an undiscriminating manner.Following an inductive approach, the initial aim is to look for patterns where seemingly discrete indices converge and explain the differences in spectral contrast between the hollow ways and their immediate surroundings.Also, a high degree of similarity between different bands can cause redundancies in spectral data analysis.However, as the first comprehensive study of the spectral index characterization of hollow ways distributed in a vast study area, the methodology investigates potential minute differences between the indices.In fact, data reduction is particularly avoided, since the proposed comparison is not within different index values, but rather between index values and precipitation levels.
The spectral data were taken from NASA's Landsat mission due to its consistency, overall reliability, and length of records.The mission provides spectral data for the available range of precipitation time-series .Using a single system also minimizes compatibility issues between band designations in different satellite missions.The Landsat 4-5 Thematic Mapper (TM) duo was in operation from 1982 to 2013.Landsat 7 with Enhanced Thematic Mapper Plus (ETM+) was launched in 1999 and continues its mission.However, due to the failure of the Scan Line Collector (SLC) in 2003, scenes collected after this date have data gaps.For this reason, Landsat 7 was not used for analysis of dates following this year.All spectral bands (blue, green, red, near infrared, shortwave infrared 1, shortwave infrared 2) have a resolution of 30 m and were obtained from the USGS as atmospherically corrected surface reflectance products.
Like any other passive satellite system, the Landsat mission suffers from cloud coverage.Also, some scenes are not usable due to instrumental error.In other cases, hazy conditions or atmospheric dust reduce the image quality, and the error is propagated from the original spectral bands to calculated index products.Therefore, it is not always possible to use a specific date in satellite remote sensing analysis which is best fitting the needs of the research question, and this type of study is prone to these problems; that is, the timing of a precipitation statistic date did not always match with the date of a high-quality Landsat scene.
In the case of low spectral data quality, the next available Landsat scene was picked.For instance, cumulated November produced the most rains in 2006 (Figure 7), but the matching Landsat scene was of low quality.The next best month describing maximum precipitation was in 1987, but again, there was no operational Landsat scene available.Ultimately, the November collection from 1988 was used instead.Table 1 shows the available high-quality eight Landsat captures with minimum cloud coverage.These images were used to calculate the 25 spectral indices mentioned below.Landsat bands were fed into a custom-built Python 3.5 code in batch-processing mode.Six spectral bands (blue, green, red, near-infrared, short-wave infrared 1, and shortwave infrared 2) produced 25 broad-band spectral indices for each sample area and for each of the eight dates with minimum, low, high, and maximum precipitation in May and November.25 indices for seven sample areas and for eight precipitation dates resulted in 1400 different indices.As an intermediary quality-control step, indices were visually investigated following their production and low-quality datasets were removed from the workflow.
While May scenes produced high-quality spectral indices ready for statistical analysis, November scenes failed to produce operational data other than GDVI, MNLI, NDMI and WET.It is possible that the previous generation Landsat sensors (and their calibrations) were not able compensate for the atmospheric effects as much as the Landsat 8.This is exemplified in Figure 8, the OLI sensor can produce very high-quality NDVI datasets (as used in Section 2.2.1), whereas previous-generation Landsat sensors produce data that does not allow for a detailed spectral analysis of archaeological features.
Remote Sens. 2019, 11, x FOR PEER REVIEW 10 of 25 Landsat bands were fed into a custom-built Python 3.5 code in batch-processing mode.Six spectral bands (blue, green, red, near-infrared, short-wave infrared 1, and shortwave infrared 2) produced 25 broad-band spectral indices for each sample area and for each of the eight dates with minimum, low, high, and maximum precipitation in May and November.25 indices for seven sample areas and for eight precipitation dates resulted in 1400 different indices.As an intermediary qualitycontrol step, indices were visually investigated following their production and low-quality datasets were removed from the workflow.
While May scenes produced high-quality spectral indices ready for statistical analysis, November scenes failed to produce operational data other than GDVI, MNLI, NDMI and WET.It is possible that the previous generation Landsat sensors (and their calibrations) were not able compensate for the atmospheric effects as much as the Landsat 8.This is exemplified in Figure 8, the OLI sensor can produce very high-quality NDVI datasets (as used in Section 4.1), whereas previousgeneration Landsat sensors produce data that does not allow for a detailed spectral analysis of archaeological features.

Generation of Vector Data
A total of 56 Landsat sub-scenes were analyzed (eight points in time for each of the seven sample areas; see Table 1) and the visible hollow ways were manually digitized in each scene.For the digitization, only the reflectance bands (blue, green, red, near infrared, shortwave infrared 1, and shortwave infrared 2) were used as references.If the hollow way was cutting through distinctly homogeneous spectral groups (such as two agricultural fields one vegetated and one barren), only the largest portion falling within a single spectral group was digitized in order to minimize withingroup variance.Next, the vectorized features were divided into 10-m intervals and corresponding nodes were stored in a single point vector file.Finally, values from each of the 25 indices were recorded at sample locations (i.e., the nodes) and data-arrays were formed for statistical analysis.

Generation of Vector Data
A total of 56 Landsat sub-scenes were analyzed (eight points in time for each of the seven sample areas; see Table 1) and the visible hollow ways were manually digitized in each scene.For the digitization, only the reflectance bands (blue, green, red, near infrared, shortwave infrared 1, and shortwave infrared 2) were used as references.If the hollow way was cutting through distinctly homogeneous spectral groups (such as two agricultural fields one vegetated and one barren), only the largest portion falling within a single spectral group was digitized in order to minimize within-group variance.Next, the vectorized features were divided into 10-m intervals and corresponding nodes were stored in a single point vector file.Finally, values from each of the 25 indices were recorded at sample locations (i.e., the nodes) and data-arrays were formed for statistical analysis.
To determine the level of spectral contrast between a feature and its immediate background, 100-m-wide rectangular buffer zones were created around each hollow way in order to sample from their local backgrounds.A random sample was selected within each buffer zone, dynamically matching the sample size of the corresponding hollow way nodes.Focusing only on local backgrounds minimizes pedological complexity and decreases the potential impact of differential land-use practices in areas away from hollow ways (Figure 9).Spectral readings of background buffers and hollow ways were merged for each sample area.Next, data falling outside of 3 standard deviations were removed to reduce potential sensor errors and to maximize between-group variations between hollow ways and their corresponding backgrounds.
To statistically evaluate the level of spectral contrast for each index, Welch's t-test (assuming unequal variances) was used.The null hypothesis was that the spectral values of hollow ways and the spectral values of their immediate backgrounds have equal mean values (for α = 0.01), and thus, hollow ways are not separable spectrally.To determine the level of spectral contrast between a feature and its immediate background, 100m-wide rectangular buffer zones were created around each hollow way in order to sample from their local backgrounds.A random sample was selected within each buffer zone, dynamically matching the sample size of the corresponding hollow way nodes.Focusing only on local backgrounds minimizes pedological complexity and decreases the potential impact of differential land-use practices in areas away from hollow ways (Figure 9).readings of background buffers and hollow ways were merged for each sample area.Next, data falling outside of 3 standard deviations were removed to reduce potential sensor errors and to maximize between-group variations between hollow ways and their corresponding backgrounds.To statistically evaluate the level of spectral contrast for each index, Welch's t-test (assuming unequal variances) was used.The null hypothesis was that the spectral values of hollow ways and the spectral values of their immediate backgrounds have equal mean values (for α = 0.01), and thus, hollow ways are not separable spectrally.
The results of this test indicate which spectral indices provide statistically significant feature contrast at sample areas (Appendix B1).Non-significant results are still valuable for mapping purposes, as the hollow ways are sometimes visible to the naked eye (as spatial manifestations of subtle contrast differences).However, further quantitative analysis cannot be performed using these indices.Therefore, the following steps were followed only using spectral indices which provided significant levels of contrast.
Since there was only a small number of high-quality indices dating to November (GDVI, MNLI, NDMI, and WET), it was not possible to conduct a comparative analysis between the two months.Furthermore, the discrepancy between their sample sizes necessitated the use of different statistical methods.For November, tabulating the spectrally significant indices from the seven sample areas resulted in a 4x4 contingency table with count data (Table 2).To investigate whether a proportion of one of the nominal variables (i.e., the spectral index) differs from the values of the other variable, the Fisher's exact test of independence was deployed in R 3.5.The results of this test indicate which spectral indices provide statistically significant feature contrast at sample areas (Table A2 in the Appendix B).Non-significant results are still valuable for mapping purposes, as the hollow ways are sometimes visible to the naked eye (as spatial manifestations of subtle contrast differences).However, further quantitative analysis cannot be performed using these indices.Therefore, the following steps were followed only using spectral indices which provided significant levels of contrast.
Since there was only a small number of high-quality indices dating to November (GDVI, MNLI, NDMI, and WET), it was not possible to conduct a comparative analysis between the two months.Furthermore, the discrepancy between their sample sizes necessitated the use of different statistical methods.For November, tabulating the spectrally significant indices from the seven sample areas resulted in a 4x4 contingency table with count data (Table 2).To investigate whether a proportion of one of the nominal variables (i.e., the spectral index) differs from the values of the other variable, the Fisher's exact test of independence was deployed in R 3.5.The spectral indices from May were sufficient in number.A Correspondence Analysis (CA) was also performed using tabulated indices.First, the significant indices were totaled for each of the seven sample areas and a summary table generated (Table 3).Out of the 25 indices, the Moisture Stress Index (MSI) and Non-Linear Index (NLI) did not produce any significant spectral contrast between the hollow ways and their immediate environments.The CA was applied to the count of significant indices in order to understand the relationship between the spectral responses and precipitation regime.
Table 3. Spectrally significant indices tabulated for the seven sample areas.No single index provided significant contrast for all of the areas.DWSI5 is the most optimal index, as it was able to capture contrast differences at different precipitation levels.GDVI displayed the worst performance.

Visibility Conditions in Time Series
The time-series investigation of NDVI values in comparison to the average spectral background suggests that vegetation heading and flowering took place during March and April, and crops rapidly transitioned into ripening stage in May (Figure 10).Therefore, what makes hollow ways more visible (i.e., spatially separable) is, in fact, not directly related to the vegetation health as measured (and commonly used) by NDVI, since May is the month when hollow ways are most easily detected by the naked eye.A similar interpretation is possible for the monthly NDWI values of hollow ways and their respective background values from each scene (Figure 11).Hollow ways are most visible to the naked eye in November, but again, the NDWI average spectral contrast between hollow ways and their backgrounds is highest during March and April.

Evaluation of Spectral Indices
The tabulated data highlight two indices that provide most of the spectral contrast for the sample areas in November: MNLI and NDMI (Table 3).Fisher's exact test for count data yields a simulated p-value of 0.9191 based on 1000 replicates.This is a highly insignificant result, so one can almost certainly conclude that there is no relation between the precipitation levels and the four spectral indices under consideration.

Evaluation of Spectral Indices
The tabulated data highlight two indices that provide most of the spectral contrast for the sample areas in November: MNLI and NDMI (Table 3).Fisher's exact test for count data yields a simulated p-value of 0.9191 based on 1,000 replicates.This is a highly insignificant result, so one can almost certainly conclude that there is no relation between the precipitation levels and the four spectral indices under consideration.
The spring season provides more room for exploration.The Correspondence Analysis (CA)

Evaluation of Spectral Indices
The tabulated data highlight two indices that provide most of the spectral contrast for the sample areas in November: MNLI and NDMI (Table 3).Fisher's exact test for count data yields a simulated p-value of 0.9191 based on 1,000 replicates.This is a highly insignificant result, so one can almost certainly conclude that there is no relation between the precipitation levels and the four spectral indices under consideration.
The spring season provides more room for exploration.The Correspondence Analysis (CA) explain 93 % of the variation in the data (Figure 12).Thus, the interpretation rests on a strong base.In May 1988 (the maximum precipitation) the spectral indices perform equally.Therefore, maximum The spring season provides more room for exploration.The Correspondence Analysis (CA) explain 93 % of the variation in the data (Figure 12).Thus, the interpretation rests on a strong base.In May 1988 (the maximum precipitation) the spectral indices perform equally.Therefore, maximum precipitation is a weak determinant for choosing an ideal index; the CA positions May 1988 very close to the origin, supporting this claim.The rest of the dates corresponding to minimum, low, and high precipitation values span the correspondence space equally and are farther away from the origin.
It can be concluded that these three dates explain most of the variation in the contrast performance of the spectral indices.DWSI5, SAVI, MVI, and SR are clustered around the origin, suggesting that these indices perform equally under variable precipitation conditions; they discriminate the least between spectral contrast and rainfall.In fact, DWSI5 performs the best, but it also evenly contributes to correspondence (min rainfall: 5, low: 4, high: 4, max: 5).
The precipitation dates relative to the index values and their groupings reveal further relations.For instance, the Enhanced Vegetation Index (EVI) is the most discriminating when rainfall is at minimum levels.The CA also suggests that there is considerable clustering for different precipitation conditions.Under high precipitation (in 1992) GDVI, DVI, MNLI, and NDMI perform similarly, and one may use GVI but avoid MCARI under low precipitation.
The analysis further reveals five distinct groups of spectral indices: (1) DVI, MNLI, NDMI; (2) ARVI, GEMI; (3) MVI, SR; (4) NDVI, OSAVI; and (5) GRVI, NDVI, DWSI1, GNDVI, SIWSI.Among these, Group 1 can be associated with high precipitation values in 1992, Group 4 with minimum precipitations in 1989 and Group 5 with low precipitation levels in 1990.Correlation matrices (Table A3 in the Appendix B) reveal high degree of correlations for all of the sample areas.For some of the index couples, such as NDWI and GNDVI, this is an expected result, as their algebraic band combinations mimic each other.In other cases, a different spectral combination can still demonstrate a strong (albeit negative) association, such as that between NDWI and DWSI1.
Remote Sens. 2019, 11, x FOR PEER REVIEW 14 of 25 precipitation is a weak determinant for choosing an ideal index; the CA positions May 1988 very close to the origin, supporting this claim.The rest of the dates corresponding to minimum, low, and high precipitation values span the correspondence space equally and are farther away from the origin.It can be concluded that these three dates explain most of the variation in the contrast performance of the spectral indices.DWSI5, SAVI, MVI, and SR are clustered around the origin, suggesting that these indices perform equally under variable precipitation conditions; they discriminate the least between spectral contrast and rainfall.In fact, DWSI5 performs the best, but it also evenly contributes to correspondence (min rainfall: 5, low: 4, high: 4, max: 5).
The precipitation dates relative to the index values and their groupings reveal further relations.For instance, the Enhanced Vegetation Index (EVI) is the most discriminating when rainfall is at minimum levels.The CA also suggests that there is considerable clustering for different precipitation conditions.Under high precipitation (in 1992) GDVI, DVI, MNLI, and NDMI perform similarly, and one may use GVI but avoid MCARI under low precipitation.
The analysis further reveals five distinct groups of spectral indices: (1) DVI, MNLI, NDMI; (2) ARVI, GEMI; (3) MVI, SR; (4) NDVI, OSAVI; and (5) GRVI, NDVI, DWSI1, GNDVI, SIWSI.Among these, Group 1 can be associated with high precipitation values in 1992, Group 4 with minimum precipitations in 1989 and Group 5 with low precipitation levels in 1990.Correlation matrices (Appendix B2) reveal high degree of correlations for all of the sample areas.For some of the index couples, such as NDWI and GNDVI, this is an expected result, as their algebraic band combinations mimic each other.In other cases, a different spectral combination can still demonstrate a strong (albeit negative) association, such as that between NDWI and DWSI1.

Discussion.
Contrast-as a concept-becomes even more fruitful as the interest shifts from the spatial domain (what is immediately visible as a hollow way, the "contrast") to the spectral domain (the average spectral separation of a hollow way from its background, the "level of contrast").Figures 10 and 11 suggest that higher average spectral separation of hollow ways from the overall scene does not necessarily translate into higher feature visibility.This is also to say that the assumed link between spatial contrast and spectral contrast does not always hold, especially as one moves from the larger scene to local conditions.This breaks the false pre-eminence of the global spectral scene

Discussion
Contrast-as a concept-becomes even more fruitful as the interest shifts from the spatial domain (what is immediately visible as a hollow way, the "contrast") to the spectral domain (the average spectral separation of a hollow way from its background, the "level of contrast").Figures 10 and 11 suggest that higher average spectral separation of hollow ways from the overall scene does not necessarily translate into higher feature visibility.This is also to say that the assumed link between spatial contrast and spectral contrast does not always hold, especially as one moves from the larger scene to local conditions.This breaks the false pre-eminence of the global spectral scene over local conditions, especially when date of satellite data acquisition is determined; the choice may not be determined by overall vegetation health or optimal climatic conditions.
When investigated in time-series form, the relationship between the acquisition date of a scene and the visibility of the hollow ways is most obvious in May and November.However, in November, NDWI clearly outperforms NDVI, probably due to the impact of first rains after a long and very dry period in the summer.It is also important to note that even during summer months, NDVI provides better background for mapping than in spring when the crop vegetation is the greenest.Two deductions can be made based on these observations: first, hollow ways are easier to detect with soil marks than with vegetation marks.This result also implicitly suggests that the differences in soil moisture retention capacities of hollow ways do not fully translate into differences in vegetation health.Second, despite their lower values, the vegetation index can still solve for hollow ways.In these cases, the index does not necessarily represent crop vegetation, but more probably the presence of droughtand salinity-resistant Prosopis type green shrubs [39] that grow along the axis of hollow ways and create distinctive alignments.Therefore, vegetation indices still remain useful for the detection of hollow ways.
Lagging is a concern when spectral data is investigated in a time-series [45,46].Scarcity or abundance of rainfall may not immediately affect the crop growth.Thus, the vegetation indices may carry a "lag", which offsets the relationship between precipitation and the spectral response characteristics.Lagging is not only specific to geographical areas, but also to different plant species: different types of vegetation growing in the same area may have different tolerance levels to variations in water availability.Furthermore, even for the same type of crop the lag lengths can differ between the early and late growing seasons.Finally, studies usually report lags only for a couple of indices [47].Considering the complexity of lagging calculations, the vast size of the study area under investigation, and the high number of spectral indices under consideration, integrating lagging into the analysis is beyond the scope of the project at this stage.Despite its coarser resolution (30 m), Landsat multispectral imagery is not only useful for mapping hollow ways, but also for discriminating between hollow ways with different levels of contrast.In fact, there are newer sensors which provide higher spatial (and spectral) resolution such as Sentinel 2, but Landsat is unique for its historicity.Providing Earth Observation as early as 1972, there is no other system which has been so instrumental in providing time-series analysis as the one used in this study.
Newer multispectral sensors with higher spatial resolutions may lack the temporal depth, but preserved hollow ways appear in greater detail.Publicly available Sentinel 2 data, for instance, provide higher spectral representation of Near Infrared portion of the electromagnetic spectrum represented in three near edge (vegetation) bands and two Near Infrared bands compared to Landsat missions.Furthermore, it has superior temporal resolution with five days revisiting period.This level of precision opens up new research paths for a detailed analysis of the relationship between climatic conditions and the spectral representation of archaeological features.The commercial satellite WorldView-3 has even higher spatial resolution (1.24 m multispectral, 3.7 m shortwave; Figure 13).In particular, the availability of eight shortwave-infrared bands brings the possibility of modeling soil moisture content in detail.
Increases in sensor spectral, spatial, and temporal resolutions may lead towards a theory of archaeological satellite remote sensing.In particular, empirical models of hydraulic conductivity and soil moisture storage capacity of cultural soils [48] coupled with spectral data analysis can highlight replicable and testable physical relationships between archaeological features and the ways in which they are reflected on electromagnetic spectrum.Ancient hollow ways are promising features in this respect.Detailed geomorphological evaluation of these linear features within the context of satellite remote sensing, e.g., how much they withstood erosion, or how much they facilitated/blocked surface runoff [49] might also highlight emergence, use, and abandonment, as well as post-depositional phases.After all, what is measured by a spectral sensor is not the actual road surface, but the relic of the road which has been transformed considerably [50].
missions.Furthermore, it has superior temporal resolution with five days revisiting period.This level of precision opens up new research paths for a detailed analysis of the relationship between climatic conditions and the spectral representation of archaeological features.The commercial satellite WorldView-3 has even higher spatial resolution (1.24 m multispectral, 3.7 m shortwave; Figure 13).In particular, the availability of eight shortwave-infrared bands brings the possibility of modeling soil moisture content in detail.

Conclusions
Publicly available multispectral sensor data with increasingly high spatial resolutions call for a wider use of spectral indices in archaeology.Due to these developments, it is also possible to conduct comparative quantitative performance analyses for large areas under varying climatic and geographic conditions.As increasing amounts of spatially accurate survey and excavation data becomes available to researchers in digital forms, it is a feasible task to systematically explore how the light behaves when reflected and/or emitted from archaeological features (or from their proxies) under variable background conditions.Therefore, the use of a specific spectral index in an archaeological investigation can and should be justified [51].
The reduction in prices of archived high-resolution commercial satellite images and the widespread use of online viewing platforms makes remote sensing from space more applicable to vast regions than before.Increases in the extent of study areas also call for the use of generic workflows in order to analyses large datasets.On the one hand, satellite remote sensing provides researchers the opportunity to conduct large-scale research in which a global understanding of an archaeological phenomenon is possible.On the other hand, the analysis heavily relies on local conditions, and thus, migrating workflows or carrying the same research parameters to other areas is never a trivial task.For these reasons, building robust methodologies for spectral data with anthropogenic origins remains a significant challenge [52].This paper is yet another quantitative reminder of how even seemingly homogeneous areas with similar features can produce great diversities in spectral responses.
This study affirms that the contrast is an important determinant in archaeological remote sensing studies; however, it further claims that it is a necessary-but not sufficient-condition if one wants to go beyond mapping projects only using the spatial manifestations of spectral contrast.Understanding the levels of separation makes it possible to categorize features and, thus, adds the necessary dynamism to the archaeological landscape, since every feature carries the potential to be unique according to one category or another.

Figure 2 .
Figure 2. (a) Hollow ways radiating from Tell Brak (#2 on Figure 1) are clearly visible in CORONA imagery (DS1102-1025DA013, CORONA KH-4B; December 1967); (b) Landsat TM true color imagery (LT51710351990149RSA04; Landsat 5; May 1990) can also capture hollow ways, albeit to a lesser extent due to Landsat's lower spatial resolution, but also because of the impact of the mechanized land-use practices after the 1980s on cultural material.

Figure 2 .
Figure 2. (a) Hollow ways radiating from Tell Brak (#2 on Figure 1) are clearly visible in CORONA imagery (DS1102-1025DA013, CORONA KH-4B; December 1967); (b) Landsat TM true color imagery (LT51710351990149RSA04; Landsat 5; May 1990) can also capture hollow ways, albeit to a lesser extent due to Landsat's lower spatial resolution, but also because of the impact of the mechanized land-use practices after the 1980s on cultural material.

Figure 3 .
Figure 3. CORONA (DS1105-1025DA063; Nov 5, 1968) historical satellite imagery is an excellent source for mapping hollow ways with high precision.The basaltic landscape to the west of Tell Beydar (represented by the white dot at the center) does not contain evidence of movement, which may indicate differential route formation conditions or selective choices about movement.Hollow ways are documented with the help of the underlying CORONA imagery.Photogrammetrically corrected images were acquired from the CORONA Atlas & Referencing System (http://corona.cast.uark.edu/)and they are orthorectified using SRTM 90m Digital Elevation

Figure 3 .
Figure 3. CORONA (DS1105-1025DA063; Nov 5, 1968) historical satellite imagery is an excellent source for mapping hollow ways with high precision.The basaltic landscape to the west of Tell Beydar (represented by the white dot at the center) does not contain evidence of movement, which may indicate differential route formation conditions or selective choices about movement.
Remote Sens. 2019, 11, x FOR PEER REVIEW 6 of 25 Model.Next, two commonly used spectral indices in archaeology, the Normalized Difference Vegetation Index (NDVI) and Normalized Difference Water Index (NDWI) were calculated monthly for the years between 2014 and 2017, and annually for the same timeframe.Landsat 8 equipped with the OLI-TIRS sensor (30 m spatial resolution) provided the necessary spectral bands for the operation.Atmospherically corrected reflectance data were acquired from the United States Geological Survey (USGS) EarthExplorer portal.After the visual check for cloud coverage and image quality, spectral indices were calculated using near-infrared (NIR), red, and green bands (see Table A1 in the Appendix A for their formulas).Spectral indices are calculated for each acquisition date.Monthly values are estimated by a single value if there is only one acquisition in the month, or by an average where there are two or more acquisitions in the month.Finally, monthly indices were averaged for across four years (2014, 2015, 2016, and 2017).The result was two spectral time series (one for NDVI and one for NDWI) from January to December.

Figure 7 .
Figure 7. Averaged precipitation from all sample areas.Purple bars indicate accumulated precipitation values for May and blue bars for November.The wettest spring accumulation (by the end of May) was in 1988 and the wettest fall accumulation 2006 (by the end of November).The driest May accumulation was in 2008 and the driest November accumulation 1998.Other values range in between these years without an apparent trend.

Figure 7 .
Figure 7. Averaged precipitation from all sample areas.Purple bars indicate accumulated precipitation values for May and blue bars for November.The wettest spring accumulation (by the end of May) was in 1988 and the wettest fall accumulation 2006 (by the end of November).The driest May accumulation was in 2008 and the driest November accumulation 1998.Other values range in between these years without an apparent trend.

Figure 8 .
Figure 8.(a) NDVI values calculated from Landsat 8 for the area around Tell Beydar in November (2017) are high quality and ready for further statistical analysis; (b) Landsat 7 produces a lowerquality NDVI November dataset (2010) for the same area.Both datasets are displayed with the same rendering parameters.

Figure 8 .
Figure 8.(a) NDVI values calculated from Landsat 8 for the area around Tell Beydar in November (2017) are high quality and ready for further statistical analysis; (b) Landsat 7 produces a lower-quality NDVI November dataset (2010) for the same area.Both datasets are displayed with the same rendering parameters.

Figure 9 .
Figure 9.A visual representation of three hollow ways superimposed over an image of the Wetness index (WET).Buffer zones contain randomly selected background samples (stars), dynamically matching the number of systematic spectral index readings along hollow way (dots).

Figure 9 .
Figure 9.A visual representation of three hollow ways superimposed over an image of the Wetness index (WET).Buffer zones contain randomly selected background samples (stars), dynamically matching the number of systematic spectral index readings along hollow way (dots).

Figure 10 .
Figure 10.Monthly whisker-box plots of NDVI values of hollow ways (green) and the index values of randomly selected points from the whole scene (red).It is clear that March and April have the highest index values, and in these months hollow ways-on the average-are more separated spectrally than the background in comparison to other months.Nevertheless, more hollow ways are visible to the naked eye in May, making this a better month for mapping purposes. .

Figure 11 .
Figure 11.Monthly whisker-box plots of NDWI values.As in the case of NDVI, March and April have the highest spectral separation (green: hollow ways; red: background), but the November dataset is the most suitable for mapping due to the hollow ways being more visible to the naked eye.

Figure 10 . 25 Figure 10 .
Figure 10.Monthly whisker-box plots of NDVI values of hollow ways (green) and the index values of randomly selected points from the whole scene (red).It is clear that March and April have the highest index values, and in these months hollow ways-on the average-are more separated spectrally than the background in comparison to other months.Nevertheless, more hollow ways are visible to the naked eye in May, making this a better month for mapping purposes.

Figure 11 .
Figure 11.Monthly whisker-box plots of NDWI values.As in the case of NDVI, March and April have the highest spectral separation (green: hollow ways; red: background), but the November dataset is the most suitable for mapping due to the hollow ways being more visible to the naked eye.

Figure 11 .
Figure 11.Monthly whisker-box plots of NDWI values.As in the case of NDVI, March and April have the highest spectral separation (green: hollow ways; red: background), but the November dataset is the most suitable for mapping due to the hollow ways being more visible to the naked eye.

Figure 12 .
Figure 12.The Correspondence Analysis (CA) of significant indices from May.

Figure 12 .
Figure 12.The Correspondence Analysis (CA) of significant indices from May.

Figure 13 .
Figure 13.(a) Very High Resolution WorldView-3 satellite imagery (06 January 2017) still contain detailed evidence of hollow ways despite the impact of modern land-use practices on archaeological landscapes.(b) Sentinel-2 (24 February 2017) might have coarser resolution in comparison to WorldView-3, but with European Space Agency's open-access policy, the system is a prime choice.The hollow ways still appear in great clarity on Sentinel2 Earth Observation satellites.

Table 1 .
Landsat acquisition dates and respective IDs for each sample area

Table 2 .
Spectrally significant indices from November and the count of represented sample areas for each precipitation statistic.

Table 2 .
Spectrally significant indices from November and the count of represented sample areas for each precipitation statistic.

Table A2 .
p-values of Welch's t-test for sub-regions.