Multitemporal 2016-2018 Sentinel-2 Data Enhancement for Landscape Archaeology: The Case Study of the Foggia Province, Southern Italy

: This paper is focused on the use of satellite Sentinel-2 data for assessing their capability in the identiﬁcation of archaeological buried remains. We selected the “Tavoliere delle Puglie” (Foggia, Italy) as a test area because it is characterized by a long human frequentation and is very rich in archaeological remains. The investigations were performed using multi-temporal Sentinel-2 data and spectral indices, commonly used in satellite-based archaeology, and herein analyzed in known archaeological areas to capture the spectral signatures of soil and crop marks and characterize their temporal behavior using Time Series Analysis and Spectral Un-mixing. Tasseled Cap Transformation and Principal Component Analysis have been also adopted to enhance archaeological features. Results from investigations were compared with independent data sources and enabled us to (i) characterize the spectral signatures of soil and crop marks, (ii) assess the performance of the diverse spectral channels and indices, and (iii) identify the best period of the year to capture the archaeological proxy indicators. Additional very important results of our investigations were (i) the discovery of unknown archaeological areas and (ii) the setup of a database of archaeological features devised ad hoc to characterize and categorize the diverse typologies of archaeological remains detected using Sentinel-2 Data.


Introduction
Earth observation (EO) technologies are increasingly recognized as extremely effective for the documentation, preservation, and management of cultural heritage (CH), and today considered a priority at European and international level with important cultural, social, and economic repercussions. This is clearly highlighted by the increasing number of papers, dedicated conferences, and special issues published in international scientific journals (see for example [1][2][3] and references therein quoted); so that, in 2017, the European Commission (EC) organized a specific workshop "to assess the potential of Copernicus in support of cultural heritage preservation and management, and to provide inputs for further research and/or operational implementation". So that subsequently, a "Copernicus Cultural Heritage Task Force" was established to support the development and Abate et al., 2019 evaluated the potential of Sentinel-2 data for preventive archaeology on two test areas in Southern Italy comparing diverse approaches to data enhancement, and semiautomatic and automatic feature extractions [12].
Today, the main scientific challenges to face are linked to the following questions, related to the general use of satellite Earth observation in archaeology and, more specifically, to issues related to the use of Sentinel-2 for archaeological investigations: (i) Is it possible to capture the spectral signatures of archaeological proxy indicators as soil and crop marks considering that the identification of archaeological features is a very complex issue because they are (a) very subtle anomalies linked to presence of buried remains and, moreover, (b) not permanent signal but only visible in specific observational conditions? (ii) Is the spatial resolution of Sentinel-2 a critical limitation being that, except for a few examples [1][2][3], the majority of archaeological investigations have been performed using very high-resolution satellite data?
To provide a contribution in this context, this paper focuses on the use of satellite Sentinel-2 data for assessing their capability at diverse scales of analyses from single site up to a landscape level, i.e., Landscape Archaeology. For the purposes of our investigations, the "Tavoliere delle Puglie", a plain in northern Apulia (Southern Italy), was selected as a test area due to its long human frequentation and, consequently, its high archaeological potential and variety of cultural heritage features, along with a huge amount of ancillary data and information also available in a GIS environment. In the past the area herein investigated has been widely studied also using very high resolution satellite data as QuickBird, World View [13], geophysical prospection [14,15], along with manned [16][17][18][19][20] and unmanned aerial archaeology prospection [21]. Archaeological recorders along with previous knowledge and information already available in GIS environment are herein adopted as ground truth [22][23][24][25][26][27] to assess the potentiality of Sentinel-2 data in known archaeological sites related to different periods and civilizations, from Neolithic era to Greek-Roman and Middle Ages [28][29][30][31][32][33]. A multitemporal (2016-2019) Sentinel-2 data set is herein analyzed to characterize the spectral signatures of soil and crop marks and capture their temporal behavior, being that they are not permanent but only visible in specific observational conditions. The paper is organized as follows, we described in Section 1 (1.1) the study area and (1.2) the Sentinel-2 data; in Section 2 the methodological approach adopted to (2.1) extract the spectral signatures from region of interest (ROIs), known archaeological areas and identify the best spectral indices along with the enhancement techniques to (2.2) further highlight the archaeological features; in Section 3 the results obtained for the whole study area the "Tavoliere delle Puglie" covering approximately 224 km 2 (in the Foggia province), and, finally, in Sections 4 and 5 discussion and conclusion, respectively.

Study Area
The investigations were carried out ( Figure 1) in "Tavoliere delle Puglie", corresponding to the current Foggia province, in Northern Apulia (Southern Italy), well known for its archaeological sites from prehistoric to industrial period. It is a plain area largely cultivated (olive trees and vines) since prehistorical times, being one of the areas where agriculture developed in Europe [22]. The Tavoliere was selected as a test area for a huge amount of ancillary data and archaeological records available in GIS environment and, therefore, represents an excellent starting point for the analysis and validation of outputs from satellite Sentinel-2 data. The WebGIS of the Italian Ministry of the Environment, along with the CartApulia WebGIS, and the WebGIS data of the Ministry and the Region of Puglia, provided: (i) high-resolution aerial images; (ii) high-resolution satellite images; (iii) LiDAR 1x1m; (iv) historical cartography; (v) modern and ancient toponymy; (vi) geo-localized points with references to well-known archaeological sites from archives, surveys and excavation, etc. [20,23,25,27,[34][35][36][37]. Moreover, the flat morphology of the area along with its agricultural vocation and, therefore, the availability of large spaces free by buildings, made it an ideal place to test satellite Sentinel-2 capability for landscape archaeology.
The area is widely recognized as one the most important regions in the world for the presence of Neolithic settlements, dated between the 6th and 4th millennium BC. In Italy the spread of farming is traditionally associated with Tavoliere where, since their discovery between the late 1940s and 1950s through aerial photographs, a great number of studies have been conducted [16]. These settlements are medium to large sized ditched villages, characterized by circular shape, composed of huts surrounded by circular ditches (small compounds). They are often found along the rivers which were a source of food (being rich of fish and other edible aquatic resources) and of course, a major source of fresh water (for drinking and irrigation), and also an effective communication system [23,25,34].
The Tavoliere is also very rich in archaeological remains not only related to the Neolithic period but also to other important periods from Bronze Age onwards. As an example, near the town of Arpi Nova, north-west of Foggia, there is the archaeological site of Arpi, datable to the 2nd millennium BC (Bronze Age) [38].
In this period, this area was inhabited by the Daunians, hence the eponymous region of Daunia, which extended from the Ofanto river in the southeast to the Gargano peninsula in the northwest. Daunia underwent the influences of Greek civilization and Magna Graecia from the end of the 5th century BC. During the Roman period, the most important town was Lucera which still preserves an amphitheatre of the Augustan period [39]. Later on, a rich cultural variety followed during the medieval period with the contribution of Byzantines, Longobards, and Saracens (8-11th century AD). [40]. The Tavoliere conserved its importance with the Normans (12th century) and reached its peak with the Emperor Frederick II of Swabia (1194-1250), who built new castles and fortified existing settlements, among which was Lucera [24].
The long and complex history of the Tavoliere is reflected in the very rich archaeological evidences found across the area, studied by means of numerous archaeological excavations and investigations also based on the use of aerial archaeology. In fact, shortly after World War II, John Bradford, of the British Royal Air Force (RAF), observing some anomalies in the vegetation (the so-called crop-marks) [16][17][18] from aerial photos of the Foggia territory, discovered one of the largest Neolithic settlements in Europe known as "Passo di Corvo". In the case of Neolithic settlements, they are circular or sub-circular features of different sizes, from a few meters to one kilometer diameter [41] and linked to traces of buried structures or anthropogenic transformations of landscape.
In more recent times, the area of Foggia has been investigated also using advanced remote sensing techniques as very high-resolution satellite data, drones, and geophysical prospection [13][14][15]. Over the years, past archaeological studies along with the interest of the Italian and worldwide scientific community for the Tavoliere area produced numerous papers and bibliographic references, very useful to define and date the archaeological features. In the last twenty years, in the whole region thanks to the activity of Soprintendenza and universities, over 1000 sites of archaeological interest were surveyed, and in some cases also excavated, and further investigated using Earth observation techniques [42,43]. So that in addition to the prehistoric remains (characterized by the circular features herein typical of the Neolithic settlements), studies based on aerial archaeology identified other geometric features, potentially linked to archaeological remains of different historical periods. For example, quadrangular shapes or parallel linear traces were refereed, on the basis of metric analysis, to the Roman period as confirmed by excavations which unearthed Roman rural farms, villas, and centuriations [44,45]. Moreover, complex subcircular shaped features, extended near ancient farmland and ruins dated to the medieval period were identified as abandoned villages or fortified settlements ("motte and bailey") in coherence with other studies conducted in other areas [24].

Sentinel-2 Data
Sentinel-2 satellite constellations have been developed in the framework of the European Commission Copernicus Programme for security and risk monitoring [4][5][6] and are managed by the European Space Agency (ESA).
In particular, the strengths points of Sentinel-2 are multiple: (i) the twin satellites (A and B) have a regular time acquisition of the Earth's surface and the same area is (on average) taken every five days; (ii) the data collected are free and made available to the user on the Copernicus Open Access Hub platform (https://scihub.copernicus.eu/dhus/#/home); (iii) the sensor resolution is very high and reaches a maximum of 10 meter; (iv) the sensor is multispectral and acquires in 12 separate bands (Table 1) [46,47].  The usefulness of medium and high-resolution satellite data is well established and the procedures for the identification of buried structures are well described in Earth observation studies applied to cultural heritage and aerial archaeology [7,41,48]. In particular, the presence of buried structures or anthropogenic transformation with high impact on the landscape (ditches, trenches, etc.) causes spatial and spectral anomalies and, in turn, differences in reflectance between the pixels related to archaeological features and their background, called crop-/soil-/weed-marks. In the presence of crop, these anomalies are evident from above as differences in the growth or health of the vegetation, or, for bare soil as differences in soil moisture and surface temperature dynamics (thermal gradient) [49]. Previous studies suggested that NIR (Near Infrared) channels and red generally increase the visibility of crop-and soil-marks, respectively [50]. Moreover, to enhance archaeological features and make their identification easier, the use of "spectral indices", i.e., mathematical combination of different bands is quite commonly adopted [11,[51][52][53].

Sentine-2 Data Processing to Characterize the Spectral Signatures of Soil and Crop Marks in Mediterranen Areas Using Region of Interest (ROIs)
The investigation was performed using multi-temporal (2016-2018) Sentinel-2 multispectral dataset, already used in the field of remote sensing-based archaeology [11,12]. They have been herein analyzed using Time Series Analysis and Spectral Un-mixing tools (available in SNAP: https: //step.esa.int/main/doc/tutorials/snap-tutorials/) to cope with the main challenges to face: (i) the fact that the archaeological features linked to buried remains are generally not permanent but only visible in specific observational conditions, and (ii) the need to identify sub-pixel anomalies linked to features of potential archaeological interest. These were further enhanced using (iii) Tasseled Cap Transformation and (iv) Principal Component Analysis which were selected because these were already used in Earth observation for archaeology [50]. All Sentinel-2 images were selected with a cloud coverage less than 5% and, above all, the clouds did not affect the area of interest (covering approximately 224 km 2 ) where most of the currently known archaeological sites were located. Unfortunately, very few data in spring season (during which generally the crop-marks are well visible), could be fruitfully used due to the high cloud coverage. The data were pre-processed using SNAP v.6 (software freely provided by ESA), which allows an XML script to automate the processes though a series of operations as re-sampling, subset, band math to combine different bands, etc. Subsequently, additional analyses were performed also using tools available in ENVI 5.3 and ArcMap 10.4. In particular, on the basis of know archaeological features, the data processing chain was made up of analyses addressed to the identification of: (i) the best indices for the recognition of archaeological features; (ii) the optimal season in which the phenomenon of soil, crop-marks are visible; (iii) the most effective approach to reduce the amount of data size preserving the information. All of these operations were followed by the identification of features and validation based on independent ancillary data (see flowchart in Figure 2). The XML script was applied as batch-process to all Sentinel-2 files automatically. The resampling function is a native function implemented in SNAP and allows pixel resolution re-scaling to benefit from all Sentinel-2 bands. Bands 2, 3, 4, and 8 (blue, green, red and NIR) were re-sampled at a 10 m spatial resolution [64][65][66].
The subset function was then used to reduce both data size and processing time focusing on the area of interest (224 km 2 ). SNAP software allows the combination of the different bands and, for this study, bands 4-3-2, 8-4-3, and 8-11-2 (Table 1) were used to obtain the following three compositions: (i) RGB, (ii) a False Infrared Color (FCIR), and (iii) Healthy Vegetation (HV) image, as showed in Figure 3. The RGB is similar to the human eye view and shows the visible spectrum, False Infrared Color, and Healthy Vegetation are useful to improve the identification of areas with different reflectance response to enhance the visibility of vegetation anomalies and their neighboring areas ( Figure 3) [67]. Subsequently, several indices (mathematical operations between bands) were calculated (using the function "Band Maths"), as suggested by the most recent papers on the subject [15,68], for all the images under investigation. The spectral indices herein considered are mainly of two types: (i) analysis of vegetation health status, and (ii) monitoring of soil and moisture level ( Table 2).

Index Equation Reference
Difference Vegetation Index (DVI) Green Normalized Difference Vegetation Index (GNDVI) Green Ratio Vegetation Index (GRVI) B8 B3 [60,61] Normalized Difference VegetationIndex (NDVI) Normalized Difference Water Index (NDWI)  [71][72][73][74][75][76] These spectral combinations are useful for the enhancement of crop-marks, soil-marks, and anomalies generated by the presence of buried structures of potential archaeological interest [60][61][62][76][77][78]. The NDVI (Normalised Difference Vegetation Index) is one of the most widely used indices also in archaeological applications. NDVI values range from −1 to +1 where negative values or close to 0 correspond to water or soil, whereas higher values correspond to vegetation, with increasing values from sparsely to dense vegetation as well as from stressed to healthy vegetation also depending on vegetation type, plant phenology, and season.
NDVI is sensitive to the absorption of radiation in the Red band, caused by chlorophyll, and the reflection of light in the Near Infrared [49,50,64,76]. Recently, several other indices have been also used for the identification of archaeological features, allowing the discrimination of pixels according to their spectral signature [76,77]. For example: (i) Albedo is an indicator of some surface features, such as brightness [50]; (ii) SR is useful for bare soil analysis; (iii) Green NDVI is a variant of the NDVI index but more sensitive to chlorophyll concentration [55,56]; (iv) SAVI and EVI indices, calculated using the blue, red, and near infrared bands, are useful to reduce background influence [49,50]; (v) GEMI index reduces the influence of the atmosphere [79,80]; (vi) SWIR (NDWI, NVMI) band-based indices are useful to assess the moisture content of soil and vegetation [81,82]; (vii) NAI and RN indices, as well as crop-and soil-indices, use wavelengths from 0.6 to 0.9 to enhance archaeological proxy indicators [82,83].
In order to capture and characterize the trend of the archaeological features within the different bands/indices over the time, the single channels and spectral index maps were further processed using "Time Series Analysis" (TSA). Moreover, in order to identify the best period of the features visibility, TSA was coupled with the Spectral Unmixing analysis, carried out (using the ENVI 5.3 software). Of course, the best period/season in which the phenomenon of soil and crop marks are evident can be different from one index to another and from one features to another even in the same image due effect of local factors [21,51]. Spectral Un-mixing is used to show single pixels values in the images as: (i) minimum value; (ii) maximum value; (iii) average; (iv) standard deviation.
The discrimination between regions of archaeological interest and regions of non-archaeological interest for TSA and Spectral Un-mixing analysis were carried out starting from the data of the WebGis CartApulia. CartApulia was produced by a multidisciplinary group involving: the four Apulian Universities, the Regional Directorate for Cultural Heritage and Activities of MiBACT (Ministero per i Beni e le Attività Culturali e per il Turismo), with the competent Soprintendenza, with the coordination and scientific validation of Prof. Giuliano Volpe. The data in CartApulia are reported as shapes that in some cases follow the contours of the archaeological remains, while in others are approximate. In addition, WebGis reports references to archaeological sites and if they have been verified or not by the public institutions. For this study we used only verified sites for the creation of regions of interest (ROIs) useful for spectral analysis, checking the presence of the selected sites also in other sources (bibliography, aerial photographs, Google, etc.). Archaeological sites to be used as a sample for the definition of ROIs were chosen on the basis of their visibility within the available sources, including Sentinel-2 images (Figure 4). 4000 pixels (2000 for each class) were selected as "positive" and "negative" within Sentinel-2 images. We have considered as "positive" the area occupied by archaeological remains and as "negative" the area immediately outside the "positive" one. This can be seen in Figure 3, as the separation between the archaeological features and its surrounding. The area occupied by the selected pixels corresponds to 0.4 km 2 . The ROIs were chosen in the area between Lucera and Foggia, which corresponds to Masseria Palmori and Masseria Melillo, and in the area of Arpi. These areas were already known for the presence of Neolithic settlements, Roman and Medieval remains, which are visible in aerial images and satellite data (Figure 4) [16][17][18][19]45,[84][85][86][87].  Figure 5 shows the spectral signature and the reflectance value of the pixels as a function of time, which was useful also to obtain the values that characterize the features of archaeological interest and the difference between the features and their neighboring pixels [21,51,64,76,88]. The analysis of the spectral signatures put in evidence a significant difference in reflectance value between archaeological features and their surroundings, from the band 6, 7, 8, 8A, and 11, covering the interval between Red edge and NIR bands, and SWIR band central wave length at 1.610 nm. The same process was subsequently applied to the different indices produced from the band maths operation (Table 2) ( Figure 6). The multitemporal analysis has been also carried out on the single calculated indices, with the aim to identify the best period of the year (i) to observe the archaeological markers, and, (ii) within the different indices, the highest discriminability of the archaeological features from the background constituted by cultivated fields or bare soils ( Figure 6). The multi temporal analysis put in evidence the best results in diverse periods of Summer (first half of July) and autumn (November and December).
Finally, due to the huge amount of data under investigation (single channels and spectral indices), additional improvements in data analysis were addressed at the (i) reduction of multispectral indices/data (one of the main issues in the case of big data analysis) to leave out the less informative multispectral data/indices (ii) further enhancement of the subtle features of interest. To this aim, in this study, Tasseled Cap Transformation (TCTs) and Principal Component Analysis (PCAs) were used, as detailed in Sections 2.2 and 2.3.

Data Processing to Enhance Features of Archaeological Interest: Tasseled Cap Trasformation (TCT)
Tasseled Cap Transformation (TCT) is one of the most used methods for the extraction of features of archaeological interest is [41,50,[89][90][91][92].
TCT is a conversion of the original images into a new data set obtained by "linear combinations" of the original bands (1) [41]: where (i) WTC depends on the sensor (the coefficients used to create the bands are statistically derived from images and empirical observations, and are sensor-specific) and the number of bands, (ii) DN Digital Number, and (iii) B Bias [41]. The first band of the TCT is related to "brightness". The second is related to "greenness" and is used as an index of healthy vegetation. The third is 'wetness' or 'yellowness' and is relevant to soil or canopy moisture.
The original formula was empirically defined by Kauth and Thomas for the Landsat MSS sensor, for the spectral analysis of wheat growth in fields based on Landsat TM [93][94][95]. Currently, TCT is a technique used in land cover mapping and classification today empirically defined for numerous sensor as Landsat TM, ETM, ASTER, MODIS, and IKONOS sensors [96][97][98][99][100][101]. Two different procedures were used for Sentinel-2 images, using the formulas described in the Index Data Base (IDB) portal TCT components were computed using both the six bands described in the formulas and only the components of Blue, Green, Red, and NIR.
The TCT components were then analysed individually and combined in RGB (R: brightness, G: greenness, B: wetness) (Figure 7).

Data Processing to Enhance Features of Archaeological Interest: Principal Component Analysis (PCA)
PCA is a transformation already used in the past to better enhance features of archaeological interest or assess the change over time for cultural heritage monitoring [92,[103][104][105][106]. Moreover, PCA is particularly useful to reduce redundant information and, in turn, the amount of data [41]. Therefore, PCA was herein adopted to reduce the data and facilitate the identification of individual features. To this aim, we analyzed both (i) the data produced during the first phase of image processing and (ii) their transformation as PCA components.
From the computational point of view, PCA, also known as Karhunen-Loéve transformation or Hotelling transformation, is an orthogonal decomposition widely used in multivariate statistics. PCA can be considered as a rotation of the cartesian axes of a new orthogonal system of variables, where the variables are represented without correlation [41]. The PCA operation consists of the following main steps: (i) averaging the pixels of the input images to calculate an average image, and subtracting the calculated average image from each input image; (ii) calculating the covariance (or variance) matrix on all input bands; (iii) decomposing the covariance matrix's own eigenvalue to obtain new images [41] called components.
As a whole, PCA is a linear transformation that transforms a number of related variables into a (smaller) number of unrelated variables, called main components, which still contains much of the information present in the input dataset. The set of outputs produced by the described methodologies has been analyzed in a GIS platform together with ancillary data from other sources.

Results
As detailed above, in Section 2 the first results, obtained from the analyses of the region of interests, were the characterization of the spectral signature of soil and crop marks (see Figures 5 and 6).
In this section we present the results, obtained for the whole study area related to the "Tavoliere delle Puglie" covering approximately 224 km 2 (in the Foggia Province), and successfully compared with independent data sources, so that this enabled the: (i) identification of the best indices to highlight the archaeological features and the best period of the year to capture soil and crop marks; (ii) discovery of unknown archaeological areas.
In particular, the data derived from WebGis CartApulia have been of great utility for the research as it is a GIS produced by the Region of Puglia, containing an "Archaeological Map of the Cultural Heritage of the Region of Puglia" based on the archives of the Soprintendenza, the Italian authority responsible for protecting and safeguarding of cultural heritage.
The GIS contains all the sites currently known from archaeological excavations, aerial photo analysis, and field reconnaissance activities, and proposes hypotheses of dating and characterization of contexts.
The ancillary data were of fundamental importance to understand the usefulness of the images produced by processing Sentinel-2 data.
The first outputs from analysis based on known archaeological features were the identification of (i) the indices that better enhance the archaeological proxy indicators, (ii) the period when the archaeological proxy indicators exhibited the best visibility, and in turn the discriminability. RGB and FCIR images show (see Figure 3) a complex pattern of overlapped traces related to settlements, roads, canals, and other buried structures belonging to different historical periods and, therefore, are quite difficult to characterize correctly.
In addition, the differences that indicate the presence of possible structures or effects of anthropogenic transformation of landscape were well delineated using the spectral separability (see Figures 5 and 6) applied to the ROIs within the individual bands and the spectral indices. The most performing indices are those involving the Green, Red, and NIR bands, such as (i) crop-band, (ii) soil-band, (iii) RN, (iv) NDVI (Table 2), as expected according to the past investigations [7,8,[11][12][13][14][70][71][72][73][74][75][76]. This is certainly due to the high resolution that the Sentinel-2 satellites have in these bands (10 m).
The analysis of spectral signatures, obtained on the basis of the Sentinel-2 bands, shows an irregular trend over time in the spectral separability between the archaeological and non-archaeological ROIs. In accordance with Figures 5 and 6, the analysis of ROIs in the same months of the year, one year later (e.g., July 2016-July 2017), shows totally different data, probably related to weather conditions, soil moisture and land use. Features are clearly visible in single bands, as well as in RGB and FCIR compositions ( Figure 9). However, some types of features are more visible in some images (compositions or single bands) than others. The comparison of the diverse spectral indices, showed that the archaeological features were well highlighted in the TCT first (brightness), and second (greenness) component, along with the RGB from the first three TCT components (as follow: R: 1st TCT, G: 2nd TCT, B: 3rd TCT).
It is important to highlight that the features were more visible from the individual products, in particular from the TCT brightness and greenness as obtained from Formulas (2) and (3), an example (a zoom of a subset of the study area) is shown in Figure 10. The image produced from the components of the PCA is definitely the one that provides the most information about the possible presence of buried remains and allows an easy identification and interpretation thanks to the sharp contrast between the features and the background. Inside the image, many archaeological features were clearly visible (i) traces of settlements and defensive moats, characterized by the circular or sub-circular shape typical of Neolithic settlements ( Figure 11) in the region; (ii) traces of roads, communication routes and defensive complex of diverse ages ( Figure 12); (iii) traces of paleo-channels, and other evidence of archaeological interest for the reconstruction of the ancient landscape. Of course, as expected, not all the images produced by PCA are useful for features analysis. The RGB image produced from the first three components (Figure 8a) is probably the worst for archaeological research because it shows a flattening of "ephemeral" elements such as archaeological features, in favor of persistent elements in all images such as buildings, modern roads and field boundaries. The RGBs (Figure 8b,c) are more suitable for archaeological studies as the two images can be defined as complementary: in each of them some elements are highlighted (Figures 11 and 12).
The analysis of the images allowed the identification of several features of possible archaeological interest. In many cases the hypotheses have been confirmed using the tools and the data sources described above (e.g., Masseria Palmori, Masseria Schifata, Masseria Villano, or Località Masseria Petrullo), whereas other hypotheses are not confirmed with the documentation currently available because they show unknown sites not yet identified or surveyed and therefore, not yet included into the CartApulia Database. The discovery of new sites appears to be one of the strengths of the elaborated PCAs. The validation of the data, however, plays a fundamental role in the analysis of the images and in the evaluation of their real potential. In fact, starting from the recognition of known sites, it is possible to understand the different types of features, and to attribute a specific connotation to most of them, creating a training database for the identification and characterization even of sites not yet discovered ( Figure 13).

Discussions
In this paper, the capability of satellite Sentinel-2 data in the identification of archaeological proxy indicators (soil and crop marks) has been assessed in the Mediterranean area of "Tavoliere delle Puglie", selected because it is characterized by a long human frequentation and, therefore, is very rich in archaeological remains, from the Neolithic Era to Bronze, Greek-Roman, and Middle Ages.
Results from the investigations performed using a multi-temporal 2016-2019 dataset of Sentinel-2 imagery enabled us to extract the spectral signatures of soil and crop marks thus highlighting the capability of Sentinel-2 for the detection of archaeological features. Moreover, outputs from our analyses enabled us to identify the spectral indices that performed better in the detection of soil and crop mark and the best period of the year to capture them.
The spectral separability of the archaeological evidence was analyzed considering both single channel and spectral indices. The single channels (i.e., spectral bands) perform better than the spectral indices for the identification of the archaeological features. The contrast of archaeological features and their surrounding is significant, as is well-defined in the spectral signatures obtained and shown in Figures 5, 6 and 9. Related to the spectral indices, the highest discrimination capability was observed from TCT (brightness and greenness component) and PCA outputs, and the best period of the year to capture soil and crop mark, was April and the autumn September and even December in the case of good quality image (i.e., no cloud cover). Of course, it is important to consider that the period of the best visibility is influenced by several factors that in the case of Foggia (that is an agricultural area) are mainly: (i) meteorological conditions; (ii) crop rotation; (iii) type of features. As a general approach, the characterization of the best period of the visibility of the archaeological features can be identified from investigations conducted in known sites using a multitemporal data set to assess the inter and intra year visibility. In particular, in the current case, the multitemporal 2016-2018 Sentinel-2 data was analyzed because the years considered were characterized by meteorological conditions significantly different from one year to another. As an example, the 2017 was a drought year with low rain compared to average rainfall, whereas the 2018 was characterized by abundant rain precipitation, which also occurred even during the summer period, which therein is generally a quite dry period. However, the visibility of the archaeological features inside the multi-temporal images is variable and related to phenomena that, as already mentioned, concern weather and land use, as well as the type of soils and buried archaeological remains.

Notes on the Multitemporal Analysis of Single Channels
The analysis of the individual channels and their combinations RGB and FCIR showed that Sentinel-2 data were very effective in the identification of large anthropogenic archaeological remains such as: trenches, ditches, holes, channels, etc., which tend to retain moisture as clearly visible in the spectral bands and images [41]. Clear evidence of this was observed in the images of July and August 2017 (see spectral signatures in Figure 5d-f), due to the rainfall occurred at the end of July in Apulia (http://www.arpa.puglia.it/web/guest/serviziometeo).
Another important conclusion of this study is that the features of archaeological interest are more visible on bare soil than in cultivated fields (in the case of the Tavoliere delle Puglie), or when the vegetation has an irregular growth/heath due to the weather conditions. However, some remains are visible due to anomalies in vegetation, as changes due to a higher or lower health status. This phenomenon is mainly visible during the winter months in dry and rainless conditions (November 2018, Figures 5m, 6 and 9).
As a whole, we can suggest that, as expected on the basis of the relationship between vegetation health and type of archaeological remains [7,14,[50][51][52][53], the visual analysis of each individual channel and of each Sentinel-2 image (of the multi-temporal data set) can provide a huge amount of useful information but of course is time consuming. Therefore, the use of spectral signature and the expected spectral separability is strongly recommended to reduce the data sets and further enhance archaeological features using TCT or PCA.
The TCTs were suitable for the identification of features of archaeological interest and very useful both individually ( Figure 10) and in RGB combination. However, the use of TCTs was also time consuming and did require the analysis of a large amount of data in the final phase of features detection. Therefore, in the case of big data processing, the PCA method was the one with the best quality/time value. However, as expected, not all PCAs were useful for the study. In the first case ( Figure 8a) the RGB image produced was lacking features that may indicate buried remains, in favor of elements in which the redundant information were concentrated (buildings, fields, roads etc.). The second case (Figure 8b) was more complex and derives from the use of the 2nd component in the Red channel and the 10th and 11th components as Green and Blue channels. The components produced by the PCA using all the data as input (see Section 2.3) had a useful content in terms of archaeological information from the 2nd component to the 15th, while the following ones showed excessive noise. This data was obtained after several output tests and may be different in other studies, being as known the interpretation of the PCAs dataset dependent. Nevertheless, as a general rule, we can argue that the best outputs from PCAs method were obtained only using as inputs the indices that had the highest separability ( Figures 5 and 6), as described in Section 2.3.

Notes on the Multitemporal Analysis of Spectral Indices
Indices were used to reduce the number of input data: they represent an enhancement of the differences in terms of spectral response of some elements in relation to the reflectance of Green, Red and NIR, which are the most suitable channels for the study of crop-and soil-marks. This method was evaluated as the most suitable for the purposes of the study: (i) the analysis was focused on the reduced to a single image; (ii) the image was of good quality; (iii) the archaeological ROIs are in sharp contrast to the surroundings, and both large shapes and smaller elements are visible (e.g., small circular compounds, rectangular-shaped farms, traces of ploughing etc.).
Of course, it is important to consider that we have in any case small features (<10 m) and there can be still additional problems to cope as the need: (i) to distinguish archaeological features from noise; (ii) to understand the nature of the features observed in order to identify and characterize new ones.
In both cases (TCTs and PCAs), the problem has been partly solved by using the knowledge acquired from archaeological investigations carried out in the past (drawings, aerial photos, GIS, etc.) which have proved to be a fundamental and essential tool for archaeological research.
However, one of the issues is precisely the correct interpretation of the large mass of data. This represents a challenge in the field of cultural heritage and the use of remote sensing for the identification, preservation and study of archaeological artefacts, where the data available from RS tools are relatively new and are based on sensors and physical parameters different from those used in the past (e.g., multispectral, hyperspectral, radar, LiDAR, GPR, etc.). Therefore, it is of fundamental importance for the research itself to couple the interpretation of the outputs of the data analysis with the archaeological documentation.

Conclusions
Today, in the era of big and open data, the large amount of data and information, available even for free to the scientific community, offers big opportunities and big challenges, also in the field of Earth observation techniques for cultural heritage. Big data and big challenges are particularly relevant for multi-and inter-disciplinary studies generally quite complex to be approached as in the case of archaeology [113][114][115].
Sentinel-2 data are certainly among the big data (available from 2014) that can offer big opportunities in many research fields and operational applications including cultural heritage. The use of high resolution multispectral data from the Sentinel-2 satellites of the Copernicus Program has shown all the strengths of this platform and the outputs from the current investigation clearly highlighted that these data can be fruitfully used in the future, for studies of the ancient landscape and the reconstruction of an extremely complex multi-layered context.
As a whole, one of main outputs of this research, based on open data and software (SNAP), was the methodological approach herein proposed that can be promptly replicate in other areas of the world. Moreover, the results obtained from the study of the Foggia province, along with the process carried out to obtain them, can be a good basis for some general considerations regarding the importance of Sentinel-2.

1.
For the province of Foggia, the use of Sentinel-2 satellite data coupled with the ancillary data from other sources has proved extremely effective for the quick identification, on a landscape scale, of dozens of archaeological sites, some already known, others unknown and herein discovered.

2.
The use of data enhancement and data reduction techniques applied to remote sensing data has proved to be extremely useful in the archaeological study. General considerations are related to the choice of multitemporal based analysis in a Mediterranean environment to cope with the different inter/intra annual visibility of soil and crops markers due to the vegetation phenology, meteorological conditions, etc.
Of course, as discussed, the main issue is to optimize the data to have a good compromise between processing and analysis in terms of effort, time, and results. To address this point, the PCA technique combined with previous use of ROIs discrimination and Spectral Un-mixing over time has proved to be very useful. The approach and the data processing chain herein proposed is an example of a perfect synthesis of the synergy needed between archaeology and remote sensing based on a virtuous circle of multidisciplinary skills that work together to address critical issues. In particular, the use of free data such as Sentinel-2 and those free available in the framework of the Copernicus initiatives, combined with archaeological knowledge of the area and ancillary data collected from other sources can produce remarkable results.
On the other hand, the use of remote sensing tools and Earth observation techniques based on Sentinel-2 images has proved to be useful in the identification of archaeological remains on a landscape scale, in the reconstruction of road systems and communication routes, and in the analysis of watercourses and their evolution over time. In particular, the identification of already known archaeological sites allows understanding of the shapes of features linked to a specific context within the images (e.g., concentric circles are linked to settlements of the Neolithic period). This is important because it allows the identification of new areas of archaeological interest (Figures 10 and 12) and their attribution to a specific historical period. The archaeological analysis on landscape scale (structures, communication routes, and watercourses) allows to reconstruct the network of settlements and to carry out new historical/archaeological considerations. At the same time, the identified areas can be investigated on a "site scale", with the normal practices of archaeology (excavation, survey) or close-range remote sensing (geophysics survey, drones, etc.) for a deep knowledge of the contexts, essential in the field of archaeological research. The use of Sentinel-2 images could also be included in the practices for the creation of the archaeological risk assessment map to be handed over to the Soprintendenza, for the protection of cultural heritage in the case of urban sprawl and land use change processes [116][117][118][119][120][121].