Detecting Neolithic Burial Mounds from LiDAR-Derived Elevation Data Using a Multi-Scale Approach and Machine Learning Techniques

: Airborne LiDAR technology is widely used in archaeology and over the past decade has emerged as an accurate tool to describe anthropomorphic landforms. Archaeological features are traditionally emphasised on a LiDAR-derived Digital Terrain Model (DTM) using multiple Visualisation Techniques (VTs), and occasionally aided by automated feature detection or classiﬁcation techniques. Such an approach offers limited results when applied to heterogeneous structures (different sizes, morphologies), which is often the case for archaeological remains that have been altered throughout the ages. This study proposes to overcome these limitations by developing a multi-scale analysis of topographic position combined with supervised machine learning algorithms (Random Forest). Rather than highlighting individual topographic anomalies, the multi-scalar approach allows archaeological features to be examined not only as individual objects, but within their broader spatial context. This innovative and straightforward method provides two levels of results: a composite image of topographic surface structure and a probability map of the presence of archaeological structures. The method was developed to detect and characterise megalithic funeral structures in the region of Carnac, the Bay of Quiberon, and the Gulf of Morbihan (France), which is currently considered for inclusion on the UNESCO World Heritage List. As a result, known archaeological sites have successfully been geo-referenced with a greater accuracy than before (even when located under dense vegetation) and a ground-check conﬁrmed the identiﬁcation of a previously unknown Neolithic burial mound in the commune of Carnac.


Introduction
Since the early 2000s, Airborne LiDAR remote-sensing technology has become an important tool for locating and monitoring cultural heritage sites in large or hard-to-access areas. High-density and high-precision LiDAR point clouds are processed to generate terrain models that are then used to detect and characterise archaeological evidence through the analysis of morphological or topographic anomalies [1][2][3].
To enhance the interpretation of subtle landforms from Digital Terrain Models (DTMs), archaeologists have relied on common visualisation techniques (VTs), including hillshade, slope, or colour-casted models derived from DTMs. More recently, advanced VTs, such as principal component analysis (PCA) of hillshades, sky-view factor, and openness models have been developed and successfully used to improve visual detection of archaeological remains. The main limitation Remote Sens. 2018, 10, 225 2 of 19 of these approaches is that they depend on empirical parameters (such as orientation or radius) and thus bias analyses of archaeological structures (or topographic anomalies) of specific sizes or morphologies [4]. The increasing development and complexity of these VTs can make the manipulation of LiDAR-derived models confusing and their interpretation subjective by non-expert users.
Along with the visual interpretation approach, an increasing number of automatic or semi-automatic feature-detection methods have been developed to address the issues involved in using LiDAR datasets to reveal archaeological remains. Object-based image analysis [5] and template matching [6] applied to DTMs or VTs has been used to automatically detect archaeological structures. While effective, this method requires defining a prototypical structure (for template matching) or characteristic spatial attributes (size, shape, orientation, specific VT value, etc.) to classify objects. This information is difficult to define for heterogeneous structures, especially for thousands of years old remains that have been altered throughout the ages.
In this article, we focus on the use of airborne LiDAR data to detect megalithic funeral structures in the region of Carnac, the Bay of Quiberon, and the Gulf of Morbihan (France). We developed an innovative approach that combines multi-scale topographic analysis of the LiDAR-derived model, as a new VT, with a machine learning method to produce a probability map to support archaeological prospections and accompanying field surveys.

Archaeological Context
The French Atlantic coast, especially the Gulf of Morbihan, maintains traces of some of the most emblematic evidence left by our forebears during the Neolithic period [7,8]. The Neolithic funeral structures in the Carnac region, dating from 4500-7000 years ago, have been particularly affected by time, especially through natural erosion and anthropomorphic modifications. Some features, of multiform nature (burial mounds or tumuli, dolmens, cairns), have been significantly reduced in size or even completely levelled. Although these megalithic structures in Morbihan can be seen as isolated individual elements, they have a complex organisation that structured the Neolithic landscape [9]. Monuments were deliberately erected in precise locations and reflect an organised signature [10]. Situated in specific topographic contexts and usually dominating their environment, the funeral structures were most likely positioned to be visible from long distances [11]. Topographic analysis, especially based on a high-resolution terrain model, thus becomes one way to examine individual monuments and also the potential spatial organisation of a wider archaeological landscape [12].
Based on a training data set containing observations of known categories, the model is trained and then used to predict the category in which a new observation belongs.
RF is especially adapted to solve high-dimensionality problems. As an ensemble classifier, RF provides a classification score calculated from classification results of the forest's individual and independent decision trees. Each tree is generated from a random selection of features/variables and observations. This output can be used to generate a probability map, which is preferred to binary classification for computer-aided feature detection in archaeology. In addition to the probability map, RF also identifies the relative importance of features during classification, which can be useful in understanding the classification process [22,23]. RF is robust and effective, even with unbalanced data, and easier to implement than SVM [23]. The progress of an observation as it is being classified is also easier to follow with RF than with MLP, which is often considered a black box.

Study Area
The megalithic landscape of Carnac extends 40 km along the southern coast of Brittany (France), from the Etel River in the west to the Rhuys Peninsula in the east (Figure 2). The area is home to 519 known Neolithic sites currently selected for inclusion on the UNESCO World Heritage List. These sites must not be perceived as individual or single monuments, but as an integral part of the landscape with inter-relations based on their topographical positions or co-visibility. The megalithic monuments (i.e., burial mounds, single standing stones, stone circles, alignments of standing stones) follow structural and functional patterns such as "alignment and stone circle" or "alignment and stone circle and burial mound". Thus, even though some are barely perceptible due to their degraded conditions, archaeologists have grouped some of the 519 sites into 26 sub-areas to understand these patterns. Two of these sub-areas, Kerlescan and Lann Granvillarec, both located in the Carnac region, were selected for detailed analysis for two reasons. First, they contain all of the types of Neolithic remains known in the study area and all of the main structural and functional patterns identified. Second, they have contrasting environments: Lann Granvillarec is a densely forested area with both deciduous and coniferous vegetation, while Kerlescan is mainly an open area with little vegetation.

Study Area
The megalithic landscape of Carnac extends 40 km along the southern coast of Brittany (France), from the Etel River in the west to the Rhuys Peninsula in the east (Figure 2). The area is home to 519 known Neolithic sites currently selected for inclusion on the UNESCO World Heritage List. These sites must not be perceived as individual or single monuments, but as an integral part of the landscape with inter-relations based on their topographical positions or co-visibility. The megalithic monuments (i.e., burial mounds, single standing stones, stone circles, alignments of standing stones) follow structural and functional patterns such as "alignment and stone circle" or "alignment and stone circle and burial mound". Thus, even though some are barely perceptible due to their degraded conditions, archaeologists have grouped some of the 519 sites into 26 sub-areas to understand these patterns. Two of these sub-areas, Kerlescan and Lann Granvillarec, both located in the Carnac region, were selected for detailed analysis for two reasons. First, they contain all of the types of Neolithic remains known in the study area and all of the main structural and functional patterns identified. Second, they have contrasting environments: Lann Granvillarec is a densely forested area with both deciduous and coniferous vegetation, while Kerlescan is mainly an open area with little vegetation.

LiDAR Data
The study was based on Airborne LiDAR data acquired with an Optech Titan bispectral LiDAR sensor. The survey was performed in March 2016, immediately before the end of the off-leaf season to ensure maximum signal penetration through the canopy.
For the survey, the system was set for multi-echo recording, and specifications were defined (Table 1) to reach a combined nominal point density of 14 points/m 2 (7 points/m 2 per channel). The nominal point density is estimated for a non-overlapping area, considering only the most recent echo. Full waveform information was not recorded during the flight. The data were delivered as an

LiDAR Data
The study was based on Airborne LiDAR data acquired with an Optech Titan bispectral LiDAR sensor. The survey was performed in March 2016, immediately before the end of the off-leaf season to ensure maximum signal penetration through the canopy.
For the survey, the system was set for multi-echo recording, and specifications were defined (Table 1) to reach a combined nominal point density of 14 points/m 2 (7 points/m 2 per channel). The nominal point density is estimated for a non-overlapping area, considering only the most recent echo. Full waveform information was not recorded during the flight. The data were delivered as an unclassified 3D point cloud in LAS format (v1.2), with files organised by channel and flight line and compressed into LAZ files using LASzip software.

The Method
The method for processing LiDAR data consisted of several steps ( Figure 3).

The Method
The method for processing LiDAR data consisted of several steps ( Figure 3).

Pre-Processing of LiDAR Data
An iterative method was used to pre-process the LiDAR data by filtering ground points and generating a high-resolution DTM to maximise the number of micro-relief features. All preprocessing tasks ( Figure 4) were performed with LASTools software [25].

Pre-Processing of LiDAR Data
An iterative method was used to pre-process the LiDAR data by filtering ground points and generating a high-resolution DTM to maximise the number of micro-relief features. All pre-processing tasks ( Figure 4) were performed with LASTools software [25]. Remote Sens. 2018, 10, x FOR PEER REVIEW 6 of 19 LAZ files, organised by channel and flight line, were then merged and reprocessed in a tiling (500 m × 500 m) system. A 50 m buffer area was used to avoid boundary effects between tiles during triangulation. Tiles were then spatially indexed to expedite the workflow.
Out-of-range values were excluded based on the Z coordinate value of each point: outside the valid range from −5 m to 60 m, the point was labelled as "Noise". Isolated points were detected using the lasnoise tool with a 1 m step (step_xy and step_z) and a minimum threshold of 10 neighbours. Thus, points with 10 or fewer neighbouring points in the surrounding 27 m 3 were labelled as noise and excluded from further processing.
A first iteration of ground filtering was performed using the lasground tool with the step parameter set to 25 m. This setting provided coarse ground filtering (without micro-relief) and excluded above-ground features (buildings and vegetation) well. Filtered ground points were labelled "Ground".
The lasclassify tool was used to classify non-ground points (above-ground features) into vegetation and building classes. All points more than 1.5 m from the ground and with a neighbouring planarity factor of at least 0.15 were labelled as buildings. All points more than 1.5 m from the ground and with a neighbouring ruggedness factor of at least 0.4 were labelled as vegetation.
A second iteration of ground filtering was performed using lasground with a smaller step (3 m) to identify smaller topographic details. This iteration was run on all points, including those already classified as buildings, vegetation, or noise. This approach helped identify additional bare-ground details (ground points) while ensuring that above-ground features did not interfere.
Final ground points were used to generate a surface based on triangulated irregular network interpolation and gridded to a DTM with a 0.25 m ground sampling distance. The optimum resolution was defined by calculating the nominal point spacing, equal to √(1/PointDensity) [26].

Multi-Scale Topographic Analysis
Multi-scale analysis of the LiDAR-derived DTM, the core of the method, was directly inspired by the integral image approach by Lindsay et al. [20]. Integral image transformation was applied to efficiently perform multi-scale analysis of DEV before integrating the results into a composite image ( Figure 5).
The processing, detailed hereafter, was performed using the open-source Whitebox Geospatial Analysis Tools [27] and the available Maximum Elevation Deviation (multiscale) tool, coupled with a custom R script [28] to create the RGB composite image. LAZ files, organised by channel and flight line, were then merged and reprocessed in a tiling (500 m × 500 m) system. A 50 m buffer area was used to avoid boundary effects between tiles during triangulation. Tiles were then spatially indexed to expedite the workflow.
Out-of-range values were excluded based on the Z coordinate value of each point: outside the valid range from −5 m to 60 m, the point was labelled as "Noise". Isolated points were detected using the lasnoise tool with a 1 m step (step_xy and step_z) and a minimum threshold of 10 neighbours. Thus, points with 10 or fewer neighbouring points in the surrounding 27 m 3 were labelled as noise and excluded from further processing.
A first iteration of ground filtering was performed using the lasground tool with the step parameter set to 25 m. This setting provided coarse ground filtering (without micro-relief) and excluded above-ground features (buildings and vegetation) well. Filtered ground points were labelled "Ground".
The lasclassify tool was used to classify non-ground points (above-ground features) into vegetation and building classes. All points more than 1.5 m from the ground and with a neighbouring planarity factor of at least 0.15 were labelled as buildings. All points more than 1.5 m from the ground and with a neighbouring ruggedness factor of at least 0.4 were labelled as vegetation.
A second iteration of ground filtering was performed using lasground with a smaller step (3 m) to identify smaller topographic details. This iteration was run on all points, including those already classified as buildings, vegetation, or noise. This approach helped identify additional bare-ground details (ground points) while ensuring that above-ground features did not interfere.
Final ground points were used to generate a surface based on triangulated irregular network interpolation and gridded to a DTM with a 0.25 m ground sampling distance. The optimum resolution was defined by calculating the nominal point spacing, equal to

Multi-Scale Topographic Analysis
Multi-scale analysis of the LiDAR-derived DTM, the core of the method, was directly inspired by the integral image approach by Lindsay et al. [20]. Integral image transformation was applied to efficiently perform multi-scale analysis of DEV before integrating the results into a composite image ( Figure 5).
The processing, detailed hereafter, was performed using the open-source Whitebox Geospatial Analysis Tools [27] and the available Maximum Elevation Deviation (multiscale) tool, coupled with a custom R script [28] to create the RGB composite image. Integral Image Transformation An integral image (or summed-area table) is a data structure and algorithm that efficiently calculates the sum of values in a rectangular subset of a grid [29]. The value at any pixel ( , ) in the integral image ( ) is the sum of all the pixels above and to the left of ( , ), inclusive, in the original image ( ). Once the integral image is calculated, summing pixel values in any rectangular window, regardless of its size, requires only four array references, presented in the following Equation (1): It allows for rapid calculation of the sum ( Figure 6) and thus the mean, especially within a large moving window (kernal) approach [20].
A single integral image can be used to sum values within an entire range of window sizes. This characteristic enables a multi-scale approach based on integral images to the analysis of topographic metrics. Integral Image Transformation An integral image (or summed-area table) is a data structure and algorithm that efficiently calculates the sum of values in a rectangular subset of a grid [29]. The value at any pixel (x, y) in the integral image (ii) is the sum of all the pixels above and to the left of (x, y), inclusive, in the original image (i). Once the integral image is calculated, summing pixel values in any rectangular window, regardless of its size, requires only four array references, presented in the following Equation (1): It allows for rapid calculation of the sum ( Figure 6) and thus the mean, especially within a large moving window (kernal) approach [20].
A single integral image can be used to sum values within an entire range of window sizes. This characteristic enables a multi-scale approach based on integral images to the analysis of topographic metrics. Remote Sens. 2018, 10, x FOR PEER REVIEW 8 of 19

Calculating Topographic Deviation at Multiple Scales
The topographic DEV equals the difference between the elevation of each grid cell and the mean elevation of the local neighbourhood, divided by the standard deviation [19], as shown in Equation (2): where is the size of the roving window (local neighbourhood), is the elevation of the central pixel of the window, ̅ is the mean elevation in the window, and is the standard deviation of elevations in the window. The multi-scale approach was based on calculating multiple raster layers from roving windows of different sizes. It was implemented efficiently using integral image transformation ( ̅ was calculated from the integral image, while was calculated from the secondpower integral image).
Three scales of analysis were defined: micro, meso, and macro. Intra-scale analysis was performed for each scale. The raster layer was calculated times with a window size ranging from to along the mathematical sequence = + ( − 1) , where equals the incremental step. The resulting raster layer at each scale was calculated using the maximum absolute DEV from the multiple intra-scale analyses performed within the window size range defined using Equation (3): The window size range and incremental step of each scale are input parameters. The window size ranges were defined based on the size of the archaeological structures to be studied. As burial mounds are usually 10-100 m in size, and scale boundaries were defined at those sizes ( Table 2):

Calculating Topographic Deviation at Multiple Scales
The topographic DEV equals the difference between the elevation of each grid cell and the mean elevation of the local neighbourhood, divided by the standard deviation [19], as shown in Equation (2): where D is the size of the roving window (local neighbourhood), Z 0 is the elevation of the central pixel of the window, Z D is the mean elevation in the window, and σ D is the standard deviation of elevations in the window. The multi-scale approach was based on calculating multiple DEV raster layers from roving windows of different sizes. It was implemented efficiently using integral image transformation (Z D was calculated from the integral image, while σ D was calculated from the second-power integral image). Three scales of analysis were defined: micro, meso, and macro. Intra-scale analysis was performed for each scale. The DEV raster layer was calculated n times with a window size D n ranging from D min to D max along the mathematical sequence D n = D min + (n − 1)k, where k equals the incremental step. The resulting raster layer at each scale was calculated using the maximum absolute DEV from the multiple intra-scale analyses performed within the window size range defined using Equation (3): The window size range and incremental step of each scale are input parameters. The window size ranges were defined based on the size of the archaeological structures to be studied. As burial mounds are usually 10-100 m in size, and scale boundaries were defined at those sizes ( Table 2): To avoid excessive calculations, the incremental step (k) of window size was set to create 11 intra-scale analyses for each scale using Equation (4) with k rounded to the nearest integer: For example, the maximumDEV raster for the macro scale (100-1000 m) was selected from 11 DEV rasters calculated with a 90 m step. The result of each maximumDEV was a raster layer with cell values that range from −3 to +3, which correspond to topographical positions that are lower or higher, respectively, than those in their neighbourhood (Figure 7).  To avoid excessive calculations, the incremental step ( ) of window size was set to create 11 intra-scale analyses for each scale using Equation (4) with rounded to the nearest integer: For example, the raster for the macro scale (100-1000 m) was selected from 11 rasters calculated with a 90 m step. The result of each was a raster layer with cell values that range from −3 to +3, which correspond to topographical positions that are lower or higher, respectively, than those in their neighbourhood (Figure 7).  (Figure 8). The composite image is composed of the macro-scale, meso-scale, and micro-scale results (in the red, green, and blue bands, respectively).

Implementing the Machine Learning Algorithm
While the absolute value of was used to create the composite MSTP image, the original value of (positive or negative) defined the actual topographic position signature of each pixel. The value indicates whether the position is dominating (positive) or dominated (negative) in the specified neighbourhood. Based on the three scale deviation signature, an RF algorithm was implemented in R software using the Caret package [30] to fit a model to automatically detect Neolithic burial mounds. The sampling plan, which was defined in a sub-area The three raster layers of maximum topographic deviation (maximumDEV micro , maximumDEV meso , maximumDEV macro ) were converted into absolute values and scaled to the range 0-254. Results were then combined into a single RGB image called a Multi-scale Topographic Position (MSTP) image for visualisation and interpretation (Figure 8). The composite image is composed of the macro-scale, meso-scale, and micro-scale results (in the red, green, and blue bands, respectively).

Implementing the Machine Learning Algorithm
While the absolute value of maximumDEV was used to create the composite MSTP image, the original value of maximumDEV (positive or negative) defined the actual topographic position signature of each pixel. The value indicates whether the position is dominating (positive) or dominated (negative) in the specified neighbourhood. Based on the three scale deviation signature, an RF algorithm was implemented in R software using the Caret package [30] to fit a model to automatically detect Neolithic burial mounds. The sampling plan, which was defined in a sub-area of Carnac (1 km 2 ), included 50 data samples (Figure 9). Each sample was a 100 m 2 square (1600 pixels at 0.25 m resolution) whose location was selected using the Latin hypercube sampling method to ensure adequate representation of variability in the data [31]. The sample size of 100 m 2 was defined empirically based on the burial mound's minimum size. The sampling plan contained two labels ("burial mound", "not burial mound") that were cross-checked with on-site verification and archaeological reference data from the French Regional Archaeological Service. Of the 50 samples, 12% were "burial mound" and 88% were "not burial mound". of Carnac (1 km 2 ), included 50 data samples (Figure 9). Each sample was a 100 m 2 square (1600 pixels at 0.25 m resolution) whose location was selected using the Latin hypercube sampling method to ensure adequate representation of variability in the data [31]. The sample size of 100 m 2 was defined empirically based on the burial mound's minimum size. The sampling plan contained two labels ("burial mound", "not burial mound") that were cross-checked with on-site verification and archaeological reference data from the French Regional Archaeological Service. Of the 50 samples, 12% were "burial mound" and 88% were "not burial mound".  A total of 80,000 pixel signatures were extracted from a stacked raster layer of three bands ( , , ). The signatures and associated labels ("burial mound", "not burial mound") were randomly divided into two parts to train and evaluate the model (70% training data, 30% evaluation data). The training data were used to construct a set of 120 decision trees as a model for binary classification, as well as for generating a probability map. This map indicates the probability (from 0 to 1) of the presence of topographic elements with characteristics similar to those of the Neolithic burial mounds (0.5 is the value that separates "burial mound" from "not burial mound"). of Carnac (1 km 2 ), included 50 data samples (Figure 9). Each sample was a 100 m 2 square (1600 pixels at 0.25 m resolution) whose location was selected using the Latin hypercube sampling method to ensure adequate representation of variability in the data [31]. The sample size of 100 m 2 was defined empirically based on the burial mound's minimum size. The sampling plan contained two labels ("burial mound", "not burial mound") that were cross-checked with on-site verification and archaeological reference data from the French Regional Archaeological Service. Of the 50 samples, 12% were "burial mound" and 88% were "not burial mound".  A total of 80,000 pixel signatures were extracted from a stacked raster layer of three bands ( , , ). The signatures and associated labels ("burial mound", "not burial mound") were randomly divided into two parts to train and evaluate the model (70% training data, 30% evaluation data). The training data were used to construct a set of 120 decision trees as a model for binary classification, as well as for generating a probability map. This map indicates the probability (from 0 to 1) of the presence of topographic elements with characteristics similar to those of the Neolithic burial mounds (0.5 is the value that separates "burial mound" from "not burial mound"). A total of 80,000 pixel signatures were extracted from a stacked raster layer of three bands (maximumDEV micro , maximumDEV meso , maximumDEV macro ). The signatures and associated labels ("burial mound", "not burial mound") were randomly divided into two parts to train and evaluate the model (70% training data, 30% evaluation data). The training data were used to construct a set of 120 decision trees as a model for binary classification, as well as for generating a probability map. This map indicates the probability (from 0 to 1) of the presence of topographic elements with characteristics similar to those of the Neolithic burial mounds (0.5 is the value that separates "burial mound" from "not burial mound").
The model was evaluated using the validation dataset. A confusion matrix including "burial mound" and "not burial mound" classes was generated for accuracy assessment and for calculating performance metrics such as Cohen's kappa coefficient [32] and precision and recall [33].
The model was then generalised to other regions in the study area to identify pixels considered as potential Neolithic burial mounds without prior knowledge of the area.

Training the Model to the Kerlescan Sub-Area
The known Neolithic funeral structures in the Kerlescan sub-area are clearly visible on its MSTP image ( Figure 10). This is confirmed by statistical analysis of the samples ("burial mound" vs. "not burial mound") ( Figure 11). The signature of burial mounds from the three scales is singular and clearly separated from other terrain positions, particularly at meso and macro scales. The model was evaluated using the validation dataset. A confusion matrix including "burial mound" and "not burial mound" classes was generated for accuracy assessment and for calculating performance metrics such as Cohen's kappa coefficient [32] and precision and recall [33].
The model was then generalised to other regions in the study area to identify pixels considered as potential Neolithic burial mounds without prior knowledge of the area.

Training the Model to the Kerlescan Sub-Area
The known Neolithic funeral structures in the Kerlescan sub-area are clearly visible on its MSTP image ( Figure 10). This is confirmed by statistical analysis of the samples ("burial mound" vs. "not burial mound") ( Figure 11). The signature of burial mounds from the three scales is singular and clearly separated from other terrain positions, particularly at meso and macro scales. The statistical results ( Figure 11) confirm that these monuments have a particularly dominating position within the neighbouring area at meso and macro scales (the topographical context of the structure), but no significant difference in signature at the micro scale (micro-topography within the structure itself).
Based on the ability to separate the two classes, as shown on the confusion matrix (Table 3), the RF model trained on the Kerlescan sub-area performed well, with a Cohen's kappa coefficient of 0.98 and precision and recall metrics of 0.98 for the "burial mound" class. The statistical results ( Figure 11) confirm that these monuments have a particularly dominating position within the neighbouring area at meso and macro scales (the topographical context of the structure), but no significant difference in signature at the micro scale (micro-topography within the structure itself).
Based on the ability to separate the two classes, as shown on the confusion matrix (Table 3), the RF model trained on the Kerlescan sub-area performed well, with a Cohen's kappa coefficient of 0.98 and precision and recall metrics of 0.98 for the "burial mound" class. Figure 11. Boxplots of the maximum topographic deviation of pixels "burial mound" and "not burial mound" sample areas at micro, meso, and macro scales. A total of 80,000 pixel signatures (1600 pixels × 50 samples) are used here. The class balance is 12% ("burial mound"), 88% ("not burial mound"). Table 3. Confusion matrix of the Random Forest classification, with true positive (tp), false negative (fn), and false positive (fp) values for the class "burial mound".

Predicted "Not Burial Mound" Predicted "Burial Mound"
Actual "not burial mound" 22,126 41 (fp) Actual "burial mound" 46 (fn) 2952 (tp) The results for Kerlescan are presented on the probability map ( Figure 12) that shows the detected position of know Neolithic burial mounds and reflects the strong separation between the two classes, with only a few pixels (<1%) ranging from 0.3 to 0.7.  . Boxplots of the maximum topographic deviation of pixels "burial mound" and "not burial mound" sample areas at micro, meso, and macro scales. A total of 80,000 pixel signatures (1600 pixels × 50 samples) are used here. The class balance is 12% ("burial mound"), 88% ("not burial mound"). Table 3. Confusion matrix of the Random Forest classification, with true positive (tp), false negative (fn), and false positive (fp) values for the class "burial mound".

Predicted "Not Burial Mound" Predicted "Burial Mound"
Actual "not burial mound" 22,126 41 (fp) Actual "burial mound" 46 (fn) 2952 (tp) The results for Kerlescan are presented on the probability map ( Figure 12) that shows the detected position of know Neolithic burial mounds and reflects the strong separation between the two classes, with only a few pixels (<1%) ranging from 0.3 to 0.7.
Remote Sens. 2018, 10, x FOR PEER REVIEW 12 of 19 Figure 11. Boxplots of the maximum topographic deviation of pixels "burial mound" and "not burial mound" sample areas at micro, meso, and macro scales. A total of 80,000 pixel signatures (1600 pixels × 50 samples) are used here. The class balance is 12% ("burial mound"), 88% ("not burial mound"). The results for Kerlescan are presented on the probability map ( Figure 12) that shows the detected position of know Neolithic burial mounds and reflects the strong separation between the two classes, with only a few pixels (<1%) ranging from 0.3 to 0.7.

Applying the Model to the Lann Granvillarec Sub-Area
MSTP analysis was applied to the Lann Granvillarec sub-area. As for Kerlescan, the MSTP ( Figure 13) image visually provides topographic information that was not visible or barely visible on the hillshaded DTM. Likewise, the funeral structures of Lann Granvillarec have a strong response at meso and macro scales (indicated by their bright yellow colour). This is also true for the most levelled monuments, which have subtle topographic prominence, such as Lann Granvillarec 3.

Applying the Model to the Lann Granvillarec Sub-Area
MSTP analysis was applied to the Lann Granvillarec sub-area. As for Kerlescan, the MSTP ( Figure 13) image visually provides topographic information that was not visible or barely visible on the hillshaded DTM. Likewise, the funeral structures of Lann Granvillarec have a strong response at meso and macro scales (indicated by their bright yellow colour). This is also true for the most levelled monuments, which have subtle topographic prominence, such as Lann Granvillarec 3. The RF model trained on Kerlescan was applied to Lann Granvillarec, generating a probability map ( Figure 14) without prior knowledge of the latter. Of the six Neolithic funeral structures in Lann Granvillarec known to the French Regional Archaeological Service (Drac/SRA), the model identified five with a probability higher than 0.9. The site that was not detected (Lann Granvillarec 5) was a nearly levelled barrow that was accidentally destroyed by recent house construction. Although it has completely disappeared, it is still recorded in the archaeological reference database of the Drac/SRA.
Other areas in the map show high probability values. Three are false positives, corresponding to recent anthropomorphic modifications of the terrain by construction in the past 30 years. One particular zone of high probability lies in the north-west section of Lann Granvillarec and has a morphology and orientation common to burial mounds ( Figure 15). On-site verification by the Drac/SRA confirmed it as a previously unknown Neolithic funeral structure. It is approximately 70 m long, 30 m wide, 2 m high, and covered by deep maritime pine forest. Its centre shows signs of two orthogonal excavations (visible on the LiDAR), which could indicate that it was once partially excavated, either by unauthorised activities (looting) or scientific operations whose archives have been lost to history. Declared as a new archaeological entity by the Drac/SRA, the monument (named "Tumulus of Coët Cougam") is now recorded in the archaeological reference database and thus protected by administrative regulations of the Drac/SRA. The RF model trained on Kerlescan was applied to Lann Granvillarec, generating a probability map ( Figure 14) without prior knowledge of the latter. Of the six Neolithic funeral structures in Lann Granvillarec known to the French Regional Archaeological Service (Drac/SRA), the model identified five with a probability higher than 0.9. The site that was not detected (Lann Granvillarec 5) was a nearly levelled barrow that was accidentally destroyed by recent house construction. Although it has completely disappeared, it is still recorded in the archaeological reference database of the Drac/SRA.
Other areas in the map show high probability values. Three are false positives, corresponding to recent anthropomorphic modifications of the terrain by construction in the past 30 years. One particular zone of high probability lies in the north-west section of Lann Granvillarec and has a morphology and orientation common to burial mounds ( Figure 15). On-site verification by the Drac/SRA confirmed it as a previously unknown Neolithic funeral structure. It is approximately 70 m long, 30 m wide, 2 m high, and covered by deep maritime pine forest. Its centre shows signs of two orthogonal excavations (visible on the LiDAR), which could indicate that it was once partially excavated, either by unauthorised activities (looting) or scientific operations whose archives have been lost to history. Declared as a new archaeological entity by the Drac/SRA, the monument (named "Tumulus of Coët Cougam") is now recorded in the archaeological reference database and thus protected by administrative regulations of the Drac/SRA.

Comparing MSTP and Other VTs of the Burial Mound of Kerlescan
The Tertre of Kerlescan, covered by vegetation, has been levelled mainly by time and anthropomorphic activities (agriculture/ploughing, archaeological excavations). Comparing VTs commonly used in archaeological contexts to the MSTP image (Figure 16), the latter highlights the topographical context and provides morphological information that was not visible or barely visible using standard VTs.

Comparing MSTP and Other VTs of the Burial Mound of Kerlescan
The Tertre of Kerlescan, covered by vegetation, has been levelled mainly by time and anthropomorphic activities (agriculture/ploughing, archaeological excavations). Comparing VTs commonly used in archaeological contexts to the MSTP image (Figure 16), the latter highlights the topographical context and provides morphological information that was not visible or barely visible using standard VTs.

Comparing MSTP and Other VTs of the Burial Mound of Kerlescan
The Tertre of Kerlescan, covered by vegetation, has been levelled mainly by time and anthropomorphic activities (agriculture/ploughing, archaeological excavations). Comparing VTs commonly used in archaeological contexts to the MSTP image (Figure 16), the latter highlights the topographical context and provides morphological information that was not visible or barely visible using standard VTs.

Advantages and Limitations of MSTP
The study has shown that the MSTP approach performs well for visual interpretation and semiautomatic feature detection. As it focuses on the topographical context of structures rather than the structures themselves, it seems particularly appropriate for the heterogeneous character of Neolithic barrows.
In addition to semi-automatic detection, the visual interpretation offered by MSTP images complements the common relief VTs developed for archaeology. The latter are usually based on fixed scales of analysis (e.g., using a single predefined radius to calculate relief metrics), while MSTP images show topographic details from local to broad perspectives due to their integral image transformation (an effective multi-scale approach). Even when MSTP images are not used alone, they can provide useful archaeological evidence, as demonstrated in Lann Granvillarec.
Despite their advantages for visualisation and interpretation, MSTP images, like most VTs, do not provide topographic measurements. Pixel values in an RGB array are not elevations or slopes. A DTM is still required to comprehensively explore the topographical context. In addition, DEV values (stored as positive and negative floating numbers) must be converted into absolute values before they can be scaled to 0-254 and used as red, green, and blue bands. This makes it impossible to distinguish bumps from holes based only on MSTP colour. Although this limitation can be avoided by blending other layers (Figure 17), it adds another processing step to the workflow.

Advantages and Limitations of MSTP
The study has shown that the MSTP approach performs well for visual interpretation and semi-automatic feature detection. As it focuses on the topographical context of structures rather than the structures themselves, it seems particularly appropriate for the heterogeneous character of Neolithic barrows.
In addition to semi-automatic detection, the visual interpretation offered by MSTP images complements the common relief VTs developed for archaeology. The latter are usually based on fixed scales of analysis (e.g., using a single predefined radius to calculate relief metrics), while MSTP images show topographic details from local to broad perspectives due to their integral image transformation (an effective multi-scale approach). Even when MSTP images are not used alone, they can provide useful archaeological evidence, as demonstrated in Lann Granvillarec.
Despite their advantages for visualisation and interpretation, MSTP images, like most VTs, do not provide topographic measurements. Pixel values in an RGB array are not elevations or slopes. A DTM is still required to comprehensively explore the topographical context. In addition, DEV values (stored as positive and negative floating numbers) must be converted into absolute values before they can be scaled to 0-254 and used as red, green, and blue bands. This makes it impossible to distinguish bumps from holes based only on MSTP colour. Although this limitation can be avoided by blending other layers (Figure 17), it adds another processing step to the workflow.

RF Classifier and Futur Perspectives
The RF technique, which has been used in remote sensing since the early 2000s, was chosen as the classification algorithm for this study, for its simplicity of implementation, robustness, and its capacity to predict probabilities [23,34,35]. The challenge we faced, which was to detect the features of the Neolithic burial mounds, once performed through the multiscale topographic analysis, appeared to be fairly simple. As shown in the results section, the monuments have a particularly dominating position within the neighbouring area at meso and macro scales (i.e., the topographical context of the structure). The class separability is so high that the RF classifier expectedly performed well with a Cohen's kappa coefficient of 0.98 and a precision and recall metrics of 0.98 for the "burial mound" class.
Recently developed Deep Learning (DL) techniques, such as the convolutional neural network (CNN), have shown to be more effective than other classifiers for solving complex image processing tasks [36,37]. However, they require high-performance computing and expensive hyperparameter searches, tuning, and testing [37]. Our objective was to develop a straightforward, yet efficient methodology combining visualisation techniques and supervised feature detection, easily reproducible by archaeologists. Considering the performance of the RF classifier in this specific study, and therefore the low potential of improvement of the classification accuracy, we did not evaluate the CNN's performance. Nevertheless, in a broader context (different archaeological structures and/or areas with different environmental conditions), the use of CNN could be considered and implemented for evaluation and comparison to RF. For example, future work could focus on the use of a deep convolutional autoencoder (DCAE). Such a strategy, successfully applied for synthetic aperture radar (SAR) image classification [38], could possibly be adapted to automatically extract features that represent morphological or topographic anomalies from LiDAR-based elevation data.

Conclusions
LiDAR data are widely used for archaeological prospection; however, the use of MSTP analysis has not been explored for archaeological applications. The preliminary results presented in this article have shown that this approach combined with machine learning techniques can provide useful results and possibilities: • A pertinent and innovative VT for using a LiDAR-derived DTM in archaeology • The ability to help archaeologists improve descriptions (geographic location and extent, morphology) of inventoried structures

RF Classifier and Futur Perspectives
The RF technique, which has been used in remote sensing since the early 2000s, was chosen as the classification algorithm for this study, for its simplicity of implementation, robustness, and its capacity to predict probabilities [23,34,35]. The challenge we faced, which was to detect the features of the Neolithic burial mounds, once performed through the multiscale topographic analysis, appeared to be fairly simple. As shown in the results section, the monuments have a particularly dominating position within the neighbouring area at meso and macro scales (i.e., the topographical context of the structure). The class separability is so high that the RF classifier expectedly performed well with a Cohen's kappa coefficient of 0.98 and a precision and recall metrics of 0.98 for the "burial mound" class.
Recently developed Deep Learning (DL) techniques, such as the convolutional neural network (CNN), have shown to be more effective than other classifiers for solving complex image processing tasks [36,37]. However, they require high-performance computing and expensive hyperparameter searches, tuning, and testing [37]. Our objective was to develop a straightforward, yet efficient methodology combining visualisation techniques and supervised feature detection, easily reproducible by archaeologists. Considering the performance of the RF classifier in this specific study, and therefore the low potential of improvement of the classification accuracy, we did not evaluate the CNN's performance. Nevertheless, in a broader context (different archaeological structures and/or areas with different environmental conditions), the use of CNN could be considered and implemented for evaluation and comparison to RF. For example, future work could focus on the use of a deep convolutional autoencoder (DCAE). Such a strategy, successfully applied for synthetic aperture radar (SAR) image classification [38], could possibly be adapted to automatically extract features that represent morphological or topographic anomalies from LiDAR-based elevation data.

Conclusions
LiDAR data are widely used for archaeological prospection; however, the use of MSTP analysis has not been explored for archaeological applications. The preliminary results presented in this article have shown that this approach combined with machine learning techniques can provide useful results and possibilities: • A pertinent and innovative VT for using a LiDAR-derived DTM in archaeology • The ability to help archaeologists improve descriptions (geographic location and extent, morphology) of inventoried structures • A support for archaeological prospection (through visualisation/interpretation of MSTP images) • A step towards semi-automatic detection by incorporating machine learning (RF) algorithms Although the method was developed on two test areas, the tools and processing workflow are sufficiently operational to study the entire LiDAR dataset covering the Carnac area. Additional terrain surveys are required to completely evaluate the method developed, which may require adjusting the process and refining the model using additional ground-checked samples.
The site-as-points current database offers variable accuracy over the entire "UNESCO area". Most site positions actually correspond to land parcel centroids. The use of LiDAR and the proposed method provides a way to improve the documentation of the sites. Beyond the UNESCO project, this study was carried out within the French Regional Archaeological Service (Drac/SRA), whose objective is to continuously enrich the inventory of archaeological sites to ensure the protection of this heritage.
Although this study focused on archaeological remains in Morbihan, the method developed will be explored in the future to extend the scope of application to a wider range of structures and a variety of archaeological contexts. Additionally, state-of-the-art classification techniques such as Deep Learning (DL), i.e., convolutional neural networks (CNNs), will be evaluated against the Random Forest classifier. Indeed, CNN frameworks like Caffe, Computational Network Toolkit (CNTK), and Torch could be employed at the pixel level via sliding window scanning to produce a confidence map similar to the one we generated for this study.