GEOBIA at the Terapixel Scale: Toward Eﬀicient Mapping of Small Woody Features from Heterogeneous VHR Scenes

: Land cover mapping has beneﬁted a lot from the introduction of the Geographic Object-Based Image Analysis (GEOBIA) paradigm, that allowed to move from a pixelwise analysis to a processing of elements with richer semantic content, namely objects or regions. However, this paradigm requires to deﬁne an appropriate scale, that can be challenging in a large-area study where a wide range of landscapes can be observed. We propose here to conduct the multiscale analysis based on hierarchical representations, from which features known as differential attribute proﬁles are derived over each single pixel. Efﬁcient and scalable algorithms for construction and analysis of such representations, together with an optimized usage of the random forest classiﬁer, provide us with a semi-supervised framework in which a user can drive mapping of elements such as Small Woody Features at a very large area. Indeed, the proposed open-source methodology has been successfully used to derive a part of the High Resolution Layers (HRL) product of the Copernicus Land Monitoring service, thus showing how the GEOBIA framework can be used in a big data scenario made of more than 38,000 Very High Resolution (VHR) satellite images representing more than 120 TB of data.


Introduction
While the GEOBIA paradigm has led to significant improvements in the analysis and understanding of high resolution remote sensing images thanks to the processing of objects (i.e., regions) instead of pixels [1], it still requires to identify the objects (or segment the image into regions) before applying sets of rules for classifying the extracted objects.This segmentation step is not straightforward and often relies on user expertise or empirical tuning to be adapted to each new scene to be processed, even if some automated approaches exists [2,3].Thus, it cannot be used for Big GeoData where large-area analyses require methods that are both very efficient and robust (applicable on different contexts) to the wide variety of scenes to be observed.
We address here these multiple issues by relying, on a multiscale image representation framework.This framework embeds the different (nested) objects in a structure called a morphological tree, with no need of parameter tuning.Computation of such a stack of segmentations benefits from some recent scalable implementations that make realistic their very fast extraction from image datasets covering large areas (>1 million km 2 ) [4,5].In order to avoid confusion in the term "tree" in this document.We will use SWF (Small Woody Features) for vegetation, decision trees for Random Forest and trees for hierarchical representation (or tree structure).Once the tree structure has been extracted, further image analysis is conducted at a very low computational cost, relying on efficient implementation of Differential Attribute Profiles (DAP), that are state-of-the-art engineered features for land cover mapping [6].We benefit from the efficiency of the different steps (tree construction, feature extraction, training, prediction) to propose a semi-supervised strategy [7] where we retrain the model for each kind of landscape, thus allowing to tackle the great variety in appearances of objects at a very large-area (i.e., VHR imagery at Pan-European scale).Due to the low computational cost (e.g., a few minutes for a Pleiades or WorldView-2/3 scene), a user can then interactively improve the classification by updating the reference samples used for training the model.The proposed scalable solution fully relies on open source components (Orfeo ToolBox (https://www.orfeo-toolbox.org)[8], Boost (https://www.boost.org)[9], GDAL (https://www.gdal.org)[10], Shark (http://image.diku.dk/shark/) [11], Triskele (https://sourcesup.renater.fr/triskele)OTB remote module) and so can be used in any GEOBIA application.
To illustrate our methodology, we consider here the mapping of Small Woody Features (SWF), that is to be included as part of a new High-Resolution Layer (HRL (https://land.copernicus.eu/paneuropean/high-resolution-layers))covering the whole of Europe from Iceland to Turkey within the Copernicus Pan-European component of the land monitoring service.Small Woody Features (SWF) represent some of the most stable vegetated linear and small landscape features providing numerous ecological and socio-cultural functions related to soil and water conservation, climate protection and adaptation, biological diversity and cultural identity [12][13][14].Although a single linear feature cannot ensure all these functions on its own, SWF are ecologically significant, structural landscape elements that act as important vectors of biodiversity and provide vital habitats and ecosystem services.Hedgerows and tree groups are linked to landscape richness and fragmentation of habitats with a direct potential for restoration while also contributing to hazards protection and green infrastructure, amongst others.The specific ecological importance of SWF underpins the need for reliable, detailed geospatial information on the occurrence and distribution of linear landscape features.SWF are an elementary part of a landscape's green infrastructure and are therefore addressed through a range of policies and directives like EU's 2020 Biodiversity Strategy [15] and specifically its Target 2 with regard to ecosystem maintenance, restoration and the establishment of a green infrastructure, clearly expresses the requirement for systematic monitoring of such features being crucial for ecosystem condition and the delivery of ecosystem services.
Extracting these objects over such a large area (almost 6 million km 2 ) from VHR imagery brings numerous challenges: large amount of data (greater than 120 TB), large number of individual image scenes (greater than 38,000), diversity of the European landscapes, and need to process these data in a timely manner whilst ensuring a satisfactory degree of accuracy.Existing tools to automatically map the SWF partially overcome the above challenges, but does not have the capability to tackle Big GeoData.Similar to our proposed method, the existing methods rely on spatial features to the model diverse nature of the SWF in addition to spectral features, but our method models the spatial features in a different and more efficient manner.For example, in [16] linear intersect sampling based strategies were used to detect linear features, in [17] linearity feature based on image gradient, and cooccurrence based features were extracted, in [18,19] object-based features were extracted through the image segmentation with different parameters, in [20] Gabor and granulometry features were extracted, and then morphological operators were used, and lastly in [21,22] directional morphological features with different structuring elements were extracted.It is well known that these features do not generalize and cannot be efficiently computed while such requirements are needed for large-scale EO.On the other hand, our proposed method based on the attribute profiles extracts better features, and also can meet requirements of Big GeoData.In addition , existing methods require both foreground and background reference samples to train a supervised classifier, but in practical scenarios, we often only have the foreground objects through the auxiliary maps.To handle this situation where training samples of only one class are available, we can consider a one-class classifier.However, it was shown that such an approach was not mostly reliable [23] in mapping applications.Thus, in our proposed methodology, we devised a simple semi-supervised strategy that automatically selects the diverse background objects in relation with foreground objects, and also extends the reference samples of the foreground object through user interaction.The method introduced here has then major advantages over the existing methods for large area processing.
This paper is organized as follows.We review the proposed methodology in Section 2. The thematic application on SWF mapping is addressed in Section 3, where quantitative results as well as discussion are also provided.Finally we conclude the paper in Section 4.

Method
In this section, we present the proposed methodology with a focus on the overall architecture, before describing in more details the two main components that are feature extraction based on attribute profiles and semi-supervised classification based on random forest.

Overall Architecture
To make the GEOBIA paradigm compliant with very large-area analysis, we propose here to perform a pixelwise analysis of object-based features, in a semi-supervised classification framework instead of the standard application of GEOBIA rulesets over pre-extracted objects.The overall methodology is given in Figure 1.In a first step, the input VHR image is enriched with the computation of some predefined indices to derive new image channels.Among the considered measures, we rely on the well-known and low-cost NDVI (Normalized Difference Vegetation Index) as well as the Sobel gradient for texture characterization ("add and select bands" stage).Indeed, while Haralick features are popular to describe texture, and have been recently coupled with the attribute profile framework [24] to improve characterization of satellite textured images, they rely on a gray level co-occurrence matrix that needs to be computed for multiple distances and orientations ("descriptor generator" stage).The texture information is extracted from each spectral band of the original multispectral image as well as from the NDVI image, thus leading to the production of several Sobel images .Furthermore, we compute single-scale textural features with efficient algorithms based on integral image representations [25], such as Haar-like features and local statistics (mean, standard deviation, entropy).
This first step also allows to derive a binary mask ("border map") that will be useful to discard the pixels flagged as no-data in subsequent processing steps.Such values are given to non-significant pixels after standard pre-processing steps such as ortho-rectification and cloud masking.In our case, they have been assigned a value of zero in all bands.
All bands are used to determine no-data areassince, as already stated, no-data pixels are identified as those having null values in all bands.But we did not need all bands to predict classification.For instance, one can use only NDVI for a woody feature analysis.According to each analysis, some bands are selected.
From the set of selected bands, we then build multiscale representations through the model of morphological trees from which we derive multiscale features called attribute profiles ("AP/DAP generator" stage).Such trees can be seen as a stack of nested segmentations and thus as a generalization of the concept of monoscale segmentation layer in GEOBIA tools.For each hierarchical representation, we measure some attributes (e.g., area, weight, . . . ) for all components, i.e., objects appearing at different scales for a given pixel.These features are thus assigned to each pixel.Computation of trees and attribute profiles is described in more details in Section 2.2.
The next step is the use of the random forest classifier in a semi-supervised framework ("learning" and "prediction" stage).The advantage of using a supervised classifier over predefined rulesets is the ability to adapt to a wide range of landscapes without explicit definition of the properties of mapped objects.However, it also requires labeled samples that describe the sought class and the background.Since labeling a full very high-resolution image is tremendous, we rather introduce a semi-supervised strategy where the training samples are generated by extending the initial sets provided by the user.Given the low computation time of the classification process over the labeled samples, the user can then easily improve the quality of the training set by providing new samples (foreground and background).Once the model is accurate enough, the final prediction is performed.More details are given in Section 2.3.
The overall process is very efficient due to a high-level of parallelism in the different steps.The reader interested in the algorithmic details and the optimization of the overall pipeline is referred to a dedicated paper [7].

Feature Extraction
The GEOBIA paradigm is usually based on some features that are extracted from each single object or region in a segmentation map.Such features describe the object properties such as its shape, spectral, textural content, etc.We propose here to rely on attribute profiles that have been very popular image features in remote sensing [6,26,27].The main difference with the standard GEOBIA workflow consists in the fact that the attribute profiles are measured over each single pixel.One can wonder how such a pixelwise analysis can be compliant with the object-based paradigm.Indeed, while being computed over each single pixel, these features are made from the properties of the objects the pixels belong to in the different segmentations.Feature extraction with attribute profiles can thus be seen as a strategy to derive object-based features in each pixel.It provides a generic framework that allows robust features to be extracted in a very efficient way from input images through their modeling into tree-based representations.
There exist different morphological tree models, and we focus here on the inclusion trees, namely min-tree, max-tree, and tree of shapes.These models describe the level sets of an input image, and the nested segmentations are partial (i.e., for a given scale there could be some parts of an image that do not correspond to any segmented region).A min-tree will highlight the local minima that will correspond to the leaves of the tree.Conversely, a max-tree emphasizes the local maxima in its leaves.These dual representations can be replaced by a self-dual model called the tree of shapes that contain extrema in its leaves.In all cases, the root of the tree is made of the whole image.While we could have also considered partition trees (that would include the standard multiresolution segmentation used in the GEOBIA framework), our choice is motivated by the fact that the features extracted from the tree (attribute profiles) have been extensively built from inclusion trees, and their computation from partition trees remains challenging [28].A recent comprehensive survey on the various tree-based representations is given in [29].
To explain our choice, we provide below a comparison of the different approaches available to analyze an image (see Figure 2).

•
The pixel approach only considers pixels lonely.It directly uses the pixel value for classification before grouping pixels of the same class to define regions.

•
The object approach considers first relations among neighbours pixels to determine a classification.
The class assigned to a pixel also depends on the value of its neighborhood.This step reveals the sets of connected pixels considered as objects.Neighboring objects are merged to produce homogeneous regions.

•
The multiscale approach enhanced the previous one by using a tree structure.The process is composed by: -hierarchy construction.As in the object approach, pixels are considered with their neighbors.The same process is applied to connect objects and to form larger objects.After some iterations all objects are connected to each other and refer the whole image.-pruning projection.This stage selected nodes according to some thresholds (see below).
Using different thresholds lead to different cuts of the tree and thus different image partitions.-feature vector computation.At this stage, all pixels are describe by a feature vector built from the values they were assigned after the different cuts.
A supervised classifier (here Random Forest) is then used on each feature vector to classify all pixels.The results are gathered to obtain the final semantic segmentation or pixelwise classification.The apparent complexity of this last approach is actually not really affecting the computation time.Indeed, efficient algorithms can be employed to provide results of significant quality.
Figure 3 gives more details of the multiscale approach and the link between the hierarchical representation and the computation of attribute profiles.The left side shows objects associated to nodes in the inclusion tree.The objects have a surface (size) and a vegetation indicator (NDVI).We compute attributes for each node (size, mean, localization, . . .).The dashed lines represent where the tree must be pruned according to some thresholds (t1, t2, t3) related to one attribute type (e.g., size).This pruning step leads to projections shown on the middle of the figure.At the root of the tree, the projection corresponds to the whole image.Decreasing the threshold gives a projection where segments are split.For each of the sample pixels (p, q, r, s), we associate a list of node values in the inclusion tree.These values depend on the segments to which the nodes belong within the path from the leaf to the root of the tree.One can assume that the size thresholds could correspond to typical size of plant, SWF, and wood respectively.The list [vegetation, other, other]   So more formally, once a tree is built, the computation of attribute profiles [6] is as follows.First some attributes are measured within each node.These attributes describe the shape, the heterogeneity or any other property of the underlying objects.They can be increasing (such as the area or size of the region) from the leaves to the root, or non-increasing (such as the standard deviation of pixel values, or moment of inertia that characterizes the region shape).A set of thresholds is then defined to filter the tree and to retain selected nodes that have attribute values corresponding to the thresholds.This step called filtering aims to prune the initial tree and to retain a very small subset of nodes.Each pixel belongs to a few of them (actually at most the number of thresholds), and can be characterized by their properties.
In the standard attribute profile framework, this characterization is simply done by considering the gray values of the nodes assuming each spectral band of the original multispectral image is considered independently.While there have been recent extensions to tackle multispectral images [30], or to extract richer features [31,32], we focus here on the original framework that comes with a lower computational cost.Since the filtering may lead to similar images between two successive thresholds, it is often relevant to rely on differential attribute profiles instead of standard attribute profiles.The differential representation is built by computing the difference between two successive values in the filtered tree.
In our scenario, and in order to limit the computational cost, we rely on some efficient implementations of the tree construction and attribute computation steps.Such algorithms have been described in [7] and made available as an open-source library called Triskele (https://sourcesup.renater.fr/triskele) that can be used as a remote module in OTB.Furthermore, during a tuning stage, we can use random forest scoring to determine among the different attributes those that actually contribute to the prediction process.In that case, we limit the computation of full attributes to the subset of pixels relevant in the learning phase.And compute required attributes attributs for all pixels.

Semi-Supervised Classification
Conversely to the standard GEOBIA methodology that makes use of predefined rulesets to be applied on the objects extracted from a prior segmentation, we rather rely here on a supervised classifier.This choice is motivated by the fact that it is not easy to define an appropriate ruleset for the sought objects, given the context of a very large area study where the objects appearance might vary a lot from one landscape to the other.
Among the different supervised classifiers available, we have decided to rely on random forest that has shown great success in remote sensing [33].The random forest is an ensemble method [34] that combines multiple decision trees to increase the robustness of the overall classification process.Each decision tree will operate on a subset of the training samples, with a subset of the available features, and can be used to derive a prediction from an input instance.The set of individual predictions are then gathered through a majority voting procedure.
The random forest classifier is known to be easy to tune with only a few parameters to be set, namely the number of trees in the forest and the number of random variables used in each tree (usually set as the square root of the feature vector length).The number of trees usually depends on the number of classes, the complexity of the feature space, and the possible memory/computation requirements.In order to speed-up the process, we opt for a lower number of trees, i.e., 100.Furthermore, random forest comes with parallel, scalable, open-source implementations such as the Shark library (http://image.diku.dk/shark).We use here this library, that has been also recently embedded into the OTB framework (https://www.orfeo-toolbox.org).
Another advantage of the random forest classifier is its ability to measure the importance of the different features.Indeed, it is possible to assess the role of each individual feature in the ensemble method (in other words, how many times it has been used to derive the prediction).We thus allow the user to select the appropriate features when dynamically adapting its training set.With a lower number of more discriminant features, we expect to both achieve higher accuracy and lower computational cost.
As already indicated, we consider here the random forest classifier in a semi-supervised framework.As mentioned earlier for the SWF mapping, we have only access to the labeled samples provided by the user to characterize foreground (class of interest, i.e., SWF), and no reference samples are available for the background (other classes).To handle this, from the labeled foreground samples and the randomly selected samples for the background (that might include also the foreground classes), the automatic strategy is devised to select the diverse samples for the background classes.The selection of background is based on the distance function to the mean of the foreground class, and the samples which are closer to the foreground mean are removed from the original background set.Thus, now we have foreground and initial background reference samples to train the random forest classifier.Here, we only consider a random subset (defined by f g_rate and bg_rate parameters) of training samples to limit the computation time.
In order to alleviate the negative effect brought by errors in the training set (that cannot be considered perfect), we also allow to discard positive samples that led to a low confidence score.Besides, the background set has to be heterogeneous to adequately represent all classes in the scene but the sought one.While the user can provide such samples, it barely corresponds to all background classes.Thus, we allow for automatic refinement of background pixels among those that the random forest classifier assigns to the sought class with a very low confidence.Let us note that this strategy uses only the part of the reference dataset involved in training, and that final accuracy assessment will be conducted on the complementary validation set in Section 3.2.
Thanks to the low computational time of the random forest learning step, it is possible for the user to judge the quality of the prediction over the training set, and to adapt the content of the set if necessary.Once the prediction model is satisfactory, it can be applied over the whole image to get the final map.

Context
In past years, remote sensing has been increasingly acknowledged to provide objective and cost-efficient approaches for mapping of small landscape elements, however, there is still no consistent inventory of SWF throughout Europe.Through initiatives such as Coordination of information on the environment (CORINE) Land Cover, and even more since the start of Copernicus and its Initial Operations with the High Resolution Layers (HRL) and the Urban Atlas, Europe has significantly improved its knowledge base on land cover/use and vegetation patterns based on EO data.While the overall landscape heterogeneity is defined by the spatial arrangement of homogeneous land cover patches, as measured by the Copernicus continental land monitoring component, its interconnections are constituted by linear structures that portray the joint role of nature and mankind in shaping the countryside [35].Both the spatial arrangement of land cover and the presence of linear structures are the two most relevant elements characterizing landscape structures [36].Geospatial information on SWF is however still lacking and only available in the form of limited national investigations mostly with a focus on farmland features [37] or other thematically focused small-scale landscape inventories, e.g., fragmentation studies such as [38].
The only quantitative information on Pan-European level is available through ground observations from the LUCAS (Land Use/Cover Area frame Statistical survey) database [39].Recent studies such as [13] have derived density maps of the spatial distribution of linear landscape elements for Europe based on spatial interpolation of LUCAS data, but the resolution of such information (1 km) is too coarse for detailed assessments and does not provide factual quantitative information on their location and extent.
As part of the Pan-European Copernicus Land Monitoring Service, the High Resolution Layers (HRLs) provide maps of multi-temporal land cover characteristics for 5 thematic areas including SWF in a consistent manner for 39 European countries (EEA39 with more than 6 million km 2 ).The SWF Pan-European map is a completely new product as part of HRLs for the 2015 reference year, which is based on extended demonstrated expertise in the production of HRLs at Pan-European level [40] and with dedicated exploratory work specifically on SWF [41].The mapping of SWF is using Very High Resolution (VHR) data as primary input with a Pan-European coverage as well as in-situ data.The VHR_IMAGE_2015 dataset made available in the ESA Copernicus Data Warehouse (DWH) is the main data source for the detection of Small Woody Features identifiable within the given image resolution (≤1 m panchromatic, 2-4 m multi-spectral).This dataset includes more than 38,000 VHR images which correspond to more than 120 TeraBytes of data.
The main difficulty when dealing with VHR images comes from the internal variability of the information for a single land-use.For instance, woody elements are represented by a high number of heterogeneous pixel values hampering usual pixel-based techniques.Nevertheless, even though object-based image analysis (OBIA) appears to be the most suitable approach to delineate SWF with VHR images, it can potentially represent some serious drawbacks related to the heterogeneous size and shape of SWF objects and the difficulty to determine suitable segmentation parameters [22].In addition, for very small objects close to the resolution of the imagery, the segmentation can lead to separate SWF or non-SWF objects to be merged together due to mixed pixel values.This makes it very challenging to define a suitable segmentation scale in different landscapes particularly if it is to be applied for the EEA39 area.Therefore, a multi-scale approach conducted both at pixel and object level is suggested to ensure the correct identification of small and irregular shaped SWF (pixel based) and larger SWF (OBIA) such as small patches of trees or scrub or larger hedges.

Experiments
A dedicated processing chain has been developed and implemented in order to process large dataset of VHR images (>38,000 scenes) to produce the SWF layer.The computing time has been measured considering a dedicated server infrastructure with high computing capabilities (Bi-CPU Xeon, 24 cores).The workflow is shortly described as follows: (1) VHR image preprocessing (radiometric and geometric corrections, cloud masking, pan-sharpening), (2) reference database preparation (extraction of SWF reference from previous reference datasets and automatic verification), (3) automatic supervised classification, (4) post-processing (vector smoothing and differentiation between linear polygons and patches).As part of the SWF map production, other post processing steps are applied by photo-interprets such as thematic manual enhancement and random manual validation.
It is noted that only for the sake of completeness of the SWF production workflow, we included the post-processing step, however we did not use this step in production of classification maps and accuracy assessment in the rest of paper.
We have presented (Section 2.2) the possibility to dynamically determine efficient attributes for random forest prediction.Such analysis provides a reduced set of of attributes (w.r.t. the full set of attributes initially available in the process).Since we are dealing with VHR images with submetric spatial resolution, we use the following set of thresholds when computing the area profile: 1000, 2500, 5000, 7500 pixels.
This paper focuses on the automatic classification step, which builds from each VHR image a map of woody and non-woody vegetation.The overall methodology to produce such a map has been provided in Section 2.

Data
While the proposed strategy has been applied at the Pan-European scale over 38,000 VHR scenes, we conduct evaluation only on a small subset for which ground truth reference data is available.More precisely, we consider an evaluation dataset covering two study sites, one in Romania (Large Region 9-LR09) and one in Germany (LR61).The evaluation dataset contains 37 VHR images in total which covers respectively about 7000 and 10,200 km 2 .The images were acquired by different satellites, Pleiades (https://www.satimagingcorp.com/satellite-sensors/pleiades-1/)and WorldView (https://www.satimagingcorp.com/satellite-sensors/worldview-2/),but all scenes include 4 spectral bands (Blue, Green, Red and Near-InfraRed) and have 2m spatial resolution.The area covered by the images varies from 192 to 1223 km 2 (ca.48 to 306 Mpixels).
As introduced before, the reference dataset is automatically built from extraction and validation of SWF elements in previous reference datasets: the Copernicus HRL Tree Cover Density Layer (TCD), the Riparian Zone and Natura 2000 dataset from the Copernicus Local components and LUCAS database.The automatic classification is then applied using 70% of the reference sample dataset for the training and the remaining 30% are used for the accuracy assessment in order to derive quantitative evaluation measures.

Comparison with Other Classification Methods
First, we present a comparison of classification method results over one site in Romania (id c55a).For this comparison, we used state-of-the-art classification methods with implementations from the Orfeo toolbox (https://www.orfeo-toolbox.org/CookBook/Applications/app_TrainImagesClassifier.html) for pixel-wise methods and eCognition solution (http://www.ecognition.com/)for the object based approach.Automatic and basic tuning is used for each method.We compare these results with our approach using computation time (with training and prediction steps for pixel-wise approach and segmentation/tree construction for GEOBIA method) and classification accuracies for the SWF class.All these classifications are made with the 4 spectral bands and the NDVI synthetic product as an input.For eCognition solution, a segmentation is first computed before a Random Forest classification is performed.Our approach is summarized in Figure 1.
Results are presented in Table 1.Our approach is more accurate than all the pixel-wise methods and has a better producer accuracy than E-cognition c solution.In terms of computation time, our approach is faster than the other object based approach.

Production Results
Tables 2 and 3 show the classification results for each analyzed VHR image in terms of processing time and classification accuracy.Whereas the processing time for classifying a Pleiades image (feature extraction, training + classification) on the Romanian site is around 5 minutes, this time is double for the Worldview images on the German site which cover around twice the Pleaides area (481 versus 867 km 2 ).Given such reasonable processing times, a Large Region of about 40,000 km 2 can be processed within one day considering an image overlap of about 50%.Such a low processing time allows the methodology to be applied in a production mode at the Pan-European scale.
The classification quality is measured by the Producer Accuracy (P.A., related to the omission errors) and the User Accuracy (U.A., related to the commission errors).The number of reference pixels gives an indication of the confidence of the accuracy figure (low reference number means less confidence).
On Romanian site (Table 2), the P.A. ranges from 77 to 99% whereas U.A. from 83 to 100%.The average P.A. and U.A is respectively 89.4% and 95.7%.On German site (Table 3), the P.A. ranges from 84 to 95% whereas U.A. from 75 to 99%.The average P.A. and U.A is respectively 89.9% and 91.4%.For both sites, the classification accuracy is high (>80%) even if the accuracy figures are slightly higher for the Romanian site.This can be explained by the higher commission errors due to highly vegetated agricultural fields on the German site.Visual results are located in Figure 4 and presented over Romania and Germany in Figures 5 and 6 respectively.These results only reflect the classification outputs.Automatic post processing is then applied to obtain the deliverable SWF map.

Discussion
Our approach is used for the production of the Pan-European HRL Small Woody Features map with 38,000 VHR scenes.In progress production on larger area over Europe (nearly 800,000 km 2 with 6000 scenes) gives acceptable results with a producer accuracy of 82.8 ± 1.51% and a user accuracy of 86.3 ± 0.86%.Our approach is flexible and robust as it gives nearly constant classification accuracies over Europe.Though we have produced acceptable classification accuracies, it can be further improved.To do so, one of the limiting factor is to have access to accurately labeled foreground reference samples.However, in practice we do not have access to them, because we are relying on the previous databases and existing auxiliary maps to derive the labeled reference samples.It is known that this could introduce label noise or mislabeled samples in the training set due to several factors such as misregistration, out-dated maps and databases, etc.Thus, in a future work we would like to consider label noise robust classification methods [42,43] to improve the classification performance, and also to incorporate more valuable information and efficient techniques to further reduce the computation time while still increasing the classification accuracy.

Conclusions
In this paper, we have presented a use case where the GEOBIA methodology has been conducted at a very large scale, i.e., over more than 38,000 scenes and 120 TB.To address the wide range of landscapes encountered at the Pan-European scale, we propose to rely on multiscale image representations known as morphological trees such as min-tree, max-tree or tree of shapes.Once built, such representations allow us to efficiently derive some image features that are fed into a random forest classifier.Thanks to the low computational cost of all the individual steps of the overall process (tree construction, feature extraction, supervised learning and prediction), it is possible for the user to optimise its input (training samples) and maximise classification accuracy by rapidly assessing the results and updating the classification model if necessary.The presented methodology has been validated through quantitative accuracy assessment on two study sites in Romania and Germany, and has been applied on a very large area use case, namely the mapping of Small Woody Features for the High Resolution Layers

Figure 1 .
Figure 1.General flowchart of the proposed approach.

Figure 3 .
Figure 3. Attribute Profile features extracted from inclusion the tree.

Figure 4 .
Figure 4. Pan-European map of the 38,000 scene footprints (red boxes) used for the Copernicus High Resolution Layer (HRL) Swall Woody Features (SWF) production.Two datasets are used for the experiments: (a) the 17 Worldview scenes over the LR61 study site (Germany, 10,200 km 2 ) and (b) the 20 Pleiades scenes over the LR09 study site (Romania, 7000 km 2 ).Scene illustrations are presented in false color composition: (Near infrared, Red, Green) as RGB.

Figure 5 .
Figure 5. Illustration of classification results: (a) false color composition of Pleiades scene #c55a over LR09 study site in Romania and (b) the same false color composition with superimposition of the Small Woody Features layer classification output (green).

Figure 6 .
Figure 6.Illustration of classification results: (a) false color composition of Worldview scene #400B over LR61 study site in Germany and (b) the same false color composition with superimposition of the Small Woody Features layer classification output (green).
relates to plant; the list [vegetation, vegetation, other] indicates SWF; the list [vegetation, vegetation, vegetation] is considered as wood.

Table 1 .
Comparison of state-of-the-art classification methods on one Pleiades scene (#c55a).Computation times are presented in seconds and reflect the whole classification process with training and prediction steps for pixelwise approach, and segmentation/tree construction for GEOBIA methods.P.A. stands for Producer Accuracy and U.A. stands for User Accuracy.

Table 2 .
List of the 20 analyzed Pleiades VHR images covering the Romanian study site (LR09) with their characteristics (identifier, acquisition date, and covered area), and classification results with processing time and accuracies.P.A. stands for Producer Accuracy and U.A. for User Accuracy.Classification accuracies are presented in percentages and pixels counts.

Table 3 .
List of the 17 analysed Worldview-2/3 VHR images covering the German study site (LR61) with their characteristics (identifier, acquisition date, and covered area), and classification results with processing time and accuracies.P.A. stands for Producer Accuracy and U.A. for User Accuracy.Classification accuracies are presented in percentages and pixels counts.