Functional Analysis for Habitat Mapping in a Special Area of Conservation Using Sentinel-2 Time-Series Data

The mapping and monitoring of natural and semi-natural habitats are crucial activities and are regulated by European policies and regulations, such as the 92/43/EEC. In the Mediterranean area, which is characterized by high vegetational and environmental diversity, the mapping and monitoring of habitats are particularly difficult and often exclusively based on in situ observations. In this scenario, it is necessary to automate the generation of updated maps to support the decisions of policy makers. At present, the availability of high spatiotemporal resolution data provides new possibilities for improving the mapping and monitoring of habitats. In this work, we present a methodology that, starting from remotely sensed time-series data, generates habitat maps using supervised classification supported by Functional Data Analysis. We constructed the methodology using Sentinel-2 data in the Mediterranean Special Area of Conservation “Gola di Frasassi” (Code: IT5320003). In particular, the training set uses 308 field plots with 11 target classes (five forests, two shrubs, one grassland, one mosaic, one extensive crop, and one urban land). Starting from vegetation index time-series data, Functional Principal Component Analysis was applied to derive FPCA scores and components. In particular, in the classification stage, the FPCA scores are considered as features. The obtained results out-performed a previous map derived from photo-interpretation by domain experts. We obtained an overall accuracy of 85.58% using vegetation index time-series, topography, and lithology data. The main advantages of the proposed approach are the capability to efficiently compress high dimensional data (dense remote-sensing time series) providing results in a compact way (e.g., FPCA scores and mean seasonal time profiles) that: (i) facilitate the link between remote sensing with habitat mapping and monitoring and their ecological interpretation and (ii) could be complementary to species-based approaches in plant community ecology and phytosociology.


Introduction
Phytosociology (i.e., the Floristic-Sociologic Approach to Vegetation Classification [1]) is the most widespread method of studying and classifying vegetation in Europe [2,3]. Plant communities, called plant associations, are the most detailed discrete units recognized in phytosociology. These plant associations, recurring in space and time, are characterized by a distinct floristic composition that reflects the current and past ecological-environmental conditions (e.g., bioclimatic features, biogeography, topographic conditions, lithology, and land-uses) [4].
Plant associations and the higher levels of phytosociological vegetation classification (classes, orders, and alliances) are useful for the diagnosis of most natural and semi-natural habitats listed in Annex I of the Habitats Directive [5,6]. Plant association mapping allows for understanding of the spatial distribution of the habitats and, if repeated over time, to evaluate and monitor their conservation status. Therefore, these activities are useful for achieving the goals set out in European environmental policies (e.g., Directive 92/43/CEE) [7][8][9][10]. Due to the high heterogeneity, fragmentation, and diversity of vegetation, the mapping and monitoring of Mediterranean plant associations and habitats are particularly challenging, when relying only on in situ observations [10][11][12].
This challenge can be supported by remote-sensing data, which, combined with field plots, enables the accurate mapping of plant communities and habitats [13][14][15][16]. The processing of remotely sensed data provides a better understanding of the conservation status, which is crucial in helping decision makers to plan and evaluate the impacts of management practices [9,[17][18][19][20].
Mapping based on remote data is effective and accurate, if multi-temporal data are used to capture the seasonal variations of the spectral reflectance, in relation to the different phenological stages of the vegetation (i.e., vegetation seasonality) [11,14,16,[21][22][23][24]. In contrast, mapping based on the spectral information of a single scene (i.e., single-date mapping) can be problematic, as distinct types of vegetation and habitats could have high spectral similarities at a given time of the season [25,26], even in multi-spectral data scenarios.
The growing availability of remote-sensing open data (e.g., Landsat, Sentinel, and MODIS) in formats usable by non-specialists (see, e.g., [27]) has facilitated the use of multiple images and dense time-series for various purposes. Dense time-series can capture the variations (both seasonal and inter-annual) of reflectance spectral bands or vegetation indices (e.g., NDVI) and are useful for the characterization of land-cover, ecosystem phenology, and landscape dynamics [28].
The use of dense time-series (>30 images per year) significantly improves the discrimination and mapping of natural vegetation types compared to using multi-temporal imagery (consisting of only a few images per year) [26,29]. Lopes et al. [26] considered the analysis of dense time-series as the new norm for tracking changes in the distribution of natural vegetation from space. Furthermore, considering entire archives of remotely sensed images (e.g., Landsat and Sentinel-2) as a cohesive temporal record, rather than as a series of individual images, provides new ecologically relevant perspectives [30].
The analysis of remotely sensed time-series plays a key role, and Functional Data Analysis (FDA) represents a reliable opportunity to analyze temporal ecological data, such as remotely sensed time-series [31]. FDA methods consider a time-series or a series (stack) of a remotely sensed images as a unique and cohesive temporal record. The basic philosophy of FDA is to consider observed data functions as single entities, rather than merely as a sequence of individual observations. The pixels to be processed are treated as temporal curves and, hence, as functions.
One of the most popular FDA methods is Functional Principal Component Analysis (FPCA) [32]. FPCA is a reduction tool that adapts traditional PCA concepts to functions. FPCA, unlike PCA, preserves the order in the data (e.g., the chronological order in the time-series) [33]. FDA techniques have been gaining in popularity in several scientific fields, ranging from genomics to finance [34]. This tool is still rarely used in remote sensing and ecology, despite some promising applications.
Li et al. [35] adopted the FPCA to classify Hyperspectral Images, considering that FDA treats multivariate data as continuous functions. Hurley et al. [31] applied FPCA to NDVI time-series to test the effects of spring and autumn phenology on the winter survival of mule deer fawns. The analysis of dense remotely sensing time-series with the functional approach has recently opened new avenues for plant association and habitat mapping as well as for the integration of remote sensing and phytosociology.
Pesaresi et al. [36] used the main seasonal NDVI variations, extracted from NDVI Landsat 8 dense time-series by FPCA, to recognize and characterize forest plant associations identified on the ground by the phytosociological approach; while, in Pesaresi et al. [37], the main seasonal NDVI variations were used (as input data and spatial predictors) together with a Random Forest classifier to map four forest plant associations in a coastal Mediterranean area of central Italy, obtaining a global accuracy of approximately 87%.
In this work, we evaluate the performance and applicability of the methodology proposed by Pesaresi et al. [37] in order to map the plant associations and habitats of a Special Area of Conservation (SAC 'IT5320003'-Gola di Frasassi), located in a mountainous sector of central Italy and characterized by high vegetation, topographic, and lithological diversity.
To achieve this goal, we applied: (i) FPCA to Sentinel-2 dense time-series of different spectral-band-based vegetation indices (e.g., NDVI, NDWI) in order to identify the main seasonal spectral reflectance variations (FPCA components); (ii) (classical) PCA to a set of topographic terrain parameters (extracted from a Digital Elevation Model) and to lithological types in order to identify the main topographic and lithological features (PCA components); and, finally, (iii) supervised classification (by Random Forest) of the FPCA and PCA components in order to map the plant associations and habitats.

Materials and Methods
In this section, we provide details regarding the materials and methods of our approach. Figure 1 shows the adopted pipeline, which processes Sentinel-2 time-series data to derive maps of plant associations and habitats.

Study Site
The study area is the Special Area of Conservation (SAC) Gola di Frasassi (also known as the Gorge of Frasassi; code IT5320003). It is located in the mountainous area of the central Apennines (between latitudes of 43 •  The area is characterized by a high topographical and lithological heterogeneity. It encompasses an area of about 728 hectares ( Figure 2). The average annual precipitation is 1115 mm, while the average annual temperature is 12.7 • C. According to the bioclimatic classification of Rivas-Martinez et al. [38], the area belongs to the temperate macrobioclimate with a weak sub-Mediterranean level, which indicates low summer aridity [39].

Target Classes and Reference Data
The following section describes the plant associations, recognized through the floristicsociological approach of Braun-Blanquet and the corresponding habitats of Directive 92/43/EEC. Woodland: (i) Holm-oak woods (habitat 9340 "Quercus ilex and Quercus rotundifolia forests"). Appenninic holm-oak woods differ from coastal or sub-coastal ones as they are richer in deciduous species (e.g., Fraxinus ornus L., Acer opalus subsp. obtusatum (Waldst. and Kit. ex Willd.) Gams, Sorbus domestica L., S. torminalis (L.) Crantz, Ostrya carpinifolia Scop.) in the tree layer and, in the understory, mesophilous species occur, such as Melica uniflora Retz., Hepatica nobilis Mill., and some orchids, among others. These species are typical of the deciduous woods surrounding holm-oak woods and easily enter in these as well.
At the same time, the most thermophilic species are lost or become rare (e.g., Rosa sempervirens L., Rhamnus alaternus L., and Pistacia lentiscus L.), as they cannot bear the rigors of the winter season. Therefore, they are classified in the association named Cephalanthero longifoliae-Quercetum ilicis, which represents the most mesophilous community of holm-oak woods, at least in the central Apennines [40]. They are widespread throughout the area, above and all along the vertical walls of the rocky gorge.
(iii) Black hornbeam forests occurring in the study area are classified within the association Scutellario columnae-Ostryetum carpinifoliae, which has a very large ecological amplitude, as it occurs in different ecological conditions. Indeed, on rocky slopes, it hosts species of the Mediterranean biocora, such as Q. ilex while, in mesic environments, it is richer in nemoral species of the Temperate biocora (e.g., Fagus sylvatica L. and S. torminalis). Black hornbeam forests are not recognized as habitats by the European Directive.
(iv) Riparian woods occur along the Sentino river flowing in the lower part of the rocky gorge. These are linear wood formations of white willow (Salix alba L.; of the association Rubo ulmifolii-Salicetum albae) growing along the river banks and of black poplar (Populus nigra L.) in the alluvial terrace above (Salici-Populetum nigrae). Both communities belong to the habitat 92A0 "Salix alba and Populus alba galleries".
(v) Pinus sp. pl. plantation. In the Apennines, huge plantations of Pinus sp. pl. and other conifers have been found in the past few centuries. In the study area, Pinus nigra ssp. nigra Arnold and P. halepensis Mill. plantations are common throughout the whole area, especially on eroded soil along the steepest slopes [43].
Shrublands: These are widespread throughout the study area, occurring on abandoned grasslands as dynamic stages of wood recolonization. In areas with deeper soils, the main common species is Spartium junceum L. and, secondarily, C. sessilifolium. Together, they form the association Spartio juncei-Cytisetum sessilifolii, which is widespread in the central Apennines. On thinner and poorer soils, the dominant species is Juniperus oxycedrus, constituting wide shrublands in the area. They are framed in the same association as the previous one but with the J. oxycedrus variant [44].
Grasslands: While grasslands are not particularly widespread in the area, the most consistent nucleus is located at the top of Mount Valmontagnana, where cattle breeding is still practiced. However, these are rather sparse and arid meadows, due to the strong stonyness of the substrate and the low depth of the soil. The reference association is Asperulo purpureae-Brometum erecti, which belongs to priority habitat 6210(*) "Semi-natural dry grasslands and scrubland facies on calcareous substrates (Festuco-Brometalia) (*important orchid sites)" [45].
Garrigues and vegetation of rock and scree: These are strictly interpenetrated and, therefore, have been mapped as a mosaic. Garrigues are formed by dwarf chamaephytes, such as Satureja montana L., Fumana thymifolia (L.) Spach ex Webb, and Thymus sp. pl., occurring on eroded soil and rocky outcrops where they form the association Cephalario leucanthae-Saturejetum montanae [45,46]. Habitats 6110 "Rupicolous calcareous or basophilic grasslands of the Alysso-Sedion albi" and 6220 "Pseudo-steppe with grasses and annuals of the Thero-Brachypodietea" are present in the clearings of this vegetation.
On shady and wet rocky walls, particularly along the gorge's wall, chasmophytic vegetation occurs (habitat 8210 "Calcareous rocky slopes with chasmophytic vegetation"), belonging to the association Moehringio papulosae-Potentilletum caulescentis. Other mapped typologies are crop lands, mainly represented by cereal crops, and urban land comprised of small towns, villages, and scattered houses.
The collected reference data, distributed over the study area ( Figure 2), was based on 308 plots and used as training data for supervised plant association mapping (see Table 1). Phyto-sociological surveys were performed in 74 plots (during 2018 and 2019) in order to identify the different plant associations and their mean specific composition. Subsequently, another 94 plots, through an expert-based similarity comparison in the field, were assigned to the identified plant associations. Finally, 140 plots were assigned to the target classes by visual interpretation of Google Earth imagery. The data set and R code used in the study are available from [47]. Table 1. Reference data. Target classes for the supervised map are listed. For plant associations, we report the syntaxa name and the corresponding habitat code (Annex 1 of the European Union Habitats Directive). The * denotes a priority habitat. In the case of 6210 if is an important orchid sites (*).

Topographic and Lithological Factors
Two different groups of ancillary features were included in the classifications: topography and lithology. Topography and lithology are natural factors that influence the composition of plant communities, as well as their spatial distribution [48]. Table 2 summarizes the terrain topographic parameters (quantitative data), extracted from a DEM with 30 m resolution (the DEM was derived from the NASA Shuttle Radar Topography Mission (SRTM)) and the lithological types (qualitative data derived from the lithological map of the Marche Region [49]).
Lithology and terrain parameters were processed by the Principal Component Analysis for mixed data (quantitative and qualitative) (dudihillsmith() of ade4 [50]). The obtained PCA scores were used as input (predictors) for modeling plant associations and habitats using a supervised Random Forest (RF) algorithm.

Remote Sensing Time-Series
L2A Sentinel-2 images were collected using the Sen2r [51] package. We considered scenes (from April 2017 to April 2020) with a cloud cover in the areas of interest lower than 25%. The clouds were masked and the images co-registered. A spatial resolution of 20 m was used; this value is suitable for forest plant communities studies (in our case, forests had almost full-coverage of the study area). The four spectral bands at 10 m (blue B2: 497 nm, green B3: 560 nm, red B4: 664 nm, and infrared B8: 835 nm) were re-sampled to 20 m. We obtained a stack of 93 Sentinel-2 pre-processed, co-registered images with 20 m spatial resolution.
We considered a set of vegetation indexes (see Table 3) to capture seasonal variation in the VIS-NIR range. The selected (six) vegetation indexes were applied to the 93 images stack obtaining a data-cube (6 × 93). Tables A2 and A3 summarize the list of distributions and the list of the Sentinel-2 selected images for this study. The collected scenes cover a multi-year period. We grouped/sorted them by their Day of the Year (DoY) to derive a dense annual time-series (e.g., [59,60]). We identified and removed the outlier values (function clean.ts() of the R package "forecast"; [61,62]) for each pixel, and then we aggregated (average) the DoY values in weekly values . We applied the Generalized Additive Model (GAM) [32] to smooth data obtaining a cubic spline representation of the pixel based time-series. We finally obtained six stack (vegetation indexes) of 52 rasters (weeks) that represent the spectra of all pixels over a year ( Figure A2). The six time-series generated were then used to feed the FPCA analysis to enable classification using the FPCA scores.

Functional Principal Component Analysis (FPCA)
Functional Data Analysis (FDA) considers data as continuous functions. Therefore, each single pixel-based time-series was considered as a single entity (i.e., continuous function), rather than a mere sequence of discrete observations (e.g., a vector with 52 values as the number of weeks in a year).
Considering that the entire time-series of a pixel is a statistical unit, FPCA treats the entire raster stack (data cube) as a single object, consisting essentially of as many functions (trajectories of variation over time) as there are pixels in the study area. As in PCA, FPCA compresses information on an ortho-normal basis. The main difference is the capability of preserving the functional structure (i.e., the chronological order, in our case) of the analyzed data [33]. FPCA, similarly to PCA, provides eigenvalues that describe the variation explained by each component. The FPCA scores quantify the similarities between the functions (time-series based on pixels), while the eigenfunctions represent the main modes of variation of the data.
FPCA scores are a core aspect of this study, as they can be used in subsequent analyses (e.g., correlation analysis, functional data clustering [63], and supervised classification [35]). Using FPCA scores, it is possible to generate a reduced space where the functions (here, pixel-based time-series) can be plotted in an ordered way, thus, supporting domain experts in ecological interpretation [31]. The scores could be also used to fit the plant association seasonal temporal profiles (curves); in particular, in this study, we used the CreatePathPlot() function of the fdapace package [64] for this purpose.
This package was also used to perform six distinct FPCAs, one for each pixel-based vegetation index time-series. The pixel-based time-series were decomposed into their main variation modes over the year (FPCA components), and the related scores (i.e., pixel-based FPCA scores) were obtained. The FPCA scores were then used as inputs (predictors) to create plant associations and habitat models using a supervised algorithm (i.e., Random Forest).

Supervised Mapping Using Random Forest
RF is an ensemble learning classifier [65] that is often used in habitat mapping studies based on remote sensing. Belgiu and Drȃguţ [66] presented a detailed review of RF and its efficiency in remote sensing. In RF, it is necessary to adjust certain parameters and aspects, such as: (i) the number of trees (the ntree parameter) that will be created by randomly selecting the samples from the training samples and (ii) the number of variables used for the division of the tree nodes (the mtry parameter).
We set ntree to 1500 and evaluated mtry from 1 to the square root of the number of input variables, as is typically done [66]. An RF can be biased if the proportions of training and validation data are unbalanced (as was the case in this study; see Table 1). This aspect could present issues and could lead to over-prediction of the majority classes and under-prediction of the minority classes. Over-and under-sampling can be used to produce more balanced data sets [67].
In order to make the frequency of the majority class closer to that of the rarest class, we incorporated down-sampling in the RF, when evaluating the training data subset (subset parameter). We set subset = 11 * nmin, where 11 is the number of target classes and nmin is the minimum between target classes (Table 1). This means that the RF model with ntree = 1500 can obtain a balanced random sample without a loss of information for the majority classes.
We considered two main groups of features: • pixel-based FPCA scores, representing the main seasonal (intra-annual) variations of the time-series; • pixel-based PCA scores, representing the topographic and lithological features.
RF models were constructed using the predictors both individually and jointly in order to evaluate and compare their contributions to the final accuracy of the map. We preliminarily applied Recursive Feature Elimination (RFE) to consider only the most discriminating predictors and reduce the (possibly high) number of dimensions. Mappings based on random forest models and accuracies were performed and assessed using the packages raster [68] and caret [69].

Classification and Mapping Accuracy Assessment
The mapping accuracy (classification) was evaluated using the Overall Accuracy (OA), Producer Accuracy (PA), and User Accuracy (UA) metrics [70], as well as the κ coefficient [71]. We executed 10-fold cross-validation five times in order to calibrate the model and provide a robust estimate of the accuracy, thus limiting any potential bias. We provide the mean OA and κ index (and their respective Standard Deviations) and a cross-validated confusion matrix (representing the error distribution for class among the five repeats).
We also performed a comparison between the reference data and a 1:10,000 scale vegetation map. The previous map was derived from acquisitions performed in 2009 and published online on a webGIS platform in 2014 (SIT-REM webGIS available here: http://sitbiodiversita.ambiente.marche.it/sitrem/, accessedon 10 February 2022) [72]. The types of vegetation and habitats between the different maps were appropriately harmonized (in terms of classes). We tested the matching between the maps and defined the respective levels of agreement through the κ statistics.

Main Seasonal Variations of Time-Series
FPCA was used to extract the orthogonal main modes of variation (FPCA components) from the multi-index Sentinel-2 time-series. Figure A1 shows the number of identified functional components (eight for the MCARI time-series and seven for all of the other time-series) and their fraction of variation explained (i.e., eigenvalues). FPCA components, in practice, represent the amount of deviation from the overall mean of the time-series, thus, reflecting the main contrasting modes of variation throughout the year.
In all of the considered cases, the two first FPCA components explained about 90% of the total variation (NDVI: FPCA1 66 Figure A1). Figure A2 shows the temporal and spatial patterns of the two first FPCA components extracted from the 18,631 weekly pixel-based time-series (we considered all of the pixels belonging to the study area), while Figure A3 shows the mean seasonal profiles of plant associations and habitats. These profiles describe, in a compact and easily readable manner, the distinct phenological behaviors in relation to the different vegetation indices presented in this work (see Table 3).

Plant Community Modeling Using the Main Topographic-Lithological and Phenological Predictors
In this sub-section, we provide results from the RF models performed using both individual and joint predictors, where the pixel-based FPCA scores summarize the main seasonal (i.e., intra-annual) multi-spectral variations, while the pixel-based PCA scores summarize the topographic and lithological predictors. Table A1 summarizes the global, producer, and user accuracies.
The RF model based solely on topographical predictors achieved an overall accuracy of 59.25% (±8.32) with k = 0.53 (±0.09). Only Riparian woods, Juniperus oxycedrus shrubs, and Urban land achieved producer accuracies above 80%. All others showed very low performance (see Table A1). This demonstrates that it is necessary to consider and integrate different predictors.
In particular, as introduced in the previous sections, we evaluated several vegetation indices that have been widely used in related works. Each model (ranging from b to h) in Table A1) highlights the different behaviors for each class, mainly related to a certain vegetation index. The NDVI index for a given class performed better than other vegetation indices (e.g., class 4 NDVI and GNDVI).
However, when combining all time-series while also including the topography, the overall accuracy reached 85.5%-also showing optimal performances in terms of UA and PA. This suggests that each vegetation index contributes in a different (mostly constructive) way to accurately and precisely predict the target class, while non-radiometric features, such as the topographic and lithological predictors, also contributed to the overall performance (see columns i and l of Table A1).

Comparison of Obtained Results with Ancillary Data
As mentioned in the previous section, we also performed a comparison with an existing map (Figure 3b). The latest official map (produced by an expert through photointerpretation) [72], compared with the reference data, showed an OA of 59.42% and k = 0.54. The best PAs were those of the Bromus erectus Huds. grasslands, the riparian woods, the Pinus sp. plantations, and the Juniperus oxycedrus shrubs, while various classes were less than 60% (e.g., Holm-oak, Downy-oak, and black hornbeam woods; see Table A1).
Both maps showed a similar trend over large areas, even if the supervised approach (Figure 3a) was able to output a more accurate map. Changes over time were negligible, considering that the intrinsic dynamics are significantly slower than the span of time between the construction of the SIT-REM maps [72] and that in this work.  Table 1.

1.
Main results. Our results confirmed that remotely sensed data can be used to (automatically) map plant associations and habitats, particularly due to their multi-spectral seasonal profiles (phenological behaviors). The main seasonal multi-spectral variations were effective predictors for the production of accurate maps as previously discussed in [16,37].
In particular, the main seasonal variations extracted from Sentinel-2 time-series data using FPCA, according to the methodology proposed in Pesaresi et al. [37], proved to be an effective tool for mapping several plant associations (of different structure and specific composition) for an entire SAC. We used supervised random forest classification, similarly to Zhu and Liu [29,[73][74][75], and revealed that the main spectral (phenological) seasonal variations contributed to the OA (here, 85.58%) of the pro-duced map, much more strongly than the lithological and topographic features (see Table A1).

2.
Simultaneous use of multiple time-series We confirmed the importance of using the popular NDVI time-series [76]. It was demonstrated (as in other related works, e.g., [8,24,75]) that the integration of multi-spectral (vegetation index) time-series can improve the performance of vegetation and habitat mapping, as seasonal patterns manifest differently in different spectral bands and vegetation indices [28] ( Figure A3). For example, holm-oak wood and Pinus sp. plantations had similar seasonal temporal profiles, in terms of the NDVI, while presenting clearly different MCARI profiles ( Figure A3). The supervised RF models constructed using the time-series individually revealed that, for the study area, seasonal variations of the NDVI time-series were the most explanatory (OA 73.13% alone), while RF model constructed using the timeseries jointly revealed some complementarity between the time-series, where each appeared to be useful in classifying different plant associations. The integration of all six considered spectral vegetation indices supplied the RF with crucial variations characterizing and discriminating the different plant associations. With the combined model, an improvement of 9% in the OA over the best single time-series was obtained. When we also considered the topographic and lithological features, an additional 3.4% improvement was achieved (Table A1).

3.
Mapping accuracy The obtained results, in terms of the OA (85.58%), demonstrated a meaningful gain of performance when compared with the existing map [72]. Habitat (92/43/EEC) and plant association maps (using the Braun-Blanquet approach) rarely reach an overall accuracy greater than 80% [17]. In most cases, the number of classes is up to five [11,14,37,[77][78][79][80], while an increase in the number of classes typically has a negative impact on the overall performance, reaching values close to 75%, as in [9,16,[80][81][82][83][84]. From a methodological point of view, it is necessary to define an acceptable level of accuracy of a map generated using remotely sensed data [85,86], taking into consideration the related use-cases and scenarios. Although it is challenging to define a minimum value threshold for the mapping of plant communities and habitats, according to the above-mentioned references, we could consider that, with a number of target classes greater than five, an OA value of 80% represents a good result, while the range of 75-80% could be sufficient. When the number of classes is ≤5, a good result should exceed 85% OA, and it can be considered satisfactory if it exceeds the 80%.

4.
Data Reduction using FPCA FPCA is an efficient input data reduction tool. First, it is the central idea of the FDA itself-compressing the input data. Indeed, it considers the pixel-based time-series as a function and as the (unique) object of analysis. Consequently, FPCA considers a stack of remote sensed images as a single container of pixel-based functions, regardless of the number of images that compose it. Then, FPCA extracts, from these pixel-based functions (considered as cohesive temporal record of pixel-based time-series), the main and orthogonal modalities of variation (i.e., functional components, numerically represented by the FPC scores that express the different seasonalities), preserving the order (i.e., chronology) of the data and facilitating ecological interpretation of the derived temporal and spatial patterns (see Figure A2; as well as [31,36,37]). In this work, for each vegetation index, we derived the time-series (set of values over 52 weeks) and then applied FPCA. This technique reduced the size to a total of 43 functional components (considering multi-spectral seasonality); see Figures A1 and A2. Furthermore, functional analysis allows for the study of derivatives, thus, providing complementary information to describe the seasonal cycles derived from satellite data (see, e.g., [87,88]). The FPCA components with a low fraction of variance explained could also be evaluated by experts in order to derive local changes that are potential indicators of different plant communities [89].

5.
Benefits for habitat directives, phyto-sociology, and landscape management The overall accuracy obtained (OA 85.58%) was greater than 80% and much higher than that obtained with the traditional method (photo-interpretation), thus, making the produced map (Figure 3) a reliable and usable tool in the main procedures of habitat directives (e.g., Appropriate Assessments and filling in of Standard Data Forms).
In the study area, the high temporal resolution of the Sentinel-2 mission makes mapping (through the functional approach) of the vegetation and habitats up-to-date and repeatable, according to the timing required by the directive (six years), as well as with higher frequencies (e.g., every 2-3 years) [9]. Cyclical mapping (at least every six years, in accordance with the Habitats Directive) of the plant associations-and, therefore, of the habitats-with high accuracy (in terms of OA) could provide a valid tool for monitoring the transformation of habitats over time [10,90]. For example, the conservation status of the grasslands in the studied area is strongly determined by the presence of invasive shrubs or species. The grasslands of habitat 6210, in fact, if not adequately managed, tend to be colonized by shrubs (e.g., Spartium junceum and Juniperus oxycedrus) and by a few perennial grasses, such as those of the genera Brachypodium, with a consequent loss of the floristic diversity of the grasslands [91][92][93][94].
Any variation or modification of the phenological profile of the habitat (as identifiable through the use of the proposed methodology) can be used as a warning signal that, in certain areas, a transformation process is underway, therefore, highlighting the need for an inspection on the ground and, consequently, the need for corrective and timely management actions. The limitation for modeling and mapping plant communities is no longer the access and processing of satellite data, but rather the time it takes to access and generate reference data in the field. Therefore, it is crucial to disseminate the vegetation plots in databases (e.g., VegItaly [95]) [9]. The production of reference data at the plant associations level could benefit from the help of drones. An unmanned aerial vehicle could enable the recognition of the plant species (see, e.g., [96]) while reaching inaccessible places for human beings due to orography and many other factors.
Recognizing few indicator species (by drone) could by key for the quick identification of vegetation types (plant association) in the field in complex environments [97] where the vegetation types and their species composition are available. Despite the high global accuracy of the map, the error matrix (Table 4) showed that black hornbeam wood and downy-oak wood (habitat 91AA*) could be misclassified. This was due to the fact that both woods share some species in the dominant tree layer, although the underlying, dominated layers are well-differentiated, in terms of mesophilic species dominating in the first case, with more thermophilic in the second. Therefore, from an ecological point of view, they are different woods, even if the tree structure and composition is partially similar. This confusion led to over-representation of downy-oak wood (PA 82.5% and UA of 65.9%) and underrepresentation of black hornbeam wood (PA 68.3% and UA 85.4%) in the predicted map ( Figure 3). However, these accuracies were still much better than those obtained with traditional mapping. The methodology presented in this work enables a concrete link between the remote survey perspective and the (phyto-sociological) field-based one. Manual (traditional) approaches could be empowered by supervised ones, which, in any case, still rely on reference data generated by domain experts. Vegetation has unique spectral signatures that evolve along with the plant life cycle over the year [75], which represents an important trait of plant associations ( Figure A3). The quantification of patterns in the seasonal behavior of the spectral reflectance can provide better characterization of plant communities [74]. Furthermore, FPCA scores represent reduced ordination spaces (i.e., useful tools for botanists and ecologists) [31,35,36] that are suitable for the ecological interpretation of the results and that contribute to the study of the relationships between the data observed in the field and those that are remotely sensed. Table 4. Cross-validated confusion matrix (10-fold, repeated five times) between the predicted target classes of the SAC "Gola di Frasassi", code IT5320003 (Central Italy). The Overall accuracy, Producer accuracy, User accuracy (in percentage), and κ Statistic are given. Row and column numbers denote the plant associations and habitats listed in Table 1. Limits and Challenges for Future Work

1.
We found that the combined use of different spectral vegetation indices led to better results compared with using a single one. Of course, the six selected vegetation indices have been widely adopted by researchers; however, we cannot exclude that a different set of vegetation indices combining two or more bands could improve the overall performance, perhaps also fixing the misclassification between black hornbeam and downy-oak woods. The typical formula adopted for the NDVI is in the form of (a − b)/(a + b) where a and b are spectral bands; this formula is only an example, and other formulas/functions may improve the final mapping.

2.
If we need to consider multiple time-series derived from a wider set of vegetation indices, we must adapt an approach considering more advanced FDA techniques. A possible solution is to adopt the Multidimensional FPCA (MFPCA) [98], which enables reduction of the time-series (i.e., a lower number of components extracted from multiple time-series), thus, providing a single phenological ordering space to make the ecological interpretation easier [37]. Although harmonic and phenological features have recently been used to improve forest type mapping, as in [29,74], new vegetation indices could be derived, with novel formulas, in order to increase the spectral separation among target classes.

Conclusions
Functional Data Analysis (FDA) (e.g., FPCA) is relatively recent and not yet widely used in remote sensing and ecology, even if it represents a powerful way to analyze temporal ecological data, such as remote-sensing time series [99]. FDA represents a promising novelty to map habitats that belong to the Natura 2000 network sites. Pesaresi et al. [36,37] applied the FPCA to NDVI Landsat 8 time-series to map and characterize forest plant associations and habitats identified with the phytosociological approach in a small area belonging to the Natura 2000 network.
In this work, we demonstrated the utility of using FPCA applied to Sentinel-2 time series of vegetation indexes to improve the mapping of plant associations and habitats of an entire Special Area of Conservation. Despite its limited size (∼728 ha), the study area is rich in terms of species, plant communities and habitat diversity.
The global accuracy achieved in the study area adopting the proposed methodology was 85.58%. The obtained results outperformed the threshold of the 80% that was rarely exceeded in remote sensing applications for habitat mapping and the existing map of the same area that was obtained by experts using photointerpretation. The main seasonal phenological variations (FPCA components and scores), extracted by FPCA from the remote sensed time series, contributed to the global accuracy of the map much more than the topographic and lithological variables.
Furthermore, this work highlighted that the plant communities, together with their own typical floristic composition, exhibited exclusive phenological dynamics that manifest differently with respect to vegetation indexes. These different temporal patterns are quantified by FPCA scores and graphically represented by their mean seasonal profiles. The proposed methodology can be used to detect future changes in phenological patterns that can serve as a warning sign of habitat change. Indeed, they act as a sensible and pragmatic tool to show, monitor, and preview environmental changes.
FPCA scores and multi-index mean seasonal profiles are important vegetation attributes that could be complementary to species-based approaches in plant community ecology and phytosociolocy and that facilitate the link between remote sensing with habitat mapping and monitoring and their ecological interpretation.
The results confirm that FDA applied to remote-sensing times-series opens new scenarios to map vegetation and habitats of the Natura 2000 network. Vegetation and habitat mapping are still mostly performed in the traditional way with the photo-interpretation method. Most of approaches to map habitat from remotely sensed data rely on single or sparse time-series (few images per year); however, the dynamics of phenology suggest using denser time series to set up a supervised classifier (Random Forest, LDA, SVM, etc.).
The FDA techniques, considering the time series of a pixel as a single statistical object (function) allow the use of many images (dense time-series) in an efficient and ecologically relevant way considering entire archives of remotely sensed images (e.g., Landsat and Sentinel-2) as a cohesive temporal record, rather than as a series of individual images.
Considering these results, we plan to extend the temporal coverage of time-series embedding different combinations of bands (indexes) that could improve the overall classification performance in a multi-dimensional scenario by using the Multivariate Functional Principal Component Analysis (MFPCA). The proposed method could be easily applied on different, larger areas with denser time-series.
We believe that the mapping of plant associations and habitats with the functional approach has a relevant impact on the decisions taken by policy makers to monitor and preserve our environments. The automation of the process also gives the opportunity to detect changes over time that could be relevant for stakeholders to plan and enact policies to preserve the environment.  Data Availability Statement: Data and code are available at [47].

Conflicts of Interest:
The authors declare no conflict of interest.
Appendix A   Table A1. Comparison of performance (OA, K, UA, and PA) between traditional map (a) and models based on components: (b) topography, (c) NDVI, (d) GNDVI, (e) MCARI, (f) NDRE, (g) NDWI, (h) MNDWI, (i) all time-series, and (l) all time-series + topography. Legend numbers correspond to plant associations and habitats listed in Table 1    The plotted MCARI values are scaled by a factor of 1000. The bold red line is the mean vegetation index value. The red polygon is the 10-90th percentile. The black line is the mean vegetation index values of whole study area (which is useful to appreciate the differences between the target class and mean of the study area). Row numbers correspond to the plant associations and habitats listed in Table 1.