1. Introduction
Identifying changes in plant communities is a conservation priority, especially for Mediterranean coastal dunes that harbour some of the most threatened habitats in Europe [
1]. Ecogeomorphic links, fundamental to the self-organisation capacity of coastal dunes [
2], suggest that a shift in plant community structure and composition (i.e., expansion of invasive species) can produce an important change in the structure and function of coastal dune ecosystems [
3], which, in turn, can induce changes to the topography itself [
4,
5]. Such changes and domino effects can potentially cripple previously established adaptation mechanisms and lead to a (permanent or temporary) system tipping point [
6]. These considerations advocate for systematic, large-scale, monitoring of coastal dune vegetation, especially in protected areas and in fragmented and stressed dune environments (e.g., due to nutrient fluxes and/or precipitation rates, human impacts, squeezed conditions, etc.) [
7]. Such efforts should focus not only on habitat conservation status, but on potential implications for the contemporary and near-future resilience of these sensitive sentinels of climate change [
8]. While traditional field surveying is the most widely used method of vegetation mapping, it is time-consuming, expensive, and limited in spatial coverage [
9]. Remote sensing, on the other hand, provides alternative means to obtain large-scale and standardised vegetation identification data [
10].
Remote sensing applications typically use hyper- or multi-spectral data obtained by Unmanned Aerial Vehicles (UAVs) and employ vegetation indices (e.g., [
11]), supervised (e.g., Maximum Likelihood, support vector machine (SVM), random forest (RF), and Artificial Neural Network (NN)) and unsupervised (e.g., K-means) classification techniques to map dune vegetation cover [
11], habitats [
12], plant taxa [
13], or individual plant species [
14]. Overall, pixel-based classification is more common in highly mixed coastal dunes; however, depending on the required resolution of the analysis, object-based classifications may perform better, reducing ‘salt and pepper’ effects [
15]. Belcore et al. applied an object-based image analysis algorithm on multispectral UAV data from coastal dunes in Tuscany (Italy), detecting and classifying 12 different plant species, as well as sand and debris, with an average accuracy of 75% for training and 62% for unseen data [
16,
17]. Aside from classification, machine learning techniques like RFs and SVMs can be employed in regression tasks to estimate continuous vegetation parameters. RF regression, for example, achieved the highest overall accuracy (0.89) among five supervised classification algorithms for the detection of the fractional cover and aboveground biomass of
Caprobrotus edulis from UAV multispectral images from the sand dunes of the Cávado River spit (north of Portugal) [
18]. Deep learning techniques have also been employed in large-scale monitoring approaches, like the Convolutional NN trained to map coastal dune habitats (four vegetated classes: shrubs, grasses, broadleaf, and needleleaf trees) throughout the Dutch coast with high accuracies (an overall accuracy of 0.92) even using only RGB UAV data at a 25 cm resolution [
19].
Even though improvements in spatial, spectral, and temporal resolution in satellite imagery and advances in remote sensing techniques have increased capabilities for regular and large-scale coastal monitoring and observation [
20], this potential has not been fully capitalised for coastal dune habitats. This is mainly due to the small size and density of plants and the complexity and heterogeneity of the existing species that inhibit coastal dune vegetation species identification [
7]. Indeed, vegetation characterisation can be challenging in the presence of short and sparse canopies or of highly mixed populations, as is usually the case in Mediterranean coastal dune systems [
21]. Similarly to works using UAV data, satellite remote sensing applications either focus on mapping the boundaries of the dune environment using vegetation indices (e.g., [
22]) or on classifying individual species (e.g., [
7]). Kozhoridze et al., for example, used coarse Landsat time series data to reconstruct the past dynamics of plant expansion of two focal invasive species (
Heterotheca subaxillaris and
Acacia saligna) in the Mediterranean coastal plain of Israel [
23]. Marzialetti et al. identified and mapped five vegetation classes organised into three hierarchical levels, applying a phenology-based RF classification on Sentinel-2 (S2) images on a Mediterranean coastal dune [
24]. A later work [
25] used similar approaches (hierarchical clustering and a RF model) and S2 imagery to produce an unsupervised land cover map, distinguishing four dune types in a representative site of the Adriatic coast. Limited in their analysis by the pixel size of the S2 product, the same authors advocated for using higher spatial resolutions and/or subpixel classification methodologies to improve results [
24,
25]. Gupta et al. tested varying resolutions of multispectral imagery from UAVs and satellites (Vision-1, PlanetScope (PS), and S2) and RF classification to map the expansion of an invasive species (
Heterotheca subaxillaris) in the coastal dunes of the eastern Mediterranean, with results supporting the critical role of spatial resolution for plant identification in highly mixed coastal dune environments [
26]. Similarly, RF models calibrated to identify the expansion of
Acacia saligna in the Adriatic coast of central Italy, were able to delineate invaded area edges and small patches more effectively in PS than in S2 [
27]. Medina Machín et al. applied an SVM classifier to discriminate between six coastal dune species (Gran Canaria, Spain) at the pixel level using WV2 imagery, with the approach showing acceptable capability for dune shrubs [
7]. The results of these works indicate that hard classification approaches (i.e., assigning one ‘dominant’ class to each individual pixel) are very limiting in highly mixed environments (i.e., those containing many plants with small plant parts and low fractional cover), even considering very high resolution satellite products [
24,
28]. Such environments require unmixing approaches that capture the fine-scale spatial distribution of vegetation species within each pixel [
29].
Spectral unmixing, a group of techniques that attempt to decompose the spectral signal of mixed pixels into contributions of individual endmembers and associate them to endmember fractional abundance within the pixel [
30], has been used for mapping a wide range of landcover types over a wide range of spatial and spectral resolutions (UAV to satellite and multi- to hyper-spectral data). Especially regarding highly mixed coastal habitats, unmixing RF models, like soft classification [
28] and rescaled regression [
29], have shown promising results in predicting the fractional cover of the main plant species in salt marshes from satellite imagery at a subpixel level. To date and to our knowledge, only the works of Ettritch et al. [
31], Medina Machín et al. [
7], and Pafumi et al. [
32] have employed spectral unmixing techniques for coastal dune observation from satellite imagery. Ettritch et al. applied a linear unmixing model to Landsat 8, WV2, and UAV imagery from Kenfig Burrows (UK) to estimate bare sand (or total vegetation) cover, as a proxy for ecological dune stabilisation [
31]. Pafumi et al. tested hard and soft RF classification approaches to map coastal dune habitats (Tuscany, Italy) on WV3 imagery, observing that while soft approaches captured a more realistic representation of vegetation patterns, hard RF classification produced more accurate results for coastal dune scrubs and white dunes [
32]. Medina Machín et al. tested linear unmixing techniques using spectral signatures from field radiometric measurements on WV2 imagery; however the approach performed worse than the application of a hard SVM classifier, trained with corrected WV2 multispectral bands plus a vegetation index (MSAVI2) and a band of contextual information (variance of the first principal component) [
7]. In these works, spectral unmixing was employed to discriminate classes with cover that typically exceeded the size of the WV2 pixel (e.g., shrubs). The applicability of such methods in more highly mixed coastal dunes remains unexplored.
The present work aims to assess the potential of spectral unmixing techniques (namely RF regressors) to discriminate plant species and predict their fractional cover in the sparsely vegetated coastal dunes of south Portugal. To do so, we employ RF regressors and very high resolution satellite imagery from WV2, combined with multispectral UAV data, used both as ground truth and as training and testing data. The analysis focuses on analysing the approach, identifying strengths and limitations, and proposing directions for further research.
4. Discussion
Spectral unmixing approaches have been successfully applied in remote sensing of habitats with higher FCs (e.g., marshes; [
28]). In fact, the RFR algorithm applied herein has been used to predict plant FC distribution in the North Inlet–Winyah Bay saltmarsh (Georgetown, SC, USA) using UAV and WV2 data [
29]. The model (called a rescaled RFR in the work of Yang et al.) was able to accurately estimate the FC of bare soil and two dominant marsh species (0.57 < CoD < 0.93 and 0.05 < RMSE < 0.27) but did not capture the distribution of a less dominant species effectively [
29]. Our application of the same approach in the highly mixed coastal dunes of the Ria Formosa barriers led to similar results and observations regarding model skill and limitations. The model was able to predict the FC of five major classes in the two barrier islands tested (three plant species, SandB, and DeadV), with reasonable-to-very good accuracy using upscaled UAV data (0.39 < CoD < 0.99 and 0.018 < RMSE < 0.041) and with reasonable-to-good accuracy using WV2 data (0.35 < CoD < 0.62 and 0.040 < RMSE < 0.123). Class expansion is an important factor, with dune plant species present in less than around 1/5th of the pixels (see
Table 2) being practically undetectable. This may be linked to the FC distribution within the sample, as an imbalanced dataset typically leads to underprediction of minority classes within the model population [
55]. Still, resampling approaches that target this imbalance by artificially reinforcing the presence of ‘rare’ data within the distribution, such as the Synthetic Minority Oversampling Technique for Regression (SMOTER; [
56]), did not improve skill with underperforming classes for TavW with WV2 (see
Table A3). More specifically, an adapted version of SMOTER [
56] was applied, that accounts for the prevalence of zero-FC observations for minority plant classes in our dataset (see
Figure 3) by considering only the non-zero value distribution to both (a) identify the rare values (instances that exceeded the mean plus the standard deviation of non-zero FCs of the class) and (b) perform the oversampling (using k-nearest neighbour interpolation within the rare values and oversampling rare values with a factor of 5). Therefore, the detection failure is most likely a combination of the low representation of some classes, combined with plant characteristics. Especially for plants whose aboveground parts cover a low proportion of the WV pixel, their signal is most likely overwhelmed by other classes that cover more space over the pixel, even if their spectral signature is distinct. An example is CalyS that, despite appearing spectrally distinct from the remaining classes while using native resolution UAV multispectral data (high SA throughout (
Figure 4a) and an accuracy of 0.87 in the hard RF (
Figure 5a)), showed very poor skill in tests using data from larger pixel sizes (upscaled P4M at 0.5 m and WV2).
Compared with other spectral unmixing regressor models tested, RF performed better than SVM and HGB for predicting FC at the subpixel level in the Ria Formosa coastal dunes. Contrastingly, Medina Machín et al. found hard SVM classification outperforming linear unmixing for the identification of six non-herbaceous plant species (five shrubs and one rush) in the Maspalomas dunes (Grand Canaria Island, Spain) from WV2 imagery [
7]. The size of these plants typically exceeds the WV2 pixel, indicating that the application of unmixing techniques was likely not strictly necessary and therefore justifying the higher accuracy of hard classifiers. It follows that the target species in the Maspalomas dunes were significantly different than our study, with only some of the ArteC plants in Ria Formosa reaching similar sizes. Similarly, the size of the target classes, compared to the spatial resolution of the features used, is an important consideration for the observations of Pafumi et al. regarding the skill of hard and soft RF classifiers for coastal dune habitat identification from WV3 (in two sites in Tuscany, Italy) [
32]. The authors concluded that hard approaches were better at classifying dune scrubs and white dunes, while also noting that soft approaches captured a more realistic representation of vegetation patterns [
32]. These considerations and observations highlight the importance of selecting a level of analysis aligned with the objectives of each study site and case. Whereas, for example, hard approaches may be more accurate for the identification of coastal dune habitats, they are hardly justifiable for satellite observation of dune species in highly mixed systems.
The spatial resolution of the feature dataset can impact the model skill, as evidenced in the reduced performance of RFR using PS imagery (see
Appendix A.4). Ettritch et al. also found an exponential increase in total dune vegetation abundance error with decreasing spatial resolution [
31]. However, the RFR tests comparing UAV multispectral data resampled at 10 cm and 50 cm pixels did not follow this pattern, with the higher resolution slightly underperforming in terms of the most abundant classes. Smyth et al. also noted higher errors occurring when applying supervised classification to very high image resolution multispectral UAV data (e.g., for a 0.01 m versus a 0.25 m pixel size) [
49]. A systematic approach, aiming to study the combined impact of spatial resolution and the predominance of one class with respect to the others on the unmixing accuracy is needed to provide more clarity on these influences.
The assessment of potential improvement in model skill by the inclusion of morphological data in the predictor dataset showed that the distance to the shoreline was more effective, increasing model skill by 7 to 40% when using UAV data and by 10 to 55% when using WV2 data. Lansu et al. also observed accuracy improvement, albeit accompanied with a small decrease in precision, in dune habitat mapping using Convolutional NN along the Dutch coast after including the distance to shoreline to coarse (25 cm) UAV multispectral data [
19]. Their results improved further by also including elevation data (digital surface model and canopy height), which was mostly linked to a better separation between shrubs and broadleaf trees [
19]. Similarly, Franklin et al. reported a 90% accuracy in detecting the presence of shrubs in the barrier islands of Virginia (USA) from LandSat-LiDAR imagery composites, using decision trees and RFs [
57]. Other works, like that of Cruz et al. who used RF with UAV data [
12], also advocate for the inclusion of elevation data for dune habitat monitoring. We found that the additional skill enhancement was too low to justify the need for topographic data (obtained by DEM and resampled at 0.5 m), even for applications only considering the embryonic dune and the foredune ridge, where topography is more linked to species distribution (e.g., [
33]). In coastal dune systems, however, where elevation is an important model predictor for dune plant distribution, spectral unmixing could be applied jointly with novel approaches for achieving reasonable accuracies (0.5–1 m) in satellite-derived DEMs (e.g., [
58]) for coastal observation and monitoring through satellite imagery.
The grouping of dune plant species based on spectral similarities was performed as an exercise in improving model accuracy and showed that each group essentially adopted the skill of its the best-performing member, without notable additional gains. While grouping did not essentially improve accuracy for our system, it needs to be stressed that some of the species that were included within a ‘spectrally similar’ group do not belong to the same successional stage type (e.g., AmmoA is a dune builder and should not have been grouped with sand-binder species like CrucM, ErynM, and OtanM; see [
59]). Such considerations, along with the location of plants within the dune system, need to be considered, especially in highly mixed dune systems. Hyperspectral imagery, on the other hand, could provide critical missing spectral detail needed to improve the segregation of plants or plant groups. Laporte-Fauret, for example, obtained good accuracies using RF classification of airborne hyperspectral imagery (spatial resolution: 1 m; spectral resolution: 4.5 nm; spectral range: from 409.23 to 987.08 nm) to classify nine plant species in a coastal dune system in southwest France, but noted a reduced model skill for small plants and low-density vegetation patches (i.e., mixed pixels) [
14]. Their results were limited by the low spatial resolution, combined with the pixel-by-pixel approach employed, whereas our results were likely mostly limited by the available spectral resolution.
It follows that coupling spectral unmixing algorithms with hyperspectral imagery could be the key to overcoming these limitations and appears to be a promising direction for future research in Earth observation of coastal dunes. Combining such data with automated (e.g., object-based classification [
15]) or semi-automated (e.g., [
60]) segmentation tools to delineate individual classes or class groups would significantly decrease the manual labour involved in compiling the reference dataset. Additionally, FC
veg can be estimated using a subset of the ground truth data and machine learning methods, like RFs that showed high accuracy in predicting the FC of desert vegetation [
46]. Such avenues could increase the automation and robustness of the approach, paving the way to wide-scale monitoring applications. At the same time and considering the good model skill in terms of sensing the FC of bare sand (or total vegetation) and major plant species from both drone and satellite imagery, such products can be capitalised on in coastal dune environmental research. For example, spectral unmixing results can (a) be incorporated into aeolian sand transport models to inform on plant distribution in the field, thus making it possible to improve approximations of flow–plant interactions (e.g., reducing flow velocity with vegetation cover [
61]) and predictions of dune morphological change; (b) be used to monitor sand cover as an indicator of dune mobility (e.g., mapping of blowouts [
62]); or (c) be used to monitor dune vegetation cover as indicator of dune vulnerability (e.g., using FC
veg thresholds [
63]).