Accuracy Enhancement for Land Cover Classification Using LiDAR and Multitemporal Sentinel 2 Images in a Forested Watershed †

Mapping land cover with high accuracy has become a reality with the application of current remote sensing techniques. Due to the specific spectral response of the vegetation, soil and vegetation indices are adequate tools to help in the discrimination of land uses. Additionally, the accuracy of satellite imagery classification can be improved using multitemporal series combined with LiDAR data. This datafusion takes advantage of the information provided by LiDAR for the vegetation cover density, and the capability of multispectral data to detect the type of vegetation. The main goal of this study is to analyze the accuracy enhancement in land cover classification of two forested watersheds when using datafusion of annual time series of Sentinel-2 images complemented with low density LiDAR. The obtained results show that overall accuracy is better if LiDAR data is included in the classification. This improvement can be a significant issue in land cover classification of forest watershed due to relationship and influence that vegetation cover has on runoff estimation.


Introduction
Land cover in forested watersheds has high influence in runoff estimation and it has been proved that land cover type and vegetation density, are two of the main factors that determine runoff volume after the rainfall [1].Therefore, accurate land cover maps are crucial for forest monitoring, hydrologic studies and flood management.
Nowadays, mapping land cover has become a reality with the application of current remote sensing techniques.There is a great availability of different satellites and sensors which offers a big range of spatial and spectral resolutions.Even so, with the appearance of Sentinel-2 imagery of the European Space Agency (ESA), new studies on land cover mapping can be performed [2][3][4].
In recent years, several works have studied the improvement of the accuracy of land cover mapping.In this way, vegetation indices (VI), due to the specific spectral response of the vegetation, have been proved to be adequate tools.There are some classic VI like the well-known Normalized Difference Vegetation Index (NDVI) originally proposed by Rouse, Jr. et al. [5], the Soil Adjusted Vegetation Index (SAVI) developed by Huete [6] or the Second Modified Soil Adjusted Vegetation Index (MSAVI2) proposed by Qi et al. [7] and the Green Normalized Difference Vegetation Index (GNDVI) developed by Gitelson et al. [8].Additionally, Sentinel-2 imagery provide more indexes rather than the classic ones.
On the other hand, other works like those of Gebhardt et al. [9] and Zhao et al. [10] show how global accuracy is improved by the use of multitemporal series instead of single date image.Moreover, with the development of LiDAR technology and machine learning techniques, works like [11][12][13][14][15] enhance overall accuracy with data fusion of LiDAR and multispectral (or hyperspectral) images.This datafusion takes advantage of the information provided by LiDAR data on the vertical structure of the vegetation, and the capability of multispectral data to capture the horizontal distribution of vegetation [13].In this way, LiDAR data from the Spanish National Plan of Aerial Orthophotography (PNOA) of the Spanish National Geographic Institute, with low density (0.5 points/m²) but with total coverage and free distribution, has been demonstrated to be an effective source of information for forest management [16].
The main goal of this study is to analyze the accuracy enhancement in land cover classification in forested watersheds when using datafusion of annual time series of Sentinel-2 images complemented with low density LiDAR and soil and vegetation indices.

Studied Area
The study area is composed of two neighboring watersheds in Badajoz province (Spain) of 146 km 2 and 194 km 2 respectively.According to the Spanish Cultivation and Land Use Map [17], the watersheds are mainly covered by the following vegetation types: perennial forest wooded (the existence of conifers is low), shrub and herbaceous.

Sentinel 2 Data
All cloud free images Sentinel 2 images from April 2017 to April 2018 (nineteen images) have been processed.In addition to the spectral bands, various vegetation indices (VI) and soil indices (SI) have been obtained for each single date image and have been included as predictors for the improvement of the classification accuracy.

LiDAR Data
LiDAR data correspond to the National Plan of Aerial Orthophotography (PNOA) of the Spanish National Geographic Institute [18] with a low point density of 0.5 points/m² but with total coverage throughout the Spanish national territory and free distribution.Although this low point density, it can provide three-dimensional structure of forest canopy.In this way, LiDAR data has been analysed with FUSION software [19] creating the canopy height model (CHM) and the tree canopy cover factor (TCCF).Afterwards, using the Sentinel-2 spatial resolution, new rasters have been obtained representing different metrics of the CHM in each pixels as follow: maximum value, medium value, median value and minimum value.The threshold between shrubs and trees has been fixed to three meters above the ground.So, two new rasters have been created: one representing the TCCF of the vegetation with normalized height above three meters, and another one representing higher vegetation with TCCF above five meters.Those LiDAR derived metrics have been also included as predictors.

Supervised Clasification
A pixel based supervised classification has been carried out for land cover mapping applying the Random Forest classifier (RF) [20].The analysis has been carried out with the Sentinel Application Platform (SNAP) software developed by the ESA.Land cover classification has been achieved for the annual time series considering, for each single date image, the VI and SI as well.Next, LiDAR data has been included in the analysis, resulting a new land cover map.Finally, classifications accuracies have been evaluated comparing the overall accuracy value derived from the error matrices and the user's and producer's accuracies.

Results
For each land cover map, the error matrix has been calculated and its corresponding overall accuracies values have been compared.The annual analysis obtains an overall accuracy of 73% and when LiDAR data is included in the classification, overall accuracy is improved to 79%.
On the other hand, both user's and producer's accuracies are better when considering the LiDAR metrics (Table 1)

Conclusions
In this work the accuracy enhancement using LiDAR data with Sentinel 2 in multitemporal land cover classification in forested watersheds has been studied.The obtained results show that overall accuracy is better if LiDAR data is included in the classification (datafusion).Therefore, LiDAR data, from National Plan of Aerial Orthophotography of the Spanish National Geographic Institute, is a good improvement for datafusion analysis.This improvement can be a significant issue in land cover classification of forest watersheds due to relationship and influence that vegetation cover has on runoff estimation and consequently, on forest monitoring and flood management prediction.