1. Introduction
Savannas cover over 40% of Africa’s land surface and support large populations of both wildlife and livestock [
1]. Savanna ecosystems are characterized by the co-existence of grasses, forbs, and woody vegetation [
2]. The percentage cover of these vegetation types shifts with environmental gradients and large herbivore diversity, density, and activity [
3]. These landscapes are generally poorly suited to agricultural cultivation, and so the livelihoods of people living in savannas are often dependent on livestock, and therefore grazing resources [
4].
In some savannas, seasonal grazing patterns of livestock and wildlife are driven by the spatial and temporal variability of grazing resources [
5]. The key resource concept suggests that wildlife and livestock productivity and resilience are controlled by access to heterogenous grazing resources that provide distinct functionality across space and time. Resources may differ in functionality, such as those used for growth and reproduction, which tend to be high-quality; low biomass wet season resources [
6], which contrast with low quality; and high quantity dry season resources, which buffer populations through times of resource scarcity [
6,
7]. Importantly, the ability of these resources to meet the metabolic and nutritional requirements of both wildlife and livestock is dependent on spatially distinct species composition, which varies in terms of functionality [
8].
African savannas can be heavily impacted by the intensity and type of livestock grazing. To avoid negative impacts of livestock grazing and maintain resilient and productive populations of livestock and wild grazers, savanna landscapes must be kept unfenced to prevent habitat loss and fragmentation. In addition, natural resource governance systems must ensure the preservation and generation of this functional heterogeneity [
9]. Poor management of livestock production systems is, in many areas, the cause of a decline in vegetation productivity and grazing quality and a loss of vegetation heterogeneity [
9,
10]. Land degradation can manifest as a long-term loss of productivity, reduction in palatable species, reduction of perennial species, increases in forbs and invasive species, and woody encroachment [
2,
3,
9], leading to decreased sustainability of pastoralist livelihoods and a reduction in wildlife populations [
11,
12]. For example, pastoralist livelihoods in Kenya are threatened by the loss of grass and forb species, which are preferred by cattle and sheep, while woody species preferred by goats are increasing [
9]. Broad-scale losses of valuable plant species and encroachment by woody species has reduced savanna condition and livestock productivity in many areas of Kenya [
11,
12]. Combating degradation and fragmentation of critical grazing resources through grazing management is currently difficult given a general lack of information on the spatial distribution of both beneficial and deleterious species. Understanding landscape-scale distributions of vegetation types, important species for grazing, and indicators of land degradation is valuable for the sustainable management of environmental and natural resources, as well as for conservation and ecological research [
13,
14]. Mapping of vegetation at large spatial scales has been facilitated by Earth observation (EO) satellite programs such as NASA’s Landsat program, which offers over 40 years of Earth surface data through imaging spectroscopy. Satellite imaging spectroscopy, the collection of images in multiple spectral bands simultaneously, provides a means of collecting data on the spectral reflectance characteristics of the Earth’s surface and atmosphere [
15].
The combination of satellite imagery, machine learning (ML), and classification algorithms has enabled the development of techniques to accurately map a range of landscape characteristics. Classifiers use ML to separate the characteristics of specified classes, such as the spectral characteristics of vegetation types. The learned characteristics of classes can then be used to classify satellite images and thereby produce geographically referenced classification maps [
15]. There are two main types of classifier, unsupervised and supervised classification algorithms (SCAs). Unsupervised classifiers separate data into a number of natural clusters defined by the analyst, whereas SCAs require training data, and the algorithm attempts to classify the data according to predefined training classes [
16]. SCAs applied to remote sensing imagery needs to be able to classify large volumes of multidimensional data (each spectral band is a dimension) with relatively low volumes of training data, which is frequently imbalanced and noisy [
17].
With advancements in the coverage, spatial resolution, and spectral resolution of remote sensing satellites and SCAs, mapping vegetation at the vegetation community or even species level has become feasible [
18,
19]. Two of the most commonly used SCAs in remote sensing studies are support vector machine (SVM) and random forest (RF) [
18,
20]. SVM is an attractive classifier for remote sensing studies because of its capacity for regularization with limited training data, and it therefore suffers less from over-fitting than many SCAs [
21,
22]. Random forest is highly versatile, computationally efficient, and frequently provides the most accurate classification results in model comparison studies using remote sensing imagery [
19]. However, RF can suffer from over fitting and cannot be regularized in the way non-decision tree-based SCAs can, although over-fitting can be overcome through a process of pruning trees [
19]. Some studies report RF to possess superior accuracy over SVM, while in other studies the opposite has been reported [
18,
23]. It is likely that sample size, class size equality, image resolution, and the characteristics of the study area all play a role in determining classification accuracy [
23,
24,
25,
26,
27].
In order to map plant species, the spectral characteristics (known as the spectral signature) of a species must vary more between than within species [
28]. Distinct spectral signatures allow the separation of species by SCA and therefore the accurate classification of pixels. The chemical composition and structure of green plant species are often very similar, which results in species exhibiting similar spectral reflectance and scattering properties [
29]. Consequently, to detect small differences in the spectral signatures of species, instruments with many contiguous bands at small bandwidths (<20 nm) tend to produce better classification accuracy [
13]. The most important regions of the electromagnetic spectrum for distinguishing plant species are the near infra-red (NIR) and short-wave infrared (SWIR) regions [
30,
31,
32,
33].
One way to enhance the spectral differences between species is to utilize multiple images, taken across a growing season [
22,
33]. Studies have shown that, when using a single image to map plant species, an image that is taken at peak productivity often produces the most accurate results [
20,
30,
33]. Recently there has been a surge of interest in the use of multiple images acquired across time (time series imagery) on the accuracy of mapping plant species [
20,
34]. Time series imagery has been used to increase the accuracy of plant species classification, albeit through only a limited number of studies [
20,
34]. This has been made possible by the spectral characteristics of plant species, which change with their phenological stage (i.e., greening up, flowering, seeding, etc.). Therefore, time series imagery facilitates class discrimination by SCA by exploiting the spectral variation associated with phenology. Studies using time series imagery often focus on species of similar physiognomy and thus do not consider how time series imagery may differently affect tree, shrub, and grass classification accuracy. It is reasonable to predict that the inclusion of imagery across seasons may enhance the classification accuracy of tree and shrub species, but not of grass species. Grasses tend to lose pigmentation rapidly after rains diminish, but woody vegetation such as trees and shrubs do not [
35]. Therefore, distinguishing the spectral characteristics of some types of vegetation may be enhanced or diminished by the inclusion of dry season imagery depending on the vegetation type.
Hyperspectral satellite imagery has distinct advantages over multi-spectral imagery for mapping plant species. However, given the coverage, repeat acquisition, spatial resolution, and data availability afforded by recent multi-spectral missions such as Sentinel-2, understanding the accuracy with which multi-spectral imagery and SCA can be used to map important plant species is of great interest. The multi-spectral instrument (MSI) on board the twin Sentinel-2 satellites contains sensors for four narrow bands (<20 nm) at the red edge and NIR region, as well as other bands in the SWIR and visible region [
36]. Sentinel-2 is also attractive because of its high revisit time (five days) and spatial resolution of 20 m in the NIR and SWIR bands [
36].
Across many African savannas such as in southern Kenya, semi-arid savannas are transitioning in use from traditional pastoralist grazing systems to agro-pastoralism, which is often accompanied by land subdivision, increased sedentarization, and breakdowns in traditional natural resource management structures [
37,
38]. As these systems undergo rapid changes in their governance and management (and hence ecological structure), aiding local communities and governments in the monitoring of their resources enhances the capability of pastoral communities and wildlife managers to detect and respond to these changes. This study aims to capitalize on the freely available data provided by the Sentinel-2 mission to map several key grazing species and indicators of land degradation in an African savanna ecosystem.