Article On the Suitability of MODIS Time Series Metrics to Map Vegetation Types in Dry Savanna Ecosystems: A Case Study in the Kalahari of NE Namibia

The characterization and evaluation of the recent status of biodiversity in Southern Africa’s Savannas is a major prerequisite for suitable and sustainable land management and conservation purposes. This paper presents an integrated concept for vegetation type mapping in a dry savanna ecosystem based on local scale in-situ botanical survey data with high resolution (Landsat) and coarse resolution (MODIS) satellite time series. In this context, a semi-automated training database generation procedure using object-oriented image segmentation techniques is introduced. A tree-based Random Forest classifier was used for mapping vegetation type associations in the Kalahari of NE Namibia based on inter-annual intensity- and phenology-related time series metrics. The utilization of long-term inter-annual temporal metrics delivered the best classification accuracies (Kappa = 0.93) compared with classifications based on seasonal feature sets. The relationship between annual classification accuracies and bi-annual precipitation sums was conducted using data from the Tropical Rainfall Measuring Mission (TRMM). Increased error rates occurred in years with high rainfall rates compared to dry rainy seasons. The variable importance was analyzed and showed high-rank positions for features of the Enhanced Vegetation Index (EVI) and the blue and middle infrared bands, indicating that soil reflectance was crucial information for an accurate spectral discrimination of Kalahari vegetation types. Time series features related to reflectance intensity obtained increased rank-positions compared to phenology-related metrics.


Introduction
Plant communities are basic natural resource management units and provide baseline information for ecological processes and functioning in semi-arid rangelands to evaluate dynamic tendencies and grazing capacity [1].However, there is a lack of consistent environmental geodata on a national scale in Namibia [2].In the past, a number of vegetation survey projects have been carried out in parts of Namibia.Volk [3] and Giess [4] conducted general descriptive vegetation surveys on a national scale.Phyto-sociological and vegetation-environmental studies have been realized in selected regions, e.g., in the Khomas Hochland south of Windhoek [5], Central Namib [6,7], Waterberg [8] and in the Etosha Pan by Le Roux et al. [9] and Du Plessis et al. [10].Nation-wide estimations of biomass and vegetation cover have been conducted using remote sensing techniques [11][12][13].Nevertheless, approximately 60%-70% of Namibia's surface has not been analyzed in terms of vegetation composition and community structure [2], which emphasizes the need for implementing standardized bottom-up-approaches in the field of biodiversity assessment and vegetation mapping in Namibia.
New techniques have emerged during the past years to estimate and analyse land-cover and land-cover change (LUCC), land value and land functioning, strongly supported by interdisciplinary approaches of different scientific communities on land change science, remote sensing, geoinformatics, and local scale studies [14,15].Satellite applications have proven to be an effective tool for land-cover mapping and monitoring, providing consistent spectral, spatially explicit and temporal up-to-date indicators of surface processes and status of biodiversity [16,17].
Phenological characterizations of satellite time series of the Advanced Very High Resolution Radiometer (AVHRR) onboard the National Oceanic and Atmospheric Administration satellites (NOAA) and Moderate Resolution Spectroradiometer (MODIS) onboard Terra/Aqua have been widely used for global land-cover mapping purposes [18] and mapping of southern African Biomes and Bioregions [19].Statistical metrics computed on MODIS time series have been successfully used as input features to classify land-cover classes characterized through differing phenological patterns, such as Hansen et al. [20] for mapping a global continuous fields tree cover and Gessner et al. [21] for mapping fractional vegetation cover in Namibia on a regional scale.
Non-parametric machine learning algorithms proved to be an effective tool for land-cover mapping using a high number of input variables.Classification and regression trees (CART), as introduced by Breiman et al. [22], have been applied on satellite time series reflectances and vegetation indices, such as the normalized difference vegetation index (NDVI, [23][24][25]).The application of ensembles of classification and regression trees resulted in increased mapping accuracies and more stable classification results [26].Random Forests, a tree-based ensemble classifier [27], proved to be effective for land-cover classification for mapping agricultural land-use [28] and to map ecotope classes based on hyperspectral imagery [29].The applications of tree-based classification algorithms have focussed on a per-pixel-basis in the global and regional remote sensing community.Integrating land-cover related object attributes, such as shape and neighbourhood, for classification issues have been recently applied and can enhance thematic depth and mapping accuracies.Object-oriented classification techniques have been combined with Random Forest classification for mapping agricultural lands by Watts and Lawrence [30].
Classification and regression trees (CART, [22]) are well known and applied for global classifications of plant functional types [31,32], mapping fractional vegetation cover mapping on global [20] and regional scales in Namibia [33], and classifying global and regional land-cover maps [23,29].
Results from legend harmonization analyses of the main global land-cover maps showed limited class agreements in highly heterogeneous landscapes, characterized by mixed classes of trees, shrubs and herbaceous vegetation.Thus, semi-arid savanna ecosystems, characterized as lands with herbaceous or woody understories and a forest canopy cover between 10%-30% have the lowest mapping accuracies compared with more homogeneous vegetation types [34,35].The most recent global land-cover map on a 300 m spatial resolution developed from bi-monthly composites of the Medium Resolution Imaging Spectrometer (MERIS) onboard the ENVISAT platform is provided by the GLOBCOVER project.Validation results over classes related to savanna ecosystems such as closed to open shrubland and grassland (>15% vegetation cover, <5 m height) show insufficient users accuracies [36], which emphasizes the need for adopted regional studies to develop standardized training methods for image classification, and estimate the most important satellite-based features for classification.
Since 2000, the interdisciplinary project BIOTA-Africa (Biodiversity Transect Analysis in Africa) has collected and analyzed data on different levels of detail in Namibia, involving different observations such as plant species composition from in-situ data and coarse scale earth observation land-cover data on regional scale.Bottom-up approaches of ecosystem assessment and understanding of landscape functioning can provide basic information to develop reliable regional validated vegetation maps.High resolution satellite time series data provide consistent information on landscape dynamics, disturbances and land change processes.Towards the development of reliable land-cover geo-information, a thorough understanding on what detail can be mapped on each scale and how land-cover information can be integrated has to be provided to make synergistic use of the multiple observation data sources.Recognizing the existing uncertainties, especially in savanna ecosystems, this paper aims to conduct a bottom-up remote sensing-based vegetation mapping to outline the capabilities for developing reliable land-cover products in dry semi-arid ecosystems.Thus, the purpose of this paper is to: -present an integrated concept for vegetation mapping in a dry savanna ecosystem based on local scale in-situ botanical survey data with high resolution (Landsat) and coarse scale (MODIS) satellite time series data.
-analyse the suitability of intensity-related and phenology-related metrics derived from MODIS time series for single annual and long-term inter-annual classifications from 2001 to 2007.

Study Region
The study region, as shown in Figure 1, comprises the communal areas in the Eastern Kalahari in Namibia with a geographic extent from 17°30'E to 21°E and 19°45'S to 21°45'S.The area is characterized by a sub-continental climate with summer rain sums of 350-450 mm in the long-term annual average, usually with a high variability.The geology of the study area is dominated by aeolian Kalahari sands with sporadic rock outcrops of sandstone, limestone, schist and dolomite of the Karoo Sequence and the Damara Sequence with a mean altitude of 1,200 m [37].The landscape can be grouped in three main vegetation types after Giess [4], the Central Kalahari, the Northern Kalahari and the Thornbush shrubland in the western part of the study region.Topographically, the transition between the Central Namibian Highlands and the Kalahari Basin is apparent from the SW-NE oriented incised omirimba (shallow water courses with no visible gradients or visible water course, typical of the arid Kalahari sand plateau [38]).The area can be classified into seven agro-ecological zones (AEZ) of common land management practices, e.g., the Southern Omatako and Fringe plains of the Central Plateau, Kalkveld, pans and stabilized dunes of the Kalahari Sand Plateau.All AEZ are characterized by extensive grazing and limited cropping [39].
Figure 1.Overview of the study region in communal areas in the Namibian Eastern Kalahari showing the main savanna vegetation types after Giess [4] overlain with the distribution of botanical field samples [40].

Field Survey and Vegetetion Data Processing
Table 1.Synoptic vegetation type legend showing the ten vegetation types synthetised from the phyto-sociological analysis.The structural vegetation type classes after Edwards [41] as well as mean and standard deviation (Sd) of the cover of tree, shrub and herbaceous layer as sampled in the field indicate slight differences in the life form composition.Note the variances of relevé numbers due to limited accessibility and the extended sample size after the merge process with image objects from the segmentation of Landsat imagery.In this study, data of different scales were used and each data type contains scale-specific land-cover information.In-situ land-cover descriptions resulting from the reconnaissance survey of the landscapes, soils and vegetation of the Eastern Communal Areas provide a detailed description of land types, vegetation composition and physiognomy and habitat settings.However, extensive effort is needed for updating and the spatial alienability is limited.A number of 422 randomly selected vegetation samples, as described in Strohbach et al. [40], were taken during the late growing season from April to May 2004 by applying a stratified sampling with a standardized plot size of 20 m × 50 m following the Braun-Blanquet approach [1].The survey included GPS reading and the record of floristic composition and habitat information after Edwards [41] using the SOTER methodology [42].Species and habitat data were archived in the TurboVeg database [43] of the National Botanical Research Institute (NBRI), then classified and refined using the TWINSPAN [44] and PHYTOTAB [45] packages.The resulting vegetation associations were clustered after characteristic, differentiating and typical species [40], to synopsize 12 main vegetation types according to dominant species occurrence, as shown in the synoptic legend in Table 1.
The main vegetation structure types after Edwards [41] are moderately closed to open shrub-and bushlands followed by a thicket and a woodland class.The sample size varies among the vegetation types due to the general inaccessibility of the study area, especially the sparsely populated central and eastern areas.Two additional classes (graminoid crops and bare areas/hardpans) were added to the classification legend to properly represent the major land-cover characteristics in NE Namibia.

Satellite Data
Five Landsat-7 ETM+ scenes with six reflectance bands (1-5, 7) from 0.45-2.35µm spectral range (path 176 to 178, row 74 to 75) acquired between March and June 2000 with 30 m resolution were used as input for the object-oriented segmentation explained in Section 1.4.The MODIS collection 5 product (MOD13Q1, 16-day composites in sinusoidal projection) at the original 232 m resolution was used for time series analysis.The pre-processing included subsetting to the areal extent of the vegetation survey plots and quality analysis.Low quality data were identified based on the MODIS quality assessment science data sets [46] and gaps were filled by linear interpolation.The quality analysis was conducted using the Time Series Generator (TiSeG) software [47].Annual time series were calculated for EVI and the blue (459-479 nm), red (620-670 nm), nir (841-876 nm), mir (1,230-1,250 nm) spectral bands [48].Bi-annual cumulative precipitation data as averaged over the study region from the Tropical Rainfall Measuring Mission (TRMM, [49]) were used to analyse the relationship of classification error rates to bi-annual rainfall conditions.

Training Database Generation
An object-oriented segmentation was performed on five Landsat scenes to retrieve homogeneous objects based on similar reflectance settings to regionalize in-situ information of the training points on a coarse MODIS pixel size.Using the multi-resolution segmentation function in the Definiens Developer 7.0 software package [50], three segmentation levels were created on pixel level data to locally adapt the optimization procedure for computing homogeneous image segments.
The first segmentation level was generated using a scale parameter of 10.The higher levels were computed on increased shape and compactness values (level 2) and a scale parameter of 25 (level 3).Increasing the scale parameters caused that small scale image objects were merged in homogenous objects to achieve a 232 m pixel size.Each specific botanical field plot was intersected with the surrounded Landsat segments.The segments assigned with a training class label were used as training database for MODIS time series metrics, as visualized in Figure 2. The up-scaling procedure resulted in an increase of the spatial extent of the training database (compare the plot number of the botanical field survey with the sample size applied on the MODIS data in Table 1), a key issue to capture the phenological variability of semi-arid vegetation in the MODIS features within the training process.To avoid the inclusion of natural disturbance by fires in the training database, the MODIS burned area product (MOD45A1, [51]) was used to exclude training areas located on burnt training sites.

Calculation of Time Series Metrics
The feature set for the classification was arranged from two kinds of time series metrics, such as phenological metrics and intensity-related metrics derived from a temporal segmentation of the annual time series.The metrics were computed on an annual basis from 2001 to 2007.The resultant annual statistical features were used to derive inter-annual metrics by computing the long-term statistics for each annual segment metric, as displayed in the flowchart in Figure 3.
Phenological metrics including information on the timing of recurring vegetation cycles (emergence and senescence of canopy) and were generated using the TIMESAT software [52].An adaptive Savitzky-Golay filter was applied on the upper envelope of the seasonal EVI curve to reduce atmospheric errors and enhance the EVI values to a meaningful phenological curve.The season cut-off parameter was determined to a number of one growing season per year.The start of the growing season (SGS) and the end of the growing season (EGS) were defined as the date for which the EVI value has increased to 25% between minimum and maximum EVI.Based on SGS and EGS, the lenth of the growing season (LGS) and mid position of the growing season (MGS) was derived.Further phenological metrics were extracted that describe the curve characteristics of the seasonal EVI time series such as maximum EVI value, base EVI value, left derivative, right derivative and amplitude.The integrated EVI over the growing season was computed, a widely used proxy for net primary production (NPP).Metrics related to reflectance intensity values of the spectral bands of the sensor were used as single date images for the 16-day EVI composites, and additional statistical metrics were computed from the temporally segmented MODIS EVI and reflectance bands.Three temporal segments per year were derived from the annual time series following the main seasonal characteristics of vegetation activity in Namibia.The first segment (Jan.-Apr.)features the main rainy season (mid-to late summer), segment two (May-Jul.)the fall and winter season and segment three (Aug.-Dec.) the hot, dry spring and early summer.Temporal mean, standard deviation, minimum, maximum, and range (between minimum and maximum) were calculated on each temporal segment from the spectral bands of the MOD13Q1 product (EVI, blue, red, NIR, MIR).The feature generation workflow is visualized in Figure 3.
Classifications were performed based on the annual and long-term feature sets.The annual classification sets consist of the 75 MODIS derived annual statistical segment metrics, 10 MODIS-derived phenological date-and NPP-related metrics, and 23 Savitzky-Golay-filtered EVI 16-day composites for the analysed year (DOY 001-353).The total number of the annual feature set was 108.The 75 statistical features from the temporal segmentation for six years of MODIS time series were used to derive 375 inter-annual features.

Separability Analysis
A separability analysis was performed to estimate the suitability of the MODIS time series derivates for classifying vegetation type units.Bhattacharyya distance (B-distance) has been successfully used for time series discriminant analysis [53], land-use classification and pattern recognition in urban areas [54].The B-distance coefficient indicates the probable proportion of variance reduction and class discrimination for a multi-feature space.The B-distance is implemented in MULTISPEC, a free software package designed for hyperspectral image analysis [55].The advantage of using B-distance compared with other common distance measures like transform divergence and Jefferies Matusita distance is its large dynamic range, which does not saturate when applied to a large set of features.The classification was performed using a Random Forests.Those non-parametric tree-based classifiers do not require a Gaussian distribution of the data.However, B-distance is a parametric separability measure.Regarding the differing statistical assumptions, a direct combination of both methods is problematic.Here, B-distance and Random Forests were applied independently.Bhattacharyya distance measures were computed on two different feature sets.The first set includes the date-and NPP-related phenological metrics and the annual seasonal segment metrics of the year 2004 as the reference year for the vegetation survey with a total number of 108 features.The second comprises a number of 375 features of long-term temporal segment metrics (2001 to 2007).

Random Forest Classification
Decision trees (DT) are structured in simple binary decisions.They are independent of data distribution and can handle categorical variables.The hierarchical structure further allows a biogeophysical interpretation of the relationship between input features and classes and can be useful if multi-source high dimensional remote sensing data is used.Bagging (bootstrap aggregation) attains to reduce variance of a classification by training a number of weak classifiers with varying bootstrap samples from the training set and subsequently averaging the predictions.Boosting is based on classifications using different takes of weighted training sets to be combined in the resultant prediction [56].Comparisons of bagging using Random Forest and boosting based on Adaboost achieved best accuracies for Adaboost at the cost of computation time [26,29,57].
Random Forest is a tree-based classifier where multiple trees are produced and combined based on equally weighted majority voting.A randomly selected third of the original training dataset is excluded for training each particular tree.This so-called out-of-bag (OOB) bootstrap sample is randomly permuted among the input features for each tree.With the remaining 2/3 of the training data, trees are grown to their maximal depth using the impurity gini index [22] due to the fact that the random permutation of samples and features antagonizes overfitting.Here, the Random Forest package implemented in the R statistics language [58] was used.
The OOB sample is used to estimate the prediction error for each permutation.Gislason [57] and Breiman [27] showed that the prediction error based on the OOB sample is slightly higher than using an independent test set.The more conservative OOB error was used to estimate classification accuracies.The variable importance function implemented in Random Forest is based on the internal OOB error estimates.The prediction error is computed on every tree based on the OOB bootstrap sample.In a second step, the OOB error is computed by permuting each predictor variable.The difference in the accuracy measures are averaged over the tree ensemble.The accuracy for each variable is normalized by the standard error [58].In this study, variable importance is examined to evaluate the used features in an ecological context.A detailed overview of Random Forests is given in Breiman [27].

Suitability of MODIS Time Series Metrics for Semi-Arid Vegetation Mapping
The characteristic phenological patterns of each vegetation type is represented by the Enhanced Vegetation Index (EVI) time series, shown in Figure 4, exemplary for the rainy season of the year 2004 with the dry-season transitions.Slight differences in the amplitude of green vegetation become apparent by comparing the maximum EVI values of moderately closed and semi-open shrub-and bushland classes.Semi-open vegetation types reach maximum EVI values between 0.45 and 0.5, while closed vegetation classes reach their maximum at 0.55.Distinct differences are visible in the green-up and senescence between vegetation structural types.
In general, as shown in Figure 4a,4b, a significantly steep slope during green-up is apparent within the moderately closed shrub-and bushland classes groups Acacia erioloba -Terminalia sericea bushlands (Ae-Ts), Terminalia sericea -Combretum collinum shrub-and bushlands (Ts-Cc), and Acacia mellifera -Stipagrostis uniplumis shrublands (Am-Su).Compared with that, a slow increase in photosynthetic activity is visible in Pterocarpus angolensis -Burkea africana woodlands (Pa-Ba) and Hyphaene petersiana plains (HP_pl).Acacia luederitzii -Ptichtolobium biflorum floodplains of the Omatako Omuramba (Al-Pb) and Terminalia prunioides thickets (Tp_th) have a distinct offset in greening and early senescence.For the rain period of 2003-2004, the time frame of green-up onset varies between end of November and January (day of year 337/001).The semi-open shrub-and bushland vegetation types are also characterized by a slight increase of vegetation activity and a lower peak in amplitude.An appreciable impact of the so-called small rainy season (varying in time and intensity from September to December) is visible in the Eragrostis rigidior -Urochloa brachyura grasslands (Er-Ub) expressed by a small peak before the main vegetative activity, which may be typical for vegetation types with a high grass cover.
The results of the separability analysis indicate the temporal and spectral capabilities of the MODIS-MOD13Q1 product to discriminate vegetation type patterns.The Bhattacharyya distance indicates a relative measure of how reliable one class can be statistically separated compared to the remaining classes.A low distance value indicates a low separability and vise versa.
Table 2 showing average Bhattacharyya distance measures between Kalahari vegetation type classes, calculated on MODIS intra-annual segment metrics for the season 2003-2004 (lower left values), indicate strong spectral and temporal confusions and thus a more uncertain classification between vegetation types with similar phenology, e.g., Acacia mellifera -Stipagrostis uniplumis shrublands, Acacia erioloba -Terminalia sericea bushlands, Terminalia sericea -Combretum collinum shrub-and bushlands (Ae-Ts vs. Ts-Cc = 0.2, Ae-Ts vs. Am-Su = 0.31).The highest mean inter-class B-distance values reached the Bare areas/ Pans caused by significant delay of the growing period due to late green-up onset caused by a typical flooding situation in the end of the rainy season and a higher surface albedo compared with shrub-and grasslands.The significance of bands within the visible and middle infrared spectral range for discrimination of vegetation community classes due to different soil settings is discussed later in Section 4. Highest B-distance values reach Combretum imberbe -Acacia tortilis woodlands (Ci-At) and Hyphaene petersiana plains (HP_pl) with 44.2.The reason for that can be seen in contrary phenologies.Ci-At has a stronger slope and early onset and slight slope of decline at the end of the rainy season, whereas HP_pl is characterized by a delay in onset of green-up and fast senescence.[41] and graminoid crops and pans shown in the MODIS (MOD13Q1, 232 m) Enhanced Vegetation Index (EVI) smothed with a Savitzky-Golay filter.See Table 1 for class labels.

Vegetation Type Mapping
The vegetation type mapping is based on a vegetation survey in Namibia's eastern communal areas and describes the transition zone between Kalahari to the Otavi mountains where calcareous geology is dominating in the NW and the mountain savanna transition zone at the SW border of the study area.The resulting vegetation type map is shown in Figure 5.
In addition to the Omatako Omuramba, a number of smaller omirimba valleys cross the study area in SW-NE direction.In general, the spatial distribution of vegetation types follows the major topographic units.The Combretum imberbe -Acacia tortilis woodlands are typical for the main river bed of the Omatako Omuramba, dominated by bright soils, as displayed in Figure 5b.
These open woodlands change to Enneapogon desvauxii -Eriocephalus luederitzianus short shrublands on calcareous omirimba and pans (Ed-El), visible in the lower river course where the Acacia luederitzii -Ptichtolobium biflorum floodplains of the Omatako Omuramba (Al-Pb) are disappearing.Besides the Omatako river, the smaller omirimba valleys are mainly mapped as Ed-El or Eragrostis rigidior -Urochloa brachyura grasslands (Er-Ub), whereas their periphery is mapped as linear patches of Acacia mellifera -Stipagrostis uniplumis shrublands (Am-Su), Figure (5c).Acacia mellifera -Stipagrostis uniplumis shrublands (Am-Su) and Acacia erioloba -Terminalia sericea bushlands (Ae-Ts) are the dominating vegetation types in the southern part of the study area classified as Thornbush shrubland and Central Kalahari class after Giess [4] and mark the transition zone to the central Kalahari.Terminalia sericea -Combretum collinum shrub-and bushlands (Ts-Cc) are dominant on deep Kalahari sands.Due to different soil conditions of the historical longitudinal dune system, e.g., in the eastern part of the study region, Acacia mellifera -Stipagrostis uniplumis shrublands were mapped.Terminalia prunioides thickets (TP_th) are confined to a patch in the most NE part of the study area, associated with shallow soils on sub-outcropping limestones of the Otavi group [59].Hyphaene petersiana plains (HP_pl) typically occur on calcareous soils and floodplains located at the periphery of the Karstveld.Pterocarpus angolensis -Burkea africana woodlands (Pa-Ba) are native in the northern Kalahari woodlands and were detected in the central to northern Kalahari transition.Due to misclassifications, patches of Pterocarpus angolensis -Burkea africana woodlands Hyphaene petersiana plains were detected in the central Kalahari.

Classification Error Assessment
The classification accuracies of the classifications performed on the six annual sets between 2001 and 2007 and the inter-annual set including the complete time series database are notably different.
As described in Section 2.6, the map accuracy was calculated on the out-of-bag data randomly selected in each classification iteration.Every pixel in the training database will be used as reference for the estimation of the classification error.
The range of the kappa statistics for annual classification sets is between 0.87 and 0.91.The inter-annual classification set reaches an increased kappa coefficient of 0.93.The resulting maps contain a higher omission error rate with user's accuracies ranging from 83.79% for the year 2006-2007 to 88.41% for the season 2004-2005.The lowest commission error rate was reached also for the season of 2004-2005 with a producer's accuracy of 96.11%, whereas the season of 2006-2007 scored lowest producer's accuracies of 94.17%.The inter-annual classification set achieved comparatively higher producer's accuracies (97.73%) than user's accuracies (94.86%).
Seasonal and inter-seasonal MODIS time series features were tested for their suitability to map plant communities of different savanna vegetation types in the Kalahari and Kalahari-transition zone.B-distance values indicate a spectral separability that is approximately ten times higher for the inter-seasonal dataset than the seasonal set of MODIS time series metrics.This result is also represented in the classification error rates.Whereas the classification error range varies between the different classes in the annual classification, a more balanced result among the classes was achieved in the long-term set, ending up in a more accurate map with a Kappa coefficient of 0.93.Precipitation data from the Tropical Rainfall Measuring Mission (TRMM) were used to compare the annual classification error rates with annual precipitation sums.Therefore, bi-annual cumulative rainfall sums were averaged over the study area.Figure 6 shows the comparison of annual OOB error rates and bi-annual cumulative rainfall.Increased OOB error rates occurred in years with high precipitation sums.Compared 00with the previous years, the growing seasons of 2005-2006 ad 2006-2007 with precipitation sums in a range of 1,000 mm to 1,160 mm were wet rainy seasons.These years were mapped with increased error rates from nine to ten percent overall OOB error.In contrast, the seasons of 2002-2003, 2003-2004, and 2004-2005 were characterized by low rainfall below 800 mm.Comparatively low error rates below eight percent were achieved for those seasons.

Requirements for Rainfall Amount for Vegetation Type Mapping
To summarize the suitability of multi-temporal MODIS time series features for vegetation type mapping in dry semi-arid savannas, the "best map" could be achieved using inter-annual segment metrics.Significant characteristics from MODIS time series metrics between the mapped vegetation type associations become apparent by including in this case a number of six annual growing periods.The vegetation mapping based on annual satellite time series achieves more reliable results for dryer years.
As shown in Table 1, the major life forms of the vegetation types in the Kalahari are characterized by open to closed shrublands, leading to very similar percent coverage values of the particular life form.This can be challenging in terms of an accurate statistical discrimination when coarse resolution satellite imagery is used.EVI time series are a measure of photosynthetic active vegetation due to the increased reflection of healthy vegetation in the near infrared band.Increased EVI values will be observed due to high rainfall rates.The closer the vegetation cover or the greenness of vegetation, the higher are the observed EVI values.Similar EVI observations for different classes can cause a decreased spectral separability among life form classes.For example, similar EVI values can be observed in a shrub-dominated savanna as in a grass-dominated savanna, if rainfall sums are high enough to produce a full coverage of green vegetation on the MODIS pixel.On a 232 m resolution an observation of a pure pixel for one single homogeneous life form group is improbable in savanna ecosystems due to the heterogeneous vegetation structure.A spectral separation of similar life form compositions, for instance the shrub and herbaceous layer, is therefore difficult, as is reflected by decreased classification accuracies.
The similar intensity level of EVI values in the growing season of different life-form classes may be the reason for decreased classification accuracies in the wet rainy seasons of 2005-2006 and 2006-2007.On the other hand, the dry rainy seasons of 2002-2003, 2003-2004, and 2004-2005 effected an increased significance of the reflectance characteristics of the underlying soil properties.The substantial influence of soil properties in local variations in vegetation phenology is discussed by Zhang et al. [60] and thus, the significance of soil-related features for mapping vegetation types in open savannas is discussed in Section 4.2.One reason for the increased accuracy of a long-term feature set can be explained by a characteristic response of each vegetation type class to precipitation conditions, as discussed in Klein et al. [61].The variability of the phenological cycle of different life-form classes of savanna vegetation was analysed by Archibald et al. [62].They showed that woody vegetation is less sensitive to inter-annual variability than grasses.Memory effects of tree species, as discussed below, and differing environmental cues on plant phenology, such as soil moisture for grasses and temperature for some tree and shrub species, can explain those differences.An inclusion of features in the classification process containing information on the inter-annual variability of the photosynthetic activity during the growing seasons, which is a response of spatially and temporally variable rainfall events, can help to separate different life-form classes in savannas.
However, sub-Saharan ecosystems have a high sensitivity to short-term rainfall variability, which can cause significant effects on land-cover dynamics [63].Hence, vegetation phenology is strongly related to precipitation amounts.Further research has to be conducted to develop a comprehensive understanding of the phenology in dry savannas and their interactions with precipitation patterns.

Spectral and Temporal Requirements for Dry Savanna Vegetation Mapping
Remote sensing applications dealing with time series images can be time consuming in terms of image processing and memory allocation.Variable importance can be useful for a pre-selection of relevant input variables to develop an ecologically-based understanding of remote sensing parameters.Here, the variable importance is used to highlight the most useful time series metrics for mapping in an open savanna ecosystem.
Figure 7 displays the 15 most important features for the classification of the seasonal datasets (2003)(2004) for the intensity-related and phenology-related time series metrics, showing the class-wise variable importance score.An analysis of the variable ranking shows that EVI is the most frequently used variable followed by the middle infrared band.Less scored bands are the red and near infrared bands.The list of the 15 high-ranked features indicates a major importance of the phenological information in the EVI metrics.The high-ranked middle infrared and blue bands indicate that reflectance information in the pedological context to a vegetation type is of major importance to distinguish vegetation type classes.The spatial distribution of semi-arid vegetation types is related to subjacent soil properties.Soil type and color are represented in the visible spectral range (soil color) and middle infrared spectral reflectance (soil properties).Regarding the similar life form composition of the vegetation type classes, as displayed in Table 1, the phenological patterns were observed to be very similar.Therefore, the information content of EVI may be too low for accurate vegetation type class discrimination which highlights the importance of the blue and middle infrared band in the variable ranking.
Figure 7 indicates the dominance of intensity-related features from the temporal segmentation (e.g., BLUE 2004 Seg-3 Max, coding the maximum value of segment three in 2004 of the blue band) followed by the occurrence of the EVI 16-day composites (e.g., EVI 2003 DOY-305, coding the EVI value of the day of the year 305 in 2003).
The analysis of the variable importance on a per-class basis indicates the bio-physical relevance of each time series metric for an increase of the accuracy in the tree ensemble.An example is shown for the maximum reflectance of the blue band in the third segment of 2004 (BLUE 2004 Seg-3 Max) assigning the highest relevance for the class bare areas and pans.Due to periodic flooding in pans, the highest variance reduction in the tree ensemble can be achieved using spectral information of the blue band.Similar patterns of increased variable importance of blue reflectance can be observed for the Eragrostis rigidior -Urochloa brachyura grasslands and graminoid crops.Both classes are characterized by sparse vegetation cover and assign increased importance values for the features related to soil properties.Similar findings on the response of herbaceaous and woody life form groups were discussed in Archibald et al. [62].
Beside the spectral information, the occurrence of date-related features such as the single EVI 16-day composite layers strengthens the hypothesis that dry savanna vegetation types can be statistically discriminated from their phenological patterns.The availability of moisture is a key driver for phenological events of Kalahari Sand species dominated by sandy soils, as discussed by Childes [64].Changes soil moisture will affect changes in the species phenology.The degree of syncronisation of phenological events is a function of environmental cues.Leaf-flush in accordance to the first rainfall is an important environmental cue for flush stimulation.Shackleton [65] argue that the initiation of leaves is, beside rainfall, triggered by temperature.This hypothesis confirms the dominance of EVI features of the beginning growing season in the importance ranking (DOY 273,289,305,353).The exposing importance values of the EVI feature coincide with the immediate time before the first rains can be explained by a pre-rain flushing, an effect observed in Australian mesic tropical savannas [66] and for tree and shrub species in Southern African savannas [64].Examples are shown for the vegetation types dominated by Terminalia and Acacia species in Figure 7.
The absence of phenological metrics extracted from the TIMESAT package in the top-rank positions indicates a minor relevance of those features for the classification.The reason may be the static concept of defining the threshold of start and end of season.For vegetation types related to pre-rain flush, a threshold of 25% of the seasonal amplitude may be too high to generate meaningful phenological features for the class discrimination.The inclusion of single EVI layers and temporal segment metrics in the feature set seems therefore more useful to classify vegetation types in the Kalahari than using phenological metrics.Further research has to be conducted to analyse the sensitivity of phenological metrics derived from satellite time series regarding environmental cues of semi-arid vegetation.

Conclusions
This study highlights the importance of combining local scale botanical assessments with coarse scale remote sensing applications.It was demonstrated that even though different vegetation type associations showed similar phenological characteristics, they gave good classification results.Improved accuracies were be achieved by the integration of inter-annual time series metrics of six years, which highlights the importance for studying longer-term inter-annual dynamics in dry savanna ecosystems.The feature ranking indicated a high potential of the spectral bands for the discrimination of vegetation type classes.Beyond EVI, an integration of the the full spectral range from the visible to the middle infrared wavelength range is therefore recommended for mapping landscapes with open vegetation cover.
Results of the separability analyses highlighted the capabilities and limitations for mapping savanna vegetation types.Vegetation types characterized by differing soil properties reached high Batthacharrya distance values and thus enhanced classification accuracies.Lower B-distance measures between closed shrubland and woodland classes due to similar phenology and soil properties indicated less accurate mapping results.In summary, all vegetation types were mapped with lower error rates in drier rainy seasons, which means that long-term phenological observations including the inter-seasonal precipitation variability is useful for an in-depth characterization of semi-arid vegetation.
The Random Forest technique used in this study proved to be robust in terms of classification error, overfitting and feature analysis functionality.The use of novel data-mining methods in combination with bottom-up approaches can therefore improve the understanding of remote sensing applications in biodiversity biogeographic research.Yet, further research has to be conducted in similar ecosystems in a synergetic use of different remotely sensed time series data such as the combination of MODIS products and inter-sensor products with botanical field data.Regarding the highly heterogeneous vegetation structure of savannas, new optical sensor systems, such as the five RapidEye satellites and the Advanced Wide Field Sensor (AWiFS) onboard the IRS satellites, seem promising for accurate mapping of open savanna vegetation structure.
Regarding the optimization of global land-cover products, a process of inter-annual time series metrics beyond one or two years may increase the low class accuracies for semi-arid shrub-and grassland classes.The application of standardized land-cover classification systems is crucial for a bottom-up transfer of botanical information on a coarse remote sensing scale.To expand the vegetation type mapping to the whole Namibia, the most challenging task will be to collect the necessary botanical field data.
A number of more than 10,000 botanical field samples (relevés) are available for Namibia and therefore a highly interdisciplinary project structure is needed to synergize these data on a coarse scale.The translation of data from the botanical survey into the FAO-UN Land Cover Classification System (LCCS) to generate flexible map products will increase the usefulness of land-cover information to a broader user community and will be focussed on in future research.

Figure 2 .
Figure 2. Examples for the generation of training data from in-situ to MODIS 232 m pixel size.Botanical field plots were intersected with homogeneous segments retrieved from Landsat imagery [2a].The training data on the 232 m MODIS pixel is visualized in [2b], displayed on the MODIS image of the 81th day of the year (DOY) 2004.Note the detailed description of the vegetation type legend in Table1.
Figure 2. Examples for the generation of training data from in-situ to MODIS 232 m pixel size.Botanical field plots were intersected with homogeneous segments retrieved from Landsat imagery [2a].The training data on the 232 m MODIS pixel is visualized in [2b], displayed on the MODIS image of the 81th day of the year (DOY) 2004.Note the detailed description of the vegetation type legend in Table1.

Figure 3 .
Figure 3. Flowchart of the extraction of intensity-related temporal segment metrics and phenological time series metrics derived from the TIMESAT software [52].Note the resulting feature sets for the classification in the grey boxes.

Figure 4 .
Figure 4. EVI time series of the year 2003-2004 averaged for the vegetation type classes with 4a moderately closed shrub-and bushland vegetation cover and 4b semi-open shrub-and bushland vegetation after Edwards[41] and graminoid crops and pans shown in the MODIS (MOD13Q1, 232 m) Enhanced Vegetation Index (EVI) smothed with a Savitzky-Golay filter.See Table1for class labels.
The comparison of the long-term statistics of inter-annual MODIS segment features from 2001 to 2007 (upper right italic values, shown in Table 2) indicate that the inter-class B-distance values could significantly increase by a factor of approximately ten.Lowest average B-distance values reached again Acacia mellifera -Stipagrostis uniplumis shrublands, Acacia erioloba -Terminalia sericea bushlands, Terminalia sericea -Combretum collinum shrub-and bushlands.

Figure 5 .
Figure 5. Vegetation type classification derived from inter-annual MODIS time series metrics (2001-2007) based on Random Forest classification.The vegetation type map is shown in 5a.5b and 5c show the classification result for examples of the Omatako River region (Box B) and an Omarumba valley cut deep into Kalahari sands (Box C), 5b and 5c are compared to Landsat-TM images (RGB-4-3-2).

Figure 6 .
Figure 6.Relationship between the out-of-bag (OOB) error rates for seasonal vegetation type classifications and cumulative bi-annual rainfall (mean and standard deviation).Note the increasing OOB error rates with increasing precipitation.
An example is given for the open shrubland classes Hyphaene petersiana plains and Enneapogon desvauxii -Eriocephalus luederitzianus short shrublands on calcareous omirimba and pans, where EVI features of the beginning rainy season score increased importance values.In general, date-related features of the early start of the growing period have increased relevance for discriminating open savanna vegetation types.An example is given for the Acacia-dominated savanna classes where the EVI 16-day composite of the day of the year 305 (EVI 2003 DOY-305) scores increased positions.As shown in Figure 4 this feature marks the immediate time before the main growing season.

Figure 7 .
Figure 7. Variable importance visualizing the 15 top-ranked MODIS time series metrics for mapping Kalahari vegetation types based on the seasonal feature set 2003-2004 as decrease of OOB error.

Table 2 .
Matrix showing average Bhattacharyya distance measures between Kalahari vegetation type classes calculated on MODIS intra-annual segment metrics for the season 2003-2004 (lower left) versus long-term inter-annual MODIS segment features (upper right, italic values) from 2001 to 2007.See Table 1 for detailed class labels.Note the bold values benchmarking examples for the lowest and highest B-distance values.

Table 3 .
User's-, producer's and overall accuracies and Kappa coefficients for six annual and the inter-annual classifications.