Modeling the underlying drivers of natural vegetation occurrence in west africa with binary logistic regression method

: The occurrence of natural vegetation at a given time is determined by interplay of multiple drivers. The effects of several drivers, e.g., geomorphology, topography, climate variability, accessibility, demographic indicators, and changes in human activities on the occurrence of natural vegetation in the severe drought periods and, prior to the year 2000, have been analyzed in West Africa. A binary logistic regression (BLR) model was developed to better understand whether the variability in these drivers over the past years was statistically signiﬁcant in explaining the occurrence of natural vegetation in the year 2000. Our results showed that multiple drivers explained the occurrence of natural vegetation in West Africa at p < 0.05. The dominant drivers, however, were site-speciﬁc. Overall, human inﬂuence indicators were the dominant drivers in explaining the occurrence of natural vegetation in the selected hotspots. Human appropriation of net primary productivity (HANPP), which is an indicator of human socio-economic activities, explained the decreased likelihood of natural vegetation occurrence at all the study sites. However, the impacts of the remaining signiﬁcant drivers on natural vegetation were either positive (increased the probability of occurrence) or negative (decreased the probability of occurrence), depending on the unique environmental and socio-economic conditions of the areas under consideration. The study highlights the signiﬁcant role human activities play in altering the normal functioning of the ecosystem by means of a statistical model. The research contributes to a better understanding of the relationships and the interactions between multiple drivers and the response of natural vegetation in West Africa. The results are likely to be useful for planning climate change adaptation and sustainable development programs in West Africa.


1.
Underlying drivers of natural vegetation were identified by Binary Logistic Regression.

2.
Multiple underlying drivers were significant at p < 0.05 with varying impacts.

3.
Human activities indicators were the dominant underlying drivers. 4.
The response of natural vegetation to climate was altered by intensification of human activities.

Introduction
Over the years, changes in the anthroposphere have drastically altered the normal functioning of the ecosystem, leading to widespread environmental deterioration, serious famine, and food security risks with a subsequent impact on the sustainable development and human well-being in many countries. According to Turner et al. [1] and Foley et al. [2], intensive anthropic (human) activities and the rapid changes in the Land Use Land Cover (LULC) system at the local, regional, and global scales may intensify changes in the land surface climate and exacerbate the vulnerability of the ecosystem to changes. Ecosystem changes, in some cases, are normal phenomena and a natural way for the system to regulate itself. The problem is that the pace and the magnitude at which the system is changing have been alarming in recent years due to over-exploitation by humans in light of population growth, abrupt changes in climate and intrinsic natural conditions (terrain slope, elevation, and soil type), improved accessibility to natural vegetation areas, and other multiple drivers [1,3]. According to Steffen et al. [4] and Ehlers and Krafft [5], in the past, human-induced LULC changes and, in particular, natural vegetation changes were restricted to a few areas and insignificant drivers of the ecosystem dynamics. In contrast, the current human impact on the ecosystem is pervasive and can be observed at a different spatial aggregation. The 1970s and 1980s drought and environmental disturbances in Africa [6] is a typical example of how the synergetic effect of multiple factors may lead to extreme environmental degradation, deforestation, biodiversity loss, food security risks, global warming, and so forth [7][8][9][10]. The consequences of it manifested through the LULC change of the continent, i.e., encroachment of natural vegetation by human-induced LULC types, e.g., cropland and settlements [11][12][13].
LULC change is, therefore, recognized as a major indicator of the impact of climate, anthropic activities, and other drivers on the ecosystem [14]. Lambin et al. [3] highlighted that the LULC change is a complex process driven by a combination of proximate and underlying drivers [15]. Human activities such as farming, illegal logging of timber, firewood extraction, and settlement expansions have a direct impact on the LULC system and may be categorized as proximate drivers [15]. According to Lambin et al. [3], proximate drivers are often under the direct control of the local people. On the other hand, the underlying drivers may be beyond the control of the indigenous people. In many cases, they are intrinsic natural characteristics, i.e., topography, climate, soil type, or accessibility to and from natural areas, socio-economic, e.g., poverty, agricultural, and development policies as well as demographic factors, e.g., human population growth with indirect impacts on the LULC system at the local level [3,14]. The previously mentioned multiple underlying causes of the LULC change have been considered by different theories to explain the underlying drivers of LULC change in the past. According to Chomitz and Gray [16], the drivers of LULC change may be explained by Boserupinan (1965) and Malthusian's (1826) theories, which relate the LULC change to population growth, the von Thunen's (1966) perspective, which relates the LULC change mainly to location-specific characteristics, such as cost of access to market, and the Ricardian (1817) theory, which relates the LULC change to intrinsic land quality (e.g., soil quality, slope, and elevation). Turner et al. [1] pointed out that these drivers are location-specific and context-specific.
In the context of Africa, especially the Sahel region, the school of thought is that climate variability (e.g., variation in wetness index) is the major underlying driver of natural vegetation's change [17][18][19][20]. On the contrary, Lambin et al. [14] were of the view that climate variability alone is not sufficient to explain the natural vegetation change in Africa. Some other studies relate changes in natural vegetation to either population growth or improved accessibility, poverty, the land tenure system, and so forth [21][22][23][24]. Boschetti et al. [25], Brandt et al. [26], Rishmawi and Prince [27], and Leroux et al. [28] highlighted that interplay of multiple factors drive a natural vegetation change in Africa. At the sub-continental and continental scales of Africa, previous driving factors analysis focused on historical LULC change detection [11][12][13]. These studies [11][12][13] revealed LULC transitions at the expense of natural vegetation at some hotspots and led to an improved understanding of the nature, spatial pattern, magnitude, and rates of a natural vegetation change in Africa, thereby, laying a foundation to identify the proximate drivers of a LULC change. However, they did not offer a better understanding of the statistically significant underlying drivers, which determine the presence or absence of natural vegetation at a given location and time.
The numerous underlying drivers of LULC change [21][22][23][24] perceived in Africa may be interpreted as possible linkages between these drivers and the response LULC types. Nevertheless, information about which of these perceived underlying drivers is statistically significant and most important in explaining the presence or absence of natural vegetation at a given moment is not well understood in Africa [29]. Natural vegetation's relationship with these perceived drivers, the conditions at which each interaction is important, leads to the occurrence of natural vegetation with respect to changes in these perceived drivers. These are some of the questions that require systematic research in Africa. Testing the statistical significance of the perceived drivers would lead to an objective assessment of their influences on natural vegetation occurrence. While the proximate drivers may be determined directly from LULC change analysis, the underlying drivers may require supplementing LULC change analysis with spatially explicit dynamic models [30].
LULC change modeling may help to improve the current understanding of the human and climate interaction with the ecosystem, especially the dynamics of natural vegetation and its response to interplay of multiple drivers. This information is required in natural resources as well as environmental and climate change management in Africa since natural vegetation plays a significant role in the global carbon cycle. Modeling the underlying drivers of natural vegetation occurrence is vital to improve the capacity of the vulnerable people to adapt to the accelerating LULC change [3,14] as proper management of the LULC system may increase food production, fibre, efficient resource use, and income toward sustainable development as well as social wellbeing [1]. Nevertheless, spatially explicit dynamic modeling of multiple drivers of natural vegetation dynamics is an under-developed area of research in Africa [14]. In many cases, such analyses are based on subjective group discussion, in-depth interviews, and people's perceptions of the underlying drivers [31,32].
The aim of this research was to test the statistical significance of the underlying drivers of LULC transitions, i.e., natural vegetation loss between the period of 1975-2000 previously detected in West Africa by Asenso Barnieh et al. [11]. Our specific objective was to develop a spatially explicit model to identify the underlying determinants of natural vegetation occurrence prior to the year 2000, i.e., after the severe drought of the 1970s and the 1980s in some selected hotspots in West Africa. The influential underlying drivers out of a set of potential ones were identified and ranked. The relationships between the significant underlying drivers and the occurrence of natural vegetation were also explored. We focused on modeling the effect of the underlying drivers of natural vegetation observed in 2000. According to Rishmawi and Prince [27], analyses of the dynamics in natural vegetation prior to 2000 with remote sensing (RS) data have revealed widespread natural vegetation reductions and impoverishment in Africa. Yet, the underlying drivers are not well understood. Understanding the past land surface processes related to the occurrence, dynamics, and interactions of natural vegetation with the human, climate, topographic, and other components is fundamental for acquiring insights into the current processes and projection into the future. Here, we hypothesized that variation in multiple drivers over past years, i.e., prior to 2000 explained the presence or absence of natural vegetation in West Africa in the year 2000.
Some previous research integrated data from different sources, such as LULC maps derived from RS data, and other socio-economic geospatial datasets to develop spatially explicit dynamic models to understand the relationship between the LULC change and the underlying driving factors [33][34][35]. This study builds upon these sources of information by developing a binary logistic regression (BLR) model to relate the presence of natural vegetation in 2000 to multiple sets of a perceived time invariant and time-dependent predictors in six selected hotspots of a natural vegetation change in West Africa [11].
The BLR has the ability to rank the proximate causes and underlying drivers, which determine the occurrence of a given LULC type at a given moment. It has been used in the past to model the nonlinear relationship between a response variable and a set of multiple predictor variables [36,37]. In recent years, the model has been applied in diverse study domains, e.g., public health-care, and landslide susceptibility analysis, to predict the probability of occurrence of an event and to identify significant drivers [36,[38][39][40][41][42][43][44][45][46][47][48][49][50][51][52]. In many cases, the response variable is binary at two categorical levels (presence or absence, live or dead, yes or no, 0 or 1), while one or more categorical or continuous predictors can be applied [53]. Inter-annual and intra-annual variability in the LULC system can be captured by developing a BLR model using multiple, time-dependent predictors.
In this present study, the BLR model was applied to link the presence or absence of natural vegetation with a set of potential predictors, which we categorized as: (a) natural/climate drivers, i.e., slope, elevation (slope and elevation influence soil erosion in Africa. Gentle slope and low elevations are favorable for replacement of natural vegetation by agricultural activities [54]), soil type (soil provides the required nutrient needed for plants growth. Likelihood of natural vegetation occurrence increases with a favorable soil type. Concurrently, soil type determines the suitability of the land area for agriculture expansion [55]), and wetness index (wetness index was used to indicate variations in precipitation, atmospheric water demand, climate forcing, and moisture availability to plant roots [56]), (b) Anthropic drivers, i.e., population density and human appropriation of net primary productivity (HANPP) (HANPP is a measure of human activities' impacts, e.g., waste disposal, urbanization, construction, and fuel-wood collection on natural vegetation. These impacts have been translated into grams of carbon [57]). Livestock density was also used to represent the impact of human activities in the form of livestock rearing and grazing on natural vegetation [58], while travel time in hours (hr) also represented the impact of human's accessibility to and from natural areas on natural vegetation [59].

Study Area: Selected Hotspots in West Africa
The study was undertaken in West Africa. The area is characterized by five broad bio-climatic zones, i.e., Saharan, Sahelian, Sudanian, Guinean, and Guineo-Congolian [60]. These five broad bioclimatic zones were further categorized into two major bio-climatic zones on the basis of the wetness index defined as the ratio of mean annual precipitation (MAP)/mean annual potential evapo-transpiration (MAE) by Trabucco and Zomer [56]. Areas with MAP/MAE < 0.4 and MAP/MAE ≥ 0.4 were categorized as Sahel (arid-semi arid) and Sudanian, Guinean, and Guineo-Congolian (humid) regions, respectively, for the purpose of this study. The Saharan eco-region in West Africa was not included in this analysis due to the desert conditions [61]. For a description of the different LULC types and long-time dynamics in West Africa, we referred to Asenso Barnieh et al. [11]. Six sites with massive natural vegetation loss between 1975 and 2000 were selected for this analysis (see Figure 1). Three sites, i.e., Site 1. Diourbel-Louga (Senegal), Site 2. Hodh el Gharbi (Mauritania), and Site 3. Zinder-Maradi (Niger) were located in the arid regions and the remaining three sites were located in the humid region, i.e., Site 4. Centre and Centre Sud (Burkina Faso), Site 5. Ashanti region (Ghana), and Site 6. Niger State (Nigeria) of the study area. Diourbel-Louga is located in the Peanut Basin of Senegal within the Sahel (arid) region. The extent of the area is about 29,537.5 km −2 . The climate here is characterized by two major seasons, i.e., nine months of the dry season, i.e., October-June and three months of rainy season, i.e., July-September [62]. In the west, cropland fields were abandoned and replaced by grasslands and other vegetation since the 1980s. The land devoted to rain-fed crops remained fairly stable between 1975 and 2000 and agricultural expansion continued in the east. This overshadows the actual magnitude of rapid agricultural land increases in this region [11,61,63]. The major soil types in the area are Cambisols, Gleysols, Arenosols, Regosols, and Vertisols. Arenosols is the dominant soil type. Hodh el Gharbi is situated in the southern part of Mauritania and covers a land area of about 50,273 km −2 . The area falls within the Sahel region with very high inter-annual rainfall variability and is characterized by sandy dunes supporting large rangelands and comparatively fertile ferruginous soils with agricultural potential. The major soil types in the area are Lithosols, Luvisols, Arenosols, and Regosols. Here, steppe and Sahelian short grass savannah are the dominant vegetation types. Previous analyses of LULC transitions in this region revealed replacement of natural vegetation by other LULC types, such as bare land and sand dunes [11,61].

Site 3: Zinder-Maradi (Niger)
Zinder-Maradi is situated in the lowest part of the Niger Plateau in South-central Niger with Sahelian bioclimatic conditions characterized by a large inter-seasonal and intra-seasonal variability of rainfall, with annual rainfall between 200 and 600 mm a −1 and accompanied by high temperatures in the dry season. The extent of the area is approximately 185,067 km −2 . The dominant soil types in the area are Arenosols, Gleysols Lithosols, Luvisols, Fluvisols, Vertisols, Regosols, and others [55]. The mean bio-productivity in this region increases toward the southern boundary of the country with relatively high population density between 80 and 150 inhabitants/people km −2 coupled with "wall-to-wall" farmland, where agricultural fields occupy nearly the entire landscape [61]. Approximately 80% of all land is cultivated with few uncultivated natural vegetation patches, making the region the largest agricultural area in Niger. Here, farmers preserve trees by natural regeneration. This is a practice which has improved on-farm trees and crop production in the area. Farmers experienced major tree declines in the 1970s and 1980s as a result of drought, expanding cropland, and human pressure [ Centre and Centre Sud are located in southern Burkina Faso across a wide bioclimatic gradient with annual rainfall between 650 mm a −1 and 1000 mm a −1 in the more humid Sudanian Region. The extent of the area is about 14,381 km −2 . Forest, gallery forest, Savanna, steppe, and rocky land are the major LULC types in the country. Previous LULC change analysis revealed the replacement of natural vegetation by cropland and settlements, along the Ouagadougou-Pama corridor in the southeast [11,61]. The major soil types in the area are Luvisols, Lithosols, Cambisols, Vertisols, Regosols, and Planosols.

Site 5: Ghana (Ashanti Region)
The Ashanti Region is located in the south-central part of Ghana where previous LULC transition analyses revealed massive natural vegetation loss due to encroachment by cropland and settlements. In the last four decades, the population density in Kumasi, which is Ghana's second-largest city, has risen sharply due to substantial rural-urban migration led by the development of factories and business activities both in and around the city, thereby, exacerbating negative impacts on pristine forest reserves (e.g., Bobiri forest reserve and Banda Hills). The extent of the area is about 24,904.1 km −2 . The region is characterized by two rainy seasons, i.e., the major rainy season in April to August and the minor rainy season from September to November with mean annual rainfall of 1270 mm a −1 [66]. The major soil types in the area are Acrisols, Lithosols, Nitosols, and Luvisols.

Site 6: Nigeria (Niger State)
The Niger State is located at the Southern Guinea Savannah zone of Nigeria and extends into a land area of about 71,017.8 km −2 . Livestock rearing is prominent in this state. The state has a rainy season from April to October and a dry season from November to March. Mean relative humidity is 59% and temperature varies between 22 • C and 39 • C. The mean annual rainfall is between 1200 mm and 1300 mm a −1 , with the highest temperature in March and the lowest in August. The major soil types in the area are Nitisols, Luvisols, Lithosols, and Fluvisols. Previous LULC transition analyses in this region revealed settlement, rain-fed, and irrigated cropland, plantation, and open mines fields' expansions at the expense of natural vegetation [11,61].

Datasets
The LULC map in 2000 was acquired from the United States Geographical Survey (USGS), West African Land Use Dynamic project. The data extend over seventeen countries in West Africa at 2 km spatial resolution. Only a small part of Cameroon was mapped. Additionally, according to CILSS [61], the northern parts of Mauritania, Mali, Niger, and Chad were not mapped due to the desert conditions, stable vegetation, and other LULC types (e.g., sand and rocks) through time. As a result, the unmapped northern parts of the previously mentioned countries, Cape Verde and Cameroon, were excluded from the analyses in this paper. The white regions on the map in Figure 1 are the areas without data and, therefore, were not mapped. Comprehensive definitions of the LULC types in the map can be found in a book published by CILSS [61]. The data was accessible at https://eros.usgs.gov/westafrica (accessed on 10 November 2018). The original LULC data with 24 classes had been re-classified into 7 LULC classes in a previous research study [11]. The reclassified map was further categorized into either presence or absence of natural vegetation and used as a dependent variable in a binary logistic regression (BLR) model we established in this study. To explain the spatial patterns and long-term response of natural vegetation to changes induced by the interplay of complex human-environmental interactions and other multiple drivers, perceived drivers of a LULC change in the study area were reviewed from literature and eight independent geospatial datasets of these perceived drivers were acquired from an open source earth observation database and categorized as: (a) natural/climate drivers, i.e., slope, elevation, soil type, and wetness index and (b) anthropic drivers, i.e., human appropriation of net primary productivity (HANPP), population density, livestock density, and accessibility drivers, i.e., travel time from natural vegetation areas to urban resources. Potential drivers without geospatial dimensions were excluded from the analyses because one of the major objectives of this study was to develop a spatially explicit dynamic model on independent predictive variables. The categorization was based on a conceptual framework developed from a combination of the Boserupian (1965), Malthusian (1826), Ricardian (1817), and von Thunen's (1966) LULC change theories described by Chomitz and Gray [16] and Mazzucato et al. [22].
The soil map has information about the major soil groups of the world [55]. The spatial resolution of this dataset was about 10 km. Each grid cell contains the unique value that depicts the features of the major group it represents. The dataset is available at: http://www.fao.org:80/geonetwork/srv/en/resources.get?id=14129&fname=Map4 _5.zip&access=private (accessed on 10 May 2018).
The global slope dataset was obtained from the United States Geographical Survey (USGSS). The gridded slope layer was mapped at about 1 km spatial resolution. The data and the metadata, which explains how the dataset was produced, can be accessed from https://www.usgs.gov/natural-hazards/earthquake-hazards/science/vs30-modelsand-data (accessed on 10 May 2018). The elevation data was extracted from the Digital Elevation Model (DEM) data acquired by the NASA-Shuttle Radar Topography Mission (SRTM). The dataset is available for downloads at 30 m resolution at https://developers. google.com/earth-engine/datasets/catalog/USGS_SRTMGL1_003 (accessed on 10 May 2018). Figure 2 shows the LULC map and the geospatial datasets of the eight potential predictors we used for the analysis. The wetness index gridded dataset was mapped at 1 km spatial resolution. This index is given as a function of precipitation, temperature, and potential evapo-transpiration. The global mean wetness index has been binned into hyper arid, arid, semi-arid, dry sub-humid, and humid (i.e., < 0.03, 0.03-0.2, 0.2-0.5, 0.5-0.65, >0.65), respectively [56]. Details of the method used to estimate the mean wetness index from 1950 to 2000 can be accessed from the CGIAR-CSI GeoPortal at: https://www.dropbox.com/sh/e5is592zafvovwf/ AAAijCvHNiE4mYvYqWDpeJ3Ga/Global%20PET%20and%20Aridity%20Index?dl=0 (accessed on 10 May 2018) A gridded raster map of accessibility (i.e., mean travel time (hours) from natural areas to the nearest town, market, other urban resources) and human settlements greater than 20,000 inhabitants was obtained from the FAO-Geo-network database. This dataset was generated by the Harvest Choice (2015) and available for downloads at http://harvestchoice.org/data/TT_20K (accessed on 10 May 2018).
The demography driving factors map (time series gridded population density map of West Africa at five-year intervals (from 1995-2000) was obtained from Center for International Earth Science Information Network CIESIN [67]. This dataset is a spatially explicit global population census data mapped at 1 km, which depicts the distribution of the human population in each grid cell. The dataset can be accessed at https://sedac.ciesin.columbia. edu/data/set/gpw-v4-population-density-adjusted-to-2015-unwpp-country-totals-rev11 (accessed on 10 May 2018).
The global patterns in human appropriation of net primary productivity (HANPP) have been mapped at 28 km spatial resolution in grams of carbon per grid cell. The datasets were downloaded from: http://sedac.ciesin.columbia.edu/es/hanpp.html (accessed on 25 January 2019). [57]. The gridded livestock density datasets were available at: http: //harvestchoice.org/data/ad05_tlu (accessed on 24 February 2019). The datasets included livestock grazing and browsing values mapped at a 5 km spatial resolution as density of livestock per cell measured as tropical livestock units (TLU/sq. km). In addition, Google Earth Imagery and other ancillary data were also obtained for the validation of the results of the analyses. Table 1 is a summary of the datasets we used for the analysis.

Binary Logistic Regression (BLR) Model
Nelder and Wedderburn [36] defined the BLR model as "empirically parameterized static model that compute probabilities, which indicate the likelihood of the occurrence of a specific event at a specific location in time." The BLR approach can model the inherent complex human environmental interactions to inform a LULC change. Where this change occurs over a longer period of time, the underlying drivers of change and how these drivers interact with the LULC system. The probability that a given LULC type (Y) is equal to (1), i.e., present for a given value of X or a change in a time-dependent variable can be determined by a BLR model. In a situation whereby the BLR model is generated by geospatial datasets, the technique yields coefficients that can be used to generate maps of the probability of occurrence of events, which is, in our case, the presence of natural vegetation [30,68]. Therefore, based on the BLR model and the potential predictors we used in this study, the probability P of observing the presence (1) or absence (0) of natural vegetation in a given pixel in the year 2000 can be expressed as: Exp β 0 +β 1 Mean Wetness index+β 2 Mean HANPP...+β n x n +e 1 + Exp β 0 +β 1 Mean Wetness index+β 2 Mean HANPP...+β n x n +e (1) where β 0 stands for the intercept, β 1 , β 2 . . . β n are the slope parameters, and {x 1, x 2, . . . , x n, } are the vectors of the independent predictors per grid cell. In this study, the predictors were: wetness index, HANPP, travel time, livestock density, population density, soil type, elevation and "slope", and e is the residual [44,52,69]. The detailed methodology is explained in Section 3.2 through Section 3.5. The output of the BLR model can be interpreted to rank the importance of the drivers. The ranking of the predictors shows which driver has the greatest effect on the dependent variable. The BLR also yields an estimate of the significance of the drivers. In regression analyses, the probability that the observed relationship between the independent and the dependent variable occurs by chance gives the statistical significance level. The statistical significance level is inversely proportional to the set confidence level, i.e., the smaller the statistical significance, the higher the confidence of the relationship between the variables under consideration [34]. In this study, the statistical p-value was applied as a metric of significance of the multiple set of predictors, while the absolute value of the standardized BLR model coefficients and the odds ratio were used to determine and rank the importance of the drivers. The odds of an event are given by P/(1 − P), where P is the probability that the event will occur. The odds ratio is defined as the ratio of two odds and accounts for the effects of an independent variable on the dependent variable. The odds ratio is calculated as: The natural log of the odds ratio gives the logit coefficient, i.e., the exponent β (Exp β ) in Equation (1). Therefore, the two functions are a measure of the same effect (i.e., the strength of the relationship between the dependent variable, and the independent predictors) [44,51]. Lower odds of an event will give β < 0, Exp β < 1, while higher odds will give β > 0, Exp β >1. The major reason for choosing the BLR model for our analysis is its ability to accommodate a wide range of data types, i.e., categorical (binary, ordinal) and continuous data, which are typical examples of data types employed in LULC modeling. On top of it, if the objective is to predict the probability of an event, the BLR model can be optimized to achieve better prediction accuracy unlike the other "black box" models [44].
To develop the BLR model, five steps were followed (i.e., literature review on potential drivers (see Introduction), geospatial data acquisition and preparation, multi-collinearity analysis, BLR model development, and validation).

Data Preparation
One of the ways to model the relationship in space and time between the presence of a given LULC type and multiple predictors at the continental scale is to determine historical changes in LULC for the whole continent in order to identify hotpots of severe changes. Spatially explicit dynamic models may then be developed to relate the observed LULC type in each hotspot to multiple predictors, e.g., factors that determine the presence of natural vegetation. In this way, the dominant underlying drivers at the continental scale can be captured. In the case of this study, long time LULC transitions mapping and hotspots identification in the study area, i.e., West Africa had been undertaken by Asenso Barnieh et al. [11]. Natural vegetation (forestland and other natural vegetation) encroachment by other LULC types, i.e., cropland, wetland, water bodies, settlements, and other LULC were the major transitions detected between 1975 and 2000. Therefore, this study focused on understanding the underlying drivers, which determined the presence of natural vegetation at some of the selected hotpots in West Africa. Due to geospatial data gaps on the potential drivers after 2000, we restricted our analysis to the years prior to 2000. However, the approach we used can be applied to model occurrence of natural vegetation at any given year when data availability is not a limiting factor.
On the basis of the previously mentioned BLR modeling approach [37], the observed LULC map of a given year (2000) with 7 LULC classes (cropland, forestland, other vegetation, wetland, water, settlement, and other LULC types) developed by Asenso Barnieh et al. [11] was further aggregated into just two categories, i.e., presence of natural vegetation, i.e., forestland and other natural vegetation and absence of natural vegetation, i.e., cropland, wetland, water bodies, settlements, and other LULC types. These major LULC categories were converted into a binary dataset and coded as 1 (presence of natural vegetation) and 0 (absence of natural vegetation). The binary LULC map and all the other geospatial gridded raster maps of the potential drivers were projected and resampled to the same spatial resolution (2 km) as the LULC map in 2000. This analysis was conducted in ARCGIS version 10.4 and the results were exported into R computing programming software interface for further analysis. All the missing values in the entire datasets were removed [44]. The datasets were divided into training (80%) and testing (20%) for the development of the model and validation, respectively. In order to examine the range of values of the potential drivers, the individual datasets were plotted against the natural vegetation dataset.

Multi-Collinearity Analysis
Collinearity occurs in multiple regressions modeling when any of the predictors are strongly correlated and two or more of the predictors bring the same information into the model. Strongly correlated predictors may inflate the standard errors of the regression coefficients in multi-variant regression and reduce the power of the significance test [44]. To identify the predictor variables, which may exhibit multi-collinearity and to remove redundant variables from the model, we analyzed correlation of the predictor variables after re-sampling the data to the same spatial resolution (2 km, i.e., the spatial resolution of the observed LULC map). Apart from the correlation analysis, other means of multicollinearity analysis (variance inflation factor (VIF), kappa, and conditional number of the eigen values) were also employed. While the output from the correlation matrix was used to examine the kappa and the conditional numbers of the eigen-values in the model, the output from the BLR model fitted with all the potential drivers was used to examine the VIF of these drivers. Any driver with VIF greater than 10 indicates severe multi-collinearity in the model. In addition, kappa and conditional numbers of the eigen-values greater than 100 indicate severe multi-collinearity [40,51,70,71].

Model Development
The presence of natural vegetation, i.e., P(Y) for a given set of values of the independent predictors X was modeled by fitting a BLR. The previously mentioned potential drivers, i.e., mean wetness index, mean population density, soil type, elevation, and travel time from natural areas to urban resources, livestock density, and HANPP (Section 3.2) were used to develop the BLR model. The BLR model calculates the natural logarithm of the odds ratio (the probability of occurrence against the probability of non-occurrence of an event) to transform a non-linear model into a linear model [72]. The development of the BLR model was based on Equation (1).
The output of this model ranks the contribution of each variable in explaining the presence of natural vegetation. Prior to the binary logistic modeling with the multiple predictors, a one-to-one logistic regression model was initially fitted between the binary response LULC data and each predictor. The rationale was to exclude insignificant drivers before the development of the multivariate BLR model. The model was run stepwise in a forward and backward direction to eliminate redundant variables, which may exhibit multi-collinearity in order to remove less significant variables.
The most important drivers of natural vegetation's occurrence were determined based on Wald Statistics and the odds ratio from the BLR model. The variable with the highest standardized coefficient was the most important in explaining occurrence of natural vegetation in 2000. To determine the dominant categories of drivers, i.e., natural or anthropic activities in each study site, the BLR model was again determined first using natural predictors only, i.e., slope, elevation, soil type, and wetness index, and second human influence predictors only, i.e., a combination of anthropic (i.e., HANPP, livestock density) and accessibility indicator, i.e., travel time. The rationale was to ascertain which of the two combinations would provide a better BLR model fit.

Model Validation
Data-splitting was adopted to validate the BLR model. Prior to the BLR modeling, the total observations in each hotspot were randomly divided into two sets, i.e., training samples of 80% (model development) and testing samples of 20% (model evaluation). The Akaike information criterion (AIC), Bayesian information criterion (BIC), and log likelihood-ratio were used as metrics for two or more model evaluations. Stepwise forward selection of the most significant predictors and backward elimination of the least significant predictors were applied in this evaluation. If two selection criteria give the same results, then multi-collinearity in the model can be neglected [34,51,[73][74][75][76][77][78].
The entire stepwise selection technique was an iterative process. In the case of the forward selection process, the model was initially fitted with the most significant predictors and additional predictors were added in order of significance. The iteration stops when no improvement can be achieved by including additional predictors. By contrast, the stepwise backward selection process is initialized by fitting the model with all the possible predictors and eliminates the least significant one at each step until the decrease in model accuracy at each step is less than a pre-set threshold. At each stage of both forward and backward selection, AIC, p-values, and the residual, with the number of degrees of freedom being the number of the independent predictors remaining at each stage, were used to set a termination rule. Lower AIC, p-values, Log likelihood ratios, and residual deviance were used to determine the best fitted model. A p-value < 0.05 was considered as the optimal termination rule to include significant variables [34,51,[73][74][75][76][77][78].
The ability of the BLR model to predict the presence of natural vegetation was evaluated with the testing samples. Using Equation (1) and the BLR model coefficients estimated with the training samples, we predicted the presence of natural vegetation in each hotspot of the study area. The AUC, which is the area under the receiver operating characteristics (ROC) curve, was used to evaluate the predictive accuracy of the BLR model [47,70,71,76,[79][80][81]. The true positive rate of the predicted values was plotted against the false positive rate at various threshold settings [68]. The AUC ranges from 0 to 1. AUC less than or equal to 0.5 is an indication that the model prediction is random, while AUC > 0.5 indicates a deterministic prediction of the response variable. AUC ≈ 1 is an indication of a perfect fit [47,70,71,76,[79][80][81]. The map of predicted probabilities for the presence of natural vegetation in each hotspot site was compared with the observed map as a means of additional validation.

Multi-Collinearity
The descriptive (Table 2) statistics illustrate the patterns in the variables we used for the development of the BLR model. The mean HANPP (2.14 × 10 11 , 1.728 × 10 11 , 8.94 × 10 10 and 26.25 × 10 10 in grams of carbon) was relatively higher in the Ashanti Region of Ghana, Centre-Centre Sud of Burkina Faso, Diourbel-Louga in Senegal, and Niger State in Nigeria coupled with a denser population, i.e., mean population densities of 153.47, 119.08, 57.73, and 47.35 inhabitants/people km −2 , respectively. Except Diourbel-Louga in Senegal, the mean wetness indexes, i.e., 0.84, 0.42, 0.20, and 0.61 in the previously mentioned areas were comparatively high, thus, placing these regions in the humid agro-ecological zone as defined in Section 2.2. Overall, livestock density did not appear to correlate with population density across the sub-continent since the mean population density in some study sites, e.g., Ashanti Region of Ghana, was comparatively high when coupled with low mean livestock density (Table 2). Comparatively, the mean travel time, i.e., 3.52, 2.75, 2.60, and 3.74 hr to grid cells endowed with urban resources was lower in densely populated sites, i.e., Ashanti Region of Ghana, Centre-Centre-Sud, in Burkina Faso, Diourbel in Senegal, and Niger State in Nigeria, than in sparsely populated sites, i.e., Zinder-Maradi in Niger and Hodh el Gharbi in Mauritania, where it was 6.70 and 7.86 hr, respectively. Here, SD, SE, TLU, and HANPP represent standard deviation (SD), standard error (SE), tropical livestock unit (TLU), and human appropriation of net primary productivity (HANPP), respectively. Figure 3 shows the pairwise correlation between the continuous predictor variables at the various study sites. Table 3 is the BLR model output which shows the significance levels of the predictor variables and Table 4 shows the performance parameters of the model. In many of the sites analyzed, some drivers exhibited multi-collinearity ( Figure 3 and Table 4). The correlation matrices ( Figure 3) and the VIF of the full model (Table 4) indicated multi-collinearity in the model at Zinder-Maradi-Niger (HANPP and wetness index, correlation = 0.8) and Diourbel-Louga, Senegal (HANPP and Population density, correlation = 0.8) sites. The strength of the correlation did not exceed the threshold for severe collinearity (correlation value > 0.8) we set in this study. In both cases, the variables exhibited positive correlation. The VIF values suggested multi-collinearity in the full model, particularly in the study sites in the arid and semi-arid regions. The conditional number of the eigen values and the Kappa of the predictors did not exceed the thresholds, i.e., kappa and conditional number of the eigen values > 100, that we set for this study (Table 4). In general, multi-collinearity was an issue in the sites where the independent predictors could explain the presence of natural vegetation, regardless of the other predictors. If two collinear predictors bring the same information into the model, the internal algorithms of the BLR model remove the less important predictor.    In each study site, the variables are arranged in order of significance. Insignificant variables were automatically removed by the algorithm used to develop the model. In some cases, some predictors were insignificant but were retained in the model if the removal would decrease the model accuracy. The predictors were ranked as follows: p < 0.001 *** (Extremely Significant), p < 0.01 ** (Very Significant), p < 0.05 * (Significant) p < 0.1 •(Less Significant), 1 (Insignificant). Here, AUC/ROC represents the area under the receiver operating characteristics (ROC) curve. The AUC ranges from 0 to 1. AUC less than or equal to 0.5 is an indication that the model prediction is random, while AUC > 0.5 indicates a deterministic prediction of the response variable. AUC ≈ 1 is an indication of a perfect fit. Here, AIC, BIC, VIF, and HANPP represent Akaike information criterion (AIC), Bayesian information criterion (BIC), variance inflation factor (VIF), and human appropriation of net primary productivity, respectively.

Significant Underlying Drivers of Natural Vegetation Identified from the Binary Logistic Regresion (BLR) Model
The BLR model revealed that the occurrence of natural vegetation in West African landscape is determined by multiple combinations of drivers (Tables 3 and 5). The combinations varied from one location to another, apparently responding to the local environmental and socio-economic conditions. The ranking of the drivers also differed from one location to another. Human influence predictors were the most significant in the Ashanti Region (HANPP), Diourbel-Louga (HANPP), Niger State (livestock density), and Hodh el Gharbi (livestock density). HANPP, livestock density, and population density were significant at all the sites. The significance levels of population density were lower in the Ashanti region of Ghana and the Niger State of Nigeria, with both being densely populated (Tables 3 and 5).  The wetness index was the most significant driver in Zinder-Maradi in Niger, located in the arid and semi-arid eco-region, and Centre-Centre Sud in Burkina Faso in the semihumid eco-region. In the Ashanti Region of Ghana and the Niger State of Nigeria in the southern humid region, the wetness index was not a significant driver at p < 0.05. In Hodh el Gharbi in Mauritania (arid region), the model fitted with the natural predictors only, i.e., the wetness index, elevation, slope, and soil type predicted the presence of natural vegetation better than the socio-economic predictors only, i.e., HANPP, population density, livestock density, and travel time. In all the remaining study sites, the model fitted with only the socio-economic predictors (see Table 3 for the AUC of the sub models) achieved better performance than the model fitted with only the natural predictors.

Model Validation
In all the study sites, the stepwise forward and backward BLR yielded the same model outcomes. The performances of these two models were better than the model fitted with only the intercept and the "full model" with all the predictors included. The best model selection was identified by the lowest AIC, Log Likelihood Ratio, BIC, and the residual deviance (Tables 3 and 4). Here, we presented the results of the forward fitting model since this identifies and ranks the most important predictors from the best to the least, which was the main objective of this study (Tables 3 and 4). As mentioned in Section 3, the ROC curve (AUC, see Figure 4 and Table 4) was used as a performance metric to assess the predictive accuracy of the BLR model. AUC values range from 0 to 1. Model fitting is best when AUC ≥ 0.  Figure 5 shows the predicted probability maps and the observed LULC maps in the study sites. The BLR model fitted with only socio-economic human activities and accessibility indicators, i.e., HANPP, livestock density, population density, and travel time to urban resources, performed better than with only a wetness index, slope, elevation, and soil type in all the study sites except in Hodl el Gharbi, Mauritania. At the latter site, the model fitted with only natural predictors gave a higher AUC than with the socio-economic indicators only (Table 4).

The Relationship between the Significant Underlying Drivers and Natural Vegetation
The sign of each coefficient of the predictors is a measure of the magnitude and direction of change in the probability of presence of natural vegetation in response to a change in a predictor (Table 3, Figures 6 and 7). Positive coefficients of the predictors indicate that increasing the effect of the predictors will increase the likelihood of natural vegetation's occurrence (presence) and vice versa (see Tables 3 and 5, Figure 6(b3-b4,c1-c3) and Figure 7(a3-a4,b1)). On the contrary, negative coefficients of the predictor variables indicate that increasing the effect of the predictor variable will decrease the likelihood of natural vegetation's occurrence (see Tables 3 and 5, Figure 6(a1-a4,c4) and Figure 7(a1-a2,b2)). We observed a consistent negative impact of HANPP on the presence of natural vegetation across all the study sites (see Tables 3 and 5, and Figure 6(a1-a4)). In many cases, the wetness index was insignificant in the humid regions, i.e., the Ashanti Region of Ghana and Niger State in Nigeria, characterized by intensive anthropic activities, i.e., high HANPP and livestock density (see Tables 3 and 5). In areas where the wetness index was a significant predictor, the effect on natural vegetation was different depending on the specific conditions of the site (see Tables 3 and 5 and Figure 7(a1-a4)).    Table 3 for all the cases, Figure 6(a1,a2) and Figure 7(a1,a2)) for graphical representations of the cases in Diourbel-Louga, Senegal, and Zinder-Maradi, Niger, respectively. Here, the effect of wetness index on the presence of natural vegetation was negative. The reverse was true. For example, Centre-Centre Sud in Burkina Faso, i.e., in the same study site, the likelihood of natural vegetation occurrence was higher in locations, with a higher wetness index (more humid conditions), while human activities were intensive in areas with a lower wetness index. In this region (see Tables 3 and 5 and Figure 7 Tables 3 and 5, Figure 6(b1,b2) for some of the cases). By contrast, the impact was positive in two arid regions, i.e., Hodh el Gharbi-Mauritania and Zinder-Maradi-Niger, and one humid region, i.e., the Ashanti Region of Ghana (see Tables 3 and 5, Figure 6(b3,b4) for some of the cases). In the Ashanti Region, livestock density was lower in the highly urbanized areas where HANPP and population density were extremely high. In other words, HANPP and population density were negatively correlated with livestock density, but the strength of the correlation was weak to suggest multi-collinearity and the significance level of livestock density was low (see Table 3).
Accessibility measured by travel time to urban areas with population greater than 2 × 10 4 of people had a positive impact on natural vegetation in Ashanti Region of Ghana, Niger State in Nigeria and Diourbel-Louga in Senegal (see Tables 3 and 5, Figure 6(c1-c3)). Travel time to urban areas was an insignificant driver in Center-Center-Sud in Burkina Faso, which is a semi-humid region. On the other hand, the effect of this predictor on the presence of natural vegetation was negative in the arid regions (see Tables 3 Tables 3 and 5).

Multi-Collinearity
In some sites, e.g., Diourbel-Louga in Senegal, some variables exhibited collinearity ( Figure 3 and Table 4). We set aside 20% of the original data for model validation and prediction. Thus, the dataset used for model predictions had the same degree of collinearity as the original dataset used to train the model. According to Harrrell Jr. [44], collinearity does not affect predictions when based on the same dataset used to estimate the model parameters, or on a new data that have the same degree of collinearity as the original data, provided extreme extrapolation is avoided. In the stepwise variable selection, collinearity can cause predictors to compete and make the selection of "important" variables arbitrary. Collinearity makes it difficult to estimate and interpret a particular regression coefficient because the predictor carries limited additional information. Despite that, we strived to eliminate redundant variables by initially identifying the potential drivers from literature. In addition, candidate predictors with a huge number of missing values were removed.

Significant Underlying Drivers Identified by the Binary Logistic Regression (BLR) Model
The BLR model provided a clear picture of which variable was significant with the corresponding p-value applying to the final predicted outcome. It revealed that the impact of the multiple sets of drivers on natural vegetation depends on other specific environmental conditions of the location under consideration, as the relationship between natural vegetation and the significant predictors varied from one study site to another for even the same predictor variables (see Tables 3 and 5).
The stepwise variable selection technique we employed enabled us to develop a concise model with fewer predictors when possible, thereby, reducing collinearity among the multiple predictors. This prevented insignificant regression coefficients from being included in the model [44,82,83]. The major limitations of this technique have been underlined by Harrrell Jr. [44], who noted that the technique may yield a p-value, which is too small. Harrrell Jr. [44] further stressed that the problem of multi-collinearity may be compounded by applying stepwise variable selection to select randomly significant variables since the inclusion of a potential driver in the model depends on the estimated regression coefficient instead of their true values. As a result, the probability of a potential driver to be included is higher if its regression coefficient is over-estimated than the reverse [82].
According to Derksen and Keselman [83], the number of potential predictors may add noise in the model. In addition, the correlation between the predictors may affect the inclusion of genuine predictors. With respect to the dimensionality of the model, Derksen and Keselman [83] proposed that the focus of effective driver selection should be the development of an algorithm to determine beforehand the total number of potential drivers to be included, rather than focusing on the number of drivers in the final model. Harrrell Jr. [44] recommended that, prior to driver selection, multi-collinearity and driver interaction must be tested. Methods such as a full model fits or data reduction are preferable to stepwise selection algorithms. However, in the case of this study, variable selection was crucial since the major objective was to identify the important drivers based on the ranking of the predictor variables by the stepwise selection algorithm.

Model Validation
Model performance is usually assessed by evaluating whether the model can accurately predict future values of the response variables. Different model validation methods have been highlighted by Harrrell Jr. [44]. We applied data-splitting to set aside independent data for model validation. However, this approach has drawbacks since data-splitting greatly reduces the sample size for both model development and model validation. Roecker [84] noted that this method "appears to be a costly approach, both in terms of predictive accuracy of the fitted model and the precision of estimated accuracy". Breiman [74] found that bootstrap validation on the original sample was as efficient as having a separate test sample twice as large.
In our case, the predictive accuracy of the BLR model was better in the arid regions than in the humid regions (Table 3 and Figure 4). Harrrell Jr. [44] linked poor performance of a regression model with over fitting, changes in measurement methods, changes in the definition of categorical variables, and major changes in variable inclusion criteria. The poor performance of our model in the humid ecosystem may be due to the exclusion of potentially important drivers, such as land tenure system, customs, norms, and so forth [22] in the development of the BLR model. According to Mazzucato et al. [22], the coping strategies adopted by the local informal land tenure system in Africa, have shaped the functioning of the ecosystem over years [22]. This suggests that, to obtain a perfect model fit, additional predictors must be included in the model. Among other things, a data gap in the predictor variables is one of the major limitations of this study.

The Relationship between the Significant Underlying Drivers and Occurrence of Natural Vegetation
Here, the effect of the dominant drivers, i.e., HANPP, wetness index, livestock density, and accessibility are discussed in detail from Section 5.4.1 through Section 5.4.4 with a brief discussion on the effect of soil types, slope, and elevation on the presence of natural vegetation in Section 5.4.5.

Human Activities (HANPP) and Demography (Population Density)
Although the relationships between the occurrence of natural vegetation and some significant drivers were location-specific, the BLR model estimated a negative relationship between HANPP and the presence of natural vegetation at all the study sites at p < 0.001. This indicates that any significant increase in HANPP will decrease the likelihood of the presence of natural vegetation (see Tables 3 and 5 and Figure 6(a1-a4) and for the graphical representation of the cases in Diourbel Louga, Senegal, Zinder-Maradi, Niger, Ashanti Region, Ghana, and Niger State, Nigeria, respectively). This result is in agreement with Steffen et al. [4] and Ehlers and Krafft [5,85] who stated that the current human impact on the ecosystem, i.e., on natural vegetation, in particular, is pervasive at different scales. HANPP was a dominant driver in the Ashanti Region, Ghana and Diourbel-Louga in Senegal where Asenso Barnieh et al. [11] revealed a massive settlement and cropland expansions. On the other hand, vegetation recovery due to cropland abandonment was detected in some isolated areas of Diourbel-Louga by the same authors.
In-depth analysis revealed that human activities had an impact on the relationships between the presence of natural vegetation and other significant drivers, particularly the wetness index, soil type, slope, and elevation. For instance, in the Ashanti Region of Ghana, Niger State in Nigeria and Diourbel-Louga in Senegal where human activities were a dominant driver, slope had a positive impact on natural vegetation because humaninduced LULC types, i.e., cropland, human settlements, and other LULC types, which were categorized as "absence of natural vegetation" in this study systematically occurred in areas with a relatively lower (gentler) slope, with the presence of natural vegetation in areas with a steeper slope (Figure 2). This signifies that crop farming and expansion of other built-up areas are avoided in areas with steeper slopes, thereby, favoring conservation of natural vegetation. Another example of how human influence may alter the response of natural vegetation to climate is the negative relationship between the occurrence of natural vegetation and the wetness index in Zinder-Maradi, Niger and Diourbel-Louga, Senegal (see Figure 7(a1,a2)). In this case, detailed analysis revealed lower probability of natural vegetation's occurrences in locations with a higher wetness index against the natural pattern of more likely occurrence of natural vegetation with higher humidity (see Figure 2). This is consistent with the previous studies by Turner et al. [1] and Foley et al. [2] who suggested that intensive human activities at the local scale and the rapid changes in the LULC system at different scales may intensify changes in land surface climate and exacerbate the vulnerability of ecosystem services, biodiversity, water balance, and soil conditions. Furthermore, we found that population density was directly proportional to HANPP in all the study sites except Diourbel-Louga in Senegal (see Tables 3 and 5). In this area, there was a positive relationship between population density and the presence of natural vegetation, i.e., an increase in population density increased the likelihood of the presence of natural vegetation. Though the population density observed in this area in 2000 was comparatively high, i.e., third highest in the selected hotspots in West Africa (see Table 2), the abandonment of cropland, the subsequent replacement by natural vegetation, and migration of the farmers to bigger cities in the same study site explained the observed positive impact of population density on natural vegetation in this region [11,61,63]. Detailed validation with the geospatial raster maps used to develop the model showed denser population coupled with intensive human activities in the Diourbel Region. Here, the likelihood of natural vegetation occurrence was lower. However, toward the Louga Region, which is the greater proportion of the site, sparser population correlated with higher likelihood of natural vegetation (see Figure 2).
The negative relationship between population density and presence of natural vegetation observed in all the other sites highlights the negative impact of human activities on natural vegetation. Brandt et al. [21] underlined the "negative-population-vegetation" response in West Africa. Under this scenario, a further increase in population will exacerbate the impact of human activities on natural vegetation with serious implications for climate impact mitigation. The loss of biomass and carbon through deforestation and degradation of natural resources increases the emission of greenhouse gases into the atmosphere. In densely populated regions, e.g., Ashanti Region in Ghana, Niger State in Nigeria and Diourbel-Louga in Senegal, the significance (see Tables 3 and 5) of population density, in explaining the presence of natural vegetation was lower since population densities in these regions were localized in few major cities (see Table 2 for the descriptive statistics and Figure 2 for spatial distribution). This finding is consistent with the LULC transitions mapping by Asenso Barnieh et al. [11] who found that human settlement expansions in West Africa are localized in only a few areas.

Livestock Density
In the sites in Diourbel-Louga, Senegal, Centre-Centre Sud in Burkina Faso, and Niger State in Nigeria, with very high livestock density (see Table 2), the BLR model predicted a negative impact of livestock density, especially cattle, on the presence of natural vegetation (see Tables 3 and 5 and Figure 6(b1,b2) for graphical representations of the cases in Diourbel-Louga, Senegal and Niger State, Nigeria). In reality, grazing and trampling pressure by livestock have negative impacts on total above ground biomass with pressure, increasing when livestock density increases [58,[86][87][88]. This is consistent with the findings from this study. Furthermore, in the previously mentioned areas, nomadism and transhumance are well practiced. Mobility of herds through nomadism (i.e., herds migrate over long distances from season to season in search of pasture) and transhumance are well-known coping strategies by livestock keepers in West Africa for ensuring the survival of livestock when the availability of natural vegetation is low during the dry season. This system of livestock management has a greater impact on natural vegetation when the movement of the animals are unguided [58,87,89,90]. Here, the proportion of livestock occurrence in the highly urbanized area was higher than in the less urbanized area where the fractional abundance of natural vegetation was higher (see Figure 2). Therefore, apart from the negative impact of livestock density on the presence of natural vegetation, other human activities tied with urbanization may play a role in the displacement of natural vegetation in these sites.
According to Schlecht et al. [88], in highly populated and urbanized areas, livestock farmers dispose livestock dung in open spaces and streets. The exposure of dung to extreme sunlight and rainfall has negative implications on the environment since the process may lead to leaching of nutrients and emission of greenhouse gases, such as methane to the atmosphere with a subsequent impact on climate. In Burkina Faso, roadside grasses, fallowed vegetation and stubble material from public spaces within and particularly around the city, are contributing a significant share of dry matter intake of ruminants in Ouagadougou, which is one of the urbanized areas in the study site in Centre-Centre Sud of Burkina Faso [88].
On the contrary, in the study sites in Hodh el Gharbi, Mauritania, Ashanti Region, Ghana, and Zinder-Maradi, Niger where positive relationships (see Tables 3 and 5, Figures 2 and 6(b3,b4) for the cases in Hodh el Gharbi, Mauritania and Ashanti Region, Ghana) were predicted between the presence of natural vegetation and livestock density by the BLR model. The livestock densities were relatively lower (Table 2) and the types of livestock were restricted to small ruminants, such as goat and sheep with restricted livestock movement [90]. Though Mauritania is well known for its dependence on nomadic and transhumance livestock farming, the site in Hodh el Gharbi had a relatively lower livestock density (see Table 2 and Figure 2) with restricted livestock movement and impact on natural vegetation compared with other parts of the country [90].
Livestock rearing in the Ashanti Region of Ghana is not nomadic. Livestock farmers mostly depend on a wide variety of feed resources such as grass and peels of cassava, plantain, and agro-industrial by-products like millet mash residue, rice bran, and home waste to feed their cattle [91]. This reduces the grazing pressure by livestock. Moreover, livestock density is more likely to have a positive impact on the presence of natural vegetation in areas where livestock are used as manpower for crop cultivation, e.g., Hodh el Gharbi, and where crop residues are used to feed the livestock. Such symbiotic association between livestock rearing and crop farming, as in some rural and peri-urban areas in the Ashanti Region, Ghana and Zinder Maradi, Niger is more likely to lead to a positive impact of livestock density on the presence of natural vegetation (Tables 3 and 5 and Figure 2) as predicted by the BLR model since crop residues augment herbaceous vegetation for grazing, thereby, easing the pressure on natural vegetation cover [89].

Accessibility (Travel Time)
The negative or positive (see Tables 3 and 5 and Figure 6(c1-c4)) influence of accessibility on the presence of natural vegetation may be related to the differences in the type and spatial distribution of natural vegetation as well human settlements.
In the humid and highly urbanized regions, i.e., Ashanti Region in Ghana and Niger State in Nigeria, where wetness was not a limiting factor, the BLR model predicted a positive relationship between travel time to urban resources and the presence of natural vegetation (see Tables 3 and 5 and Figure 6(c2,c3)). In these sites, the rate of natural vegetation conversion to human settlement, cropland, and other industrial developments between 1975 and 2000 was unprecedented [11]. The only types of natural vegetation spared from land conversion in these areas were protected forest and natural vegetation in the rural areas. The closer the cities to land area allocated for natural vegetation conservation and protection, the higher the rate of conversions and, hence, as travel time increases, the likelihood of the presence of natural vegetation increases (see Figure 2). Therefore, natural protected areas in such areas may be sited very far away from urban areas. Diourbel-Louga in Senegal is an arid region with spatial patterns similar to the previously mentioned areas in the humid region (see Figure 6(c1)). Here, the model predicted a positive impact of travel time on natural vegetation.
The reverse was true for the study sites in Zinder-Maradi in Niger and Hodh el Gharbi in Mauritania where wetness was a limiting factor and human settlements were in close proximity with natural vegetation and better climate conditions. In this case, a negative relationship between travel time and natural vegetation was estimated by the BLR model (see Tables 2, 3 and 5 and Figure 6(c4) for the case in Hodh el Gharbi, Mauritania). However, HANPP in the relatively more humid and urbanized areas was high. In the Zinder-Maradi Region, for example, farmers have adopted management practices. The areas very far from the cities were degraded due to the repeated episodes of severe drought in the 1970s and 1980s [64,65]. Under these conditions, increasing travel time to urban resources decreases the likelihood of the presence of natural vegetation (Figure 2). This is in agreement with the analysis of distribution and trend in vegetation in some sections of the Zinder-Maradi site by Rishmawi and Prince [27], who highlighted that areas far from human settlements were prone to a biomass productivity decline. Leroux et al. [28] found that increasing travel time had a negative impact on biomass productivity in some parts of the study area. Therefore, natural protected areas in such Regions must be cited in close proximity to human settlements under strict management and protection. The Sahel Great Green Belt Initiative is a call in the right direction [92,93].

Climate (Wetness Index)
The predictions from the BLR model suggested that the wetness index was neither the only underlying driver nor the dominant driver of the presence of natural vegetation in West Africa prior to the period of 2000. The wetness index was insignificant at p < 0.05 in some study sites, particularly in the humid regions, e.g., the Ashanti Region-Ghana and Niger State-Nigeria (see Tables 3 and 5). This contradicts previous studies by Anyamba and Tucker [19], Hickler et al. [17], Seaquist et al. [18], and Huber et al. [20], who noted that climate variability is the major underlying driver of vegetation dynamics in Africa, particularly in the Sahel region. In Diourbel-Louga in Senegal, which falls within the Sahel (arid) region, the step-wise forward BLR model revealed that human activity as measured by HANPP was the dominant driver of the presence of natural vegetation (see Tables 3 and 5). The relationship determined by the BLR model is supported by the previous LULC transitions analyses of Asenso Barnieh et al. [11], who revealed a massive settlement and cropland expansion with isolated cropland decline due to abandonment in Senegal.
Even in the study sites where the wetness index was the leading underlying driver, the sub-models demonstrated that the model with human socio-economic activities predictors only explained a greater proportion of variability in the presence of natural vegetation than the natural predictors' only (see Table 3). In addition, the impact of the wetness index on natural vegetation was different and controlled by human activities across the sub-continent, as described earlier (see Table 3 and Figure 7(a1-a4)). With limited human activities, a higher wetness index will increase the presence of natural vegetation, as in the case of Hodh el Gharbi in Mauritania (Figures 2 and 7(a3,a4).
Despite the fact that the impact of wetness index on natural vegetation was positive in both Hodh el Gharbi in Mauritania and Centre-Centre Sud in Burkina Faso (see Tables 3 and 5, Figure 7(a3,a4)), we observed that the impact in Hodh el Gharbi in Mauritania was random while the impact in Centre-Centre-Sud in Burkina Faso was systematic. This is because we found a direct relationship between population density, HANPP, and livestock density, and the wetness index in Hodh el Gharbi-Mauritania (Figure 3b). Nonetheless, toward the northern part of the study site, there was a vast area of degraded land. Visual analysis of the HANPP and livestock density geospatial data (Figure 2) revealed little human impact and a clearer impact of climate, such as repeated episodes of drought in this area. Comparatively, human socio-economic activities measured by population density, livestock density, and HANPP were relatively lower at this study site ( Table 2). The sub-model fitted with only natural predictors explained a greater proportion of the variability in the occurrence of natural vegetation than the sub-model fitted with only human socio-economic predictors in this study site. This is consistent with the LULC transition analysis by Asenso Barnieh et al. [11] who revealed massive natural vegetation replacement by sandy dunes and bare land linked with climate in this region. Hoscilo et al. [94] found a positive relationship between rainfall and normalized difference vegetation index NDVI, which is a proxy for biomass productivity in Mauritania.
On the contrary, an inverse relationship was found between population density, HANPP, and livestock density and the wetness index in Centre-Centre Sud in Burkina Faso (Figure 3d). Here, socio-economic indices were relatively high (see Figure 2) in the far arid north. Generally, human socio-economic indicators in this study site were higher in comparison with other study sites apart from the Ashanti Region of Ghana (see Table 2). The wetness index was higher in the areas toward the humid south, far from human settlements with little human activities whose impact is coupled with very dense natural vegetation. These areas were spared from land conversion between 1975-2000 [11] (see Figure 2 for spatial distribution of natural vegetation). The natural vegetation distribution here followed the natural pattern suggesting that, in spite of human socio-economic growth, systematic spatial planning may sustain the positive response of natural vegetation to precipitation and mitigate its sensitivity to disturbances (see Figure 7(a4)).
In Diourbel-Louga of Senegal and Zinder-Maradi in Niger, where the impact of the wetness index on natural vegetation was negative, the effect of the wetness index on natural vegetation was controlled by the intensity of human activities, i.e., a positive relationship was found between HANPP, livestock density, population density, and the wetness index (see Figures 2 and 3a,c), with the presence of natural vegetation decreasing while increasing the wetness index (see Figure 7(a1,a2) for a graphical representation of the case in Diourbel-Louga and Zinder-Maradi). In these two areas, previous studies have documented massive expansions of croplands at the more humid southern parts where the wetness index is higher and natural vegetation has been replanted at the water deficit areas under the "farmer-managed natural resources' regeneration initiative" [64,65].

Soil Type, Elevation, and Slope
The BLR model predicted (see Tables 3 and 5 and Figure 2) a positive effect of soil type on the presence of natural vegetation in Centre-Centre Sud of Burkina Faso, Niger State of Nigeria, and Diourbel-Louga of Senegal, where the dominant soil types, i.e., Nitisols, Luvisols, Fluvisols, Arenosols, and Cambisols, are fertile and a negative effect in the Ashanti Region of Ghana, Hodh el Gharbi of Mauritania, and Zinder-Maradi of Niger, where the dominant soil types, i.e., Acrisol, Lithisols, and Regosols, are sub-optimal for natural vegetation growth [55]. With regard to the effect of slope and elevation on occurrence of natural vegetation, naturally, gentle slopes and lower elevations are generally more suitable for vegetation growth [54]. This is consistent with the negative effects of elevation on natural vegetation, i.e., increasing elevation decreased the odds of the occurrence of natural vegetation, estimated by the BLR model at all the study sites (see Table 3 and Figure 7(b2) for a graphical representation of the case in Hodh el Gharbi, Mauritania), except in Diourbel-Louga in Senegal (see Figure 7(b1)) where a higher elevation increased the presence of natural vegetation.
Slope controls soil water content by influencing infiltration, drainage, and runoff [95][96][97][98] and it is a determinant of soil erosion and degradation, with an impact on vegetation growth [99]. In the Ashanti Region of Ghana, Diourbel-Louga of Senegal, and Niger State of Nigeria, human activities were influential drivers (see Tables 3 and 5) and slope had a positive impact on natural vegetation, i.e., steeper slopes increased the odds of natural vegetation occurrence because humans induced LULC types, i.e., cropland and human settlements systematically occur in areas with a gentler slope (see Tables 3 and 5 and Figure 2). The results of this study provide enough evidence to support the conclusion that "a single factor explanation of a LULC change in Africa", i.e., of the occurrence of natural vegetation is not enough [14]. Our findings are in agreement with the findings of Boschetti et al. [25] and Leroux et al. [28] who stated that the LULC change in West Africa is determined by interplay of climate, biophysical, and anthropogenic drivers.

Conclusions
This research demonstrated the ability of the BLR model to identify statistically significant drivers, which determine the presence of natural vegetation in a complex ecosystem where a single driver explanation was not sufficient. The BLR model revealed not only the statistically significant multiple set of underlying drivers of natural vegetation's occurrence, but also the interplay of these drivers and their relative impacts. The study revealed that, even prior to 2000, human activities as measured by HANPP were pervasive and a significant driver of natural vegetation in West Africa. The likelihood of natural vegetation's occurrence decreases with increasing human activities in all the study sites. Moreover, the relationships between the presence of natural vegetation and all the other drivers were location-specific and mostly depended on the intensity of human activities. The study highlights the significant role human activities play in altering the land surface climate and natural indicators as well as the ecosystem. Thus, without a negative human influence, the relationship between natural vegetation and natural drivers, i.e., wetness index, slope, elevation, and soil type, leads to positive effects on the ecosystem, but intensive human activities may trigger negative changes in the ecosystem. Therefore, development of policies and strategies to control the intensity of human activities and to revamp the degraded landscape are urgently needed in West Africa. The results from this study may provide useful information for understanding the interactions between human and the environment for natural resources management, climate change adaptation, and sustainable development. The past responses of natural vegetation to multiple underlying drivers captured by this research may serve as a foundation to predict the current occurrences of natural vegetation and projections into the future. The BLR model approach is advancement over the traditional narrations and people's perceptions on underlying drivers of natural vegetation's dynamics in West Africa.