Proximate and Underlying Deforestation Causes in a Tropical Basin Through Specialized Consultation and Spatial Logistic Regression Modeling

: The present study focuses on identifying and describing the possible proximate and underlying causes of deforestation and its factors using the combination of two techniques: (1) specialized consultation and (2) spatial logistic regression modeling. These techniques were implemented to characterize the deforestation process qualitatively and quantitatively, and then to graphically represent the deforestation process from a temporal and spatial point of view. The study area is the North Pacific Basin, Mexico, from 2002 to 2014. The map difference technique was used to obtain deforestation using the land-use and vegetation maps. A survey was carried out to identify the possible proximate and underlying causes of deforestation, with the aid of 44 specialized government officials, researchers, and people who live in the surrounding deforested areas. The results indicated total deforestation of 3,938.77 km 2 in the study area. The most important proximate deforestation causes were agricultural expansion (53.42%), infrastructure extension (20.21%), and wood extraction (16.17%), and the most important underlying causes were demographic factors (34.85%), economics factors (29.26%), and policy and institutional factors (22.59%). Based on the spatial logistic regression model, the factors with the highest statistical significance were forestry productivity, the slope, the altitude, the distance from population centers with fewer than 2500 in-habitants, the distance from farming areas, and the distance from natural protected areas.


Introduction
According to the Forest Resources Assessment (FRA), the worldwide annual rate of deforestation has decreased from a rate of 16 million ha/year from 1990 to 2000 to 10 million ha/year in 2015-2020 [1]. Deforestation and forest degradation rates are alarming, since they are responsible for approximately 20% of the carbon dioxide (CO2) that is globally emitted to the atmosphere [2,3]. For this reason, international organizations created the program for the Reduction of Emissions from Deforestation and Forest Degradation (REDD+), which addresses climate change through the mitigation of deforestation and degradation of forest areas, with the additional goal of preserving carbon stocks and their sustainability [4].
The participation of developing countries is requested. Therefore, a strategy or an action plan at a national level is needed to meet with REDD+ goals. These strategies must ensure the full and active participation of relevant stakeholders, such as aboriginal groups and local communities, and must tackle the most important causes of deforestation and forest degradation [5].
Causative pattern studies have been performed to describe the causes of deforestation and forest degradation, explaining how the loss of forest cover has occurred [6,7]. These studies have contributed to the reduction of uncertainty about the spatial and temporal occurrence of future forest deforestation and degradation processes. In addition, the ecological impact on the land caused by deforestation has been also reported [8,9], which is mainly related to climate variations, water quality degradation, soil erosion, and biodiversity loss [10][11][12].
Several methodologies have been used to understand the causes of deforestation. The causative pattern studies of deforestation have been focused on analyzing the proximate and underlying causes and their interlinkages [13,14]. Some case studies at a local or regional level have examined the causes of deforestation empirically, through an extensive bibliographical review [15][16][17][18][19]. Other studies have carried out surveys to describe proximate and underlying causes of deforestation [20][21][22][23][24][25]. These causes are related to the expansion of agricultural lands, forest exploitation, mining, expansion of human settlements, population growth, and socio-economic activities. In addition, recent studies have taken into account local perceptions and reported other causes of deforestation, such as the use of firewood for tobacco curing, brick production, and domestic use, as well as the development of subsistence agriculture, which represents an economic income in marginalized regions [26][27][28][29]. The use of mathematical modeling, such as regression models [30][31][32], neural artificial networks [33,34], agent-based models [35], cellular automats [36][37][38][39], multilevel statistical models [40,41], and remote sensing techniques [42,43], has been also reported to describe the causes of deforestation.
In Mexico, linear regression models have been widely used to model deforestation and land-use changes [44][45][46][47][48][49][50]. These studies have recognized that the most important cause is the conversion of forest land to agricultural use. However, recent studies using generalized additive models [51] indicates that land tenure is a critical factor in driving the decision to deforest. By using geographically weighted regression [52], the slope of the land was identified as the most important deforestation factor for all types of forest in the state of Mexico, since this factor facilitates or hinders the expansion of agriculture and urban expansion. The weights of evidence technique [53] has been used to identify an increase in induced grasslands, which in turn represents the main cause of forest loss. Specialized consultation has suggested that agricultural expansion and infrastructure extension are the main causes of forest loss [54].
The present study focuses on identifying and describing the possible causes of deforestation in the North Pacific Basin in Mexico from 2002 to 2014. The main contribution of the present study consists of the integration of techniques and methodologies that better describe the proximate and underlying causes of deforestation. First, a survey was carried out with the aid of specialized government officials, researchers, and people who live in the surrounding deforested areas to identify the possible causes of deforestation and their related factors. Such surveys represented the basis for the determination of the variables used for the development of the spatial logistic regression (SLR) model. Finally, the behavior of deforestation in the North Pacific basin was modeled from a spatial and temporal point of view.

Methodology
The proposed methodology was developed under the following scheme: first, the deforestation was estimated in the study area. Then, officials and local people were consulted to identify the possible proximate and underlying causes of deforestation. An analysis of land-use change was carried out to compare the perceptions obtained from surveys with the land-use changes observed in the study area. Finally, the deforestation factors were determined based on the bibliographic review and the results obtained from the identification of the proximate and underlying causes of deforestation through the survey. This set of factors was used for the development of the spatial logistics regression model. This model was then validated using the total operating characteristic index. Finally, an analysis of the linkages between the factors and the proximate and underlying causes of deforestation was carried out (Figure 1).

Deforestation Estimation
The land-use and vegetation (LUV) map provided by the National Institute of Statistics and Geography (INEGI) from 2002 to 2014 were used to estimate deforestation, considering only two land categories: forest (primary and secondary coniferous, oak, mountain mesophilic, deciduous, evergreen, sub-deciduous forest, and cultivated forest) and non-forest (all other categories). The forest cover loss analysis was carried out by using the map difference technique. The deforestation was obtained in hectares, and the annual average deforestation rate was estimated according to the equation proposed by [55] Equation (1).
Where: 1 and 2 are the forest covers on different dates, and is the number of years evaluated.

Survey Design, Implementation, and Analysis
The purpose of the survey was to identify the possible proximate and underlying causes of deforestation from 2002 to 2014. In this study, the survey design, implementation, and analysis were based on the conceptual framework proposed by Geist and Lambin [13]. As proximate causes, five main factors were identified: agricultural expan-sion (permanent cultivation, shifting cultivation, cattle ranching, and new agricultural areas), wood extraction (commercial wood extraction, fuelwood extraction, and farm improvement), infrastructure extension (transport infrastructure, settlement expansion, and public services), mining operations (metal mining and non-metal mining), and social trigger events (population displacements, social disorder, and drug trafficking). The underlying causes were classified in five factors: demographic (migration and population growth), economic (market growth and commercialization, economic structure, and urbanization and industrialization), technological (land factors, agro-technological changes, and labor factors), policy and institutional (formal policies, policy climate, and property rights), and cultural (attitude, values and beliefs, individual and household behavior).
The survey was applied to 44 specialized officials in different government departments (economic, ecological, environmental, and forest protection government departments). These officials are affiliated with different agencies, such as the Secretariat of Environment and Natural Resources (SEMARNAT), the National Forestry Commission (CO-NAFOR), the National Water Commission (CONAGUA), the National Commission of Protected Areas (CONANP), the Federal Attorney for Environmental Protection (PROFEPA), and the Ministry of Communication and Transportation (SCT). Researchers affiliated with the Autonomous University of Sinaloa and 10 people who live in the surrounding areas with higher deforestation were also consulted to reinforce and enrich the opinion of the officials. The opinions obtained through the surveys were summarized by using the frequencies of the occurrence of causes. The frequencies were expressed as a percentage to carry out an individual analysis of each cause and to determine their importance with regard to deforestation processes.

Land-Use Change Analysis
An analysis of land-use changes was carried out using the methodology developed by Pontius et al. [56]. The LUV maps provided by INEGI at a scale of 1:250,000 were homogenized in 10 categories (aquaculture, agriculture, human settlements, forest, water bodies, mangrove, scrubs, other lands, pasturelands, and hydrophilic vegetation), and topologically and geometrically corrected. The cross-tabulation matrix was applied to obtain the gains and losses for each land-use category. This methodology also recognized the land-use exchanges experienced between categories in the period 2002 and 2014. In particular, the most important transitions were identified, which verified the results (causal elements) obtained from the specialized consultation on the causes of deforestation. From this matrix, the gains Equation (2) and losses Equation (3) were determined.
where is the gain column, representing the proportion of the landscape that experienced an increase between 2002 and 2014; denotes the sum of the proportion of the landscape in category j in 2014; and denotes the proportion of the landscape that shows persistence of category j. In addition, represent the loss that each category had between 2002 and 2014, denotes the sum of the proportion of the landscape in category i in 2002, and denotes the proportion of the landscape that shows persistence of category i.

Determining the Factors of Deforestation
An extensive literature review, the survey results, and the land-use change analysis were the basis for determining the factors of deforestation in the North Pacific Basin. In this sense, the proximate and underlying causes of deforestation were related to a group of geographic variables that could explain these causes in a map [52,57]. Due to the number of variables that can be a cause of deforestation in the study basin, they were divided into three groups: (a) socioeconomic (population growth, population density, and marginalization index), (b) biophysical (altitude, slope, forest productivity, mean annual precipitation, soil moisture, and temperature), and (c) proximity-related (distance from natural protected areas, distance from agricultural areas, distance from pasture areas, distance from roads, distance from hydrography, distance from localities with less than 2500 inhabitants, distance from mines, and distance from human settlements) [9,33,52,53].
Different algorithms were used to obtain factors from the variables: the inverse distance weighted was used to model the socioeconomic factors [58] and the Euclidean distance algorithm was used to obtain all the proximity factors [53]. The slope of the terrain was obtained from the digital elevation model [59].
Once the factors were obtained, a Pearson correlation analysis was carried out to examine the correlation between the variables. A high correlation between paired factors is recognized when the Pearson coefficient (r) is greater than 0.75. This exploratory data analysis was used to identify the factors that must be excluded to avoid the model overestimation [60].

Spatial Logistic Regression (SLR) Model
The SLR model was adjusted using the IDRISI TerrSet software [61], based on the factors that were previously defined [62]. Deforestation was established as the dependent binary variable during the period of study. The dependent variable represents the presence or absence of an event, where 1 = deforested areas and 0 = non-deforested areas in each observation or pixel (Equation (4)): where represents the parameter for each variable estimated by the model, and is the factor included in the model ( = 1, 2, ⋯ , ); finally, is the probability that a nondeforested pixel will become deforested.
According to [63], a comparison between the coefficients obtained in the model must be carried out. In this sense, standardized coefficients were determined using Equation is the standardized coefficient of each variable in the model, is the standard deviation of each variable, and is approximately equal to 3.141592654.

Evaluation of the SLR Model
Total operating characteristic (TOC) was used to indicate the degree of adjustment of the spatial model [64]. The TOC diagram shows the model capacity to determine the land susceptibility to deforestation in different spatial locations. The model sensitivity is represented by the true positive percentage, while the model specificity is measured by the false positive percentage. Both sensitivity and specificity parameters were calculated to better understand the factors of deforestation with a greater certainty [65,66].
The results obtained in the SLR model of deforestation were contrasted with the qualitative and quantitative results acquired from the specialized consultation. The causative elements of deforestation complemented the mathematical modeling. The interlinkages between the proximate and underlying causes and the synergies between the techniques used in the present study were analyzed and discussed to better understand the process of forest cover loss.

Study Area
The North Pacific Basin is comprised of the total territory of Sinaloa state and part of the Chihuahua (11.75%), Durango (42.54%), Zacatecas (6.04%), and Nayarit (32.53%) state territories. The study basin has a surface area of 152,013 km 2 , which is equivalent to 8.0% of the Mexican territory ( Figure 2). The population of the basin is around 4,466,000 inhabitants [67]. The land-use changes due to the socioeconomic activities carried out in the basin have caused adverse effects on ecosystems, such as degradation, a decrease of aquifer levels, the alteration of the water cycle, and biodiversity loss [68].

Data
Vector, raster, and alphanumeric information were collected from different government institutions (Table 1). A thematic, topological, geometric analysis, and a series of spatial operations were applied to homogenize the information. Such homogenization was made with a pixel size of 100 m.

Deforestation Estimation
From 2002 to 2014, a forest cover loss of 3938.77 km 2 was detected in the North Pacific Basin. This forest area represented 3.95% of the total forest in 2002. In contrast, the forest cover gain was about 2507.39 km 2 , which was characterized principally by natural recovery ( Table 2). The forest recovery area represented 1.64% of the North Pacific Basin area. The balance between forest cover loss and gain was negative; therefore, the net loss of forest coverage was 1431.38 km 2 (Figure 3).  The annual average deforestation rate in the North Pacific Basin was 0.11%. This deforestation rate was higher than the one presented by the Global Forest Watch (GFW) program, which suggested 0.07% for the period of 2002-2014 in the study area [69]. The difference between both deforestation rates was because the GFW program does not consider deforestation in primary and secondary deciduous, evergreen, and sub-deciduous forests. The annual average deforestation rate from 2000 to 2010, as indicated by the Food and Agriculture Organization (FAO), was 0.40% for Mexico as a whole country [70], which is higher than the deforestation rate obtained in this work.

Survey Analysis
The frequency analysis of the occurrence of proximate causes identified that the main forest loss processes were agricultural expansion (53.42%), followed by the infrastructure extension (20.21%) and wood extraction (16.17%) ( Table 3). These results coincide with those mentioned by Geist and Lambin [13] and Monjardín-Armenta et al. [54], who also recognized that agricultural expansion and infrastructure extension are the main proximate causes of deforestation of tropical forests. The most important proximate sub-causes of deforestation were permanent cultivation (19.00%), shifting cultivation (15.18%), cattle ranching (13.15%), commercial wood extraction (11.23%), and transport infrastructure (8.36%). These five proximate sub-causes explain a total of 73.8% of the occurrence of proximate causes of deforestation in the North Pacific Basin. The frequency analysis of the occurrence of underlying causes identified that the demographic (34.85%), economic (29.26%), and policy and institutional factors (22.59%) were the main underlying forces of deforestation (Table 3). These results coincide with those mentioned by Pacheco et al. [24]. In particular, the population growth (27.52%), market growth and commercialization (13.91%), policy climate (11.89%), and economic structure (10.23%) were identified as the most important underlying sub-causes of deforestation in the North Pacific Basin ( Table 3).
The opinion of the people living within the affected areas corroborated the opinion of the specialized officials. Also, the people living in the study area provided specific information about the possible causes of deforestation, such as the cultivation of cannabis and poppy and the establishment of synthetic drug laboratories. These causes are very specific to the study area, and they can be considered as deforestation causes in areas located closed to the mountainous areas of the basin.

Land-Use Change Analysis
The dynamics of land-uses were estimated through the cross-tabulation matrix (Table 4). The main diagonal of the matrix indicates the land surfaces that remained stable in both periods, which are also called persistence. The land-use change analysis shows that the categories with the highest gains were agriculture (3926.95 km 2 ), forest (2507.65 km 2 ), and pasturelands (1759.71 km 2 ), but also these same categories showed the greatest losses, with cover changes of 2272.28 km 2 , 3938.7 km 2 , and 2082.93 km 2 , respectively. The most significant forest loss was towards agriculture, with 2492.56 km 2 , which represented 63.28% of the total loss. Hence, the greatest forest gain was mainly presented in the agricultural and pastureland coverages. Human settlements are one of the categories that showed the most coverage increase, with an increase of 33.36% concerning its initial surface area. This category also presented the least amount of loss (19.6 km 2 ). Other categories that observed gains were aquaculture, with 39.20%; human settlements, with 30.75%; mangroves, with 6.97%; and agriculture; with 5.6%, while the categories that showed significant losses were hydrophilic vegetation, with 23.04%; water bodies, with 7.96%; scrub, with 5.55%; pastureland, with 2.87%; and forests, with 1.44%.
The results of the cross-tabulation matrix indicated that great coverage of forest was changed to agriculture. These results corroborated that agricultural expansion is one of the main causes of the deforestation processes in the North Pacific Basin. Likewise, since the human settlements category is directly linked to the extension of infrastructure, the land-use change analysis confirms that the human settlement extent is also one of the main causes of the deforestation process in the study area.

Adjustment and Evaluation of Spatial Logistic Regression (SLR) Model
The SLR model was adjusted to the biophysical, socioeconomic, and proximity variables that were used for the spatial and temporal analysis of the deforestation process. According to the exploratory data analysis (correlation matrix), the temperature was removed from the model to avoid overestimating the regression model, and to minimize the multi-collinearity among the factors of deforestation [60].
The degree of adjustment of the model was calculated using the TOC statistical method (Figure 4). The value obtained for the area under the curve (AUC) of TOC was 0.948. Based on the AUC value obtained, the spatial model demonstrated a reliable and effective discrimination index between the classifications of the true and the false positives. According to [48], an AUC value of 0.7 is acceptable, while a value greater than 0.8 is excellent, and more than 0.9 is exceptional.  Table 5. The model results indicate that the most suitable areas to be deforested are those located around the already deforested ones. These areas are characterized by low forest production, with gentle slopes and low altitudes, suitable for perennial or shifting cultivations. Likewise, the model results showed a high probability of losing forest cover around human settlements because of the demand areas for new urban developments and their respective urban services infrastructure. Figure 5a shows that a large part of the forest cover located in the northwest area of the basin has a probability of being deforested. This area comprises the municipalities of El Fuerte, Sinaloa de Leyva, Guachochi, Chinipas, Batopilas, and Urique. Similarly, Figure  5b reveals another suitable area for deforestation located in the north-central part of the basin. This area comprises the municipalities of Culiacan, Badiraguato, Mocorito, Tamazula, Topia, and Canelas. Furthermore, the spatial model suggests that El Rosario, Escuinapa, Rosamorada, Ruiz, Suchil, and Mezquital municipalities are the most prone areas for losing forest cover in the southeast of the basin.  According to the results of the SLR model, the most significant factors were forest production, slope, altitude, distance from localities with fewer than 2500 habitants, distance from agriculture, and distance from natural protected areas. These factors showed a different impact on the spatial model (Table 5). In particular, most of the proximity factors showed a negative impact. A negative impact indicates that the smaller the distance between the forest cover and roads, agriculture, pastures, mines, hydrography, and human settlements, the higher the probability of deforestation. An exception was observed for the distance from protected natural areas factor, where a positive impact occurs: the smaller the distance, the lower the probability of forest cover loss.
The standardized coefficient ( ) obtained for the socioeconomic factors shows that the population growth, marginalization index, and population density had a nonsignificant impact on forest cover loss in the basin. The socioeconomic factors showed a weak statistical relationship, since they do not present a single independent effect on deforestation processes, as reported in other studies [53,71].
Regarding the biophysical factors, most of them showed a negative significant impact. Therefore, the most suitable areas for deforestation were identified at lower-altitude terrains. According to spatial model results, gentle slope areas or low forest production areas were also the most suitable areas for deforestation. Based on the model results, biophysical and proximity factors play essential roles in the forest cover and land-use changes, as has also been suggested by Lambin et al. [72].
The specialized consultation about causative deforestation elements is consistent with other studies conducted in different parts of the world [13,[20][21][22]54,73]. The consultation technique identified the main causes of deforestation in the North Pacific Basin. In the present study, the agricultural extension, urban creation, and rural and industrial infrastructure were driven mainly by demographic factors, such as population growth and the economic development of the region.
The opinion of people who live in the deforested areas provided a reference about the specific causes of deforestation in some places. They revealed essential data about the harmful practices of authorities, such as corruption, underlying unsustainable logging practices, improper rights regimes for the possession and land use, the lack of culture and education, and the limited economic development. The opinions also evidenced that some areas that were previously used for forestry now are used for agriculture practice. In mountain areas, the deforested areas are used for the cultivation of marijuana and poppy, and to establish laboratories for the development of synthetic drugs, which have caused migrations and a series of social trigger events. Table 6 shows the factors that were related to some proximate or underlying causes or sub-causes of deforestation. The socioeconomic factors were directly related to the demographic pressure, particularly to population growth, since the higher the population is, the greater the demand for wood, as is the demand for territorial extensions for agricultural, urban, and industrial purposes. Similarly, the marginalization index reflects different concerns, such as the economic situation, education level, and housing situation of the population. Therefore, a higher value of the marginalization index is related to greater deforested areas for domestic use.
The biophysical factors were associated with topographical characteristics, such as altitude and slope, and climatic features, such as soil moisture and precipitation. Thus, the biophysical factors restricted the land-use change, in particular the underlying deforestation causes related to demographic pressures, such as the expansion of agricultural borders, road infrastructure, and urban and rural settlement expansion.
The proximity factors related to forest cover loss were the distance from localities with fewer than 2500 habitants, distance from the agricultural expansion border, and distance to natural protected areas. The SLR model results confirm that the probability of forest cover loss increases when the distance between the forest and the areas modified by man decreases. One disadvantage of the SLR model was the difficulty of spatially representing the interlinkage between proximate or underlying causes or sub-causes of deforestation with some factors of deforestation. For this reason, the SLR model was complemented with the qualitative and quantitative information provided by the surveys. In the present study, the social trigger events, such as social disorder, drug traffic, policy climate, and other cultural and technological factors, were not spatially represented, but they were considered in the forest cover loss process.

Conclusions
The present study combines the use of two different techniques for identifying and describing the proximate and underlying deforestation causes in the North Pacific Basin: SLR spatial modeling and specialized consultation. The combined use of both techniques provided some advantages over the variety of existing causative deforestation studies, such as characterizing and representing the temporal and spatial behavior of the deforestation process. The deforestation model identified the forest areas that were most suitable for deforestation. Such information could be used to determine and alert priority forest management areas.
Most of the deforestation studies only report numerical or spatial behavior. However, these studies fail to characterize the qualitative causes of the deforestation process. In this sense, the present study can be used to describe the spatial behavior of deforestation processes, but also as an alternative methodology that can be extrapolated to assess the factors of deforestation in other tropical regions. Understanding land-use changes is critical for reducing uncertainty about the spatial and temporal occurrence of future deforestation. This study contributes to adequate forest planning, which could mitigate greenhouse gas emissions. Besides, the present study could be the basis for the generation of forest restoration and management programs and the conservation of natural resources.
Finally, the present study contributed to the development of mitigation strategies for deforestation and forest degradation, due to the root proximate and underlying deforestation causes were identified at a local level. By using the proposed methodology, decisionmakers could define the governmental priorities, which in turn could allow tracking and reducing the specific activities that produce changes in the forest cover over time.