Estimation of Current and Future Suitable Areas for Tapirus pinchaque in Ecuador

: At present, climate change is a direct threat to biodiversity and its effects are evidenced by an increasingly accelerated loss of biodiversity. This study identiﬁed the main threats presently facing the Tapirus pinchaque species in Ecuador, generated predictive models regarding its distribution, and analyzed the protected areas as a conservation tool. The methodology was based on a literature review and the application of binary predictive models to achieve these objectives. The main results indicate that the T. pinchaque is seriously threatened, mainly by changes in land use. In addition, three models were selected that show current and future suitable areas for the conservation of the species. Its current distribution amounts to 67,805 km 2 , 33% (22,872 km 2 ) of which is located in 31 of the 61 protected areas. Finally, it is important to take timely actions focused on biodiversity conservation, considering the importance of balance in ecosystems to the humans dependent thereof, and the results regarding the changes in the current and future distribution areas of the mountain tapir are a great contribution to be used as a management tool for its conservation.


Introduction
Biodiversity is a multidimensional, interdependent, and complex network, which as a whole and in a functional way provides ecosystem goods to humans [1,2]. All these types of life generate a balance between the different ecosystems, allowing a correct functionality, contributing to the generation of services that are used by human beings [3,4]. At present, the loss of biodiversity is one of the great challenges facing the planet. Although the reasons are varied, human activity is one of the main factors influencing the extinction of species [5]. Biodiversity is not only important because of its natural value as part of life support systems, but also because it has a very large economic potential which has not yet increased cattle ranching, agriculture, logging, and the exploitation of natural resources, such as gold and copper [29,30,[38][39][40]. These threats have introduced problems, including a lack of connectivity between habitats, which in recent years has made it impossible to generate successful reproductive processes. Another less serious problem is hunting for food, medicinal, or ritual purposes [28,31].
One of the main problems worldwide for implementing projects or programs aimed at biodiversity conservation is the lack of economic resources [35][36][37]. On the other hand, the scarcity of studies means that there is not enough data and information to help develop conservation strategies [38]. In this context, the objectives of the present study were: (1) to apply ecological niche models to estimate the climatically suitable areas for the species at present and under the possible effects of climate change in the future; (2) to quantify the suitable areas predicted by the models in the National System of Protected Areas (SNAP, for its initials in Spanish) of Ecuador; (3) to estimate the changes (stability, loss and gain of suitable areas) for T. pinchaque in the study area.

Study Area
The study area considered in this research for modeling purposes is centered on the Andes Mountains of Ecuador, with exact distribution zones of T. pinchaque, i.e., the Andean ecosystems, such as western montane forest, paramo, inter-Andean scrubland, and the eastern montane forest in Ecuador ( Figure 1). This zone includes 19 provinces: Napo, Morona Santiago, Azuay, Bolívar, Cañar, Carchi, Chimborazo, Cotopaxi, El Oro, Esmeraldas, Loja, Orellana, Pastaza, Pichincha, Santo Domingo de los Tsachilas, Sucumbíos, Tungurahua, and Zamora Chinchipe [30,39]. cattle ranching, agriculture, logging, and the exploitation of natural resources, such as gold and copper [29,30,[38][39][40]. These threats have introduced problems, including a lack of connectivity between habitats, which in recent years has made it impossible to generate successful reproductive processes. Another less serious problem is hunting for food, medicinal, or ritual purposes [28,31]. One of the main problems worldwide for implementing projects or programs aimed at biodiversity conservation is the lack of economic resources [35][36][37]. On the other hand, the scarcity of studies means that there is not enough data and information to help develop conservation strategies [38]. In this context, the objectives of the present study were: (1) to apply ecological niche models to estimate the climatically suitable areas for the species at present and under the possible effects of climate change in the future; (2) to quantify the suitable areas predicted by the models in the National System of Protected Areas (SNAP, for its initials in Spanish) of Ecuador; (3) to estimate the changes (stability, loss and gain of suitable areas) for T. pinchaque in the study area.

Methods
For a better illustration and understanding, the methodological process is described, considering the objectives established in this study. Initially, a step-by-step description is given of the process followed for the generation of distribution models for the species under study. Then, the process adopted to quantify the predicted suitable areas for T. pinchaque in the SNAP of Ecuador is described. Finally, we describe the process used to estimate changes in suitable areas by comparing the current and future models for T. pinchaque in the study area.

Predictive Model Occurrence Data
A database with validated occurrence records for T. pinchaque was constructed from the scientific literature, established within its native distribution limits, specifically for three Andean countries: Ecuador, Colombia, and Peru [29,40]. These records were previously obtained from biodiversity databases, such as GBIF (GBIF; www.gbif.org, accessed on 15 September 2021), MaNIS (www.manisnet.org), and the Tapir Specialist Group (IUCN/SSC TSG-Ecuador). In addition, occurrence records of T. pinchaque obtained through field work and monitoring projects in Colombia, Peru and Ecuador were included; a total of 1424 presence records were initially included in the database, including records from 1950 to the present (Supplementary Materials). Each occurrence record was projected on the geography through its coordinates (longitude, latitude) in decimal degrees based on the WGS 84 datum; this process was carried out using ArcMap 10.5. Each locality was verified and validated, eliminating records that presented geographical inconsistencies. Then, a cleaning protocol was applied, based on the methodology of Simoes [41], considering the recommended standards for the creation of ecological models [42,43]. This consisted of the following steps: (1) removing duplicate records, (2) verifying records with geographic inconsistencies, and (3) reducing areas with a high density of records, linked to oversampling close to accessible areas (settlements, roads, rivers, etc.) [44,45], in order to mitigate spatial autocorrelation and overfitting in the models [46,47]. This process was performed using the spThin package in R [48], managing a minimum distance of 1 km between each record. In this context, the cleaned database included 366 presence records ( Figure 1). Finally, the aforementioned records were randomly split into two datasets, evaluation (20%) and calibration (80%), using the split_data function of the ellipsemm package in R [41].

Environmental Data
Nineteen bioclimatic variables were used at a spatial resolution of~1 km 2 from Worldclim version 1.4 [49], available at (http://www.worldclim.org/, accessed on 15 September 2021). Four variables were excluded (Bio 8, 9, 18, and 19), because they introduce spatial errors associated with the combination of temperature and precipitation information [50]. To forecast suitable areas of T. pinchaque distribution in the future, we used layers derived from projections of general circulation models (GCMs), for the average period between 2041 and 2060 (2050). Three GCMs (CCSM4, HadGEM2-ES, and MIROC5) were selected considering the variability and uncertainty present in the mathematical models used to generate future climate data [51,52]. In addition, two representative concentration trajectories (RCP 4.5 and RCP 8.5) were used for each GCM. These trajectories represent conservative (RCP 4.5) and pessimistic (RCP 8.5) greenhouse gas (GHG) emissions [53].
The dimensionality of the environmental dataset was reduced to mitigate the effects of the existing spatial autocorrelation between them, applying a Pearson coefficient (r > 0.8) between pairs of variables; this process was performed using the GUI version of the ntbox package [54]. Highly correlated variables were removed based on the previously mentioned threshold. Finally, five variables were selected: Bio 1, 2, 3, 12, and 15.

Calibration Area
For ecological niche models, the delimitation of a calibration area (M) [55] is a crucial and cautious process, mainly because its size influences the predictions generated by the algorithms [56], which implies possible overfitting and overpredictions in areas suitable for the species [57]. In view of this, a search was initially conducted in the current literature on the factors that influence the mobility of T. pinchaque. The results corroborated that altitude is one of the factors that restrict its range of mobility [58], with reports for T. pinchaque at altitudes ranging from 1200 to 4700 masl [34,59,60].
Because of the above, a digital terrain model (DEM) was downloaded at a resolution of 250 m, derived from the Shuttle Radar Topographic Mission (SRTM). The DEM was reclassified to identify the areas present between 1200 to 4700 masl. A hypothesis was then made about the accessibility area (M), using a 40 km buffer. This facilitated the capture of the altitudinal range suitable for T. pinchaque ( Figure 1). Finally, a projection area (G) was defined, taking into account Ecuador, and considering the capacity of the ecological niche models to be projected in space (different geographic zones from those used for calibration) and periods (years) [41,55].

Model Calibration
For the construction of the models, the maximum entropy algorithm implemented in MaxEnt [61], through the kuenm package [62], was used. Aware that this type of algorithm, used for creating ecological models, is susceptible to configurations and parameterizations [63], which can cause problems related to over-fits and over-predictions in the final results [57], a rigorous calibration process was applied. This calibration consisted in the creation of candidate models, using a single set of variables (Set 1), 13 different regularization multiplier (RM) values (0.1-1 with intervals of 0.1, 1-4 with intervals of 1), as well as 29 possible combinations of feature classes (FC) (l = linear, q = quadratic, p = product, t = threshold, and h = hinge). The imposed parameters and configurations allowed us to evaluate the level of complexity of the models [64].

Evaluation and Creation of Final Models
Three metrics were used to evaluate the candidate models, allowing for the selection of the model with the best performance and the most simplicity in terms of parameterization and configuration. Initially, the significance ratio was calculated using partial ROC [65], a variant that allows for the resolution of issues associated with the traditional AUC, which is not recommended in ecological niche models. The resulting partial receiver operating characteristic (ROC) values range between 1 and 2, with values greater than 1 indicating good model performance. As a result, the predictive capacity was determined using the omission rate (OR) with a 5 percent tolerance for error [66]. The OR function enables the estimation of the fraction of presences predicted to be absent or false negatives. The Akaike information criterion for small samples (AICc) was then used to determine complexity [64]. This metric enables the assessment of the effect of the number of parameterizations and configurations on the generation of overfitted or highly complex models. As a result, only models with delta AICc < 2 values were chosen. After selecting the best model, final models for T. pinchaque were created for the present and future using 100% of the presence data, 10,000 background points, a maximum of 500 interactions, and ten replicates with logistic output format.
Consequently, models were projected from the calibration area (M) to the projection area (G), both for the current context and for future models, with their respective CPRs. During the model projection process, MaxEnt was configured to prevent extrapolation, avoiding possible overpredictions [57] in the final models.
Finally, seven resulting models were obtained: (1) current model; (2)  The median was applied to the replicates of each model for the present period and future scenarios. Subsequently, the median of the medians was calculated for each RCP, thus generating consensus models, which summarize the variability existing in all the GCMs used [67]. As a result of this process, three definitive models were obtained: (1) current model; (2) consensus model RCP 4.5; and (3) consensus model RCP 8.5. These models were converted to binary, applying a threshold of 5% allowable omission error, considering the calibration data [66].

Representativeness of Predicted Suitable Areas in Ecuador's SNAP
The predicted suitable areas in the SNAP of Ecuador were quantified by applying an overlap between the models generated for T. pinchaque and a vector file of the SNAP of Ecuador. This file was obtained from the spatial data infrastructure of the Ministry of Environment, Water and Ecological Transition of Ecuador (http://ide.ambiente.gob.ec/ mapainteractivo/, accessed on 15 September 2021). The overlapping process was carried out using ArcMap 10.5. It is important to highlight that the SNAP in the context of Ecuador is considered one of the most effective methods of in situ conservation, given that it allows the protection and preservation of sensitive ecosystems and the biodiversity present in them. In addition, one of the tangible results is the success in counteracting the effects associated with deforestation and soil changes [25]. Hence, it is important to carry out this process of quantification of the areas predicted by the models as a conservation strategy for T. pinchaque.

Habitat Changes in the Distribution
The current binary model and future consensus models for RCP 4.5 and 8.5 were compared to estimate both the proportion and relative number of stable, lost, and gained pixels for T. pinchaque in its predicted range. This allowed the generation of models that report areas with potential future gains, losses, and stability. This process was performed through the BIOMOD_RangeSize function incorporated in the biomod2 package in R [68].

Model Validation
Using the combination of 29 feature classes, 13 regularization multipliers, and a single set of variables, 377 candidate models for T. pinchaque were created and evaluated. Finally, three candidate models were statistically significant and met the AICc criteria ( Table 1). The best candidate model used in the creation of the final models had an AUC ratio: 1.246, showing that it is a model with excellent predictive power, with an OR = 0.045, AICc = 8443.803, and ∆AICc = 0, in addition to the following configurations and parameterizations: RM = 0.7 and FC = qp (Table 1). Sensitivity values representing the fraction of correctly predicted presences or true positives achieved by the models were obtained. In this context, the rate of presences correctly predicted by the models was 99.7% (with one presence not predicted) taking as reference the calibration area (M). Finally, with respect to the models generated in the projection area (G), the hit rate was 95% (10 non-predicted presences). It should be noted that in order to evaluate the sensitivity, the models had to be previously converted into binary, while the threshold used was 5% of the omission error. The sensitivity values obtained demonstrate the predictive power that the models had in predicting the presence data of T. pinchaque.  Abbreviations: NP = national park; ER = ecological reserve; PPA = private protected area; BR = biological reserve; WPR = wildlife production reserve; DAPA = decentralized autonomous protected area; NRA = national recreation area; WR = wildlife refuge; MCA = municipal conservation area; CPA = community protected area; GR = geobotanic reserve.

Model 2: Future
The study area was divided into three zones: north, center, and south ( Figure 1). In addition, these models suggest potential losses, gains, and stability of suitable areas for T. pinchaque habitats (Figure 2). Among the remarkable results, it can be mentioned that by the year 2050, the RCPs suggest that a large part of the area will be maintained, with values of 98.8% (66,

Discussion
In this study, a modeling process was performed based on the recommended standards [42,43], which meant that possible errors associated with sampling bias and spatial aggregation could be mitigated [44,46]. The process was able to accommodate the dimensionality of the set of environmental variables, the use of different settings and parameterizations [62], and the complexity of the models [64]. In this context, the models obtained after the selection process were highly significant (p < 0.001), had low omission rates (OR = 0.045), and were less complex (AICc = 8443.803, ∆AICc = 0). This is a clear difference compared to previous studies [29,69], where an exhaustive process was not applied in the selection of configurations and parameterizations in the calibration process, which is known to be very important for the final results of the models [63].
The results reported by the model under current conditions show that there is 67,805 km 2 of suitable area for T. pinchaque in the Andean region of Ecuador, with 33.7% within the current SNAP, which is an acceptable range for the conservation of the species. Globally, 7.45% of the planet's land surface is interconnected through PAs [70], which have contributed to biodiversity monitoring and conservation. For this reason, the expansion or creation of new protected areas in Ecuador is considered key as a beneficial in situ conservation strategy [27], because it would improve the connection between natural spaces in which T. pinchaque habitats, thus mitigating the impact of climate change observed in these areas [71]. The model under current conditions establishes that ~66% is outside the SNAP, which indicates insufficient connectivity in its territory for the preservation of this iconic species as a means of sustaining biological richness [69]. Its ecological On the other hand, with respect to losses, the models report changes of 1.2% (835 km 2 ) and 1.4% (925 km 2 ) in the RCP 4.5 and 8.5 models, respectively. This suggests that most of the suitable areas that will show reductions are between 1060 and 2464 masl. These losses are evident in eight SNAP PAs: Cayambe Coca National Park, Cofan Bermejo Ecological Reserve, El Quimi Biological Reserve, Llanganates National Park, Sangay National Park, Siete Iglesias Municipal Conservation Area, and Sumaco Napo-Galeras National Park. Notably, El Zarza Wildlife Refuge is the most affected PA with a total reduction of suitable areas for T. pinchaque in both RCP models.
In general, the models project gains in suitable areas along the western and central zone of the Andean region of Ecuador, specifically in pixels of values with a minimum of 1087 and a maximum of 5863 masl. These gains were reported in nine SNAP PAs, highlighting a more significant gain in both RCP models in the following PAs: Chimborazo Wildlife Production Reserve, Los Ilinizas Ecological Reserve, and Cotacachi Cayapas National Park.

Discussion
In this study, a modeling process was performed based on the recommended standards [42,43], which meant that possible errors associated with sampling bias and spatial aggregation could be mitigated [44,46]. The process was able to accommodate the dimensionality of the set of environmental variables, the use of different settings and parameterizations [62], and the complexity of the models [64]. In this context, the models obtained after the selection process were highly significant (p < 0.001), had low omission rates (OR = 0.045), and were less complex (AICc = 8443.803, ∆AICc = 0). This is a clear difference compared to previous studies [29,69], where an exhaustive process was not applied in the selection of configurations and parameterizations in the calibration process, which is known to be very important for the final results of the models [63].
The results reported by the model under current conditions show that there is 67,805 km 2 of suitable area for T. pinchaque in the Andean region of Ecuador, with 33.7% within the current SNAP, which is an acceptable range for the conservation of the species. Globally, 7.45% of the planet's land surface is interconnected through PAs [70], which have contributed to biodiversity monitoring and conservation. For this reason, the expansion or creation of new protected areas in Ecuador is considered key as a beneficial in situ conservation strategy [27], because it would improve the connection between natural spaces in which T. pinchaque habitats, thus mitigating the impact of climate change observed in these areas [71]. The model under current conditions establishes that~66% is outside the SNAP, which indicates insufficient connectivity in its territory for the preservation of this iconic species as a means of sustaining biological richness [69]. Its ecological role in sensitive environments, such as the páramos, is vital because it is a seed disperser and an excellent bioindicator of the conservation status of the ecosystems [34,72,73]. An example that demonstrates its importance is the Llanganates-Sangay ecological corridor, which has improved connectivity between the Llanganates and Sangay national parks, which are protected areas that have suitable areas that serve as habitats for T. pinchaque [72,74].
With respect to future predictions, the models suggest losses of~2% in areas suitable for T. pinchaque distribution; these values are different from those reported by Ortega-Andrade [31], where they predict losses approximately equal to 17%. These differences can be addressed by the different methodologies used in both studies. Ortega-Andrade uses an omission error of 10%, which implies a significant reduction of adequate areas compared to the 5% applied in this study. Furthermore, it should be noted that a key difference was the use of different configurations and parameterizations in the MaxEnt algorithm through the kuenm package [62], which helped us avoid models with high complexity [64]. Finally, future models in this study have provided estimates for the total loss of suitable areas for T. pinchaque in the El Zarza Wildlife Refuge which could be associated with the influence of climate change in Andean regions. Therefore, it is vital to start taking initiatives to improve the adaptation of the species in the future in this protected area.
On the other hand, in terms of area gain, the models suggest that the western and central zones of the Andean region in Ecuador would be favored by these changes. However, these gains are reported in areas with pixels that have altitude values higher than what is currently known as a habitat for the species [34,59,60]. This can be explained in two ways. The biological explanation suggests that there is a relationship between altitudinal range and biodiversity. For example, an increase in temperature would induce a displacement towards areas that were previously colder, where environmental circumstances may be more conducive to the adaptation of populations in the future [75]. In this context, the displacement of fauna towards higher altitudinal zones is a relatively rapid process [76], which is not the case with the transition of flora towards higher altitudinal ranges [77]. This implies that T. pinchaque, being an herbivorous mammal [60], will in the future be limited with respect to its diet by not being able to rely on vegetation corresponding to its feeding habits, which could eventually lead to poor adaptation to higher altitudinal floors. The second explanation is that these areas are possible over-prediction zones generated by the models [57], which could be associated with the influence of size on the calibration area (M) used [56]. It is important to note that there is currently no standard methodology to minimize the drawbacks of this, so it is recommended to interpret with caution the areas reported for the species in areas with altitudinal ranges above 4700 masl. However, these areas could not be ruled out entirely, as studies suggest that climate change will influence Andean areas with higher altitudes in Ecuador [78,79], leading them to become increasingly abundant [78,[80][81][82][83].
Multiple recommendations have been made in recent decades to improve the quality of niche models [34,35], including adequate cleaning of presence registers [33], delimitation of the calibration area (M) [48,49], incorporating a wide selection of parameters and configurations to minimize model complexity [56], and the use of multiple statistical criteria during the evaluation process [54]. These recommendations were taken into account when developing the final models, resulting in the generation of robust models at the predictive level. This study provides a deeper understanding of T. pinchaque's distribution in Ecuador's tropical Andes, and provides information on the species' conservation status at a geographic scale, allowing for the identification of areas vulnerable and susceptible to future climate change effects. The study's primary limitation was using only bioclimatic variables in the model generation process; this is justified because biotic interactions are imperceptible at coarse scales, such as those used in this study. Nonetheless, it is recommended that researchers continue developing models for T. pinchaque at finer scales using diverse environmental datasets (e.g., [73][74][75]) and global circulation models. Additionally, it would be necessary to analyze T. pinchaque's biotic interactions with other mammalian species, their feeding patterns, the impact of land use change, and various anthropogenic activities.

Conclusions
Ecological niche models based on a rigorous selection and evaluation protocol were successfully used. The MaxEnt algorithm was used to estimate the potentially suitable areas for the distribution of T. pinchaque in current and future conditions. Robust models with excellent performance were obtained, considering the best configurations and parameterizations from the evaluation process. The geographic information presented in this study can be used in the short term to establish mitigation and adaptation strategies for climate change, to aid the design and establishment of connectivity zones between protected areas, and to evaluate biotic interactions with other tropical Andean species and environmental education programs based on the importance of the species in the conservation of Andean ecosystems in Ecuador.
Under current conditions, the models report suitable areas for the distribution of T. pinchaque in 19 provinces in Ecuador and 31 areas of Ecuador, representing a total of 67,805 km 2 . Under future conditions, these areas would remain stable, being equivalent to~98% in both RCPs. In addition, future models report partial losses of suitable areas in seven protected areas and total losses in the El Zarza Wildlife Refuge, suggesting that environmental changes induced by future climate change will influence the species habitat. However, there are also estimated gains of suitable areas in nine protected areas, which allows us to analyze the importance associated with the increase in the proportion of protected areas in the SNAP of Ecuador for the conservation of T. pinchaque in the future.
The main threat identified for the tapir is the destruction of its habitat due to land use changes. With the projections made, we can show that, to a great extent, the in situ conservation strategy, i.e., the protected areas, play an essential role in the conservation processes of the species. However, it is essential to urge wildlife authorities and managers to join efforts to create strategies that contribute to connecting the main tapir habitats, such as, for example, biological corridors. This will facilitate the linking of dispersed populations, increasing the number of individuals and improving their genetic variability. Finally, it is important to continue with more studies that contribute to biodiversity conservation, considering the importance of balance in ecosystems to the humans dependent thereof.