Current and Future Potential Distribution of the Xerophytic Shrub Candelilla ( Euphorbia antisyphilitica ) under Two Climate Change Scenarios

: Candelilla ( Euphorbia antisyphilitica Zucc.) is a shrub species distributed throughout the Chihuahuan Desert in northern Mexico and southern of the United States of America. Candelilla has an economic importance due to natural wax it produces. The economic importance and the intense harvest of the wax from candelilla seems to gradually reduce the natural populations of this species. The essence of this research was to project the potential distribution of candelilla populations under di ﬀ erent climate change scenarios in its natural distribution area in North America. We created a spatial database with points of candelilla presence, according to the Global Biodiversity Information Facility (GBIF). A spatial analysis to predict the potential distribution of the species using Maxent software was performed. Thirteen of 19 variables from the WorldClim database were used for two scenarios of representative concentration pathways (RCPs) (4.5 as a conservative and 8.5 as extreme). We used climate projections from three global climate models (GCMs) (Max Planck institute, the Geophysical Fluid Dynamics Laboratory and the Met O ﬃ ce Hadley), each simulating the two scenarios. The ﬁnal predicted distribution areas were classiﬁed in ﬁve on-site possible candelilla habitat suitability categories: none ( < 19%), low (20–38%), medium (39–57%), high (58–76%) and very high ( > 77%). According to the area under the curve (0.970), the models and scenarios used showed an adequate ﬁt to project the current and future distribution of candelilla. The variable that contributed the most in the three GCMs and the two RCPs was the mean temperature of the coldest quarter with an inﬂuence of 45.7% (Jackknife test). The candelilla’s distribution area for North America was predicted as approximately 19.1 million hectares under the current conditions for the high habitat suitability; however, the projection for the next ﬁfty years is not promising because the GCMs projected a reduction of more than 6.9 million hectares using either the conservative or extreme scenarios. The results are useful for conservation of the species in the area with vulnerable wild populations, as well as for the selection of new sites suitable for the species growth and cultivation while facing climate change.

Based on GCMs and RCPs, it is possible to estimate SDMs for candelilla in North America by using points of presence of the species. In this sense, the Global Biodiversity Information Facility (GBIF) is one of the databases with a large set of species occurrences on the planet, nevertheless; many of these data consist of presence records provided by museum or herbarium collections [27]. In these cases, various authors consider that the methods of collecting the field information are rarely known, so that absences cannot be inferred with certainty [23,[28][29][30], there are some methods that may help to reduce bias for noisy data, such as high-resolution satellite images, using information of others members of the community or carrying out field verification [21]. Despite this issue, the GBIF also includes a large number of sites that come from well-documented inventory and monitoring schemes, such as in the case of the National inventory of candelilla in Mexico. This inventory was carried out for northern Mexico during 2016 [31] and the GBIF includes 348 sites of presence of candelilla from it. In addition, the GBIF also includes 248 research observation sites of candelilla from 2016 to 2019 reported from previous research [27]. These data increase the certainty of the presence of candelilla in North America and provides a reliable basis for the development of Eco-geographic models of distribution of the species in this region.
The distribution models are known according to their objectives as bioclimatic models, ecological niche models or habitat models [32][33][34], which consider mathematical algorithms created with variables which allows evaluating the potential geographic distribution of the species [35]. These algorithms are created based on data of habitat presence or absence and its relationship with current conditions and future probabilities of the behavior of some variables, such as the climate variables [35,36].
There are many methods that may be used for species distribution modeling, such as Markov random fields, mixture models or logistic regression [37,38]. One of the most used and reliable methods is the maximum entropy approach (Maxent) [21,39,40]. The models, generated by Maxent, have a natural probabilistic interpretation, giving a smooth transition from most to least suitable conditions, facilitating its interpretation [21,40].
The present research has the main objective of modeling the current and future potential distribution of candelilla (Euphorbia antisyphilitica Zucc.) in North America, based on two RCPs and employing three GCMs [41,42]. These SDMs would be useful to assist in the conservation of vulnerable wild populations or in the cultivation the species in areas where a greater adaptability of the species to climate change may be achieved [32,34].

Materials and Methods
The study area comprehends from central Mexico up to the southeast of the United States of America, in the region known as the Chihuahuan Desert ( Figure 1).
A total of 406 records of E. antisyphilitica presence were downloaded from the GBIF, 348 were located in Mexico and 58 corresponds to the United States of America ( Figure 1) [27]. Furthermore, 27 verification field sites, located in the Mexican states of Durango, Chihuahua and Coahuila, were used to validate occurrence points from the GBIF.
Nearby points were eliminated using a 1.5-km buffer to prevent the distribution bias and overlapping points inside of each cell.
To determine the influence of the environmental variables on the distribution of E. antisyphilitica, we included in the model 19 bioclimatic variables from the period 1950-2000. These variables had a 30-s (ca. 1 km) spatial resolution and were downloaded from the WorldClim dataset [43]. A total of 406 records of E. antisyphilitica presence were downloaded from the GBIF, 348 were located in Mexico and 58 corresponds to the United States of America ( Figure 1) [27]. Furthermore, 27 verification field sites, located in the Mexican states of Durango, Chihuahua and Coahuila, were used to validate occurrence points from the GBIF.
Nearby points were eliminated using a 1.5-km buffer to prevent the distribution bias and overlapping points inside of each cell.
To determine the influence of the environmental variables on the distribution of E. antisyphilitica, we included in the model 19 bioclimatic variables from the period 1950-2000. These variables had a 30-s (ca. 1 km) spatial resolution and were downloaded from the WorldClim dataset [44].
To reduce multi-collinearity among the 19 bioclimatic variables, highly correlated variables (r ≥ 0.85 Pearson correlation) were eliminated. The reduction of predictor variables resulted in the inclusion of 13 variables (Table 1).  To reduce multi-collinearity among the 19 bioclimatic variables, highly correlated variables (r ≥ 0.85 Pearson correlation) were eliminated. The reduction of predictor variables resulted in the inclusion of 13 variables (Table 1). The highlighted variables were selected in the multi-collinearity test, and used in modeling.
The three GCMs used were the MPI-ESM-LR (Max Planck Institute) [44], the GFDL-CM3 (Geophysical Fluid Dynamics Laboratory) [45] and the HADGEM2-ES (Met Office Hadley) [46]. The future climate data of two RCPs for a medium-term projection  were also included [40,47]. These were the RCP of 4.5 W/m 2 of low emissions (CO 2 ) with a stable tendency, and the RCP of 8.5 W/m 2 of high emissions (CO 2 ) with increasing tendency [48]. The two RCPs and the three GCMs were selected based on the experience in the use of the models by the "Computer Department for the Atmospheric and Environmental Sciences" during the generation of the Weather Atlas for Mexico [49]. A maximum entropy model was used (Maxent software-Version 3.4.1) [50] because it has been shown to perform the SDMs well, relative to other modeling methods [21,39,47]. Maxent also uses presence-only data to predict the distribution of a species based on the theory of maximum entropy [50]; furthermore, Maxent has a natural probabilistic interpretation, giving a smooth transition from most to least suitable conditions, which can be easily interpreted [38,51].
To calibrate the model, a Bootstrap resampling algorithm of Maxent was used [50]. We selected 75% of the data for model training and 25% for model testing [52], keeping other values as default.
Jackknife analysis was performed to determine the variables that reduce the model reliability when omitted [53]. We also used the area under the receiving operator curve (ROC-AUC) to evaluate the model performance. The ROC-AUC ranges from 0 to 1, where values near 1 indicate a good performance of the model and a value of 0.50 indicates the model does not perform better than random [47,54]. Meanwhile, a value of 1.0 indicates perfect discrimination [47,54,55]. The model with the highest UAC value was considered the best for assessing the potential distribution of Candelilla [52].
The jackknife approach excludes one variable at a time when running the model. It provides information on the importance of each variable to the model in terms of how effective each variable is at explaining the species distribution and how much unique information each variable provides [56]; meanwhile, the AUC is a ranked approach for assessing model fit that determines the habitat suitability location will be ranked higher than a random background location [50,56]. These random background locations serve as pseudo-absences for all analyses in Maxent [56].

Projections of Current Candelilla's Presence
The average value of AUC (0.970) (±0.001) indicated a high discrimination for candelilla's current distribution model in North America (Figure 2), it performed better than a random model [21,26]. The variables that showed the highest contribution to the SDMs were the mean temperature of coldest quarter (Bio 11) (45.7%), the precipitation seasonality (coefficient of variation) (Bio 15) (13.3%), the precipitation of coldest quarter (Bio 19) (11.6%) and the mean temperature of driest quarter (8.7%) and precipitation of driest month (4.7%). The cumulative contribution of these six variables was of 84%. The 13 variables considered for the model integrated a cumulative value of 99% (Table 1). These results showed that winter temperatures and precipitations have also a high influence in the candelilla's distribution in the study region. The Jackknife analysis showed that the mean temperature of coldest quarter has the highest relative importance as independent variable over the candelilla distribution model. In addition, it indicated that the environmental variables related to temperature (isothermality (BIO2/BIO7) (×100) and temperature seasonality (standard deviation ×100) are important for the candelilla's distribution model (Figure 3). The estimated area for the medium and low habitat suitability classes predicted a large area of potential distribution of candelilla in North America. This area included the states of Tamaulipas, Jalisco, western Zacatecas and western Guanajuato in Mexico, as well as southeastern Texas in the USA. Currently there is no evidence of presence of candelilla in this area [9,27,32,59]. Therefore, the current projection of candelilla's distribution seems to match better with the high habitat suitability class (19,166,820 ha), which includes the states of Coahuila, northeast Durango, Easter Zacatecas and San Luis Potosi in Mexico and a small area in the south of Texas ( 2 and Figure 4) [9,32]. The Jackknife analysis showed that the mean temperature of coldest quarter has the highest relative importance as independent variable over the candelilla distribution model. In addition, it indicated that the environmental variables related to temperature (isothermality (BIO2/BIO7) (×100) and temperature seasonality (standard deviation ×100) are important for the candelilla's distribution model (Figure 3).  The Jackknife analysis showed that the mean temperature of coldest quarter has the highest relative importance as independent variable over the candelilla distribution model. In addition, it indicated that the environmental variables related to temperature (isothermality (BIO2/BIO7) (×100) and temperature seasonality (standard deviation ×100) are important for the candelilla's distribution model (Figure 3). The estimated area for the medium and low habitat suitability classes predicted a large area of potential distribution of candelilla in North America. This area included the states of Tamaulipas, Jalisco, western Zacatecas and western Guanajuato in Mexico, as well as southeastern Texas in the USA. Currently there is no evidence of presence of candelilla in this area [9,27,32,59]. Therefore, the current projection of candelilla's distribution seems to match better with the high habitat suitability class (19,166,820 ha), which includes the states of Coahuila, northeast Durango, Easter Zacatecas and San Luis Potosi in Mexico and a small area in the south of Texas ( 2 and Figure 4) [9,32]. The estimated area for the medium and low habitat suitability classes predicted a large area of potential distribution of candelilla in North America. This area included the states of Tamaulipas, Jalisco, western Zacatecas and western Guanajuato in Mexico, as well as southeastern Texas in the USA. Currently there is no evidence of presence of candelilla in this area [9,27,31,58]. Therefore, the current projection of candelilla's distribution seems to match better with the high habitat suitability class (19,166,820 ha), which includes the states of Coahuila, northeast Durango, Easter Zacatecas and San Luis Potosi in Mexico and a small area in the south of Texas (Figures 2 and 4) [9,31].

Future Potential Sites for Distribution of Candelilla
Based on the high habitat suitability, the six models showed that in the future (year 2069) the environmental changes generated by the increase in the concentration of gases such as carbon dioxide may cause a reduction in the habitat suitability of candelilla in North America, compared to the habitat suitability under current conditions. This could happen even under the more conservative scenario RCP4.5 (Table 2). These results suggest that candelilla's southern populations will reduce their distribution area in the future as a consequence of climate change. Currently, basic aspects of the species are lack of knowledge, such as reproductive physiology, adaptation to cultivation and genetics. Studies that provide this information should be highly useful to provide a better management to candelilla population in the near future. The six models were consistent in their projections. The GFDL-CM3 model predicted increases in distribution area for the none class and reductions for the low, medium and high habitat suitability classes for the two scenarios ( Table 2). The HADGEM2 model predicted reductions in the candelilla's distribution area for the none and high classes and increments in the distribution for the low and medium classes for the two scenarios, as well. Meanwhile, the MPI-ESM-LR model projected reductions in the none and high classes for the RCP4.5 scenarios and reductions in the low and high classes for the RCP8.5 scenario. The very high habitat suitability class seems to sub-estimate the candelilla distribution area, based on technical and scientific reports [8,9,31,58].
According to the projection obtained with the GCMs and the RCP4.5, the increase in the area caused by climate change in the future (Year 2069) for the none, low and medium classes would occupy lands beyond the north and west portion of the current distribution, mainly in the states of Chihuahua and Sonora in Mexico and Texas in the USA. This behavior could be related to the warming that will probably occur in lower latitudes as a consequence of climate change. The distribution of candelilla seems to be higher in colder areas ( Figure 5). For the high habitat suitability class, the GFDL-CM3, HADGEM2 and MPI-ESM-LR models predicted a reduction of 39.2, 20.8 and 10.7 in the candelilla's distribution area, respectively. The states where candelilla populations could be negatively affected are Coahuila and San Luis Potosí in Mexico. Currently, the largest harvest of the species is carried out in these states [8,9].
According to these results, the effect of climate changes for the next fifty years could generate a significate reduction of lands were candelilla is currently harvested. In this sense, the climate change could also affect the percentage of wax produced by candelilla's plant as a consequence of changes in winter temperatures and precipitation.
For the scenario RCP8.5, the HADGEM2 model projected increments up to 33.3% in the candelilla's distribution area for the low class and 13.7% for the medium class. Meanwhile, the MPI-ESM-LR model predicted an increment of 23.5% in the area for the middle class and 2.3% for the none class. Regarding the GFDL-CM3, it projected a reduction of 23.7%, 19.7% and 45% for the low, medium and high classes, respectively. Furthermore, the three models predicted reductions between 36% to 45% in the high class. The increments in the candelilla's distribution area could occur in the north and west area, according to the current candelilla's distribution as well as the RCP4.5. Meanwhile, the states with reductions of candelilla's distribution, as a result of the climate change under the RCP8.5, are Coahuila, San Luis Potosí and Zacatecas, Mexico ( Figure 6).     Under this scenario, projected changes will continue beyond fifty years, which could dramatically reduce the distribution area of the species over the next 100 or more years. Genetic studies and adaptability of the candelilla to extreme scenarios could generate the necessary information to establish conservation strategies for the species in the face of possible environmental changes in the long term.
According to the classes of low and medium habitat suitability, the two scenarios projected an increment in the candelilla's distribution areas, including areas with no current conditions for its growth [9]. These areas are located in southern Mexico and include the states of Veracruz, Oaxaca, Puebla and Estado de México, where the temperature and precipitation are higher than the climate conditions currently needed for the natural candelilla´s distribution [2,3,35,48]. Conversely, the southern presence for candelilla was reported in the Tehuacán-Cuicatlan Valley, which is located in the semiarid region between Puebla and Oaxaca, Mexico [59].

Discussion
The six models of E antisyphilitica potential distribution in North America showed a very well accurate ROC-UAC value [22,47]. For the current conditions E. antisyphilitica occupies a great distribution area (191,616 km 2 ) over the arid lands of northern Mexico and a small portion of southern Texas. The three GCMs projected a reduction in the distribution area for the high habitat suitability as a consequence of climate change for the next fifty years (2069). Even the more conservative scenario of carbon dioxide emissions RCP4.5 predicted a reduction area ranging from 10% to 39%; meanwhile, the extreme scenario RPC8.5 projected a reduction area from 36% to 45%, compared to the current distribution. In the Mexican states of Coahuila, Chihuahua, Zacatecas and San Luis Potosi, the harvest of candelilla for the commercial extraction of wax is constantly increasing due to the increasing demand of organic wax [2,8,9]. Even though the current use of candelilla cannot be regarded as extremely threatening for the species, given its vast distribution area, local to regional wild growing stocks are reduced or have disappeared. Thus, the species exploitation without controlled management leads to the decline of populations [2,60].
Our results would allow stablishing better candelilla's management, including protection or cultivation based on the vulnerability of wild populations or on the location of suitable areas for candelilla's growth [8,9,61]. In this regard, we identified that thirteen environmental variables influence (99%) the current and future distribution of candelilla in North America. Four of those variables accumulated almost 80% of the influence on the GCMs (temperature of coldest quarter (Bio 11) (45.7%), the precipitation seasonality (coefficient of variation) (Bio 15) (13.3%), the precipitation of coldest quarter (Bio 19) (11.6%) and the%), the mean temperature of driest quarter (Bio9) (8.7%)). These environmental variables showed that the candelilla's distribution is mostly affected by low temperatures and dry seasons. According to the literature, this species grows in areas of semi-desert climate and is highly adapted to drought conditions with erratic rainfall regimes with an annual precipitation of 150-500 mm and extreme temperatures of 44 • C and −2 • C [2,8,61].
The three models were consistent in their projections for the two scenarios (RPC4.5 and RPC8.5). The GFDL-CM3 model predicted more conservative changes compared to the HADGEM2 and the MPI-ESM-LR models. Both of the scenarios predicted possible increments in the candelilla's distribution area for the lower and middle habitat suitability classes, mainly towards the north area of the current distribution [62][63][64]. The models also predicted a considerable reduction in the area for the high habitat suitability class for the next fifty years. The projections of increments or reduction of the candelilla's distribution areas were higher for the RPC8.5 scenarios compared to the RPC4.5 scenarios. Thus, candelilla may be sensitive to warming changes generated as a consequence of an increment in the emissions of greenhouse gases in the future [65]. In order to conserve candelilla's populations, this behavior suggests that plantations of candelilla could be established in the northern area of its current distribution, mainly in north Coahuila and northwest Chihuahua in México [7,10,18,19]. In contrast, the southern populations appear to be more vulnerable as a consequence of global warming with the presence of warmer winters during the next fifty years.

Conclusions
The six potential distribution models generated for candelilla (E antisyphilitica) in North America showed a very well accurate ROC-UAC value. The three GCMs used in this studio were consistent in their projections for the two scenarios evaluated (RPC4.5 and RPC8.5); however, the GFDL-CM3 model predicted more conservative changes compared to the HADGEM2 and MPI-ESM-LR models.
The estimated candelilla's distribution area for North America is approximately 19.1 million of hectares under the current conditions; however, the projection for the next fifty years (Year 2069) is not promising because there could be a reduction of 4.5 million hectares considering the conservative scenario (RCP 4.5) or an average of 8.1 million hectares in the case of the extreme scenario (RCP 8.5).
The climate variables with the greatest influence in the candelilla's distribution models were the temperature of coldest quarter, precipitation seasonality (coefficient of variation), the precipitation of coldest quarter, mean temperature of driest quarter and precipitation of driest month. therefore, the species distribution could be strongly conditioned by the low temperatures and precipitation occurring during the winter season.
The projections of increments or reductions in the distribution area of candelilla were higher for the extreme scenario (RPC8.5) compared to the conservative one (RPC4.5). This behavior shows that candelilla is sensitive to warming; thus, southern populations could be more vulnerable compared to northern ones. In this sense, plantations of this species could be established in the northern area of candelilla's current distribution, mainly in the north of Coahuila and northwest of Chihuahua states in México.