Maxent Modeling for Identifying the Nature Reserve of Cistanche deserticola Ma under Effects of the Host ( Haloxylon Bunge) Forest and Climate Changes in Xinjiang, China

: Cistanche deserticola Ma is a traditional Chinese medicinal plant exclusively parasitizing on the roots of Haloxylon ammodendron (C. A. Mey.) Bunge and H. Persicum Bunge ex Boiss and the primary cultivated crop of the desert economy. Its wild resources became scarce due to over-exploitation and poaching for economic beneﬁts. To protect the biological diversity of the desert Haloxylon–Cistanche community forest, the optimal combination of desert ecology and economy industry, and their future survival, this paper examines the conservation areas of wild C. deserticola from the perspective of hosts’ effects and climate changes. To identify conservation areas, the potential distributions generated by MaxEnt in two strategies (AH: abiotic and hosts factors; HO: hosts factors only) compare the model’s performance, the niche range overlap, and the changing trend in climate changes. The results show the following: (1) The HO strategy is more suitable for prediction and identifying the core conservation areas in hosts and climate changes (indirectly affected by host distributions) for C. deserticola . (2) The low-suitable habitat and the medium-suitable habitat are both sensitive to the climate changes; the reduction reaches 48.2% (SSP585, 2081–2100) and 26.6%(SSP370, 2081–2100), respectively. The highly suitable habitat is always in growth, with growth reaching 27.3% (SSP585, 2081–2100). (3) Core conservation areas and agriculture and education areas are 317,315.118 km 2 and 319,489.874 km 2 , respectively. This study developed a predictive model for Maxent under climate change scenarios by limiting host and abiotic factors and inverted the natural habitat of C. deserticola to provide scientiﬁc zoning for biodiversity conservation in desert Haloxylon– Cistanche community forests systems, providing an effective reference for decision makers.


Introduction
The desert ecology and economy industry (DEEI), in the context of China's Belt and Road Initiative and targeted poverty reduction policy, is receiving more and more attention [1][2][3]. The Xinjiang Uyghur Autonomous Region, located in north-western China, is becoming the largest potential gainer because of its unique locational and climatic advantages, including China's largest and second-largest deserts, the Taklimakan desert and the Gurbantunggut desert, respectively [4][5][6]. However, the trickiest challenge of DEEI is also from itself and the stress of ecology and economy [7,8]. Precipitation occurring much

Occurrence Data
To compile comprehensive data on the occurrences of all C. deserticola and their host, Haloxylon spp., we used a combination of field survey data (collected under current conditions) and data from herbaria historically collected with precise coordinates, such as those from the Chinese Virtual Herbarium (www.cvh.ac.cn, last accessed on 20 November 2021) and the Global Biodiversity Information Facility (www.gbif.org, last accessed on 20 November 2021), totaling 209 occurrences of the three species.

Occurrence Data
To compile comprehensive data on the occurrences of all C. deserticola and their host, Haloxylon spp., we used a combination of field survey data (collected under current conditions) and data from herbaria historically collected with precise coordinates, such as those from the Chinese Virtual Herbarium (www.cvh.ac.cn, last accessed on 20 November 2021) and the Global Biodiversity Information Facility (www.gbif.org, last accessed on 20 November 2021), totaling 209 occurrences of the three species.
Why were historical data combined with data from current expeditions? The primary reason for this is that current wild resources do not adequately cover the range of C. deserticola and their hosts over the last 50-100 years. When discussing the conservation of C. deserticola wild resources, we must include areas where the natural environment is suitable, but the wild resources have been drastically reduced due to human activity [42]. In other words, it is inaccurate to consider only the loci that exist in the current situation of severed wild resource availability. Therefore, it is critical and necessary to collect data on extinct species to determine which areas are suitable for their growth without regard for the effects of human activity.

Environmental Data
We used the WorldClim 2.1 database (www.worldclim.org, last accessed on 20 November 2021) with a spatial resolution of 2.5 arcminutes for both historical and future climate data (including 19 bioclimatic variables in each scenario). Additionally, we used the mean of the BCC-CSM2-MR, CanESM5, CNRM-CM6-1, and MIROC6 future climate simulations to avoid some climate uncertainty from a single climate model. Elevation data were retrieved from the same website, the WorldClim version 2.1 database, for both current and future climatic environments. Additionally, considering the topographical and soil variables in the role of shaping plant species distributions, we downloaded the related data, including aspect, slope, soil water content (at 100 cm and 200 cm depth), soil sand content (at 100 cm and 200 cm depth), soil organic carbon content (at 100 cm and 200 cm depth), and soil texture class (at 100 cm and 200 cm depth) from NASA and OpenLandMap and processed by Google Earth Engine. Additionally, these topographical and soil variables were considered unchanging factors in future scenarios. We primarily used the functions in the "raster" and "rasterVis" packages to process these environmental data files in R [43,44].

Distribution Modeling
Before the variables selection, we used the ENMTools to clean and screen the occurrence data of each species, which aims to match data accuracy between the resolution of occurrence and that of environmental factors [45,46].
The selection of variables was driven by species data, the majority of which came from the VarSel function in the package "SDMtune" in R [47], using 10,000 randomly selected points throughout the study area, as well as environmental data on the locations of these points. Additionally, we used Spearman's coefficient analysis to eliminate environmental variables with a Spearman's coefficient greater than 0.7 and graded the remaining environmental variables according to model importance, once again eliminating the less important environmental variables (the correlation matrix in Appendix A). The remaining environmental variables (Table 1) were ranked according to their model importance. The less-important environmental variables were also eliminated, allowing the variables we chose to closely fit the model and the actual data.
For the selection of the species distribution model algorithms, we used the Maxent algorithm [48]. It employs machine-learning techniques to simulate species' overall distribution based on their presence, environmental data, and some background points, which is particularly useful when there are few data on species distribution. Still, it has high predictive reliability, making this method widely used to study species distribution and the delineation of protected areas [49][50][51][52]. However, the Maxent algorithm cannot be used exactly according to the default parameter settings, which have been shown to over-fit the data in many studies [28]. Thus, to solve such problems, we mainly use some functions in the package "kuenm" in R [53]. By supplying information about species distribution and environmental data, this program can examine parameter selection based on (1) statistical significance, (2) predictive power, and (3) model complexity to increase the model's dependability and performance. This function is accomplished by developing a large number of candidate models and picking the one that is most appropriate [26,53].

Category Bioclimatic Variables Information
Cistanche deserticola Ma

Bio07
Temperature annual range Bio19 Precipitation of coldest quarter aspect Aspect slope Slope Water_b100 Soil water content at 100 cm depth Organic_b200 Soil organic carbon at 200 cm depth Texture_b100 Soil texture class at 100 cm depth Texture_b200 Soil texture class at 200 cm depth Haloxylon ammodendron (C. A. Mey.) Bunge

Bio06
Min Temperature of coldest month Bio15 Precipitation seasonality aspect Aspect slope Slope elevation Elevation Water_b100 Soil water content at 100 cm depth Organic_b100 Soil organic carbon at 100 cm depth Texture_b100 Soil texture class at 100 cm depth Texture_b200 Soil texture class at 200 cm depth H. persicum Bunge ex Boiss

Bio07
Temperature annual range Bio13 Precipitation of wettest month Bio17 Precipitation of driest quarter aspect Aspect slope Slope Sand_b200 Soil sand content at 200 cm depth Organic_b200 Soil organic carbon at 200 cm depth Texture_b100 Soil texture class at 100 cm depth Texture_b200 Soil texture class at 200 cm depth After developing the method of distribution prediction and the ideal choice of parameters, it is extremely difficult to consider the host factors in the distribution model of C. deserticola. To solve this problem, we proceeded with the following steps (based on the current climate and using the same way for the future climate): (1) generating the host models and distribution maps by their abiotic variables, respectively; (2) generating the parasitic models by the abiotic variables and the host distribution maps (as biotic variables).
Based on the experience of previous studies, we tried to utilize two strategies to compare (1) the influence of the abiotic factors and the host factors on C. deserticola (AH), (2) the effect of hosts only on C. deserticola (HO). The distinction between the two strategies is that AH considers the host factor as several variables affecting C. deserticola distribution prediction. In contrast, HO views the host factor as the only variable affecting C. deserticola distribution prediction. In comparison, the HO method appears to be more consistent with the natural growth and developmental patterns of C. deserticola, which, as an utterly parasitic plant, obtains all its nutrients from its host. Thus, regardless of the discrepancies between the two methodologies, the primary point of convergence is that they both considered the host component, contrary to earlier distribution studies of C. deserticola, which ignored the host factor [27].
To better build and assess species distribution models, we utilized the data split function in package "kuenm" to split the occurrence data of each species into training data (75%) and test data (25%) to better evaluate the final model. We employed the AUC (area under the curve) statistic to evaluate the model, which is widely used and recognized by most researchers [54]. Moreover, the effect of different models is graded according to the value of AUC; model performance was classified as excellent (1-0.9), good (0.9-0.8), fair (0.8-0.7), poor (0.7-0.6), failing (0.6-0.5), and the AUC values closer to 1 indicated the better the performance of the model [55]. All current models' AUC values are greater than 0.9 for distribution prediction, which is within the confidence interval for climate forecast (0.75-0.95) and accurate yield predictions [56]. We also employed the "Measuring Range Overlap" function of the "ENMTools", which allows for the simultaneous evaluation of all sorts of distributions to better portray the AH and HO methods and reflect the distribution of hosts. This function is based on a widely used age-range correlation study [57]. The overlap has also been used in numerous studies on ecological niches to assess the geographical characteristics, competition, and interactions between the distributions of different species, where the threshold value used in the application of the overlap function is critical for the impact of the assessment results; thus, when judging the threshold value, we used the response curves generated alongside the distribution model (primarily, the threshold value is determined by assessing the abnormality of the response curves, mainly of the two host plants) that accompany the distribution model and are thus employed as a proxy for evaluation [57][58][59][60][61][62].

Assessing the Conservation Areas
We have primarily used two principles in the delineation and identification of conservation areas: (1) the classification of the fitness class of the distribution of C. deserticola wild resources under current environmental conditions; (2) the change in habitat class of areas under future climate change conditions.
Habitat levels are classified using the Jenks natural breaks classification method, which looks for natural patterns in the data rather than artificial and rigid grouping [63]. Jenks natural breaks classification classifies habitats into four categories: (1) inappropriate habitat (IH); (2) low-suitable habitat (LSH); (3) medium-suitable habitat (MSH); (4) highly suitable habitat (HSH). To ensure that the future classification of habitat classes is consistent with the current distribution and allows for comparison, we applied the current grouping strategy of each model to the future distribution model so that the groupings were consistent and the grouping error was low, to minimize the climate change uncertainty.
Using the current habitat as a criterion, habitat changes are classified into three broad categories based on the difference between the habitat in various climatic conditions and the current habitat: (1) areas that have remained constant in habitat class over time; (2) areas that have decreased in habitat class over time; and (3) areas that have increased in habitat class over time.
Based on current habitat classes, a statistical approach was used to determine trends in the rate of change in area for various habitat classes under various climatic scenarios. By utilizing trends rather than simply increasing or decreasing areas to classify conservation areas, natural patterns of habitat change can be better understood to benefit human activities.

Abiotic and Host Factors
Under the effects of hosts and current climate factors (Table 1; Figure 2), the total area of HSH is 89,098.579 km 2 . It mainly occurs in northern Xinjiang, the central and southern Gurbentunggut desert, the central region of Irtysh River, and the northern foothills of the Tian Shan Mountains. The MSH is a total of 133,722.574 km 2 , and is concentrated in the periphery of HSH, such as the upstream of Irtysh River, northern Gurbentunggut desert, and northern foothills of Karakoram Range. Additionally, the LSH is 208,660.229 km 2 , and it is mainly in the southern foothills of the Tian Shan Mountains. centrated in the periphery of HSH, such as the upstream of Irtysh River, northern Gurbentunggut desert, and northern foothills of Karakoram Range. Additionally, the LSH is 208,660.229 km 2 , and it is mainly in the southern foothills of the Tian Shan Mountains.

Host Factors Only
In the effect of host factors only (Table 1; Figure 3), the HSH concentrated in the Gurbentunggut desert is a total of 319,489.874 km 2 , and from the periphery of the Gurbentunggut desert to the southern foothills of the Altai Mountains and the upstream of Tarim River with the main oasis in Southern Xinjiang, the MSH is about 230,856.128 km 2 ; the LSH is 86,458.990 km 2 and is predominant in the south-western Taklimakan desert and the periphery of the Tian Shan Mountains.

Future Distribution
In all the climatic scenarios (Figure 4), the distributions of C. deserticola (with two strategies) do not make a huge difference. The HSH is slowly expanding, and the expansion of LSH comes with the same trend in AH. Additionally, the habitat in HO illustrates that the MSH is expanding south.

Future Distribution
In all the climatic scenarios (Figure 4), the distributions of C. deserticola (with two strategies) do not make a huge difference. The HSH is slowly expanding, and the expansion of LSH comes with the same trend in AH. Additionally, the habitat in HO illustrates that the MSH is expanding south. In SSP126, the LSH shows a tendency to increase firstly and then decrease suddenly, with a slow increase afterward. The MGR reaches 24.2% in 2021-2040, and the maximum reduction ratio (MRR) is 5.9%. The MSH shows a decreasing trend, with the MRR reaching 7.7% in 2041-2060. In SSP245, LSH decreases over time, and the MRR is 33.1% in 2081-2100; the MSH shows the same trend with an MRR at 17.6% in 2081-2100; in SSP370 and SSP585, the trends of LSH and MSH are the same as in SSP245. The MRR of LSH in SSP370 is 43.6% in 2081-2100, and in SSP585 2081-2100, the MRR is 48.2%. The MRR of MSH in SSP370 reaches 26.6% in 2081-2100, and in SSP585, it reaches 23.3% in 2081-2100.
Moreover, we have conducted a statistical analysis of how the area of each habitat changes in response to future climate change impacts ( Figure 5). The HSH area tends to increase under either strategy. In the HO strategy, the area of HSH has consistently increased in response to all climatic scenarios: the maximum growth ratio (MGR) in SSP126 is 13.9% (2081-2100); in SSP245, the maximum growth ratio is 19.4% (2081-2100); the maximum growth ratio reaches 23.1% (2081-2100) in SSP370; and 27.3% (2081-2100) is the maximum growth ratio in SSP585. The extension of HSH mainly comes from the level-up of MSH and LSH. Thus, the changes of MSH and LSH should be considered.

The Result of Evaluation and Conservation Identifying
The results of the AUC assessment show that all models assessed as excellent (Table 2). In order to better evaluate the differences between the models from multiple perspectives in an integrated manner, we therefore analyzed the response curves ( Figure 6) as well as the Niche range overlay (Table 3).  We compared the distribution of the parasitic plants with the distribution of the host plants to demonstrate the influence of host plant factors on the predicted results ( Figure 7). We found the HO strategy takes into account the effects of host distribution more comprehensively.  We compared the distribution of the parasitic plants with the distribution of the host plants to demonstrate the influence of host plant factors on the predicted results ( Figure  7). We found the HO strategy takes into account the effects of host distribution more comprehensively. We have taken into account the actual distribution of the host plant on the basis of the AUC values and the HO strategy adequately reflects the impact of host distribution. We therefore used the HO strategy as a basis for delineating the C. deserticola nature re- We have taken into account the actual distribution of the host plant on the basis of the AUC values and the HO strategy adequately reflects the impact of host distribution. We therefore used the HO strategy as a basis for delineating the C. deserticola nature reserve and based the delineation of the reserve and functional areas on changes in habitat area under future climate (Figure 8). The AH/HO means the strategy of abiotic and host factors or host factors only. serve and based the delineation of the reserve and functional areas on changes in habitat area under future climate (Figure 8). The AH/HO means the strategy of abiotic and host factors or host factors only.

The Better Model
Numerous researchers have attempted to predict the distribution of parasitic plants by incorporating host factors [63,64]. However, these studies have primarily focused on semi-parasitic plants or animals associated with specific food sources, and they have primarily considered the AH strategy [33,36,64]. Few distribution predictions for holoparasitic plants explicitly consider host factors as the sole qualifier, such as the HO strategy.

The Better Model
Numerous researchers have attempted to predict the distribution of parasitic plants by incorporating host factors [63,64]. However, these studies have primarily focused on semiparasitic plants or animals associated with specific food sources, and they have primarily considered the AH strategy [33,36,64]. Few distribution predictions for holoparasitic plants explicitly consider host factors as the sole qualifier, such as the HO strategy.
In terms of model evaluation, the models of C. deserticola under both strategies all achieved an AUC greater than 0.9 (Table 2), which is considered excellent [55]. Additionally, the AUC for the AH strategy was slightly higher than for the HO strategy. Contrary to the AH strategy, C. deserticola is found in the southern part of Xinjiang in the actual survey. This investigation is more consistent with the HO strategy.
In the Maxent, the presence of data inputs and the selection of variables can have a significant effect on the final model's prediction results [65]. Additionally, some studies on the importance of MaxEnt algorithm variables have discovered that when all of the given environmental factors (biotic and abiotic) are included in the model, the importance of each environmental factor is underestimated [66,67]. In other words, the MaxEnt algorithm can illustrate the effects of all variables we provided [68]. Especially when we removed the less-important variables in pre-processing. Thus (Table 1), in the AH strategy, the effects of host factors would not have the same importance as that in the HO strategy. However, for the holoparasitic plant, all of its nutrient sources are obtained from the hosts [69,70]. As a result, the hosts' distributions deserve to be elevated in importance as a proxy for biological conditions. As a result (Figure 6), we examined the host distribution's response curves under both strategy models to identify differences between them.
By comparing the response curves for the major host plant, H. ammondendron, we can see that when the response curve of 0.65 is used as the threshold, the two curves do not indicate the same range of thresholds, and the trends between the two strategies are different. To further compare the two strategic models' responses to host distributions, the effect of hosts distributions on the niche range overlap was quantified. The specific point of difference in the response curve of H. ammondendron, 0.65, was used as the analysis threshold value. As shown in the table (Table 3), the HO strategy is better than the AH strategy for hosts' responses. This means that the HO strategy can better respond to the effect of the two host distributions on the distribution of parasitic plants at a threshold of 0.65.
The final predicted distribution map shows that AH is more concentrated in northern Xinjiang, while HO is distributed throughout the province (Figure 7). According to relevant studies on variable importance, it is hypothesized that the importance of host factors in AH is comparable to that of other abiotic factors and does not take precedence, i.e., the role of hosts is diminished in the AH strategy [66,71]. In HO, the effect of the hosts can be more clearly demonstrated because no other abiotic factors are present and only the host factors are used as variables.
Thus, when the actual response of the hosts' factors is compared in the holoparasitic distribution study to both AH and HO strategies, the HO strategy is more responsive to the role of host distributions and more relevant to the actual distribution of C. deserticola. Although we still have a long way to go in terms of model assessment accuracy. As a result, in future research, we will employ additional statistical tools, particularly those frequently used in species distribution models, to verify the accuracy of the strategies we have developed [72][73][74]. For instance, increasing the number of model evaluation metrics such as Kappa, Boyce, AUC-PR, and others, and increasing the number of prediction algorithms such as random forests, etc. [75][76][77][78]. These are all areas in which our future research will need to improve over time.

The Conservation Areas of C. deserticola
Because Highly suitable habitat in most previous designations of protected areas for plant resources have decreased due to climate change, it is possible to designate protected areas based on currently suitable areas [79]. However, our study discovered that the variation in the area was not identical for the various habitat classes, even though they were all located in the same climatic scenario ( Figure 5). We have found that HSH has maintained a growth trend in all of the HO strategies (the maximum growth ratio is 27.3% in SSP585 2081-2100). This means that the HSH has consistently increased its habitat area compared to its current area during climate change. And such a pattern also presents itself in comparison to the AH strategy. This trend seems to be revealed in this study and be explained by future temperature and climate change characteristics in the Xinjiang region. For desert plants, the abundance of water resources during the peak growth period is directly related to growth conditions [80]. All SSP scenarios precipitation has increased to varying degrees compared to the present [81]. The HSH area, responds best to C. deserticola and is most suitable for its survival, so an increase in precipitation will undoubtedly increase the area of the HSH.
However, when we shifted our focus to LSH and MSH, we found that the effect of precipitation did not seem to explain well the reduction in low and medium suitable habitat (the maximum reduction ratio of LSH is 48.2% in SSP585 2081-2100 and that of MSH is 26.6% in SSP370 2081-2100). We then similarly began to focus on the effects of temperature. In our survey of the cultivated industry, we found that moisture was important for the growth of C. deserticola. And when the temperature rises, the plant's demand for water increases [82]. This effect is much greater for MSH and LSH than for HSH.
Of course, as there are few studies on the physiology of C. deserticola, we can only make a few possible guesses. In the MSH and LSH regions, where soil, topographic and climatic conditions are not optimal, changes in temperature, precipitation and other climatic changes do not cause its growth to develop in a favourable direction. This has led to a trend towards a reduction in its current suitable area. The MSH and LSH, which are more sensitive to the environment, should then be the main conservation areas for the future changes in the climate of each habitat.
Thus, when identifying the conservation areas, the climate-sensitive zones should be given primary consideration (LSH and MSH). Given that the sensitivity of LSH and MSH is broadly similar in scenarios, the conservation areas in different scenarios can maintain essentially similar results. Therefore, the combination of LSH and MSH is identified as the core conservation in all scenarios because of its sensitivity.

Conclusions
Our study focuses on the influence of host factors on the distribution of parasitic plants under different strategies (AH and HO). The following conclusions can be drawn from our study: 1.
By incorporating model predictions and actual parasitic conditions, the spatial distribution of C. deserticola with the HO strategy is more realistic and accurate than AH. The HO strategy outperformed the AH strategy in terms of the effect on host distribution. It can better reflect the actual parasitic situation.

2.
Under various climate scenarios, the HSH is constantly increasing, reaching a maximum growth ratio of 27.3%; the LSH is more sensitive and is primarily decreasing, reaching a maximum reduction ratio of 48.2%; and the MSH has the same sensitivity as LSH, reaching a maximum reduction ratio of 26.6%. As a result, while designating the protected areas, the influence of the combination of LSH and MSH was prioritized in all climate scenarios. Additionally, given the economic development and biodiversity protection (Figure 8), the HSH is defined as agriculture and education industrial areas, aiming to promote the development of cultivated C. deserticola industry and curb the loss of biodiversity in cultivated HCCF by receiving inspiration from the natural host-parasite relationship (HCCF). The core conservation area in SSP126, SSP245, SSP370, and SSP585 is 317,315.118 km 2 . The HSH, which is always growing, is used as agricultural and educational industrial areas. The industrial zones cover a total area of 319,489.874 km 2 .