Maxent Data Mining Technique and Its Comparison with a Bivariate Statistical Model for Predicting the Potential Distribution of Astragalus Fasciculifolius Boiss. in Fars, Iran

: The identiﬁcation of geographical distribution of a plant species is crucial for understanding the importance of environmental variables a ﬀ ecting plant habitat. In the present study, the spatial potential distribution of Astragalus fasciculifolius Boiss. as a key specie was mapped using maximum entropy (Maxent) as data mining technique and bivariate statistical model (FR: frequency ratio) in marl soils of southern Zagros, Iran. The A. fasciculifolius locations were identiﬁed and recorded by intensive ﬁeld campaigns. Then, localities points were randomly split into a 70% training dataset and 30% for validation. Two climatic, four topographic, and eight edaphic variables were used to model the A. fasciculifolius distribution and its habitat potential. Maps of environmental variables were generated using Geographic Information System (GIS). Next, the habitat suitability index (HSI) maps were produced and classiﬁed by means of Maxent and FR approaches. Finally, the area under the receiver operating characteristic (AUC-ROC) curve was used to compare the performance of maps produced by Maxent and FR models. The interpretation of environmental variables revealed that the climatic and topographic parameters had less impact compared to edaphic variables in habitat distribution of A. fasciculifolius . The results showed that bulk density, nitrogen, acidity (pH), sand, and electrical conductivity (EC) of soil are the most signiﬁcant variables that a ﬀ ect distribution of A. fasciculifolius . The validation of results showed that AUC values of Maxent and FR models are 0.83 and 0.76, respectively. The habitat suitability map by the better model (Maxent) showed that areas with high and very high suitable classes cover approximately 22% of the study area. Generally, the habitat suitability map produced using Maxent model could provide important information for conservation planning and a reclamation project of the degraded habitat of intended plant species. The distribution of the plants identiﬁes the water, soil, and nutrient resources and a ﬀ ects the fauna distribution, and this is why it is relevant to research and to understand the plant distribution to properly improve the management and to achieve a sustainable management.


Introduction
Spatial and temporal distribution of species is affected by the quality and quantity of habitats [1,2]. Species distribution models (SDMs) are generally used to predict the habitat potential and spatial distribution of a species according to the occurrence data and different environmental variables [3][4][5]. These models have been widely used for many different purposes in ecological and conservation studies to evaluate the relationship between species occurrence and environmental variables [6][7][8]. Reliable data collection of species' spatial distribution could provide important information for conservation planning, reclamation projects of degraded habitats, and prediction of anthropogenic and climatic impacts on habitat potential of a plant species. A variety of statistical and probabilistic models is currently used to determine the spatial distribution of plant species [5,[9][10][11][12]. Among the common SDMs used in the recent literature are the statistical models such as generalized linear models (GLMs: [9,13] and generalized additive models (GAMs: [14][15][16]) and also probabilistic models such as Maxent [5,[17][18][19] and frequency ratio (FR). Among different species distribution models, Maxent has been proven suitable to predict the habitat potential of plant species based on presence-only occurrence data [5,6,18,20,21]. Although among the mentioned models, the FR algorithm mainly has been used for predicting the natural hazard such as landslide [22][23][24][25], its application in species distribution modeling has not been well documented.
The Maharlou Watershed located in the southern Zagros Mountain is exposed extensively to the deposits from hill slopes with marl soils originated from Gachsaran, Razak, and Sachoun geological formations [26]. Western parts of Maharlou Hill slopes are traditionally used for livestock grazing by nomads and rural communities, as well as agricultural activities, leading to plant diversity loss and accelerated soil erosion. This area is suffering from severe water erosion due to widespread marl soil distribution and has the high potential for severe land degradation process. It has been proven that marl soils are very sensitive to water erosion and mass movement due to human-induced land use changes and agricultural activities. Furthermore, sediments produced from marl soils could increase water pollution and flooding events, decrease reservoir capacity, and lead to soil and plant diversity loss [27]. On the other hand, the positive impact of vegetation on water erosion control has been widely studied in the literature reviews [28,29]. Hence, in arid and semiarid regions, identification and restoration of native plant communities that exist or appear in the erosion process prior to desertification are very important for landscape restoration planning [30][31][32]. Astragalus fasciculifolius (Fabaceae) is a 0.5-1.5 m high perennial thorny shrub mainly found in hill slopes of southern Zagros in the Fars, Bushehr, Khuzestan, and Hormozgan Provinces. A. fasciculifolius shows high ecological values in terms of soil conservation, livestock feeding, carbon sequestration potential, and medicinal properties [33][34][35][36].
In the present study, the FR bivariate statistical model and Maxent data mining technique were used to predict and map the distribution of A. fasciculifolius in marl soils of Maharlou Watershed, Fars Province, Iran.
So, the aims of the current research are: (1) to quantify the relationship between A. fasciculifolius and the selected environmental variables; (2) to develop the habitat suitability index (HSI) and identify the additional and potential localities of A. fasciculifolius and targeting the marl soil conservation activities, as well as vegetation restoration planning; and, (3) to compare between FR and Maxent models for predicting the potential distribution of A. asciculifolius in the study area. Figure 1 indicates the methodology flowchart of the current study. This figure shows the process involved in the conditioning environmental variables preparation, models application, and their validation.

Study Area
The study area, western parts of Maharlou Watershed, is located in Fars Province, southern Iran (29°31′-29°54′N and 52°12′-52°41′E), with an area of 265 square km inside the Zagros Mountains ( Figure 2). It receives 510 mm annual rainfall with average annual temperature of 17 °C [37]. The minimum and maximum elevation ranges from 1500 to 2700 m above sea level, respectively. Vegetation of the study area is dominated by A. fasciculifolius Boiss, along with plant species of Ebenus stellata Boiss., Convolvulus leiocalycinus Boiss., Gundelia tournefortii L., Phlomis elliptica Benth., and Stipa barbata Desf. Considering the widespread marl soils in the western parts of Maharlou Watershed and using a geology map, the boundaries of marl formations as key indicators for soil conservation were extracted and used as a base map for data sampling and future analyses ( Figure 2).

Study Area
The study area, western parts of Maharlou Watershed, is located in Fars Province, southern Iran (29 • 31 -29 • 54 N and 52 • 12 -52 • 41 E), with an area of 265 square km inside the Zagros Mountains ( Figure 2). It receives 510 mm annual rainfall with average annual temperature of 17 • C [37]. The minimum and maximum elevation ranges from 1500 to 2700 m above sea level, respectively. Vegetation of the study area is dominated by A. fasciculifolius Boiss, along with plant species of Ebenus stellata Boiss., Convolvulus leiocalycinus Boiss., Gundelia tournefortii L., Phlomis elliptica Benth., and Stipa barbata Desf. Considering the widespread marl soils in the western parts of Maharlou Watershed and using a geology map, the boundaries of marl formations as key indicators for soil conservation were extracted and used as a base map for data sampling and future analyses ( Figure 2).

Species Description
Astragalus is one of the largest and most widespread genera of Fabaceae family, including 3280 species distributed all over the world. In general, 844 species of this genus are grown wildly in Iran, of which 620 species (73.4%) are endemic [38][39][40][41]. A. fasciculifolius is a perennial woody plant growing wild in southern parts of the Zagros Mountains ( Figure 3). It has thorny branches and deep taproots. A. fasciculifolius is a xerophyte plant that is well adapted to marl soils of the southern Zagros and plays an important role in the conservation of marl soils; besides, it has some industrial and medicinal uses for the local people of the southern Zagros, Iran [33,42,43].

Species Description
Astragalus is one of the largest and most widespread genera of Fabaceae family, including 3280 species distributed all over the world. In general, 844 species of this genus are grown wildly in Iran, of which 620 species (73.4%) are endemic [38][39][40][41]. A. fasciculifolius is a perennial woody plant growing wild in southern parts of the Zagros Mountains ( Figure 3). It has thorny branches and deep taproots. A. fasciculifolius is a xerophyte plant that is well adapted to marl soils of the southern Zagros and plays an important role in the conservation of marl soils; besides, it has some industrial and medicinal uses for the local people of the southern Zagros, Iran [33,42,43].

Species Occurrence Data
A total of 85 occurrence records of A. fasciculifolius were collected during field surveys in 2015 and 2016, from April to September, in hill slope parts of western Maharlou Watershed. The longitude and latitude of species localities were recorded using a handheld Global Positioning System (GPS) receiver (Garmin Map 62s) and the geographical distribution database of A. fasciculifolius was compiled ( Figure 2).

Climatic Data
Climatic data were obtained from 18 stations in the study area (http://www.irimo.ir). Mean annual precipitation and temperature maps were prepared using inverse distance weight (IDW) and kriging interpolation approaches (Figure 3a,b). It has to be noted that these variables are very important in environmental modeling (Table 1) and are frequently used in species distribution models [11,44].

Topographic Data
Digital Elevation Model (DEM) with a pixel size of 20 m × 20 m (http://www.ncc.org.ir) was used to generate altitude, slope angle, aspect, and plan curvature layers using Arc GIS 10.1. The altitude thematic layer was categorized into 4 classes (Figure 4c) and the slope angle map was classified into 5 classes (Figure 4d). The slope aspect and plan curvature maps (Table 1) were calculated using DEM

Species Occurrence Data
A total of 85 occurrence records of A. fasciculifolius were collected during field surveys in 2015 and 2016, from April to September, in hill slope parts of western Maharlou Watershed. The longitude and latitude of species localities were recorded using a handheld Global Positioning System (GPS) receiver (Garmin Map 62s) and the geographical distribution database of A. fasciculifolius was compiled ( Figure 2).

Climatic Data
Climatic data were obtained from 18 stations in the study area (http://www.irimo.ir). Mean annual precipitation and temperature maps were prepared using inverse distance weight (IDW) and kriging interpolation approaches (Figure 3a,b). It has to be noted that these variables are very important in environmental modeling (Table 1) and are frequently used in species distribution models [11,44].

Topographic Data
Digital Elevation Model (DEM) with a pixel size of 20 m × 20 m (http://www.ncc.org.ir) was used to generate altitude, slope angle, aspect, and plan curvature layers using Arc GIS 10.1 (ESRI, Redlands, CA, USA). The altitude thematic layer was categorized into 4 classes ( Figure 4c) and the slope angle map was classified into 5 classes (Figure 4d). The slope aspect and plan curvature maps (Table 1) were calculated using DEM and then generated and categorized into 5 and 3 classes, respectively (Figure 3e,f).

Soil Data Mapping
Based on A. fasciculifolius spatial distribution, soil samples were taken in 0-30 cm depth of plant habitat. In total, 85 soil samples were taken. Then, the number of the most important physical and chemical soil parameters including sand, silt, clay, organic matter, electrical conductivity (EC), nitrogen, organic carbon (OC), bulk density (BD), and acidity (pH) was determined in the laboratory. The air-dried soil samples were sieved through a screen with 2 mm mesh size and prepared for the analysis. Soil texture and pH were determined using hydrometer method [45,46] and an electric pH meter [47], respectively. Also, soil EC was determined from a 1:1 soil-water suspension using an EC meter [48]. Soil organic carbon (OC) and nitrogen were measured using the Walkley-Black [49] and Kjeldahl methods [47]. Soil bulk density was identified using a core sampler of 8 cm diameter [49]. The soil data resulting from laboratory analysis was mapped (Table 1) and interpolated through kriging techniques (Figure 3g,n). At first, the spatial variability of each soil sample was determined and semi-variograms were calculated. Then, the most appropriate model was fitted and the crossvalidation approach was used to ensure the appropriation of the selected model.

Soil Data Mapping
Based on A. fasciculifolius spatial distribution, soil samples were taken in 0-30 cm depth of plant habitat. In total, 85 soil samples were taken. Then, the number of the most important physical and chemical soil parameters including sand, silt, clay, organic matter, electrical conductivity (EC), nitrogen, organic carbon (OC), bulk density (BD), and acidity (pH) was determined in the laboratory. The air-dried soil samples were sieved through a screen with 2 mm mesh size and prepared for the analysis. Soil texture and pH were determined using hydrometer method [45,46] and an electric pH meter [47], respectively. Also, soil EC was determined from a 1:1 soil-water suspension using an EC meter [48]. Soil organic carbon (OC) and nitrogen were measured using the Walkley-Black [49] and Kjeldahl methods [47]. Soil bulk density was identified using a core sampler of 8 cm diameter [49]. The soil data resulting from laboratory analysis was mapped (Table 1) and interpolated through kriging techniques (Figure 3g,n). At first, the spatial variability of each soil sample was determined and semi-variograms were calculated. Then, the most appropriate model was fitted and the cross-validation approach was used to ensure the appropriation of the selected model.

Frequency Ratio (FR) Model
The frequency ratio (FR) model supposes that the species occurrence will happen at similar conditions as if it occurs now. Moreover, the FR model is based on the assumption that the larger ratio means the stronger relationship between spatial species occurrence and the given geo-environmental variables [23,24,50]. In fact, the frequency ratio is a bivariate statistical model that could be expressed as the frequency ratio of each variable [51,52]. This approach is based on the relationship between species spatial distribution and all geo-environmental factors contributing to species occurrence in the habitats [24,53]. This approach is a simple and reliable model used in many fields of research [23,[53][54][55][56]. This model can be expressed by the following equation [24,25]: where FR is the frequency ratio of class i of parameter j, Npix is the number of pixels with species occurrence within class i of parameter variable x, N pix(Lx j ) is the number of pixels with parameter variable x j , and n is the number of classes in the parameter variable x i in the study area [24,25].
The FR value of 1 indicates the area in which species occurrence and FR value less than 1 shows a lower probability of species occurrence. Therefore, the higher ratio value indicates the stronger the relationship between species occurrence and the given variable's class attribute, while the lower the value means the lower probability of the species occurrence in the study area [23,25].

Maximum Entropy (Maxent) Model
Maxent model is a machine learning/data mining program that evaluates the distribution probability of a species in relation to environmental factors [12,17,57]. This model has a general-purpose approach to estimate the probability distribution of a species, proven to work well in practical studies [7]. Maxent uses presence-only data and makes spatial predictions from now occurrence data to predict the distribution of a species. Indeed, Maxent model estimates the spatial distribution probability of a species that is closest to uniform and still subjected to environmental variable constraints. Also, it attempts to predict the habitat suitability for a species [5,17,20].
Maxent could support categorical and continuous predictor data varying from 0 as the lowest to 1 as the highest suitability. If we suppose a random variable j that has n different potential results as x1, x2, . . . , xn, then the occurrence probabilities are p1, p2, . . . , pn, respectively. The equation of Maxent is as follows [2,23,58]: H jmax = log 2 S j ; S j is the number of classes (5) where a is the area of category and b represents the area of species occurrence percentages within the given category, (P ij ) is the probability density, H j and H jmax are entropy values, I j represents the information coefficient, and W j represents the resultant weight value for the parameter as a whole (Equations (2)- (7)). The practical application of the Maxent model in habitat suitability for plant species has been proven in several investigations [2,[10][11][12]18,59,60]. The Maxent program used in this study was obtained from http://biodiversityinformatics.amnh.org/open_source/maxent/. It can be downloaded free of charge for scientific research activities. The detailed description of this model and practical manual is presented by Phillips et al. (2006) [17], Elith et al. (2011) [20] and Phillips et al. (2017) [61]. In this research, 70% and 30% of data were assigned for model training and validation, respectively [62].

Models Validation
The FR and Maxent results were validated using the threshold-independent area under the curve (AUC) of receiver operating characteristics (ROC) [63]. This approach is one of the most widely used statistics for model evaluation. The AUC value ranges from 0.5 to 1.0, where value <0.5 describes no fit to the data; in contrast, 1.0 value indicates perfect model performance, and values >0.9 imply a very high performance. In general, the higher AUC value shows the higher performance of the model [60,63,64].

FR Model
The results of the FR model for all the data layers are shown in Figure 5. To visualize the results, the results of the species occurrence were classified into categorical occurrence classes in accordance with the natural breaks method [65]. where a is the area of category and b represents the area of species occurrence percentages within the given category, (Pij) is the probability density, Hj and Hjmax are entropy values, Ij represents the information coefficient, and Wj represents the resultant weight value for the parameter as a whole (Equations (2)- (7)). The practical application of the Maxent model in habitat suitability for plant species has been proven in several investigations [2,[10][11][12]18,59,60]. The Maxent program used in this study was obtained from http://biodiversity informatics.amnh.org/open_source/maxent/. It can be downloaded free of charge for scientific research activities. The detailed description of this model and practical manual is presented by Phillips et al. (2006) [17], Elith et al. (2011) [20] and Phillips et al. (2017) [61]. In this research, 70% and 30% of data were assigned for model training and validation, respectively [62].

Models Validation
The FR and Maxent results were validated using the threshold-independent area under the curve (AUC) of receiver operating characteristics (ROC) [63]. This approach is one of the most widely used statistics for model evaluation. The AUC value ranges from 0.5 to 1.0, where value <0.5 describes no fit to the data; in contrast, 1.0 value indicates perfect model performance, and values >0.9 imply a very high performance. In general, the higher AUC value shows the higher performance of the model [60,63,64].

FR Model
The results of the FR model for all the data layers are shown in Figure 5. To visualize the results, the results of the species occurrence were classified into categorical occurrence classes in accordance with the natural breaks method [65].   In the FR model, each factor's rating was assigned as the relationship between A. fasciculifolius occurrence and given environmental variables. As shown in Figure 5, the annual rain less than 351.  [69,70]. The soil texture of study habitat mainly composed of sandy-clay soils permits deep root percolation in soil layers and uptake water from lower soil layers in dry seasons [71]. In the case of bulk density, the class <1.26 grcm -3 had a considerably greater FR value (3.48) and the species occurrence was larger than other classes. With regard to organic carbon (OC), the FR value of 1.3-1.6% indicated the greater value (1.25), followed by the 1.6-1.9% class with a value of 1.15. Considerable research has shown that woody perennial shrubs produce more aboveground litters and, accordingly, improve soil organic carbon and fertility [72][73][74]. For nitrogen (N), the results showed that the range between 0.23 and 0.39%, with FR value of 3.37, seem to have a higher impact on A. fasciculifolius occurrence. The higher level of nitrogen in habitat soil of A. fasciculifolius indicates the positive impact of legume plants such as Astragalus genus on the nitrogen fixation process, which have been well studied [75]. The frequency ratio between A. fasciculifolius occurrence and EC indicated that the higher FR values ( Figure 5) are related to the class of <0.73 dsm -1 (4.77), suggesting the negative impact of salinity on habitat of A. fasciculifolius. Several research studies have shown the occurrence of Astragalus on non-saline soils in the mountainous parts of Iran [76][77][78]. According to the frequency ratio model results, A. fasciculifolius occurrence density is highest at the acidity (pH) ranging from 7.90 to 8.99, with FR value of 2.02. Due to widespread calcareous geology formation in the study area, the studied species prefers neutral to alkaline soils [79].
Finally, the habitat suitability index (HSI) was calculated by summation of each variable ratio value, as shown in Equation (8)  In the FR model, each factor's rating was assigned as the relationship between A. fasciculifolius occurrence and given environmental variables. As shown in Figure 5 [69,70]. The soil texture of study habitat mainly composed of sandy-clay soils permits deep root percolation in soil layers and uptake water from lower soil layers in dry seasons [71]. In the case of bulk density, the class <1.26 grcm -3 had a considerably greater FR value (3.48) and the species occurrence was larger than other classes. With regard to organic carbon (OC), the FR value of 1.3-1.6% indicated the greater value (1.25), followed by the 1.6-1.9% class with a value of 1.15. Considerable research has shown that woody perennial shrubs produce more aboveground litters and, accordingly, improve soil organic carbon and fertility [72][73][74]. For nitrogen (N), the results showed that the range between 0.23 and 0.39%, with FR value of 3.37, seem to have a higher impact on A. fasciculifolius occurrence. The higher level of nitrogen in habitat soil of A. fasciculifolius indicates the positive impact of legume plants such as Astragalus genus on the nitrogen fixation process, which have been well studied [75]. The frequency ratio between A. fasciculifolius occurrence and EC indicated that the higher FR values ( Figure 5) are related to the class of <0.73 dsm -1 (4.77), suggesting the negative impact of salinity on habitat of A. fasciculifolius. Several research studies have shown the occurrence of Astragalus on non-saline soils in the mountainous parts of Iran [76][77][78]. According to the frequency ratio model results, A. fasciculifolius occurrence density is highest at the acidity (pH) ranging from 7.90 to 8.99, with FR value of 2.02. Due to widespread calcareous geology formation in the study area, the studied species prefers neutral to alkaline soils [79].
Finally, the habitat suitability index (HSI) was calculated by summation of each variable ratio value, as shown in Equation (8)  where HSI is the habitat suitability index and FR is the frequency ratio value of each variable. The HSI represents the relative habitat potential to species occurrence. So, the greater FR value means the higher suitability of habitat for species occurrence. The resulting map of HSI is shown in Figure 6. The HSI value of the final FR map ranges from 7.81 to 32.78. This value represents the relative habitat potential index to A. fasciculifolius occurrence. The index values were classified into four classes (low, moderate, high, and very high) using the natural break (NB) method ( Figure 6). The classified areas in the low and moderate classes are 55.71 and 25.85%, respectively. On the other hand, the areas in the high and very-high HSI classes are 14.21 and 4.23%, respectively ( Figure 6). Generally, the high and very-high A. fasciculifolius occurrence areas included around 18.44% of the total of the study area. where HSI is the habitat suitability index and FR is the frequency ratio value of each variable. The HSI represents the relative habitat potential to species occurrence. So, the greater FR value means the higher suitability of habitat for species occurrence. The resulting map of HSI is shown in Figure 6. The HSI value of the final FR map ranges from 7.81 to 32.78. This value represents the relative habitat potential index to A. fasciculifolius occurrence. The index values were classified into four classes (low, moderate, high, and very high) using the natural break (NB) method ( Figure 6). The classified areas in the low and moderate classes are 55.71 and 25.85%, respectively. On the other hand, the areas in the high and very-high HSI classes are 14.21 and 4.23%, respectively ( Figure 6). Generally, the high and very-high A. fasciculifolius occurrence areas included around 18.44% of the total of the study area.

Maxent Model
The Maxent model for A. fasciculifolius performed very well with an AUC value of 82.6%. Table  2 and Figure 7 indicate the achieved results for given geo-environmental variables. Based on Maxent model assumption, if a pixel in the study area has equal condition of the rating data, it would assign the higher value. Meanwhile, the pixels with different environmental conditions are assigned with lower values [10].
Based on the results generated by the Maxent model (Table 2), BD, nitrogen, and pH with probability value of 0.82, 0.57, and 0.57 were the most useful variables with the highest impact on the habitat suitability, followed by sand percent (0.49), EC (0.49), clay percent (0.40), and silt percent (0.39), while other variables had less impact on the habitat suitability index of the study area.

Maxent Model
The Maxent model for A. fasciculifolius performed very well with an AUC value of 82.6%. Table 2 and Figure 7 indicate the achieved results for given geo-environmental variables. Based on Maxent model assumption, if a pixel in the study area has equal condition of the rating data, it would assign the higher value. Meanwhile, the pixels with different environmental conditions are assigned with lower values [10].
Based on the results generated by the Maxent model (Table 2), BD, nitrogen, and pH with probability value of 0.82, 0.57, and 0.57 were the most useful variables with the highest impact on the habitat suitability, followed by sand percent (0.    Consequently, suitable habitats for A. fasciculifolius were predicted in parts with bulk density of <1.26 grcm -3 , soil nitrogen content of >0.23%, pH with value of >7.90, and clay content of <45.39% in the study area (Table 2, Figure 7).The habitat suitability for A. fasciculifolius increased with decreasing electrical conductivity; however, it decreased slowly with increasing of soil clay content (Figure 7). These results indicate that A. fasciculifolius prefers western/northern slopes with sandy-clay, non-saline soils. These results are in line with Ghanbarian and Tayebi Khorrami's (2005) [80] understanding of A. Fasciculifolius as a shrubby legume plant with a high adaptation level to the hill slopes of the Zagros Mountains, in southern Iran. The suitable annual mean temperature higher than 15.85 • C and annual rain less than 332.5 mm indicated that A. fasciculifolius favors warmer and drier locations. [33,36,38] reported that A. fasciculifolius is normally found in the arid and semiarid regions of the Fars, Bushehr, Khuzestan, and Hormozgan Provinces in southern Iran.
Overall, the contribution of soil factors including BD, nitrogen, pH, and clay, except organic carbon, were relatively strong in A. fasciculifolius occurrence, but the impact of the climate variables such as annual rain and annual temperature was moderate. Meanwhile, the impact of topographic factors such as elevation, slope degree, slope aspect, and plan curvature were relatively very weak in occurrence of A. fasciculifolius.
Consequently, a HSI map was obtained and classified into four classes of low, moderate, high, and very high based on a given training data set, as mentioned for the FR model ( Figure 8) Consequently, a HSI map was obtained and classified into four classes of low, moderate, high, and very high based on a given training data set, as mentioned for the FR model ( Figure 8)

Validation of the HSI Maps and Comparison between FR and Maxent Models
In the current study, the validation of habitat suitability maps was conducted using ROC curve [23]. Based on literature review, the ROC curve represents the quality of deterministic and probabilistic detection and could be a useful approach to rely on the prediction models of a plant species [11]. The ROC curves (PRC-prediction rate curve) of FR and Maxent models were illustrated in Figure 9. The area under the ROC curve (AUC) was used to evaluate and compare both FR and Maxent model performance.

Validation of the HSI Maps and Comparison between FR and Maxent Models
In the current study, the validation of habitat suitability maps was conducted using ROC curve [23]. Based on literature review, the ROC curve represents the quality of deterministic and probabilistic detection and could be a useful approach to rely on the prediction models of a plant species [11]. The ROC curves (PRC-prediction rate curve) of FR and Maxent models were illustrated in Figure 9.
The area under the ROC curve (AUC) was used to evaluate and compare both FR and Maxent model performance. The habitat suitability results indicated an AUC value of 0.826 and 0.758 for the Maxent and FR model, respectively (Table 3). The results of both models provided satisfactory output for habitat suitability prediction. Although, the Maxent model showed a better result compared to FR approach. In previous research, Hosseini et al. (2013) [10], Khanum et al. (2013) [11], and Sahragard and Chahouki (2015) [19] showed that the Maxent model could be a satisfactory tool with high accuracy potential for predicting the suitable habitats of plant species. On the other hand, some researchers showed reasonable application of the FR model for potential distribution modelling of natural hazards such as fire and landslide and also groundwater potential mapping [23,24,59,81]. In general, the Maxent and FR models are good estimators of A. fasciculifolius habitat in the study area. Although, the Maxent model showed slightly better results for habitat suitability mapping. On the other hand, data clustering has been suggested as new research in unsupervised learning algorithm that could be applied in different fields of ecology including, plant distribution and habitat suitability studies. This is a suggestion for future research efforts in species distribution modeling [82][83][84]. In general, there are different data mining techniques that are applied in geosciences and environmental engineering in cases of landslides, forest fires, land subsidence, flood, and gully erosion, including random forest-RF [85][86][87][88][89][90], support vector machine-SVM [90][91][92], multivariate adaptive spline regression-MARS [93,94], and boosted regression tree-BRT [95]. Even in a case on groundwater modeling, Rahmati et al. 2016 [69] considered a comparison between two data mining techniques, including random forest and Maxent models. Their results showed that accuracy of two models was 86.5 and 91%. So, the findings indicated that Maxent is better than random forest in the modelling process. So, it is important to mention that the Maxent data mining model as a traditional approach in comparison to other recent data mining techniques could present reasonable results; thus, in the current study, we tried to use the Maxent model for species distribution modelling and its results comprised to a bivariate statistical The habitat suitability results indicated an AUC value of 0.826 and 0.758 for the Maxent and FR model, respectively (Table 3). The results of both models provided satisfactory output for habitat suitability prediction. Although, the Maxent model showed a better result compared to FR approach. In previous research, Hosseini et al. (2013) [10], Khanum et al. (2013) [11], and Sahragard and Chahouki (2015) [19] showed that the Maxent model could be a satisfactory tool with high accuracy potential for predicting the suitable habitats of plant species. On the other hand, some researchers showed reasonable application of the FR model for potential distribution modelling of natural hazards such as fire and landslide and also groundwater potential mapping [23,24,59,81]. In general, the Maxent and FR models are good estimators of A. fasciculifolius habitat in the study area. Although, the Maxent model showed slightly better results for habitat suitability mapping. On the other hand, data clustering has been suggested as new research in unsupervised learning algorithm that could be applied in different fields of ecology including, plant distribution and habitat suitability studies. This is a suggestion for future research efforts in species distribution modeling [82][83][84]. In general, there are different data mining techniques that are applied in geosciences and environmental engineering in cases of landslides, forest fires, land subsidence, flood, and gully erosion, including random forest-RF [85][86][87][88][89][90], support vector machine-SVM [90][91][92], multivariate adaptive spline regression-MARS [93,94], and boosted regression tree-BRT [95]. Even in a case on groundwater modeling, Rahmati et al. 2016 [69] considered a comparison between two data mining techniques, including random forest and Maxent models. Their results showed that accuracy of two models was 86.5 and 91%. So, the findings indicated that Maxent is better than random forest in the modelling process. So, it is important to mention that the Maxent data mining model as a traditional approach in comparison to other recent data mining techniques could present reasonable results; thus, in the current study, we tried to use the Maxent model for species distribution modelling and its results comprised to a bivariate statistical model entitled the FR as it is known as a basic model for considering the relationship between independent and dependent variables in literature reviews. The HSI maps could be a useful tool for conservation agencies and land managers for conservation and reclamation of degraded habitats of A. fasciculifolius.

Soil as a Key Factor
Our research has shown that soil is the key factor to determine the plant distribution. This is a key finding as it demonstrates that soil conservation should be applied where there is a need to preserve the plant cover. Soil is a key issue to achieve sustainable development, as the United Nations shows in their goals for Sustainable Development [96]. Moreover, in arid and semiarid countries, there is a need to find a proper management to fight against land degradation and it is a challenge to achieve the Land Degradation Neutrality [97]. There is a need to understand the soil and plant interaction to develop restoration programs. Research on the interaction of soils and plants shows how the fate of the ecosystems is dependent on soil erosion [98]. More and more, soil erosion is found to be a key link between the plant and the soil world as plants control the soil losses, but the soil quality and the soil erosion determine the plant distribution [99]. This is mainly due to the role that soil erosion exerts on seed redistribution and then on soil erosion [100]. The connectivity of the flows and sediments, and then seeds, is a relevant concept to be used in future research on plant distribution, as the flows determines the fate of the soil particles and then the seeds [101].

Conclusions
In the current study, a habitat suitability model was developed using FR and Maxent models to evaluate and predict the current and potential areas suitable for A. fasciculifolius in marl soils of the Zagros Mountains, southern Iran. Among the 14 variables selected for the model development, A. fasciculifolius habitat was mainly influenced by soil parameters including bulk density, nitrogen content, electrical conductivity, acidity, and clay percent. The climate and topographic variables showed relative weak impact in determination of habitat suitability of A. fasciculifolius. The Maxent model has reasonable and relative accurate outputs as well as FR for predicting the habitat suitability of A. fasciculifolius in our study area. The habitat suitability maps produced using FR and Maxent approach were classified into four classes including very high, high, moderate, and low. The verification of the results showed that both FR and Maxent models are reasonable for habitat suitability prediction. Although, the result of the Maxent model was better than the FR approach. Therefore, the areas predicted and mapped as the favorite places for A. fasciculifolius could be considered for current and future conservation planning. It could be suggested that more detailed soil maps and more climatic layers could also be used in habitat suitability mapping of the studied plant species to achieve more precise suitability map. Our findings can be applied in different ways such as of the protection of the sensitive habitats of A. fasciculifolius and reclamation of the degraded marl soil vegetation in the study area. More attention should be paid in the future to investigate environmental variables that could influence the A. fasciculifolius due to climate change and human-induced activities.

Conflicts of Interest:
The authors declare no conflict of interest.