Mapping the Global Potential Geographical Distribution of Black Locust (robinia Pseudoacacia L.) Using Herbarium Data and a Maximum Entropy Model

Black locust (Robinia pseudoacacia L.) is a tree species of high economic and ecological value, but is also considered to be highly invasive. Understanding the global potential distribution and ecological characteristics of this species is a prerequisite for its practical exploitation as a resource. Here, a maximum entropy modeling (MaxEnt) was used to simulate the potential distribution of this species around the world, and the dominant climatic factors affecting its distribution were selected by using a jackknife test and the regularized gain change during each iteration of the training algorithm. The results show that the MaxEnt model performs better than random, with an average test AUC value of 0.9165 (±0.0088). The coldness index, annual mean temperature and warmth index were the most important climatic factors affecting the species distribution, explaining 65.79% of the variability in the geographical distribution. Species response curves showed unimodal relationships with the annual mean temperature and warmth index, whereas there was a linear relationship with the coldness index. The dominant climatic conditions in the core of 2774 the black locust distribution are a coldness index of −9.8 °C–0 °C, an annual mean temperature of 5.8 °C–14.5 °C, a warmth index of 66 °C–168 °C and an annual precipitation of 508–1867 mm. The potential distribution of black locust is located mainly in the United Argentina. The predictive map of black locust, climatic thresholds and species response curves can provide globally applicable guidelines and valuable information for policymakers and planners involved in the introduction, planting and invasion control of this species around the world.


Introduction
Black locust (Robinia pseudoacacia L.) belongs to the family Leguminosae and is native to eastern North America [1,2].Its eastern range is centered on the Appalachian Mountains and extends from central Pennsylvania and southern Ohio to northeastern Alabama, northern Georgia and northwest South Carolina.The western section of its native range includes parts of Missouri, Arkansas and Oklahoma, and populations also exist in Indiana and Kentucky (see Figure 1; [1,3]).Black locust was first introduced to Europe in the early 17th century as an ornamental tree.Since then, it has been widely introduced to temperate Asia, Australia and New Zealand, northern and southern Africa and temperate South America for wood production, as a nurse tree [4], and for large-animal forage [5].Since black locust has a nitrogen-fixing capability and a well-developed root system, it can remediate and improve soil nutrient condition.This makes it an important ecological pioneer species for windbreaks, erosion control and reclamation of disturbed sites [6,7].In addition, black locust has a high reproductive potential, both through its root suckers (resprouting ability) and by seed propagation.However, it acts as a harmful invasive species in some parts of the world, for example in Great Britain, Germany, France, Japan and New Zealand [8][9][10].Currently, no techniques are available that provide effective control of black locust invasions [11].Therefore, in order to make practical and controlled use of this multipurpose species, it is essential to determine its global potential distribution area, significant environmental factors and species response curves, as prerequisites for top-level design in the introduction, cultivation, afforestation and invasion control of this species.
Species distribution modeling (SDM) plays a leading role in biogeography and regional ecology in estimating the niche and distribution area of a species when distribution data are limited [12][13][14][15].Usually, SDM is only used to analyze the realized niche, although a very early SDM paper [16] discussed Hutchinson's [17] concept of the realized and fundamental niche in relation to tree species.Booth et al. [16] showed for the first time how information from introductions outside of the native range could be used to provide some indication of the fundamental niche.Many early introductions of black locust are likely to have been trials of monoculture timber plantation, and the species is invasive and has been able to compete with other species outside of its native range.Thus, here, we measure the occupied area and the potentially occupied area [14], which were referred to by Peterson et al. [18] as the occupied distributional area and the invadable distributional area, respectively.Figure 1.Global spatial abundance of black locust specimens around the world (the grid cell is 1.5° × 1.5°) and its native range in eastern North America [1]; the total number of specimens is 32,674 (32,434 from Global Biodiversity Information Facility (GBIF) and 240 from Chinese Virtual Herbarium (CVH); the GBIF database query date is 9 May 2014, and the CVH database query date is 10 May 2014.
With the development of computer hardware and Internet speed, the current availability of species and climate information sharing systems has greatly enhanced the field of SDM [19][20][21][22].These have greatly inspired the use of a variety of algorithms for predicting the potential distribution of species.The performance of various SDM algorithms have been evaluated by numerous comparative studies, suggesting that the appropriate choice is dependent on multiple factors (e.g., sample size, species rarity, size of the species' geographic range, spatial scale and user preference) [13,23,24].Because black locust is a wide-ranging species, known to be present at various locations around the world (the distribution data collection process is described in Section 2.1.),maximum entropy modeling was used here to simulate its potential distribution.Maximum entropy modeling uses only presence data as the basis of its predictions, unlike presence/absence models, and is therefore more valuable in regions where collecting absence points has been problematic or completely neglected [25,26].It has recently been proven to be an effective tool for predicting potential species distributions in a wide variety of research studies [27][28][29].
Climate is considered to be the most important environmental factor influencing the distribution of species and vegetation at a regional and global scale [30][31][32][33].Therefore, our research has mainly concentrated on investigating the climatically suitable habitat, significant climatic factors, climatic thresholds (niche) and climatic response curves of this species.In this study, species occurrence data with a spatial resolution of 0.5° × 0.5° (approximately 55 km at the Equator; the reason for using this resolution is given in Section 2.1.),together with a system collating 13 climatic indexes (detailed information is given in Section 2.2), were input into a MaxEnt model to simulate the potential distribution area of black locust around the world.The main objectives of the present study were to simulate the ecological niche of black locust, analyze its ecological and geographical distribution and investigate primary climatic factors that determine the potential distribution of this tree around the world.The results could provide theoretical support for top-level design by policy-makers and planners for the introduction, cultivation, planting and invasion control of this species around the world.

Species Occurrence Data
Global black locust occurrence data were collected from herbarium records in the Global Biodiversity Information Facility database (GBIF) [34] and the Chinese Virtual Herbarium database (CVH) [35].A total of 32,556 herbarium records were collected from the GBIF database and 319 from the CVH database.Records without coordinates were deleted, and records from small Pacific and Atlantic islands were also deleted, because assigning coarse resolution coordinates (0.5° × 0.5°) to these records may lead to their corresponding climate data being inaccurate.A total of 32,674 specimens (32,434 from GBIF and 240 from CVH) were identified by the coordinates recorded in the database or by coordinates derived from a place name included in the database.The main reason for collating records with a coarse geographic resolution (0.5° × 0.5°) was that there may have been a sampling bias or error at a fine resolution in the GBIF and CVH occurrence records, which would produce models of lower rather than higher quality [21].Another consideration was the calculation speed of the computer, identified in many previous studies, which was based on a spatial resolution between 50 km × 50 km and 200 km × 200 km [20,21,36].A total of 1174 grid cells (0.5° × 0.5°) were identified globally as containing black locust (details of the calculation process are given in Section 2.4.).To clearly show the spatial abundance distribution of the 32,674 specimens around the world, the point density function in ArcGIS 9.3 (ESRI, Redlands, CA, USA) was used to draw a 1.5° × 1.5° resolution distribution map of black locust (Figure 1) with its native range in eastern North America based on Little [1].

Climatic Variables
According to previous studies, there are many variables used to characterize hydrological-thermal climatic niches around the world.For example, 19 BIOCLIM variables were used to define global climatic niches in the WorldClim database [19,37]; three variables (annual biotemperature, potential evapotranspiration and annual precipitation) were used to define global climatic niches in the life zone model [30]; and three variables (warmth index, coldness index and humidity index) were used to represent global climatic niches in Kira's index system [38].Three groups of climatic variables are widely used in research on the relationship between species/vegetation and climate, on a regional or global scale.The 19 BIOCLIM variables are widely used in SDM studies, as the data can be easily downloaded from the WorldClim database with no further calculation required [39][40][41].The five integrated climatic variables (not including annual precipitation) are seldom used in SDM studies, but they can still generally provide considerable power for explaining species distributions [42][43][44][45][46].
In this study, we used the BIOCLIM variables and the five integrated climatic variables based on Holdridge's life zone model and Kira's index system.An excess of climatic variables can cause overfitting, so we selected 8 of the 19 BIOCLIM variables.A total of 13 climatic factors were used to define the climatic niches of the world (Table 1), which is sufficient for research on any species at a global scale, including black locust.Baseline climatic layers were downloaded from the WorldClim database with a 10 arc-min spatial resolution, which were generated using thin-plate smoothing splines of latitude, longitude, altitude and monthly temperature and precipitation records from 50-year climate station averages from 1950 to 2000 [19].Then, these layers were converted to a 0.5° × 0.5° spatial resolution across the globe, and some climatic variables were calculated using the corresponding formulae (Table 1).
Potential evapotranspiration PET mm PER = 58.93 × ABT/AP (ABT is annual biotemperature, AP is annual precipitation) [30] Humidity index HI mm/°C HI = AP/WI (AP is annual precipitation, WI is the warmth index) [46]

Model Selection and Evaluation
In this study, we used the software, MaxEnt (Version 3.0), a machine learning algorithm designed by Phillips et al. [25].It is written in Java, so it can be used on all modern computing platforms, and is freely available on the Internet at [47].The main advantage of applying MaxEnt to the modeling of geographical species distributions in comparison with other methods is that it only needs presence data, besides the environmental layers.Furthermore, it is possible to use both categorical and continuous layers.Phillips et al. [25,48] found that MaxEnt outperformed the genetic algorithm for rule set prediction (GARP) [49] on observational data for North American breeding birds and two Neotropical mammals (Bradypus variegatus and Microryzomys minutus).Elith et al. [13] found that MaxEnt was one of the best of 16 different methods for modeling the distributions of 226 species in 6 different regions.Similarly, Wisz et al. [24] found that MaxEnt was one of the best predictors among 12 different models tested.
MaxEnt applies five different feature constraints (linear, quadratic, product, threshold and hinge) to environmental variables, namely "the maximum entropy principle", to estimate the species distribution probability.This principle can be considered as a constrained optimization problem (where the aim is to maximize a function).The estimated MaxEnt probability distribution (Gibbs distribution) of location (χ) is exponential in a weighted sum of environmental features (f) divided by a scaling constant (Zλ, Equation 1) to ensure that the probability values range from 0 to 1 and sum to 1 [50].
where n is the number of environmental features, λ is the vector of the feature weights, with real values, and Zλ is the normalizing constant that guarantees that the probability distribution sums to one over the area of interest.MaxEnt provides output data in three formats: raw format (raw values must sum to 1), cumulative format (the value for each cell is equal to the probability of finding the species of interest at that cell plus all other cells with equal or lower probability; values range from 0 to 1) and logistic format (the probability of occurrence is estimated by including environmental variables; values range from 0 to 1) [51].Raw values are often very small for each data point, thereby making interpretation difficult.The cumulative format is more easily interpreted when projected in a geographic information system, but these projections are not necessarily proportional to the probability of occurrence.The logistic format is currently recommended, because it allows for an easier and potentially more accurate interpretation compared to the other approaches.
MaxEnt modeling can determine the importance of environmental variables using a jackknife test and the regularized gain change during each iteration of the training algorithm.Caution must be used when employing this method, as strong collinearity can influence the results, due to the highly correlated variables.MaxEnt allows the construction of response curves to illustrate the effect of selected variables on the probability of occurrence.These response curves consist of the specific environmental variable as the x-axis and, on the y-axis, the predicted probability of suitable conditions as defined by the logistic output.Upward trends for variables indicate a positive relationship; downward movements represent a negative relationship; and the magnitude of these movements indicates the strength of the relationship [51].
An important part of determining the ability of niche models to predict the distribution of a species is having a measure of fit.The performance of the MaxEnt model is usually evaluated by the threshold-independent receiver operating characteristic (ROC) approach (calculating the area under the ROC curve (AUC) as a measure of prediction success).The ROC curve is a graphical method that represents the relationship between the false-positive fraction (one minus the specificity) and the sensitivity for a range of thresholds.It has a range of 0-1, with a value greater than 0.5 indicating a better-than-random performance event [52].A rough classification guide is the traditional academic point system [53]: poor (0.5-0.6), fair (0.6-0.7), good (0.7-0.8), very good (0.8-0.9) and excellent (0.9-1.0).

Experimental Design and Statistical Analysis
First, we plotted all 32,674 black locust specimens' occurrence records on the world map with a 0.5° × 0.5° spatial resolution.This means that the world map was divided into grid cells (land area with 584,521 cells) with 300 rows and 720 columns.We assumed that a grid cell was suitable for black locust survival, as long as one or more specimens were present in it.Then, a binary grid map (presence/absence map) with a 0.5° × 0.5° spatial resolution was converted into points by using the raster-to-point function in ArcGIS 9.3 (a total of 1174 occurrence points).The latitude and longitude coordinates of each occurrence point (see attachment Supplementary Excel) were stored in an Excel database for MaxEnt model building.
Second, we loaded the latitude and longitude coordinates of the black locust occurrence points into MaxEnt, together with all 13 climatic layers.A ten-fold cross-validation method was used to assess the accuracy of the MaxEnt model predictions.The importance of climatic factors was evaluated by using a jackknife test and the regularized gain change during each iteration of the training algorithm.Otherwise, the default settings were used to run the MaxEnt model.The logistic format of the MaxEnt output was used for mapping habitat suitability.To avoid potential problems due to the effect of strongly correlated factors on the explanation of species response curves, these curves were created by using the MaxEnt model with only the corresponding variable.These curves reflect the dependence of the predicted suitability on the selected variable and reflect the dependency resulting from correlation between the selected variable and other variables.
Finally, the MaxEnt model produced ten species-distribution probability maps based on ten-fold cross-validation.The ten probability maps were then averaged to obtain a habitat suitability map for black locust.Four arbitrary habitat categories were used: the core area (0.6-1.0), the moderately suitable area (0.4-0.6), the marginal area (0.2-0.4) and the unsuitable area (0-0.2),based on the predicted habitat suitability map.The climatic thresholds for each habitat class were analyzed using ArcGIS 9.3.

Current and Potential Distribution of Black Locust
Based on the locations of black locust specimens in the GBIF and CVH databases, the map of the present distribution is shown in Figure 1.Black locust occurs mainly in 35 countries: America, Canada, Mexico, Chile, Bolivia, Argentina (only one record), China, Japan, Afghanistan, Pakistan, Indonesia, Australia, South Africa, Nigeria, Ireland, United Kingdom, Portugal, Spain, France, Germany, Belgium, the Netherlands, Switzerland, Italy, Austria, Czech Republic, Poland, Hungary, Romania, Greece, Georgia, Armenia, Denmark, Norway and Sweden.Large numbers of specimens were collected in North America, Europe and Asia.
The ten probability maps obtained from the ten-fold cross-validation were averaged to obtain a habitat suitability map for black locust (Figure 2).According to the probability values, four habitat categories were defined: the core area (0.6-1.0), the moderately suitable area (0.4-0.6), the marginal area (0.2-0.4) and the unsuitable area (0-0.2).The core suitable areas for black locust are distributed mainly in the eastern United States, Europe, Australia and New Zealand.The moderately-suitable areas mainly included China, Japan, South Africa, Chile and Argentina.In Europe, the core areas are the United Kingdom, Germany, France, the Netherlands, Belgium, Italy and Switzerland.Climatic threshold information for each habitat class is shown in Table 2.It shows the climatic thresholds for the core areas of black locust: a coldness index of −9.8 °C-0 °C, an annual mean temperature of 5.8 °C-14.5 °C, a warmth index of 66 °C-168 °C and an annual precipitation of 508-1867 mm.Table 2. Climatic threshold of black locust suitable habitat map predicted by the MaxEnt model.

Model Performance and Importance of Climatic Factors
Ten-fold cross-validation was used to evaluate the accuracy of the MaxEnt model.The accuracy of the resulting model predictions is shown in Figure 3; the MaxEnt model predictions were highly accurate (AUC > 0.9), with a mean AUC of 0.9165 (0.9033-0.9309).The coefficient of variation was only 0.8% among the ten predictions, indicating that the ten-fold cross-validation method does not affect the accuracy of the MaxEnt model simulation.The relative importance of the climatic factors is shown in Figure 4.It is apparent that the coldness index, the annual mean temperature and the warmth index are the most important climatic factors determining the distribution of black locust; these three factors explain 65.79% of the variance, (15.84%-27.92% for each factor), followed by the mean temperature of the coldest month, the annual precipitation and the annual biotemperature, which explain another 23.77% of the variance (7.13%-8.96%for each factor).The remaining seven climatic factors were less important in determining the geographical distribution of black locust (collectively, they explained 10.44% of the variance, 0.13%-4.15%for each factor).The response curves of black locust under all climatic factors is shown in Figure S1.The first three most important climatic factors (coldness index, annual mean temperature and warmth index) are shown in Figure 5.It is clear that unimodal relationships exist between the habitat suitability value and the annual mean temperature or warmth index, whereas the coldness index shows a linear relationship.The response peak in black locust habitat suitability for the coldness index was at 0 °C; for the annual mean temperature, it was at 9.8 °C; and for the warmth index, it was at 100 °C.

Species Record Database and Species Modeling Tools
The current availability of species-sharing information systems around the world makes it possible to easily study the geographical distribution of plants worldwide [20][21][22].GBIF is a free and open access biodiversity database that integrates existing worldwide biodiversity data to form a user-oriented global biodiversity service network.CVH is also a free and open access database, which integrates the herbarium data of national natural museums from 14 institutes in China.GBIF and CVH are complementary to each other with little duplication of occurrence records, which makes the occurrence records of CVH a great contribution to GBIF.The old interface of GBIF (which is still available, but not being further developed with new data) [54] includes the capability to quickly generate an SDM niche model (a variant of BIOCLIM rather than MaxEnt; for instructions, see Section 3.1 of Booth [22]).The GBIF niche model output, generated with the 19 BIOCLIM variables, appears to indicate an overly broad climatic suitability (see Supplementary Figure S2) when compared with Figure 2.This may be due to our inclusion of the CVH occurrence records from China, which has improved the SDM niche model's predictive performance.The new GBIF interface [34] no longer includes the SDM niche model option, so users are forced to perform their own SDM analyses outside of GBIF.GBIF now has so many data points that the providers of the service are concentrating on just storing the data and not providing many integrated analytical tools.Here, we provide a workflow (in Section 2.4) for predicting the potential distribution of a species based on a MaxEnt model, which may be useful for other species.We believe that if every country in the world contributed the data from their national herbaria to the GBIF database, we would be able to make more accurate predictions about our Earth in the near future by using SDM.
At an ecosystem scale, many climate-vegetation models have been used to study the relationship between vegetation and climate, such as Holdridge's life zone model [30], Kira's index model [38] and the dynamic vegetation model [55,56].At a species scale, previous studies have normally used the peak width at half height (PWH) to study the climatic thresholds of species [45,57].PWH generally assumes that the response of the species to climatic factors is normally distributed.For example, Ni and Song [44] used this method to study the relationship between geographical distribution and climate for Cyclobalanopsis glauca in China.SDM provides another way to study species-climate relationships at a species scale, which does not assume that the response of the species to environmental factors is normally distributed.SDM is widely used, because it can simulate a species' potential distribution and simultaneously identify significant environmental factors [13,14,25].For example, Irfan-Ullah et al. [58] predicted the geographical distribution of Aglaia bourdillonii in India, and Li et al. [41] simulated the potential distribution of Quercus wutaishanica in China.This study uses the maximum entropy model, a very popular SDM, to predict the worldwide geographical distribution of black locust.The performance of MaxEnt reached a very high level (a mean AUC value of 0.9165) with a coefficient of variation of only 0.8%, indicating that the MaxEnt model was suitable for simulating the potential distribution of this species around the world.The response curves of black locust to dominant climatic factors show that there is a unimodal response to the annual mean temperature and warmth index and a linear response to the coldness index.The MaxEnt model does not assume that the response of the species distribution to climatic factors is a predefined normal distribution.Therefore, both types of response curve could be detected by the MaxEnt model (Figure 5 and Figure S1).

Significant Climatic Factors, Geographical Boundary and Potential Distribution Area
In this study, we found that the coldness index, annual mean temperature and warmth index are the most important climatic factors for determining the potential distribution of black locust (with these three factors determining 65.79% of the variance).According to Chuine's explanation [59] for why phenology drives species distribution, we infer that these three climatic factors may play different roles in the growth process of black locusts.The coldness index (reflecting the extent of harsh climatic conditions during the non-growing season, with a relative of 27.92%) can be interpreted as the cold-climate stress that drives the maturation of fruit later in the growing season, which means that black locusts cannot continue to expand into higher latitudes.The warmth index (reflecting heat conditions during the species growing season, with a relative contribution of 15.84%) can be interpreted as the high thermal climatic conditions that suppress black locust flowering and leafing early in the growing season, such that the tree cannot continue to expand into lower latitudes.The mean annual temperature, with a relative 22.03% contribution, reflects the annual species heat demand during the growing season.Fang and Lechowicz [43] also found that heat-related climatic factors were the limiting factors of the geographical distribution of Fagus spp.around the world, especially the heat conditions during the growing season.Their conclusions were similar to ours.
The core suitable areas in the global potential distribution of black locust are mainly distributed among the eastern United States, Europe, Australia and New Zealand, whereas the moderately suitable areas are mostly in China, Japan, South Africa, Chile and Argentina (Figure 2).These countries are very suitable for afforestation with or introduction of black locust, but we should be aware of the high risk of the invasive potential of this species in these countries.When we compare the difference between the current data available in GBIF and CVH and the potential distribution of black locust (see attachment Figure S3), we could not find any occurrence records in the following 24 countries: Peru, Uruguay, Brazil, Zimbabwe, Kenya, Ethiopia, Morocco, Algeria, New Zealand, Croatia, Bosnia and Herzegovina, Yugoslavia, Albania, Macedonia, Bulgaria, Turkey, Iran, Azerbaijan, Ukraine, Belarus, Lithuania, Latvia, Estonia and Russia, most of which are developing countries.Though we do not know if black locust exists in these countries (due to the lack of web-based open access sharing of data about this species through GBIF from the national herbaria of these countries), we can conclude that this species could be introduced to these countries and may pose a potential threat of invasion there.

Climatic Threshold and Its Implication
Based on the map of the climatically suitable habitat for this species (Figure 2), we infer that the climatic thresholds of the core area of black locust are a coldness index of −9.8 °C-0 °C, an annual mean temperature of 5.8 °C-14.5 °C, a warmth index of 66 °C-168 °C and annual precipitation of 508-1867 mm.These climatic thresholds may define the most suitable climatic niche, as the occurrence points input into the MaxEnt model come from not only native areas, but also invaded and cultivated areas around the world.Petitpierre et al. [60] reported a large-scale test of climatic niche conservatism for 50 invasive terrestrial plant species from Eurasia, North America and Australia.Their findings reveal that substantial niche shifts are rare in terrestrial plant invaders, providing support for an appropriate use of SDM for the prediction of both biological invasions and responses to climate change.Therefore, we can use the climatic thresholds reported in Table 2 and the response curves in Figure S1 as references to relate to local weather station data to indicate when future invasion control should be initiated or when this species can be used for afforestation, especially on a small geographical scale.
Some studies have reported that the local soil moisture level is an important factor in determining the distribution and growth of black locust at a small scale [61].Therefore, the use of just climatic factors, on a large scale and with a coarse resolution, to simulate suitable habitat may overestimate the potential distribution of this species.When more types of variables are used, such as soil factors, the niche of the species will be more accurate, and the modeled distributions can be much more credible.Guisan and Thuiller [12] stated that "a gradual distribution observed over a large extent and at coarse resolution is likely to be controlled by climatic regulators, whereas patchy distribution is more likely to result from a patchy distribution of resources, driven by micro-topographic variation or habitat fragmentation".In their hierarchical modeling framework (see Figure 3 in Guisan and Thuiller [12]), climatic factors were used to determine species ranges on a large scale with coarse resolution, while soil nutriments determined species ranges on a local scale with fine resolution.We recommend further research on the hierarchical prediction of black locust distribution with the integration of more types of environmental factors, at the national or regional scale with fine resolution in the climatically suitable countries identified here.Our research has mainly concentrated on the habitat that currently has the potential to be climatically suitable and the climatic niche of this species on a global scale with coarse resolution.The predictive map of black locust distribution, the climatic thresholds and the species response curves can provide global perspective guidelines and valuable information for policymakers and planners of introductions, planting and invasion control of this species around the world.

Conclusions
Black locust is native to eastern North America, and has already spread widely all over the world by transplantation and cultivation in recent 300 years.On the one hand, black locust is a tree species of high ecological and economic value (as a pioneer species, a nurse species, an ornamental species, a feed species, etc.).On the other hand, it is also a serious invasive species in some parts of the world (such as Britain, Germany, France, Japan, etc.).This study comprehensively integrate the global herbarium data of black locust (from Global Biodiversity Information Facility and Chinese Virtual Herbarium) and the world climate system (BIOCLIM system, Holdridge life zone system, and Kira's index system) using maximum entropy modeling to research the climatic suitable habitat, significant climatic factors, climatic thresholds (niche) and climatic response curves of this species at global scale with coarse resolution.Realized distribution area and potential distribution area are compared for figuring out the suitable area for invasive control or afforestation of this species.The climatic thresholds and response curves could be used as references to relate to the data of local weather station to indicate when future invasion control should be initiated or when this species can be used for afforestation, especially on a small geographical scale.In addition, the climate system (in Section 2.2) and the simulating workflow (in Section 2.4) in this study could be useful when the global potential distribution of other species is predicted.

Figure 2 .
Figure 2. Global potential distribution area of black locust predicted by the MaxEnt model around the world (the grid cell is 0.5° × 0.5°); the value of climatic suitability is the average of a ten-fold cross-validation.The standard deviation of each grid square is 0-0.11.

Figure 3 .
Figure 3. Areas under the receiver operating characteristic curve (AUC) values of ten-fold cross-validation models (1-10 represent the model code, ascending in order by AUC value; the mean AUC value is 0.9165).

Figure S3 .
Figure S3.The difference between the current distribution knowledge of GBIF and CVH and the potential distribution of black locust.No occurrence records were found in the following 24 countries: Peru, Uruguay, Brazil, Zimbabwe, Kenya, Ethiopia, Morocco, Algeria, New Zealand, Croatia, Bosnia and Herzegovina, Yugoslavia, Albania, Macedonia, Bulgaria, Turkey, Iran, Azerbaijan, Ukraine, Belarus, Lithuania, Latvia, Estonia and Russia, most of which are developing countries.

Table 1 .
Description of 13 climatic factors, corresponding calculated formula and reference.