Modeling the Potential Global Distribution of Phenacoccus madeirensis Green under Various Climate Change Scenarios

: The Madeira mealybug, Phenacoccus madeirensis Green, is a serious invasive pest that does signiﬁcant damage to more than 120 genera of host plants from 51 families in more than 81 countries. However, the potential distribution range of this pest is unclear, which could hamper control and eradication e ﬀ orts. In the current study, MaxEnt models were developed to forecast the current and future distribution of the Madeira mealybug around the world. Moreover, the future potential distribution of this invasive species was projected for the 2050s and 2070s under three di ﬀ erent climate change scenarios (HADGEM2-AO, GFDL-CM3, and MIROC5) and two representative concentration pathways (RCP-2.6 and RCP-8.5). The ﬁnal model indicates that the Madeira mealybug has a highly suitable range for the continents of Asia, Europe, and Africa, as well as South America and North America, where this species has already been recorded. Potential expansions or reductions in distribution were also simulated under di ﬀ erent future climatic conditions. Our study also suggested that the mean temperature of the driest quarter (Bio9) was the most important factor and explained 46.9% of the distribution model. The distribution model from the current and future predictions can enhance the strategic planning of agricultural and forestry organization by identifying regions that will need to develop integrated pest management programs to manage Madeira mealybug, especially for some highly suitable areas, such as South Asia and Europe. Moreover, the results of this research will help governments to optimize investment in the control and management of the Madeira mealybug by identifying regions that are or will become suitable for infestations.


Introduction
Insects are the most diverse and pervasive set of species on earth [1]. Invasive insect pests can significantly affect agricultural productivity, forest resources, and human health [2]. The largest food-producing countries, such as China and the United States, face the highest potential losses from invasive insects [3]. The cost of controlling and managing invasive insects is estimated to be a minimum of US$70 billion-far greater than the cost of healthcare, which is $6.9 billion per year globally [3]. More than 450 species of alien insects, most of which are herbivores, have fed on various plants since European settlement in the United States, and 14% of these insects have caused large monetary losses [4].
Scale insects are major agricultural and forestry pests that are extremely invasive, and damage plants through sap loss [5]. For example, scale insects account for only 1% of the total insect fauna of indicating the locations of province or state were removed, but data referring to the county, city, and town were retained. An initial dataset containing 124 georeferenced occurrences was compiled, of which 15 were in the Madeira mealybug's native range and 109 records were from its invasive range.
Geographical biases in the distribution records were associated with sampling near cities or other areas that are easily accessible to collectors [35,36]. This collection bias often leads to environmental bias and can adversely affect model calibration by causing the overrepresentation of certain environmental features that are characteristic of more accessible and extensively surveyed areas [37]. Sampling bias can be addressed by reducing the number of distribution sites in oversampled regions using spatial filtering [38]. To reduce the sample bias of the occurrence points, a coarse resolution (grid cell of 5 km × 5 km; consistent with the resolution of the environment variable) was created and a single point was randomly extracted from each grid cell, which included two or more sampling points [39,40]. After filtering, 117 distribution records remained, including the native range (15 points) and the invaded regions (102 points). The list of distribution sites and the map are shown in Figure 1 and Table S2. The workflow was conducted in ArcGIS 10.1 (ESRI, Redlands, CA, USA) (http://www.esri.com/).

Environmental Variables
Climatic variables are the main factors that determine species niches and have been used frequently for large-scale insect niche modeling [41,42]. At a broad scale, precipitation and temperature are two principal climate factors that constrain the species distribution range [43]. Thus, a set of 19 current global climate data points with an interpolated spatial resolution of 2.5 arc min (~5 km resolution at the equator) were used in this study, as obtained from the WorldClim Global Climate Database, version 1.4 (https://www.worldclim.org/) [44]. Minima, maxima, and average values of monthly, quarterly, and annual ambient temperature and precipitation values, recorded from 1950 to 2000, were included to represent the current climate conditions. Multicollinearity among climate variables may affect the analysis of species-environment relationships [45]. Hence, to minimize multicollinearity, a Pearson's correlation coefficient for each pairwise comparison of all 19 climatic variables was used to identify and remove highly correlated variables (R 2 ≥ |0.8|) (Table S3). Multicollinearity was assessed using ENMTools version 1.0 [46]. Five bioclimate environmental variables were retained in the final analysis: mean diurnal range (mean of monthly (max temp − min temp)) (Bio2), isothermality (Bio2/Bio7) (×100) (Bio3), mean temperature of driest quarter (Bio9), precipitation seasonality (coefficient of variation) (Bio15) and precipitation of coldest quarter (Bio19).
Given the uncertainty of future climate scenarios, impact assessments need to incorporate data from a range of climate models that are effective at simulating historical climate changes over the

Environmental Variables
Climatic variables are the main factors that determine species niches and have been used frequently for large-scale insect niche modeling [41,42]. At a broad scale, precipitation and temperature are two principal climate factors that constrain the species distribution range [43]. Thus, a set of 19 current global climate data points with an interpolated spatial resolution of 2.5 arc min (~5 km resolution at the equator) were used in this study, as obtained from the WorldClim Global Climate Database, version 1.4 (https://www.worldclim.org/) [44]. Minima, maxima, and average values of monthly, quarterly, and annual ambient temperature and precipitation values, recorded from 1950 to 2000, were included to represent the current climate conditions. Multicollinearity among climate variables may affect the analysis of species-environment relationships [45]. Hence, to minimize multicollinearity, a Pearson's correlation coefficient for each pairwise comparison of all 19 climatic variables was used to identify and remove highly correlated variables (R 2 ≥ |0.8|) (Table S3). Multicollinearity was assessed using ENMTools version 1.0 [46]. Five bioclimate environmental variables were retained in the final analysis: mean diurnal range (mean of monthly (max temp − min temp)) (Bio2), isothermality (Bio2/Bio7) (×100) (Bio3), mean temperature of driest quarter (Bio9), precipitation seasonality (coefficient of variation) (Bio15) and precipitation of coldest quarter (Bio19).
Given the uncertainty of future climate scenarios, impact assessments need to incorporate data from a range of climate models that are effective at simulating historical climate changes over the area of interest [47]. To account for uncertainty in future scenarios, three random global circulation models (GCMs) were downloaded from the Worldclim database with a spatial resolution of 2.5 arc

Modeling Approach
Numerous programs are used to assess the potential distribution of invasive insect species, such as CLIMEX [49], BIOCLIM [50], and GARP [51]. However, MaxEnt performs particularly well for modeling small and presence-only data on the current and future distribution of invasive species [52]. MaxEnt is a general-purpose machine learning method with a precise mathematical formulation; it has a number of aspects that make it well suited for species distribution modeling, such as using a regularization multiplier to control model complexity, and thus avoids overfitting [53]. Thus, numerous studies have used MaxEnt to predict the potential distribution of invasive pests under climate change, such as the common ambrosia beetle [54] and large pine weevil [55]. In addition, MaxEnt generates a response curve for each environmental predictor, which is essential for interpreting and comparing the performances of models [56]. Therefore, the potential distribution of the Madeira mealybug under current and projected climate scenarios was investigated using MaxEnt version 3.4.1 [57]. The MaxEnt model applied a machine learning method called maximum entropy modeling, which follows the principle of maximum entropy. The entropy formula is defined as below: where π is the unknown probability distribution;π is the approximation of π; X is a finite set; x is an individual element in set X; and ln is the natural logarithm. The entropy is nonnegative and is at most the natural log of the number of elements in X. Usually, feature types and a regularization multiplier are employed to optimize models and control for overparameterization in MaxEnt. These feature types represent different transformations of the covariates, including linear, product, hinge, threshold, and quadratic features, which allow the software to be optimized for the species of interest and prevent oversimplified or overcomplicated models. Hence, in order to produce the best possible model for pests, the R package "ENMeval" was used to avoid overfitting and optimize model parameters, including regularization multipliers and the combination of feature classes in the current study. "ENMeval" executes a set of automated models to compute several estimators, yielding a comparable evaluation of all possible models [58]. The regularization multipliers ranged from 0.5 to 4 in increments of 0.5, and 8 FCs were tested for the Madeira mealybug: study, the "checkerboard2" approach was used to calculate the Akaike information criterion coefficient (AICc), and the lowest delta AICc scores were selected to run the final MaxEnt models. The ENMeval package was implemented in R 3.1.3. The result are shown in Figure S1 and Table S4. The regularization multiplier is 1.5 and feature combinations of LQHPT were selected for our analysis. The logistic output of MaxEnt was used to run the model. The 10th percentile training presence logistic threshold was used to define the suitable and unsuitable habitats for the Madeira mealybug. This threshold is widely used in species distribution modeling, especially when data are collected by different observers and methods over a long period of time [59]. The potential distribution map was reclassified into four levels based on the following thresholds: <threshold indicates an unsuitable habitat; threshold-0.4 indicates low habitat suitability; 0.4-0.6 indicates moderate habitat suitability and 0.6-1 indicates high habitat suitability. Finally, 10-fold cross-validation was executed to run MaxEnt to prevent random errors from the predicted samples. In addition, a final suitability map was created by averaging the maps from the three future climate scenarios to reduce the uncertainty among GCMs.

Model Evaluation
The receiver operating characteristic (ROC) area under the curve (AUC) is often used to assess the performance of MaxEnt; however, there are some disadvantages to this approach for our study. This approach weighs the omission and commission errors equally, and AUC also does not provide information on the spatial distribution of model errors [60]. Thus, a partial receiver operating characteristic (pROC) metric approach was employed to assess the robustness of the model by Niche Toolbox (http://shiny.conabio.gob.mx:3838/nichetoolb2/) with 1000 replicates and E = 0.05 [61].

Model Performance
The test of the model performance yielded optimal results for partial ROC (mean AUC: 0.9177406) and the distribution of AUC ratios calculated as AUCpartial/AUCrandom was significantly greater than random expectations showing high performance (p < 0.0001) ( Figure S2). A "10th percentile training presence logistic threshold" value of 0.2126 was obtained for P. madeirensis. Therefore, the threshold at below 0.2126 indicates an unsuitable habitat for this species.

Importance of Environmental Variables
Of the five climatic variables that were tested, the relative contributions of each environmental variable to the final model showed that the mean temperature of the driest quarter (Bio9) was the largest contributor (46.9%) that constrained the potential distribution map of P. madeirensis. Mean diurnal range (Bio2) had the second largest contribution and accounted for 21.6% of the model (Table 1). These two factors explained 68.5% of the potential distribution range. Other factors, such as isothermality (Bio2/Bio7) (Bio3), precipitation of coldest quarter (Bio19) and precipitation seasonality (coefficient of variation) (Bio15) contributed to the potential distribution map by 13.1%, 10.8%, and 7.5%, respectively. The result of the model implies that the thermal condition was more important than other environmental variables in the potential distribution model of P. madeirensis.

Current Potential Distribution Range
The current distribution pattern of P. madeirensis was created based on the current climate and occurrence records (Figure 2). The current distribution map suggested that Southern Asia, Europe, Central Africa, most parts of South America, and Central America have suitable environmental conditions for P. madeirensis. North America and Australia showed scattered distributions of P. madeirensis. In Asia, Southern China, most parts of Vietnam, Cambodia, Laos, Southern Thailand, Sri Lanka, South and East India, northern Philippines, and some islands of Indonesia have environments with high suitability for P. madeirensis. Iceland, countries bordering the Mediterranean Sea, the United Kingdom, and Ireland also have suitable environments with high risk for P. madeirensis in Europe. Northern Egypt, Libya, Tunisia, Algeria, Northeastern Morocco and the Western Sahara, Central Africa, and Madagascar likely have highly suitable environments as well. Australia, New Zealand, and Papua New Guinea also showed high suitability for P. madeirensis. In addition, Southeastern America, Central America, and South America had high invasion risks under the current climate scenario. In its native range, most of Brazil, Guyana, Ecuador, and Bolivia were highly suitable for P. madeirensis. Based on the current climate variables, the total range or area of potential suitable habitat for P. madeirensis was predicted to be approximately 12.95 × 106 km 2 , of which about 1.71 × 106 km 2 (nearly 13.20% of the total suitable range) showed high habitat suitability, and about 4.11 × 106 km 2 (approximately 31.73% of the total potentially suitable area) showed moderate habitat suitability ( Table 2).

Current Potential Distribution Range
The current distribution pattern of P. madeirensis was created based on the current climate and occurrence records (Figure 2). The current distribution map suggested that Southern Asia, Europe, Central Africa, most parts of South America, and Central America have suitable environmental conditions for P. madeirensis. North America and Australia showed scattered distributions of P. madeirensis. In Asia, Southern China, most parts of Vietnam, Cambodia, Laos, Southern Thailand, Sri Lanka, South and East India, northern Philippines, and some islands of Indonesia have environments with high suitability for P. madeirensis. Iceland, countries bordering the Mediterranean Sea, the United Kingdom, and Ireland also have suitable environments with high risk for P. madeirensis in Europe. Northern Egypt, Libya, Tunisia, Algeria, Northeastern Morocco and the Western Sahara, Central Africa, and Madagascar likely have highly suitable environments as well. Australia, New Zealand, and Papua New Guinea also showed high suitability for P. madeirensis. In addition, Southeastern America, Central America, and South America had high invasion risks under the current climate scenario. In its native range, most of Brazil, Guyana, Ecuador, and Bolivia were highly suitable for P. madeirensis. Based on the current climate variables, the total range or area of potential suitable habitat for P. madeirensis was predicted to be approximately 12.95 × 106 km 2 , of which about 1.71 × 106 km 2 (nearly 13.20% of the total suitable range) showed high habitat suitability, and about 4.11 × 106 km 2 (approximately 31.73% of the total potentially suitable area) showed moderate habitat suitability ( Table 2). Gray, unsuitable habitat area; blue, low habitat suitability area; yellow, moderate habitat suitability area; red, highly habitat suitability area. Response curves show the quantitative relationship between environmental variables and the logistic probability of presence (known as habitat suitability) and can deepen our understanding of the ecological niche of the species. The responses of five environment variables to P. madeirensis are illustrated in Figure 3. Based on response curves, two main climatic factors that were associated with Gray, unsuitable habitat area; blue, low habitat suitability area; yellow, moderate habitat suitability area; red, highly habitat suitability area.   Figure 3. Based on response curves, two main climatic factors that were associated with high suitability were temperatures of 2-10 • C for Bio2 and 15-27 • C for Bio9. Other climate conditions also showed a relationship to high habitat suitability, such as 2.5-6 • C for Bio3, 80-100 mm for Bio15, and 40-500 mm for Bio 19.

Future Invasion Range
Model transfers to future conditions of P. madeirensis found a similar pattern to the present day ( Figure 4). The potential distribution range of P. madeirensis was larger than the current environmental variables under climate change ( Table 2).

Future Invasion Range
Model transfers to future conditions of P. madeirensis found a similar pattern to the present day ( Figure 4). The potential distribution range of P. madeirensis was larger than the current environmental variables under climate change ( Table 2).
The predicted area gains in the area of suitable habitat were predicted to be about 11.86 × 106 km 2 , which was 8.41% lower than the current conditions when compared to condition RCP 2.6-2050 ( Figure 4A and Table 2). The highly suitable area was reduced to 1.21 × 106 km 2 under these climate scenarios, of which there was a 29.23% decrease over the current highly suitable areas. The highly suitable areas were markedly reduced in Southern China, Central Africa and northeastern India. Under RCP 8.5-2050, both the suitable habit and highly suitable habit decreased relative to present-day conditions. The areas were reduced by 12.04% and 29.82% for suitable habitats and highly suitable habitats, respectively ( Figure 4B and Table 2).
Under the future climate scenario of RCP 2.6-2070, the areal extent of gains amounted to 21.26 × 106 km 2 (64.16% of the currently suitable habitat area) ( Figure 4C and Table 2). Expansions in the area signification increased with the warming of the climate, especially in Southern China and Europe. The area of highly suitable habitat expanded to 2.97 × 106 km 2 (a 73.68% increase of the currently highly suitable habitat area), and this pattern is accentuated in Southern China and Eastern Europe.
The MaxEnt predictions of gains in suitable habitat areas under the future climate scenario/year combination RCP 8.5-2070 also showed a decrease near 11.21 × 106 km 2 (a 13.43% reduction in the currently suitable habitat areas) ( Figure 4D and Table 2). The highly suitable area might reduce by 30.40% under this scenario, constricting the area to 1.19 × 106 km 2 . The restricted range was predicted to increase the highly suitable areas in Southern China and Eastern Europe, but the model predicted a constriction of suitable habit areas in some regions, such as countries around the Mediterranean Sea, South America, Africa.
In summary, the potential distribution of P. madeirensis will shrink with climate change, reducing both the total suitable habit and the highly suitable habitat in the three future climate scenarios that The predicted area gains in the area of suitable habitat were predicted to be about 11.86 × 106 km 2 , which was 8.41% lower than the current conditions when compared to condition RCP 2.6-2050 ( Figure 4A and Table 2). The highly suitable area was reduced to 1.21 × 106 km 2 under these climate scenarios, of which there was a 29.23% decrease over the current highly suitable areas. The highly suitable areas were markedly reduced in Southern China, Central Africa and northeastern India. Under RCP 8.5-2050, both the suitable habit and highly suitable habit decreased relative to presentday conditions. The areas were reduced by 12.04% and 29.82% for suitable habitats and highly suitable habitats, respectively ( Figure 4B and Table 2).
Under the future climate scenario of RCP 2.6-2070, the areal extent of gains amounted to 21.26 × 106 km 2 (64.16% of the currently suitable habitat area) ( Figure 4C and Table 2). Expansions in the area signification increased with the warming of the climate, especially in Southern China and Europe. The area of highly suitable habitat expanded to 2.97 × 106 km 2 (a 73.68% increase of the currently highly suitable habitat area), and this pattern is accentuated in Southern China and Eastern Europe.
The MaxEnt predictions of gains in suitable habitat areas under the future climate scenario/year combination RCP 8.5-2070 also showed a decrease near 11.21 × 106 km 2 (a 13.43% reduction in the currently suitable habitat areas) ( Figure 4D and Table 2). The highly suitable area might reduce by 30.40% under this scenario, constricting the area to 1.19 × 106 km 2 . The restricted range was predicted

Discussion
A global dataset of occurrences of the Madeira mealybug and MaxEnt were employed to forecast the potential distribution of this species under current and future climate conditions. The models performed very well for predicting the suitable habitat range of P. madeirensis when evaluated using partial ROC. The model indicates several climatic variables that constrain the current distribution of P. madeirensis. Most of the regions invaded by P. madeirensis have a warm climate. Regions with extreme cold and heat are unsuitable for P. madeirensis under climate change. In spite of this, the most important factor that explained the current distribution was temperature. Mean temperature of the driest quarter (Bio9) and mean diurnal range (Bio3) explained 78.5% of the current model. P. madeirensis can produce more than 250 eggs in a week at a constant temperature of 25 • C, and female mealybugs can complete development at 15-25 • C under laboratory conditions [34]. Based on response curves, our model indicated that there would be an increase in the population of P. madeirensis when exposed to a mean temperature of 15-27 • C in the driest quarter, predicting this as its most suitable temperature range. Thus, these temperature values could represent the most appropriate climatic conditions for the population growth of P. madeirensis. However, several highly suitable habitats in the current model would become unsuitable or shrink under climate change, such as its native range (South America) and its range in Europe. A possible reason for this decline in native habitat is global warming. In addition, the highly suitable range would shift to higher latitudes, such as in Europe, most likely because of rising temperatures making these areas more suitable for P. madeirensis.
The Madeira mealybug is native to South America but has already invaded other continents, such as Asia, North America, Europe, and Africa. The model predicted that climate change will significantly influence the worldwide distribution of P. madeirensis, but its effects will vary among different regions. Our model shows that P. madeirensis can invade most parts of the global landmass, such as Southern China, southern and eastern India, countries around the Mediterranean Sea, and Central Africa. The model also predicted that there would be a reduction of highly suitable and total habitat in the global range under climate versions RCP 2.6-2050, RCP 8.5-2050, and RCP 8.5-2070 relative to the present-day conditions. This result is not surprising. Many studies have shown similar results for various invasive species, such as the Giant African Snail [62], Schinus mole [63], Hyptis suaveolens [64], and Nitellopsis obtuse [65]. The range contraction that was predicted for P. madeirensis provides important information for its management and eradication. The shrinking climatic suitability of the current invaded range may make invasive species less competitive, potentially leading to their retreat [66]. Retreat by invasive pests could provide opportunities for the restoration of currently invaded landscapes, but further research is urgently needed to identify these opportunities and offer guidance for the management of pest species.
In addition, the model predicted a dramatic expansion in the future climatic model RCP 2.6-2070, which showed increased areas of suitability with plausible range shifts. These four prediction models suggest considerable uncertainty based on the current results. The uncertainty results from diverse sources, including distribution data quality, environmental datasets, modeling algorithms, model type, and model calibration parameters [67]. To minimize the uncertainty in the current study, resampling occurrences were limited to one per pixel, multicollinearity for climatic variables was minimized, and overparameterization was controlled for in MaxEnt. These methods were employed to minimize the uncertainty of the model [68].
In response to climatic change, most favorable habitat ranges shift based on the current climate model, including in its native and invasive ranges. The change of suitable habitat offers information for scientists or farmers looking to implement effective management strategies to control P. madeirensis. Many previous studies have focused on molecular identification [11,69], physiological characteristics [13,34], and the chemical and biological control of P. madeirensis [13,70]. However, very few management strategies incorporate the spread of pests and the shift of distribution range due to climate change. Our model provides preliminary evidence for developing monitoring strategies to detect future infestations in currently uninfected regions by government agencies. Early warning monitoring is a critical step to develop management strategies for pests [71,72]. Species distribution models provide a cost-effective tool that can identify the most suitable areas for potential invasions. For example, our analysis suggests that P. madeirensis would find a highly suitable environment in Madagascar, Malaysia, Australia, and Iceland. In addition, some areas also show moderate suitable environments, into which P. madeirensis can expand under climatic change, such as Kazakhstan, Kyrgyzstan, and Azerbaijan. Highly and moderately suitable habitat areas predicted from our model can be used to formulate strict quarantine measures to prevent invasion. The results of the current research can help policy makers, extension organizations, and farmers to make adapted agricultural management decisions today when choosing planting areas that are less susceptible to the pest. In addition, prediction maps can give farmers information that can be used to help secure food production in the future when the Madeira mealybug is likely to expand its distribution range under climate change.
In summary, the results of the current study provide important information for identifying infected areas and making management decisions to control P. madeirensis globally.

Conclusions
The current study projected the potential distributions of the Madeira mealybug under current and future climate change scenarios. First potential distribution maps for this pest were created based on current and future climate scenarios, and the results suggested that the future potentially suitable habitat area is far greater than the current distribution range. The current research provides a reference frame for quarantine departments and policymakers to manage and develop policies to control this pest. In addition, strict quarantine measures should be employed in areas where the Madeira mealybug has not yet been reported based on the results of current research. Other variables, such as the species' dispersal capacity, should be determined in future studies.