Next Article in Journal
Intergenerational and Intersexual Differentiation in Respiratory Metabolic Rates of Schlechtendalia chinensis: A Comparison Across Sexuales, Parental Sexuparae, and Progeny Fundatrices
Previous Article in Journal
Rapid Detection of Philaenus italosignus Drosopoulos & Remane, 2000 (Hemiptera: Aphrophoridae) with Real-Time PCR Probe LNA Technology
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Ensemble Distribution Modeling of the Globally Invasive Asian Cycad Scale, Aulacaspis yasumatsui Takagi, 1977 (Hemiptera: Diaspididae)

by
Samuel Valdés-Díaz
1,2,*,
Reyna Tuñón
1,
Dilma Castillo
1,
Alieth Sanchez
1,
Brenda Virola-Vasquez
1,
Patricia Esther Corro
1,
Francisco Serrano-Peraza
1,
Bruno Zachrisson
1,3,
Jose Loaiza
1,4,
Rodrigo Chang
1 and
Luis Fernando Chaves
1,5,6
1
Programa Centroamericano de Maestría en Entomología (PCMENT), Universidad de Panamá, Ciudad de Panamá Estafeta Universitaria, Apartado 3366, Panamá 4, Panama
2
Biodiversity Consultant Group (BCG), Hato Pintado, Ciudad de Panama PA 07126, Panama
3
Instituto de Innovación Agropecuaria de Panamá (IDIAP), Chepo, Ciudad de Panama PA 07052, Panama
4
Instituto de Investigaciones Científicas y Servicios de Alta Tecnología, Panama, PO Box 0843-01103, Panama
5
Department of Environmental and Occupational Health, School of Public Health, Bloomington, IN 47408, USA
6
Department of Geography, Indiana University, Bloomington, IN 47405, USA
*
Author to whom correspondence should be addressed.
Insects 2025, 16(10), 1016; https://doi.org/10.3390/insects16101016
Submission received: 16 August 2025 / Revised: 7 September 2025 / Accepted: 19 September 2025 / Published: 30 September 2025
(This article belongs to the Section Insect Pest and Vector Management)

Simple Summary

Aulacaspis yasumatsui is an invasive scale insect of great economic importance to the horticulture industry, affecting plant species in the Cycas genus. Prior attempts to model the ongoing spatial expansion of A. yasumatsui were based on the MaxEnt algorithm and WorldClim bioclimatic covariates to predict shifts in its distribution owing to future climate scenarios. Here, we develop the first ensemble species distribution model including recent occurrence records and environmental predictors not analyzed in previous efforts. We combine the outcomes of six different algorithms in order to improve model accuracy and minimize model error. The findings support the likely invasion of A. yasumatsui in tropical and subtropical areas of Africa and Central America. Dispersal is also highly possible through northern areas of Europe and China.

Abstract

Species distribution models (SDMs) have become an important tool to inform conservation and pest surveillance programs about the potential biological invasion of insect pests. Nonetheless, to be operational, SDMs need to incorporate multiple environmental covariates and a representative number of occurrence points depicting the species’ ecological niche. The algorithm of choice, model of choice, and comparison can also have a great effect on the final prediction output. We created a dataset based on previously published records, plus 36 new occurrences and 37 environmental predictors, to generate the first global ensemble distribution model for Aulacaspis yasumatsui. We employed a strategy that aggregates SDMs with the best performance (i.e., greater accuracy) from six different algorithms, resulting in an averaged and weighted model, i.e., the ensemble model. We then selected models from algorithms whose true skill statistic (TSS) was above 0.5 in order to map the potential global distribution of A. yasumatsui. Our results suggest that covariate selection and the individual model algorithms used in the ensemble may be more important for achieving an accurate SDM than the number of occurrence points.

1. Introduction

Invasive scales, aphids, whiteflies, and leafhoppers have become significant global threats to agriculture and horticulture, driven largely by the movement of exotic plants and agricultural products through increased international trade [1]. Biological traits of these invasive insects, including high reproductive capacity, broad host ranges, and climatic adaptability, enable their rapid establishment and spread into novel environments, necessitating enhanced monitoring and predictive modeling to mitigate their ecological and economic impacts [2].
Aulacaspis yasumatsui Takagi, 1977 (Hemiptera: Diaspididae), commonly known as the Asian Cycad Scale, is an armored scale insect with a particular preference for Cycas revoluta Thurnberg, 1782, a member of the family Cycadaceae that has been hypothesized as its main host plant [3]. The first observation of A. yasumatsui feeding on Cycas revoluta was at the type locality of Bangkok, Thailand, in 1972 by Takagi [4]. According to the original description, A. yasumatsui is native to the tropical and subtropical areas of Southeast Asia. However, the recent accumulation of new records from around the world has confirmed that A. yasumatsui is also able to feed on plants of the family Zamiaceae, affecting mainly the genera Dioon, Encerphalartos, Macrozamia, Microcycas, and Stangeria [5,6,7,8].
Aulacaspis yasumatsui has been classified as an exotic invasive pest according to the Crop Protection Compendium of CABI, 2021. Aulacaspis yasumatsui has been reported in several countries where Cycas revoluta was previously introduced through the cycad horticulture trade [8]. A recent study on the genus Aulacaspis, including yasumatsui, highlights the economic importance of this insect group as pests of sugarcane, ornamental grasses, and herbs on a global scale. Due to the increased international trade of Cycas revoluta, the expanded distribution of A. yasumatsui now includes records from across the Americas [9,10,11,12,13,14,15,16], Asia [17,18,19], Europe [19,20,21], and Africa [22,23,24]. The new distribution records warn non-native countries about the negative impact that A. yasumatsui could have on the Neotropical members of the family Zamiaceae, of which many species have been declared as critically endangered or vulnerable by the IUCN Red List of Threatened Species, and call for special considerations in the elaboration of contingency plans against the invasion of this insect pests. Despite the increasing number of countries being invaded by A. yasumatsui in the last 10 years and the hidden economic hazard posed by this pest to the cycad horticulture industry around the world, little effort has been made towards gaining a better understanding of the potential spread of this species outside its native range [8,18]. In particular, we know little about which ecological variables, measured with remote sensing, other than climatic factors might impact the likely geographic expansion and habitat suitability of this species globally [25].
Species distribution models (SDMs) are widely used in ecology to predict the distribution of suitable habitat for a particular species of interest [26,27]. Traditionally, species distribution models (SDMs) have relied upon species occurrence points and values of selected environmental covariates as data inputs, which has two implications for their ability to estimate habitat suitability parameters [28]. Species occurrence datasets may fail to adequately capture the fundamental niche for taxa with a high dispersal capability, which can lead to populations in sink habitats [29], whereas dispersal limitation may exclude species from large areas of suitable habitat [16,30]. These issues limit the ability to fully estimate fundamental niches from observed species occurrence types of data. A way to deal with these limitations is to evaluate and compare the outcomes of several model algorithms and select the best prediction based on a set of performance metrics [31,32,33]. Alternatively, the selection of just one single ensemble model combining the strengths and caveats of all models can be used to make accurate estimates of species distribution, provided that the model is stable and robust.
Previous efforts to predict the continental distribution of A. yasumatsui through SDMs relied on the MaxEnt algorithm, a machine learning method that predicts a species’ existence in a coordinate-based space built with the species occurrence (or presence-only) points and the environmental background where the species has not been reported [34,35]. This approach makes it almost impossible to assess specificity since no information on species absence is used [28,36,37]. Ensemble models of species niche distributions are commonly used to mitigate the impact of model performance variability across multiple species, localities, and datasets because no single method consistently outperforms those available, and ensemble methodologies can increase the accuracy of SDMs and reduce their error [38,39].
The performance of species distribution models (SDMs) is influenced not only by the quantity of occurrence records but also critically by their spatial arrangement and the design of the model evaluation [30]. Traditional evaluations have demonstrated that model accuracy can be artificially inflated due to the spatial clustering of data points and the methods used for validation, with limited gains achieved by simply increasing the sample size when data are dense yet spatially biased [26,34]. Presence-only modeling approaches, in particular, require careful handling of absence or pseudoabsence data to prevent overly confident predictions, while recent studies highlight that failing to explicitly address sampling bias and evaluation strategies can lead to unrealistic SDMs [30,34].
Equally vital to SDM accuracy is the selection of predictor variables and the modeling approach. Ecological theory and empirical benchmarks underscore the importance of choosing predictors that are both ecologically meaningful and minimally redundant, as well as managing collinearity among variables, to enhance the model’s ability to generalize across different spatial and temporal contexts [22,23,37]. The choice of algorithm also plays a significant role, affecting the flexibility of response shapes and the risk of overfitting. Ensemble modeling frameworks that integrate diverse algorithms, such as Generalized Linear Models, Boosted Regression Trees, Random Forests, and Artificial Neural Networks, can improve prediction stability when combined with rigorous model selection criteria and robust evaluation metrics like the Area Under the Curve (AUC) and true skill statistic (TSS) [29,31,37]. Building upon these foundations, this study explicitly investigates how the number of occurrence records, reduction in covariates, and use of an ensemble of several algorithms influence the predictive performance and geographic realism of an SDM for the invasive scale insect Aulacaspis yasumatsui.
Here, we created a dataset based on 158 A. yasumatsui occurrence records, including 122 previously reported and analyzed occurrences, plus 36 new occurrences. As covariates, we included 21 WorldClim variables plus 18 additional environmental predictors, including remotely sensed data not considered in previous A. yasumatsui SDM efforts. We started with 37 covariates and 10 sets of pseudoabsence points, i.e., simulated locations where the species is presumed absent across the planet, where the number of pseudoabsences was the same as that of A. yasumatsui occurrences. We went through a process of model selection in order to reduce the number of covariates, thus avoiding the overfitting of the SDM. For the ensemble model, we employed a strategy that aggregates niche models with the best performance (i.e., greater accuracy) from six independent algorithms, resulting in an averaged and weighted model based on each model’s predictive performance, which has been shown to improve ensemble predictions [39,40]. We repeated this procedure for a model based on the same set of occurrences previously analyzed [32], which we compared with our larger dataset and whose analysis showed that model selection might be more important than the abundance of occurrence points to model the global distribution of an insect.
Here, our goal was to generate the first robust and comprehensive global ensemble distribution model for A. yasumatsui that can be used as a resource for monitoring, preventing, and controlling the threats imposed by this insect pest to native host plants and the horticultural industry around the world, specifically Cyca spp. and Zamiaceae.

2. Materials and Methods

2.1. Occurrence Records

We compiled Aulacaspis yasumatsui records from GBIF (GBIF Occurrence Download. Available online: https://doi.org/10.15468/39omei accessed via GBIF.org on 5 March 2022), CABI, and the peer-reviewed literature (1977–2022); taxonomy was harmonized to A. yasumatsui Takagi; ambiguous “cycad scale” mentions were excluded unless explicitly confirmed. Coordinates were standardized to WGS84 and filtered to remove zeros, country/province or institution centroids, marine points, and entries discordant with verbatim locality. We kept only records with ≤5 km coordinate uncertainty or demonstrably precise localities; exact duplicates were removed using the “spThin” function from the R package spThin (version 0.2.0) [41], and multiple points per 2.5′ grid cell were collapsed to a single occurrence. To limit clustering at the environmental resolution (~4.6 km at the equator), we applied 5 km spatial thinning.

2.2. Data Sources and Processing of Environmental Data

We built a multilayer raster incorporating 37 environmental covariates to fit species distribution models for A. yasumatsui under current environmental conditions. These predictors were chosen due to their general physiological significance to insects and because recent research has revealed that such covariates could be important in shaping insect spatial distributions [42].
Environmental predictors were obtained from WorldClim v2.1 [43] at a 2.5′ resolution (19 bioclimatic variables derived from 1970 to 2000 monthly temperature and precipitation) and from Google Earth Engine (GEE [44,45]). From GEE, we accessed TerraClimate (monthly climate and climatic water balance at 1/24° ≈ 4 km; 1958–31 December 2021 [46]), the NOAA Climate Data Record (CDR) of AVHRR Normalized Difference Vegetation Index (NDVI) based on daily images from 24 June 1981 to 31 December 2021 at a resolution of 0.05° ≈ 5.66 km [47], MODIS/Terra Land Surface Temperature/Emissivity (daily, 1 km resolution, 1st January 2002 to 31 December 2021 [48]), GTOPO30 (completed in 1996, global elevation at 30-arc-second ≈ 1 km [49]), and ALOS PALSAR Forest/Non-Forest (25 m, annual estimates from 2009 to 2017 [50]). To represent forest cover, we used PALSAR’s categorical layer (forest, water, non-forest) which follows the FAO criterion (>0.5 ha with >10% canopy cover). The time series of rasters collected with the Google Earth Engine were further processed with the reducer command to estimate the median for each pixel. All rasters were re-projected to WGS84 (EPSG:4326), snapped to a common 2.5′ grid (~4.6–5 km at the equator), and resampled using bilinear interpolation for continuous variables (WorldClim, TerraClimate, NDVI, elevation) and nearest neighbors for categorical data (PALSAR) [51,52]. Higher-resolution layers were aggregated to 2.5′ by area-weighted mean (continuous) or majority class (categorical).

2.3. Ensemble Algorithms, Pseudoabsences, and Covariate Selection

To build the ensemble distribution model [53], we followed a methodology similar to the one described by Rhodes et al. [54]. We employed six algorithms for modeling the global distribution of A. yasumatsui. These included Logistic Generalized Linear Models (L-GLMs), Multiple Adaptive Regression Splines (MARSs), and three tree-based methods which include Classification and Regression Trees (CATs), Generalized Boosted Regression Models (GBMs), and Random Forests (RFs), and we also employed Artificial Neural Networks (ANNs). For detailed explanations of each one of these algorithms, we refer the interested reader to Rhodes et al. [54]. All analyses were completed using the biomod2 package for R [55]. This package was selected for its ability to incorporate several SDM modeling techniques and reproducibility settings. Results from all algorithms were cross-validated using fivefold cross-validation, i.e., where one fifth of the observations are left out of model fitting and used to evaluate the model each time the model is fitted using an incomplete dataset.
To fit the model, we generated 112 pseudoabsence points. While it is possible to fit SDMs with presence-only data, presence–absence data has shown superior performance [56]. However, true absence points are rare and particularly difficult to confirm for mobile species [57]. Without true absence points, we rely on an artificial set of absence points, termed “pseudoabsence points” [57,58]. There are many different strategies for generating these points, so we refer to the suggestions in [57], which details the best sampling method based on the algorithms used. We started with 37 covariates and went through a process of model selection in order to have a reduced number of covariates that avoided overfitting. For the first stage of covariate selection, we generated 10 sets of 112 random pseudoabsence points, a number equal to the occurrences we had after thinning the dataset and a balanced number (i.e., number of occurrences = number of pseudoabsences) that ensures the best predictive performance for machine learning algorithms [59,60,61]. We then ran each one of the six algorithms three times, and each of the three times, we included 100 permutations for each covariate at a time. These permutations allowed us to estimate covariate importance, a measurement of the drop in explained variance or prediction accuracy by a given covariate, which does not involve the use of Pearson’s correlation [57,62].
Based on this preliminary analysis, we chose to include all covariates whose importance was above 5%. We repeated this process one extra time, until we selected seven covariates (Table 1). For the seven selected covariates, we generated ten additional pseudoabsence datasets of 112 locations, this time employing the Surface Range Envelope (SRE) method in the biomod2 package. The SRE method was used to identify a range of suitable environmental conditions [63]. Pseudoabsence points were then sampled randomly outside of that area, as they were considered environmentally dissimilar to the location of presence points. The resulting datasets of absences and SRE-generated pseudoabsences were run one time for each one of the six algorithms, including 100 permutations for each covariate, to assess their importance.
To visualize the results from the median of the 10 datasets used to generate the ensemble, we employed strip plots. These plots show curves for the probability of occurrence, as a function of each covariate, for each algorithm used to build the ensemble SDMs. The strip plots were generated by producing a prediction from a model using a new dataset in which only one covariate was allowed to vary in a sequence between the minimum and maximum extremes, while the other covariates were fixed at their median values [64].
To evaluate the performance of ensemble SDMs and covariate selection, we employed the Area Under the Curve (AUC) and the true skill statistic (TSS) evaluation statistics. The AUC estimates an algorithm’s ability to correctly differentiate between presence and absence locations, with a value of 0.5 suggesting that model performance is no better than random chance. Meanwhile, the TSS works similarly and is equal to the sum of model sensitivity and specificity minus one [55]. Similar, or higher, AUC and TSS values suggest that models have similar, or improved, performance. We used models from algorithms whose TSS was above 0.5 to generate global ensemble SDM maps of A. yasumatsui based on the 112 occurrence records we compiled. As all individual models had a TSS > 0.5, we weighted all the resulting 60 projected predictions for A. yasumatsui habitat suitability, using the TSS value from each individual model as a proportional weight in the ensemble. We employed the TSS as this metric is independent of the prevalence of occurrences while being interpretable like the AUC [65]. Finally, for comparison, we also fitted the six algorithms ten times, employing the seven selected covariates, to a restricted dataset that only considered the 86 occurrence records from [35] and 86 pseudoabsences also generated using the SRE method.

3. Results

We obtained 122 occurrence points for Aulacaspis yasumatsui, including 36 new records not included in the 86 records from a prior global analysis [35]. Following data cleaning and thinning, 112 records from 20 countries remained, the majority of which were distributed in the United States (46 occurrences), Mexico (15 occurrences), and Taiwan (10 occurrences), see Figure 1. The correlations between the various covariates at the pixels where A. yasumatsui was collected are presented in Figure 2.
The Pearson’s correlogram (Figure 2) shows clusters of high covariate correlations (r > 0.6), particularly for the temperature-related covariates, i.e., the Annual Mean Temperature (AMT), Mean Temperature of the Wettest Quarter (MTWQ), Mean Temperature of the Coldest Quarter (MTCQ), Minimum Temperature of the Coldest Month (MTCM), and Minimum Temperature (MINT). These patterns of association called for a process of covariate selection based on covariate importance, whose results are presented in Figure 3.
Of the thirty-seven environmental covariates initially investigated, only seven were selected for inclusion in the final ensemble species distribution model. During model selection, covariate importance was over 5% for only ten environmental covariates (Figure 3A). After model selection (Figure 3B), the Actual Evapotranspiration (ACEV) and Mean Temperature of the Coldest Quarter (MTCQ) were the two most important covariates with roughly 30% and 18% of importance, respectively. Of the five remaining, the next most important environmental covariates were the Climate Water Deficit (CLW), Vapor Pressure (VAPRE), Reference Evapotranspiration (REV), Temperature Seasonality (TS), and Temperature Annual Range (TAR) (Figure 3B).
Species distribution models were created using six different algorithms and 112 occurrences to estimate habitat suitability, including 36 new records of A. yasumatsui. The ROC and TSS scores for the models reported in Table 2 indicate better performance for the final model with seven covariates (Figure 3B). Random Forest (RF) and Generalized Boosted Regression Models (GBMs) showed the best performance with AUC scores of 0.99 ± 0.13 and 0.98 ± 0.01 plus TSS scores of 0.91 ± 0.08 and 0.89 ± 0.08, respectively. Artificial Neural Networks (ANNs; AUC = 0.93 ± 0.05, TSS = 0.82 ± 0.12) and Classification and Regression Trees (CTAs; AUC = 0.91 ± 0.04, TSS = 0.81 ± 0.08) also showed good performance but were, comparatively, the worst-performing algorithms. Using the same six algorithms and seven covariates but only the 86 occurrence records from Wei et al. [35] generally produced lower AUC values but higher TSS scores in all models (Table 3).
All six modeling algorithms consistently predict an increase in the global probability of occurrence for Aulacaspis yasumatsui with rising Vapor Pressure (VAPRE) levels, particularly at values exceeding 250 KPa (Figure 4A). For the Logistic Binomial Model (LBM), Multivariate Adaptive Regression Splines (MARSs), and Random Forest (RF), Actual Evapotranspiration (ACEV) (Figure 4B) exhibits a strong negative relationship with occurrence probability, especially when ACEV surpasses 900 mm/month. This negative correlation is less pronounced when using Random Forest and Boosted Regression Trees (BRTs), indicating reduced model sensitivity to this covariate.
Temperature Seasonality (TS), which quantifies the extent of temperature fluctuation over a year (Figure 4C), generally demonstrates limited influence on predicted suitability across most algorithms. However, the LBM strongly favors low-TS environments, suggesting that stable annual temperature regimes may be more suitable.
The Temperature Annual Range (TAR) (Figure 4D), the differential between the warmest and coldest months, shows minimal effect on occurrence probability across the majority of models. The Logistic Binomial Model (LBM) reveals a distinct unimodal response, with habitat suitability increasing at TAR values above 400 (equivalent to 4 °C), peaking around 1000 (10 °C), and declining at values beyond 1500 (15 °C). This pattern suggests that A. yasumatsui may favor environments with moderate annual temperature fluctuations, while extreme variability is less suitable.
Artificial Neural Networks (ANNs) follow a similar pattern, with occurrence probability sharply rising between 600 and 1000 (6–10 °C) and then decreasing at higher TAR values. Meanwhile, MARSs and Boosted Regression Trees (BRTs) predict relatively stable suitability across the TAR spectrum, indicating a limited role of this covariate in those models. Random Forest shows high occurrence probabilities throughout most of the TAR range but exhibits a gradual decline past 1500 (15 °C), suggesting some sensitivity to higher temperature variability.
The Mean Temperature of the Coldest Quarter (MTCQ, in °C × 100) emerged as an influential predictor of Aulacaspis yasumatsui occurrence in select models (Figure 4E). The Logistic Binomial Model (LBM) exhibited a sharp threshold response, with suitability declining steeply above 350 (3.5 °C), suggesting a lower thermal tolerance limit during colder seasons. In contrast, models such as ANNs, MARSs, BRTs, and Classification Trees maintained high suitability across the full MTCQ range, indicating limited sensitivity to cold-season temperatures. Random Forest showed a gradual decline in suitability above 300 (3 °C).
Climate Water Deficit (CLW), a measure of atmospheric water demand unmet by available moisture, showed varying degrees of influence on the predicted occurrence of Aulacaspis yasumatsui across models (Figure 4G). The LBM and MARSs predicted a strong, monotonic increase in suitability with rising CLW, plateauing above 700 mm, indicating potential adaptation to drier conditions. ANNs showed a similar trend, with a sharp rise in occurrence probability at intermediate CLW values. Classification Tree models exhibited a threshold-based response, while BRTs and Random Forest maintained high suitability across the entire CLW gradient. These patterns suggest that A. yasumatsui may be tolerant of water-limited environments, particularly as inferred from models sensitive to continuous climatic gradients.
Two final ensemble models were developed: one utilizing previously published occurrence data [25] and the other incorporating an updated occurrence dataset. These ensemble predictions are presented in Figure 5A,B, with spatial differences between models highlighted in Figure 5C. Compared to earlier modeling efforts, our updated ensemble predicts higher occurrence probabilities (depicted in warm tones) in northern Europe, the Mediterranean basin, northern China, Japan, and parts of eastern Australia. In the Americas, increased probabilities are projected along the Pacific coast from Peru to Chile, as well as across fragmented regions of the Caatinga, Chaco, and Savanna biomes and within the Cascade Range of western North America.
Conversely, our updated ensemble model predicts reduced occurrence probabilities (cool tones) relative to previous models in regions such as the Amazon basin, the Appalachian Mountains, the dry tropical zones of Africa, and northern Australia, including Arnhem Land and the York Peninsula. Regions with little to no difference between models are represented in green.
Despite both ensemble models following the same modeling workflow, differences in prediction accuracy are attributed primarily to the number of occurrence records. The updated ensemble, built using a broader covariate set (n = 122), yielded improved predictive performance. These findings suggest that even with fewer occurrence records, robust species distribution models can be achieved through careful management of multicollinearity, integration of diverse algorithms, and ensemble modeling approaches.

4. Discussion

The global spread of Aulacaspis yasumatsui is largely driven by the international trade of Cycadaceae and Zamiaceae, and it has raised major concerns for both native and cultivated cycads. As such, the global expansion of A. yasumatsui is yet another expression of a relational geography where circuits of capital [65,66] shape a space for the invasion of species into new habitats, which can be harmful to native flora and fauna, in a pattern that is similar to what has been observed with pandemics and other emerging diseases [67,68,69]. Biologically, this spread occurs because the primary hosts are cycads (Cycadaceae, Zamiaceae), present as both native stands and widely planted ornamentals; urban gardens, nurseries, and wholesale hubs concentrate hosts and act as stepping stones that increase propagule pressure, shorten dispersal distances, and enable rapid establishment along trade and landscaping corridors.
Previous species distribution models (SDMs), primarily using the MaxEnt algorithm, failed to predict high suitability in regions such as Mesoamerica and parts of Africa [34,35], where the species is now established [9,22]. Our updated ensemble model, informed by a broader and more recent occurrence dataset and a refined selection of environmental predictors, successfully captures these newly invaded areas, particularly in Central America and the Caribbean, where suitable habitats in Mesoamerica and the Caribbean were predicted even with the reduced dataset of 86 records from a previous study [35].
A key factor driving the enhanced accuracy of our model is the integration of temperature-related covariates across multiple dimensions, reflecting A. yasumatsui’s complex climatic niche. Among these, the Mean Temperature of the Coldest Quarter (MTCQ) proved particularly critical, accounting for approximately 20% of covariate importance. The Logistic Binomial Model (LBM) revealed a sharp decrease in suitability above 3.5–4 °C, suggesting that populations are constrained by thermal thresholds during the coldest months. This sensitivity may explain the absence of established populations in colder temperate regions, despite apparent environmental suitability under less detailed models.
In contrast, the Temperature Annual Range (TAR) and Temperature Seasonality (TS)—both representing variability rather than absolute values—had subtler effects across models. However, both the LBM and Artificial Neural Networks (ANNs) showed increased occurrence probabilities at moderate TAR values (approximately 6–10 °C), which align with subtropical climates typical of Mesoamerica. These gradients suggest that moderate thermal oscillation, rather than stable tropical warmth or extreme fluctuations, may provide ideal conditions for establishment. Earlier models likely overlooked this nuance due to limited covariate resolution and algorithmic sensitivity.
The role of Actual Evapotranspiration (ACEV), which explained nearly 30% of the model’s predictive power, further supports this inference. High ACEV values, typically found in ever-wet equatorial forests such as the Amazon and Congo basins, were associated with reduced suitability. This helps explain why these regions remain largely free of A. yasumatsui, despite their apparent climatic favorability. Conversely, Climate Water Deficit (CLW) and Vapor Pressure (VAPRE), each contributing approximately 15%, showed positive associations with occurrence probability in several models, particularly the LBM and ANNs, suggesting that A. yasumatsui can persist under moderate water stress and high atmospheric aridity.
Taken together, the negative association with ACEV and the positive, saturating responses to CLW and VAPRE, captured most clearly by model algorithms sensitive to continuous gradients (ANNs, LBM), indicate that suitability for A. yasumatsui increases from per-humid toward sub-humid conditions and then plateaus near the dry margin.
The newly predicted suitability across Central America, including Panama, Mexico, and Costa Rica, reflects the convergence of these optimal conditions: moderate annual temperature ranges and intermediate water deficits. The previous failure to predict these regions can be attributed to both the narrower scope of environmental covariates and the use of a single algorithm (MaxEnt), which lacked sensitivity to complex interactions and nonlinear responses [38,39,42]. Within the sample sizes available here (tens to low hundreds), predictor choice and the multi-algorithm ensemble had a larger effect on performance than adding more occurrences. Even with the 86-record subset [35], models that explicitly represented cold-season temperature, moisture balance, and Actual Evapotranspiration and reduced predictors to a parsimonious seven recovered the Mesoamerican signal; by contrast, adding points in already well-sampled areas yielded diminishing returns after 5 km thinning.
For communication and planning, we translate continuous probabilities into three classes: high (>0.70), moderate (0.30–0.70), and low (≤0.30). We recommend (i) enhanced quarantine and nursery inspections in high-suitability areas, prioritizing ports of entry and ornamental-plant clusters; (ii) sentinel monitoring and community reporting in moderate areas to enable early detection; and (iii) baseline surveillance in low-suitability zones, with emphasis on pathways rather than blanket coverage. These classes can be mapped directly to guide phytosanitary surveillance and rapid response.
In contrast, our ensemble modeling framework, incorporating six different algorithms and weighing their outputs by performance (TSS > 0.5), offers a more robust and ecologically grounded representation of the Asian Cycad Scale niche. The improved model performance metrics (ROC = 0.95 ± 0.04; TSS = 0.86 ± 0.08) support the reliability of these predictions and highlight the value of multi-algorithm approaches for invasive species forecasting.
Future work aimed at better understanding the expanded geographic distribution of Aulacaspis yasumatsui might be improved by customizing the climatic data to the sampling period of species occurrence. To date, the vast majority of SDMs have been generated almost exclusively with the WordClim dataset [43], with little to no effort being made to include additional datasets with timeframes matching the timing of the occurrences. At the time of this paper being under review, the WorldClim dataset had been cited, according to Google Scholar, in over 16,000 studies despite large mismatches with the time of the modelled species occurrence data. This is conducted assuming climate stability, i.e., that estimates consider landscape elements that affect climatic dynamics [70], that estimates are based on long time series, and that proper resampling and robustness testing, e.g., varying geographical resolution and metrics for the evaluation model outputs, has been performed [71,72]. Thus, based on the current literature, we think our approach enriches common practice by using datasets whose uncertainties are properly understood and that also temporally overlap with the samples from the species whose distribution is being modelled. Ideally, data from remote sensing and estimates from climatic models that incorporate weather station data that coincide with the sampling of occurrences might improve the accuracy of the models. This could result in more precise estimates of habitat suitability based on present-day environmental data, and we are aware of at least one instance where such an approach has been pursued [57].
Furthermore, although our occurrence dataset was expanded and spatially thinned to reduce clustering, sampling bias and uneven geographic coverage still remain as potential concerns in our study, particularly in regions with limited survey efforts. This could have influenced model accuracy and ecological inference, as species distribution models (SDMs) are sensitive to biased presence data and rely heavily on the quality and representativeness of occurrence records [27,28,36]. Finally, the environmental covariates we employed captured key climatic factors; however, the models did not explicitly incorporate biotic interactions such as host plant availability, predation, competition, or mutualisms, all of which are known to strongly influence invasive species’ distributions [23]. The omission of these biological factors may limit the ecological realism and predictive power of the ensemble SDM, especially for a specialist herbivore like Aulacaspis yasumatsui that depends on cycads whose own distributions and vulnerabilities vary geographically.

5. Conclusions

Altogether, our findings emphasize the importance of temperature thresholds and variability, alongside moisture availability, in defining the potential range of A. yasumatsui. They underscore the need for dynamic SDMs that account not just for mean climatic conditions but for seasonal extremes and physiological tolerances, as organisms are sensitive to both mean environments and their variability [72,73]. These insights are crucial for risk assessment and the prioritization of conservation and biosecurity efforts, especially in biodiversity hotspots where endemic cycads are already under conservation threat.

Supplementary Materials

Figures and raw data used in this article are available online at https://www.mdpi.com/article/10.3390/insects16101016/s1, Figure S1: Strip plot from previous studies; Table S1: Summary of environmental covariates; Table S2: Occurrence points.

Author Contributions

Conceptualization, R.C., L.F.C., J.L., S.V.-D. and F.S.-P.; introduction, methodology, and results, L.F.C. and S.V.-D.; investigation, P.E.C., L.F.C., S.V.-D., R.T., D.C., A.S., B.V.-V., B.Z., L.F.C., F.S.-P. and R.C.; resources, L.F.C., J.L. and S.V.-D.; data curation, S.V.-D. and P.E.C.; writing—original draft preparation, S.V.-D., L.F.C. and P.E.C.; writing—review and editing, S.V.-D., J.L., F.S.-P. and L.F.C.; visualization, S.V.-D.; supervision, J.L., R.C. and L.F.C.; project administration, S.V.-D. and L.F.C.; funding acquisition, S.V.-D., B.Z., L.F.C. and J.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received financial support from the Secretaría Nacional de Ciencia, Tecnología e Innovación (SENACYT) grant APY-NI-2015-27; Ph.D. grant No. 002-2015, the grant for International Mobilization SENACYT-01-MOV 2018, 266-2018 to J.L., and the grant EIE18-008 logistically supported this research and the Central American Master’s Program in Entomology of the Universidad de Panamá for J.L. Computational facilities for ensemble model estimation and publication charges were provided by Biodiversity Consultant Group (S.V.-D.).

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The detailed location data are included in Supplementary Data.

Acknowledgments

R.T., D.C., A.S., B.V.-V., and F.S.-P., as members of the 21st promotion at Programa Centroamericano de Maestría en Entomología, Universidad de Panamá, would like to extend their sincere appreciation to the SENACYT for the academic grants received during their studies in Panamá. F.S-P. was funded by a fellowship from the German Academic Exchange Service (DAAD). PECC would like to thank to the Secretaría Nacional de Ciencia, Tecnología e Innovación (SENACYT) for the financial support of the program “Inserción de Becarios 2021” PIB-2021-004 SENACYT and the Universidad de Panamá. B.Z. and J.L. would like to thank the Sistema Nacional de Investigación (SNI) program, SNI-19-2020 and SNI-062-2023, respectively, for the support received in the development of this research. L.F.C. was funded by Indiana University and NSF CNH2-1924200.

Conflicts of Interest

The authors declare that there are no conflicts of interest associated with this publication. The manuscript has been read and approved by all the authors, and there are no external people who satisfied the criteria for authorship. All the authors confirm that they have given due consideration to the protection of intellectual property associated with this work and that there are no impediments to publishing the results of this research and with respect to intellectual property. All the authors confirm that they have followed the regulations of the research institutions concerning intellectual property.

References

  1. Ruchin, A.B.; Esin, M.N.; Nikolaeva, A.M.; Aleksanov, V.V.; Tomkovich, K.P. Invading the North: Dispersal of Nearctic Treehopper Stictocephala bisonia (Hemiptera: Cicadomorpha: Membracidae) in European Russia. Period. Biol. 2024, 125, 167–177. [Google Scholar] [CrossRef]
  2. Venette, R.C.; Ragsdale, D.W. Assessing the Invasion by Soybean aphid (Homoptera: Aphididae): Where Will It End? Ann. Entomol. Soc. Am. 2004, 97, 219–226. [Google Scholar] [CrossRef]
  3. Weissling, T.J.; Howard, F.W.; Hamon, A.B. Cycad Aulacaspis Scale, Aulacaspis yasumatsui Takagi (Insecta: Homoptera: Sternorrhyncha: Diaspididae); University of Florida Cooperative Extension Service, Institute of Food and Agricultural Sciences, EDIS: Tampa, FL, USA, 1999. [Google Scholar] [CrossRef]
  4. Takagi, S. A new species of Aulacaspis associated with a cycad in Thailand (Homoptera: Cocoidea). Insecta Matsumurana New Ser. 1977, 11, 63–72. [Google Scholar]
  5. Halbert, S.E. Entomology section. Triology 1998, 37, 6–7. [Google Scholar]
  6. Howard, F.W.; Hamon, A.; Mclaughlin, M.; Weissling, T.; Yang, S.L. Aulacaspis yasumatsui (Hemiptera: Sternorrhyncha: Diaspididae), a Scale Insect Pest of Cycads Recently Introduced into Florida. Fla. Entomol. 1999, 82, 14–27. [Google Scholar] [CrossRef]
  7. McDougall, K.L.; Gullan, P.J.; Craven, P.; Wright, G.T.; Cook, L.G. Cycad Killer, Qu’est-Ce Que c’est? Dieback of Macrozamia Communis on the South Coast of New South Wales. Aust. J. Bot. 2021, 69, 102–109. [Google Scholar] [CrossRef]
  8. Marler, T.E.; Lindström, A.J.; Watson, G.W. Aulacaspis yasumatsui Delivers a Blow to International Cycad Horticulture. Horticulturae 2021, 7, 147. [Google Scholar] [CrossRef]
  9. Normark, B.B.; Normark, R.D.; Vovides, A.; Solís-Montero, L.; González-Gómez, R.; Pulido-Silva, M.T.; Escobar-Castellanos, M.A.; Dominguez, M.; Perez-Farrera, M.A.; Janda, M.; et al. Cycad Aulacaspis Scale (Aulacaspis yasumatsui Takagi, 1977) in Mexico and Guatemala: A Threat to Native Cycads. Bioinvasions Rec. 2017, 6, 187–193. [Google Scholar] [CrossRef]
  10. Castillo-gómez, M.; Chang-pérez, R.; Jiménez-puello, J.; Santos-mugas, A. Knowledge of Cycad Aulacaspis Scale (Hemiptera: Diaspididae) in Panama. BioInvasions Rec. 2021, 10, 1015–1021. [Google Scholar] [CrossRef]
  11. González-Gómez, R.; Riverón-Giró, F.B.; García-González, A.; Martínez-Rosas, R.; Solís-Montero, L. First Report of Aulacaspis yasumatsui (Hemiptera: Diaspididae) in Mexico. Fla. Entomol. 2016, 99, 583–584. [Google Scholar] [CrossRef]
  12. Evans, G.A.; Dooley, J.W. Potential Invasive Species of Scale Insects for the USA and Caribbean Basin. In Potential Invasive Pests Agricultural Crops; CABI: Oxfordshire, UK, 2013; pp. 320–341. [Google Scholar]
  13. Van Lenteren, J.C.; Colmenarez, Y.C. Biological Control in Barbados. In Biological Control in Latin America and the Caribbean: Its Rich History and Bright Future; CABI: Oxfordshire, UK, 2020; pp. 43–57. [Google Scholar] [CrossRef]
  14. CABI; EPPO. Aulacaspis yasumatsui. [Distribution Map]; Centre for Agricultural Bioscience International: Oxfordshire, UK, 2021. [Google Scholar] [CrossRef]
  15. Blanco-Metzler, H.; Zúñiga Orozco, A. Manejo de Aulacaspis yasumatsui (Hemiptera: Diaspididae) Mediante El Uso de Jabones Comerciales En Costa Rica. Rev. Sedes Reg. 2012, 25, 143–168. [Google Scholar] [CrossRef]
  16. Perez-Gelabert, D. The Invasive Armored Scale Aulacaspis yasumatsui Takagi (Hemiptera: Coccoidea: Diaspididae) in the Dominican Republic. Proc. Èntomol. Soc. Wash. 2019, 121, 525–528. [Google Scholar] [CrossRef]
  17. Bailey, R.; Chang, N.T.; Lai, P.Y.; Hsu, T.C. Life Table of Cycad Scale, Aulacaspis yasumatsui (Hemiptera: Diaspididae), Reared on Cycas in Taiwan. J. Asia Pac. Entomol. 2010, 13, 183–187. [Google Scholar] [CrossRef]
  18. Muniapan, R.; Shepard, M.; Carner, G.; Aun-Chuan Ooi, P. Arthropod Pests of Horticultural Crops in Tropical Asia; CABI: Oxfordshire, UK, 2012. [Google Scholar]
  19. Trencheva, K.; Trenchev, G.; Tomov, R.; Wu, S.-A. Non-Indigenous Scale Insects on Ornamental Plants in Bulgaria and China: A Survey. Entomol. Hell. 2010, 19, 114–123. [Google Scholar] [CrossRef]
  20. Ülgentürk, S. Cycas Palmiyelerinin Kaçak Yolcusu; Aulacaspis yasumatsui Takagi (Hemiptera: Diaspididae) An Unwanted Passenger on Cycas Palm; Aulacaspis yasumatsui Takagi Materyal ve Yöntem Sonuçlar ve Tartışma. Türk. Entomoloji Bül. 2015, 5, 195–200. [Google Scholar] [CrossRef]
  21. Kozár, F.; Konczné Benedicty, Z.; Fetykó, K.; Kiss, B.; Szita, É. An Annotated Update of the Scale Insect Checklist of Hungary (Hemiptera, Coccoidea). Zookeys 2013, 309, 49–66. [Google Scholar] [CrossRef]
  22. Germain, J.F.; Hodges, G.S. First Report of Aulacaspis yasumatsui (Hemiptera: Diaspididae) in Africa (Ivory Coast), and Update on Distribution. Fla. Entomol. 2007, 90, 755–756. [Google Scholar] [CrossRef]
  23. Dimkpa, S.O.N.; Baraka, R.E.; Tobin-West, M.D.; Okwukwu, E. Cycad Aulacaspis Scale (Cas) Aulacaspis yasumatsui Takagi As A Major Pest of Sago Palm Cycas spp. in Nigeria. Int. J. Entomol. Nematol. Res. 2021, 5, 1–10. [Google Scholar]
  24. Nesamari, R.; Millar, I.M.; Coutinho, T.A.; Roux, J. South African Cycads at Risk: Aulacaspis yasumatsui (Hemiptera: Coccoidea: Diaspididae) in South Africa. Afr. Entomol. 2015, 23, 196–206. [Google Scholar] [CrossRef]
  25. Araújo, M.B.; Luoto, M. The Importance of Biotic Interactions for Modelling Species Distributions under Climate Change. Glob. Ecol. Biogeogr. 2007, 16, 743–753. [Google Scholar] [CrossRef]
  26. Guisan, A.; Thuiller, W. Predicting Species Distribution: Offering More than Simple Habitat Models. Ecol. Lett. 2005, 8, 993–1009. [Google Scholar] [CrossRef]
  27. Segurado, P.; Araújo, M.B. An Evaluation of Methods for Modelling Species Distributions. J. Biogeogr. 2004, 31, 1555–1568. [Google Scholar] [CrossRef]
  28. Liu, C.; White, M.; Newell, G. Measuring and Comparing the Accuracy of Species Distribution Models with Presence-Absence Data. Ecography 2011, 34, 232–243. [Google Scholar] [CrossRef]
  29. Lee-Yaw, J.A.; McCune, J.L.; Pironon, S.; Sheth, S.N. Species Distribution Models Rarely Predict the Biology of Real Populations. Ecography 2022, 2022, e05877. [Google Scholar] [CrossRef]
  30. Pagel, J.; Treurnicht, M.; Bond, W.J.; Kraaij, T.; Nottebrock, H.; Schutte-Vlok, A.L.; Tonnabel, J.; Esler, K.J.; Schurr, F.M. Mismatches between Demographic Niches and Geographic Distributions Are Strongest in Poorly Dispersed and Highly Persistent Plant Species. Proc. Natl. Acad. Sci. USA 2020, 117, 3663–3669. [Google Scholar] [CrossRef] [PubMed]
  31. Aguirre-Gutiérrez, J.; Carvalheiro, L.G.; Polce, C.; van Loon, E.E.; Raes, N.; Reemer, M.; Biesmeijer, J.C. Fit-for-Purpose: Species Distribution Model Performance Depends on Evaluation Criteria—Dutch Hoverflies as a Case Study. PLoS ONE 2013, 8, e63708. [Google Scholar] [CrossRef]
  32. Anderson, R.P.; Lew, D.; Peterson, A.T. Evaluating Predictive Models of Species’ Distributions: Criteria for Selecting Optimal Models. Ecol. Modell. 2003, 162, 211–232. [Google Scholar] [CrossRef]
  33. Konowalik, K.; Nosol, A. Evaluation Metrics and Validation of Presence-Only Species Distribution Models Based on Distributional Maps with Varying Coverage. Sci. Rep. 2021, 11, 1482. [Google Scholar] [CrossRef]
  34. Kanle Satishchandra, N.; Geerts, S. Modeling the Distribution of the Invasive Alien Cycad Aulacaspis Scale in Africa under Current and Future Climate Scenarios. J. Econ. Entomol. 2020, 113, 2276–2284. [Google Scholar] [CrossRef] [PubMed]
  35. Wei, J.; Zhao, Q.; Zhao, W.; Zhang, H. Predicting the Potential Distributions of the Invasive Cycad Scale Aulacaspis yasumatsui (Hemiptera: Diaspididae) under Different Climate Change Scenarios and the Implications for Management. PeerJ 2018, 6, e4832. [Google Scholar] [CrossRef] [PubMed]
  36. Fielding, A.H.; Bell, J.F. A Review of Methods for the Assessment of Prediction Errors in Conservation Presence/Absence Models. Environ. Conserv. 1997, 24, 38–49. [Google Scholar] [CrossRef]
  37. Yackulic, C.B.; Chandler, R.; Zipkin, E.F.; Royle, J.A.; Nichols, J.D.; Campbell Grant, E.H.; Veran, S. Presence-Only Modelling Using MAXENT: When Can We Trust the Inferences? Methods Ecol. Evol. 2013, 4, 236–243. [Google Scholar] [CrossRef]
  38. Breiner, F.T.; Guisan, A.; Bergamini, A.; Nobis, M.P. Overcoming Limitations of Modelling Rare Species by Using Ensembles of Small Models. Methods Ecol. Evol. 2015, 6, 1210–1218. [Google Scholar] [CrossRef]
  39. Li, X.; Wang, Y. Applying Various Algorithms for Species Distribution Modelling. Integr. Zool. 2013, 8, 124–135. [Google Scholar] [CrossRef]
  40. Ampomah, E.K.; Qin, Z.; Nyame, G. Evaluation of Tree-Based Ensemble Machine Learning Models in Predicting Stock Price Direction of Movement. Information 2020, 11, 332. [Google Scholar] [CrossRef]
  41. Aiello-Lammens, M.E.; Boria, R.A.; Radosavljevic, A.; Vilela, B.; Anderson, R.P. SpThin: An R Package for Spatial Thinning of Species Occurrence Records for Use in Ecological Niche Models. Ecography 2015, 38, 541–545. [Google Scholar] [CrossRef]
  42. González-Tokman, D.; Córdoba-Aguilar, A.; Dáttilo, W.; Lira-Noriega, A.; Sánchez-Guillén, R.A.; Villalobos, F. Insect Responses to Heat: Physiological Mechanisms, Evolution and Ecological Implications in a Warming World. Biol. Rev. 2020, 95, 802–821. [Google Scholar] [CrossRef] [PubMed]
  43. Fick, S.E.; Hijmans, R.J. WorldClim 2: New 1-Km Spatial Resolution Climate Surfaces for Global Land Areas. Int. J. Climatol. 2017, 37, 4302–4315. [Google Scholar] [CrossRef]
  44. Patterson, T.C. Google Earth as a (Not Just) Geography Education Tool. J. Geogr. 2007, 106, 145–152. [Google Scholar] [CrossRef]
  45. Google Earth Engine Data Catalog. Available online: https://developers.google.com/earth-engine/datasets (accessed on 2 February 2022).
  46. Abatzoglou, J.T. Development of Gridded Surface Meteorological Data for Ecological Applications and Modelling. Int. J. Climatol. 2013, 33, 121–131. [Google Scholar] [CrossRef]
  47. Vermote, E.; Justice, C.; Csiszar, I.; Eidenshink, J.; Myneni, R.; Baret, F.; Masuoka, E.; Wolfe, R.; Claverie, M. NOAA Climate Data Record (CDR) of Normalized Difference Vegetation Index (NDVI), Version 4; NOAA National Centers for Environmental Information: Asheville, NC, USA, 2014. [Google Scholar]
  48. Wan, Z.; Hook, S.; Hulley, G. MOD11A1 MODIS/Terra Land Surface Temperature/Emissivity Daily L3 Global 1km SIN Grid V006. Distributed by NASA EOSDIS Land Processes DAAC; 2015. Available online: https://www.earthdata.nasa.gov/data/catalog/lpcloud-mod11a1-006 (accessed on 2 February 2022).
  49. Miliaresis, G.C.; Argialas, D.P. Segmentation of Physiographic Features from the Global Digital Elevation Model/GTOPO30. Comput. Geosci. 1999, 25, 715–728. [Google Scholar] [CrossRef]
  50. Shimada, M.; Itoh, T.; Motooka, T.; Watanabe, M.; Shiraishi, T.; Thapa, R.; Lucas, R. New Global Forest/Non-Forest Maps from ALOS PALSAR Data (2007–2010). Remote Sens. Environ. 2014, 155, 13–31. [Google Scholar] [CrossRef]
  51. Parker, J.A.; Kenyon, R.V.; Troxel, D.E. Comparison of Interpolating Methods for Image Resampling. IEEE Trans. Med. Imaging 1983, 2, 31–39. [Google Scholar] [CrossRef] [PubMed]
  52. Keys, R. Cubic Convolution Interpolation for Digital Image Processing. IEEE Trans. Acoust. 1981, 29, 1153–1160. [Google Scholar] [CrossRef]
  53. Guisan, A.; Thuiller, W.; Zimmermann, N.E. Habitat Suitability and Distribution Models: With Applications in R; Cambridge University Press: Cambridge, UK, 2017; ISBN 0521765137. [Google Scholar]
  54. Rhodes, C.G.; Loaiza, J.R.; Romero, L.M.; Alvarado, J.M.G.; Delgado, G.; Salas, O.R.; Rojas, M.R.; Aguilar-Avendaño, C.; Maynes, E.; Cordero, J.A.V.; et al. Anopheles albimanus (Diptera: Culicidae) Ensemble Distribution Modeling: Applications for Malaria Elimination. Insects 2022, 13, 221. [Google Scholar] [CrossRef]
  55. Thuiller, W.; Georges, D.; Gueguen, M.; Engler, R.; Breiner, F.; Lafourcade, B.; Patin, R.; Blancheteau, H. biomod2: Ensemble Platform for Species Distribution Modeling. R package version 4.2-1. 2022. Available online: https://CRAN.R-project.org/package=biomod2 (accessed on 2 February 2022).
  56. Elith, J.H.; Graham, C.P.H.; Anderson, R.P.; Dudík, M.; Ferrier, S.; Guisan, A.; Hijmans, R.J.; Huettmann, F.; Leathwick, J.R.; Lehmann, A.; et al. Novel Methods Improve Prediction of Species’ Distributions from Occurrence Data. Ecography 2006, 29, 129–151. [Google Scholar] [CrossRef]
  57. Barbet-Massin, M.; Jiguet, F.; Albert, C.H.; Thuiller, W. Selecting Pseudo-Absences for Species Distribution Models: How, Where and How Many? Methods Ecol. Evol. 2012, 3, 327–338. [Google Scholar] [CrossRef]
  58. Monaghan, A.J.; Eisen, R.J.; Eisen, L.; McAllister, J.; Savage, H.M.; Mutebi, J.P.; Johansson, M.A. Consensus and Uncertainty in the Geographic Range of Aedes Aegypti and Aedes Albopictus in the Contiguous United States: Multi-Model Assessment and Synthesis. PLoS Comput. Biol. 2019, 15, e1007369. [Google Scholar] [CrossRef]
  59. Boehmke, B.; Greenwell, B.M. Hands-On Machine Learning with R; CRC Press: Boca Raton, FL, USA, 2019. [Google Scholar]
  60. Olden, J.D.; Lawler, J.J.; Poff, N.L. Machine Learning Methods Without Tears: A Primer for Ecologists. Q. Rev. Biol. 2008, 83, 171–193. [Google Scholar] [CrossRef]
  61. Kuhn, M.; Johnson, K. Applied Predictive Modeling; Springer: New York, NY, USA, 2013; Volume 26. [Google Scholar]
  62. Rhodes, C.G.; Chaves, L.F.; Bergmann, L.R.; Hamer, G.L. Ensemble Species Distribution Modeling of Culex tarsalis (Diptera: Culicidae) in the Continental United States. J. Med. Entomol. 2023, 60, 664–679. [Google Scholar] [CrossRef]
  63. Elith, J.; Ferrier, S.; Huettmann, F.; Leathwick, J. The Evaluation Strip: A New and Robust Method for Plotting Predicted Responses from Species Distribution Models. Ecol. Modell. 2005, 186, 280–289. [Google Scholar] [CrossRef]
  64. Allouche, O.; Tsoar, A.; Kadmon, R. Assessing the Accuracy of Species Distribution Models: Prevalence, Kappa and the True Skill Statistic (TSS). J. Appl. Ecol. 2006, 43, 1223–1232. [Google Scholar] [CrossRef]
  65. Harvey, D. Spaces of Global Capitalism; Verso: New York, NY, USA, 2006; ISBN 1844670651. [Google Scholar]
  66. Wallace, R.G.; Bergmann, L.; Kock, R.; Gilbert, M.; Hogerwerf, L.; Wallace, R.; Holmberg, M. The Dawn of Structural One Health: A New Science Tracking Disease Emergence along Circuits of Capital. Soc. Sci. Med. 2015, 129, 68–77. [Google Scholar] [CrossRef]
  67. Wallace, R.G. Dead Epidemiologists: On the Origins of COVID-19; Monthly Review Press: New York, NY, USA, 2020. [Google Scholar]
  68. Wallace, R.G.; Liebman, A.; Chaves, L.F.; Wallace, R. COVID-19 and Circuits of Capital. Mon. Rev. 2020, 72, 1–15. [Google Scholar] [CrossRef]
  69. Daly, C. Guidelines for Assessing the Suitability of Spatial Climate Data Sets. Int. J. Climatol. 2006, 26, 707–721. [Google Scholar] [CrossRef]
  70. Zumwald, M.; Knüsel, B.; Baumberger, C.; Hirsch Hadorn, G.; Bresch, D.N.; Knutti, R. Understanding and Assessing Uncertainty of Observational Climate Datasets for Model Evaluation Using Ensembles. WIREs Clim. Change 2020, 11, e654. [Google Scholar] [CrossRef]
  71. Easterling, D.R.; Peterson, T.C.; Karl, T.R. On the Development and Use of Homogenized Climate Datasets. J. Clim. 1996, 9, 1429–1434. [Google Scholar] [CrossRef]
  72. Lewontin, R.; Levins, R. Schmalhausen’s Law. Capital. Nat. Social. 2000, 11, 103–108. [Google Scholar] [CrossRef]
  73. Chaves, L.F. Climate Change and the Biology of Insect Vectors of Human Pathogens; John Wiley & Sons: Hoboken, NJ, USA, 2016; ISBN 9781119070894. [Google Scholar] [CrossRef]
Figure 1. Map showing the geographic locations of the invasive Asian Cycad Scale Aulacaspis yasumatsui, red dots indicate occurrence points after data cleaning and thinning. The final data coverage includes 20 countries with 112 points.
Figure 1. Map showing the geographic locations of the invasive Asian Cycad Scale Aulacaspis yasumatsui, red dots indicate occurrence points after data cleaning and thinning. The final data coverage includes 20 countries with 112 points.
Insects 16 01016 g001
Figure 2. Pairwise Pearson’s correlation between environmental covariates at the global occurrence points for the invasive Asian Cycad Scale Aulacaspis yasumatsui. Correlations have been clustered to ease the visualization of groups of highly correlated covariates. High correlations represent r values −0.6 > r > 0.6.
Figure 2. Pairwise Pearson’s correlation between environmental covariates at the global occurrence points for the invasive Asian Cycad Scale Aulacaspis yasumatsui. Correlations have been clustered to ease the visualization of groups of highly correlated covariates. High correlations represent r values −0.6 > r > 0.6.
Insects 16 01016 g002
Figure 3. Covariate importance in the ensemble species distribution model of the invasive Asian Cycad Scale Aulacaspis yasumatsui: (A) All covariates used for initial model selection; the red dashed line indicates the 5% importance threshold. (B) Covariates selected for the final ensemble model. Red dashed line indicates 5% variable importance threshold.
Figure 3. Covariate importance in the ensemble species distribution model of the invasive Asian Cycad Scale Aulacaspis yasumatsui: (A) All covariates used for initial model selection; the red dashed line indicates the 5% importance threshold. (B) Covariates selected for the final ensemble model. Red dashed line indicates 5% variable importance threshold.
Insects 16 01016 g003
Figure 4. Strip plots for the predicted probability of occurrence for the invasive Asian Cycad Scale Aulacaspis yasumatsui using 112 occurrence records (Supplementary Material Figure S1 shows the results based on the 86 records from a previous study [35]). Media of replicated runs for each model are represented by lines. Model algorithms are color-coded. Environmental covariates included (A) Vapor Pressure (VAPRE), (B) Actual Evapotranspiration (ACEV), (C) Temperature Seasonality (TS), (D) Temperature Annual Range (TAR), (E) Mean Temperature of the Coldest Quarter (MTCQ), (F) Reference Evapotranspiration (REV), and (G) Climate Water Deficit (CLW).
Figure 4. Strip plots for the predicted probability of occurrence for the invasive Asian Cycad Scale Aulacaspis yasumatsui using 112 occurrence records (Supplementary Material Figure S1 shows the results based on the 86 records from a previous study [35]). Media of replicated runs for each model are represented by lines. Model algorithms are color-coded. Environmental covariates included (A) Vapor Pressure (VAPRE), (B) Actual Evapotranspiration (ACEV), (C) Temperature Seasonality (TS), (D) Temperature Annual Range (TAR), (E) Mean Temperature of the Coldest Quarter (MTCQ), (F) Reference Evapotranspiration (REV), and (G) Climate Water Deficit (CLW).
Insects 16 01016 g004
Figure 5. Global ensemble distribution model for the invasive Asian Cycad Scale Aulacaspis yasumatsui. Color indicates habitat suitability measured as probability from 0 to 1, as presented in the legend. (A) SDM generated with previously published data [35]; (B) SDM generated with 122 records of this study; (C) differences in predicted distribution models. Blue indicates areas with medium and high occurrence probability predicted in (A) but low probabilities predicted in (B). Red indicates areas with medium and high probabilities of occurrence in (B) but with low probability in (A).
Figure 5. Global ensemble distribution model for the invasive Asian Cycad Scale Aulacaspis yasumatsui. Color indicates habitat suitability measured as probability from 0 to 1, as presented in the legend. (A) SDM generated with previously published data [35]; (B) SDM generated with 122 records of this study; (C) differences in predicted distribution models. Blue indicates areas with medium and high occurrence probability predicted in (A) but low probabilities predicted in (B). Red indicates areas with medium and high probabilities of occurrence in (B) but with low probability in (A).
Insects 16 01016 g005
Table 1. Summary of the environmental covariates considered for the distribution modeling of the species Aulacaspis yasumatsui Takagi, 1977. Covariates are organized according to their importance (decreasing, presented in Figure 3B where abbreviations are matched with those presented under the “Abbreviation” column here).
Table 1. Summary of the environmental covariates considered for the distribution modeling of the species Aulacaspis yasumatsui Takagi, 1977. Covariates are organized according to their importance (decreasing, presented in Figure 3B where abbreviations are matched with those presented under the “Abbreviation” column here).
AbbreviationCovariateEco-Environmental Significance
ACEVActual Evapotranspiration (Derived)Integrated energy–water flux; negative association; ever-wet per-humid climates correspond to lower suitability compared with sub-humid regimes
MTCQMean Temp of Coldest QuarterWinter constraint: suitability drops when MTCQ < ~3.5–4 °C, indicating cold-season limits on survival and establishment
CLWClimate Water Deficit (Derived)Unmet water demand; positive to a plateau; indicates tolerance of moderate moisture limitation, with saturation near the dry margin
VAPREVapor PressureAtmospheric moisture (kPa); positive, saturating response that tracks the same moisture gradient as the deficit, peaking in sub-humid conditions and leveling toward the driest end
REVReference EvapotranspirationAtmospheric evaporative demand (energy/temperature signal); helps distinguish sub-humid from per-humid settings
TSTemperature Seasonality (sd × 100)Captures thermal variability across the year; extreme seasonality reduces suitability
TARTemperature Annual RangeAnnual amplitude of temperature; moderate ranges (~6–10 °C) align with higher suitability, whereas very low or very high ranges reduce establishment
Table 2. Comparison between the two models presented in Figure 3. The table shows receiver operating characteristic curve (ROC) and true skills statistic (TSS) values, with standard deviation (±SD).
Table 2. Comparison between the two models presented in Figure 3. The table shows receiver operating characteristic curve (ROC) and true skills statistic (TSS) values, with standard deviation (±SD).
ModelROCTSS
All (37 covariates)0.90 ± 0.080.76 ± 0.13
Final (7 covariates)0.95 ±0.040.86 ± 0.08
Table 3. Comparison of algorithm performance for model B using 112 observations vs. 86 observations from Wei et al. [35]. In the table: receiver operating characteristic curve (ROC), true skills statistic (TSS) values, and standard deviation (±SD).
Table 3. Comparison of algorithm performance for model B using 112 observations vs. 86 observations from Wei et al. [35]. In the table: receiver operating characteristic curve (ROC), true skills statistic (TSS) values, and standard deviation (±SD).
AlgorithmUpdated Model EvaluationWei et al. [35] Data Model Evaluation
ROCTSSROCTSS
GLM0.97 ± 0.010.87 ± 0.010.83 ± 0.020.96 ± 0.05
RF0.99 ± 0.130.91 ± 0.080.88 ± 0.030.97 ± 0.10
MARS0.97 ± 0.030.83 ± 0.100.84 ± 0.010.98 ± 0.03
GBM0.98 ±0.010.89 ±0.080.87 ± 0.020.97 ± 0.07
ANN0.93 ± 0.050.82 ±0.120.80 ± 0.060.94 ± 0.15
CTA0.91 ±0.040.81 ± 0.080.83 ± 0.050.92 ± 0.10
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Valdés-Díaz, S.; Tuñón, R.; Castillo, D.; Sanchez, A.; Virola-Vasquez, B.; Corro, P.E.; Serrano-Peraza, F.; Zachrisson, B.; Loaiza, J.; Chang, R.; et al. Ensemble Distribution Modeling of the Globally Invasive Asian Cycad Scale, Aulacaspis yasumatsui Takagi, 1977 (Hemiptera: Diaspididae). Insects 2025, 16, 1016. https://doi.org/10.3390/insects16101016

AMA Style

Valdés-Díaz S, Tuñón R, Castillo D, Sanchez A, Virola-Vasquez B, Corro PE, Serrano-Peraza F, Zachrisson B, Loaiza J, Chang R, et al. Ensemble Distribution Modeling of the Globally Invasive Asian Cycad Scale, Aulacaspis yasumatsui Takagi, 1977 (Hemiptera: Diaspididae). Insects. 2025; 16(10):1016. https://doi.org/10.3390/insects16101016

Chicago/Turabian Style

Valdés-Díaz, Samuel, Reyna Tuñón, Dilma Castillo, Alieth Sanchez, Brenda Virola-Vasquez, Patricia Esther Corro, Francisco Serrano-Peraza, Bruno Zachrisson, Jose Loaiza, Rodrigo Chang, and et al. 2025. "Ensemble Distribution Modeling of the Globally Invasive Asian Cycad Scale, Aulacaspis yasumatsui Takagi, 1977 (Hemiptera: Diaspididae)" Insects 16, no. 10: 1016. https://doi.org/10.3390/insects16101016

APA Style

Valdés-Díaz, S., Tuñón, R., Castillo, D., Sanchez, A., Virola-Vasquez, B., Corro, P. E., Serrano-Peraza, F., Zachrisson, B., Loaiza, J., Chang, R., & Chaves, L. F. (2025). Ensemble Distribution Modeling of the Globally Invasive Asian Cycad Scale, Aulacaspis yasumatsui Takagi, 1977 (Hemiptera: Diaspididae). Insects, 16(10), 1016. https://doi.org/10.3390/insects16101016

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop