Potential Distribution of Colonizing Nine-Banded Armadillos at Their Northern Range Edge

The nine-banded armadillo (Dasypus novemcinctus) has become a recent addition to the local fauna of Illinois as a response to habitat alteration and climate change. This range expansion has resulted in the presence of armadillos in areas not predicted by earlier models. Although these models have been revised, armadillos continue to move north and have reached areas of heavy agricultural use. We identified conditions that favor the presence of armadillos and potential corridors for dispersal. Identifying the distribution of the armadillo in Illinois is a vital step in anticipating their arrival in areas containing potentially sensitive wildlife populations and habitats. Armadillo locations (n = 37) collected during 2016–2020 were used to develop a map of the potential distribution of armadillos in southern Illinois. Environmental data layers included in the model were land cover type, distance to water, distance to forest edge, human modification, and climactic variables. Land cover type was the most important contributing variable to the model. Our results are consistent with the tenet that armadillo activity and dispersal corridors are centered around riparian areas, and that forested cover may provide corridors an agricultural mosaic.


Introduction
The nine-banded armadillo (Dasypus novemcinctus) has colonized much of the southern United States in less than 200 years [1]. The current distribution of armadillos now includes 15 states, and is expected to expand farther northeast [2,3]. This range expansion has led to management concern about which factors facilitated their colonization, and what will ultimately limit establishment. Although temperature and precipitation are thought to limit northern expansion to the 40th parallel north [3], armadillos have successfully adapted to a broader range of environmental conditions [4] and surpassed thresholds from previous models thought to be too cold to support the establishment of a new population, with records as far north as Nebraska and Indiana [3,5]. In Illinois, sightings have increased dramatically since the early 2000s, and breeding populations have become established in the state [3,6,7]. It is now believed that permanent sustaining populations of armadillos are limited to areas that receive yearly precipitation greater than 50 cm and have mean January temperatures above −8 • C [3]. Yet, to this date, there is no characterization of the realized habitats and dispersal corridors used by armadillos based on either empirical data or modelling using known attributes of both climate and terrain.
Armadillos have dispersed through vast areas of heavy agricultural use in Arkansas, Georgia, Oklahoma, and Texas [3,8]. Illinois is crossed by the 40th parallel, which is presumed to act as the northernmost limit for the dispersal. The southern half of Illinois is surrounded by the Mississippi and the Ohio Rivers and sustains a robust population of armadillos. This region of the state is covered with fragmented forest that connects the major river valleys with the corn belt. This heavy fragmentation of forested cover with agricultural land presents a unique opportunity to identify corridors that armadillos are using in their northward dispersion.
Armadillos may positively influence ecosystems by creating new habitats via burrows, increasing soil nutrients through the bioturbation of soil, and regulating insect pests [9]. However, wildlife managers are often concerned about armadillo colonization due to the introduction of pathogens (e.g., Trypanosoma cruzi, [10,11]; and Mycobacterium leprae, [12], adverse effects on native wildlife, and potential damage to crops and property [1,[13][14][15]. For example, northern bobwhite (Colinus virginianus) are a species of concern, and have suffered considerable declines since the 1970s, presumably due to habitat fragmentation and genetic isolation of subpopulations [16][17][18]. Predation is a common cause of bobwhite nest failure, and armadillos have been identified as an important mammalian predator to bobwhites [19]. Staller et al. (2005) found mammals were responsible for 59% of bobwhite nest predations, with armadillos being second only to raccoons (Procyon lotor) in predation rates, and typically all eggs in the nest were consumed.
Wildlife managers are also interested in habitat use by colonizing armadillos, given the potential for overlap between armadillos and species of concern, such as northern bobwhites. Armadillos are thought to be closely affiliated with riparian habitat and hardwood hammocks [1,20,21], appear to avoid upland pine habitat, or show little habitat selection whatsoever [22]. Inbar and Mayer [23] found armadillo roadkills near dense woodlands in winter months, but in summer, roadkill locations were dependent more on traffic levels rather than land cover type. Armadillos also require deciduous forests associated with leaf litter and prey availability, especially in winter months [24,25].
Biologists, health officials, and wildlife managers can use knowledge of habitat preferences and other environmental factors to forecast areas of potential armadillo colonization using species distribution models (SDM) [26][27][28]. Species distribution modeling has been a useful research tool for a wide array of ecological issues, including management of invasive species [29], assessment of habitat connectivity for species of concern [28,30], prediction of parasite presence [11], and assessment of species richness [31]. Studies of the potential distribution of armadillos in Illinois can provide a unique insight into baseline habitat preferences of armadillos as they expand their range north, as habitat preference may change based on population characteristics such as density. Several species have generalized their habitat preferences as population densities increased [32][33][34]. The same phenomenon may be true for armadillos at their northern range edge, with populations altering habitat selection as preferred areas become overcrowded and unavailable. Modeling the potential distribution of armadillos is also important in anticipating disease spread and effects on other native wildlife.
We modeled the influence of several environmental variables on potential distribution of armadillos in southern Illinois. SDM have been previously used to predict armadillo colonization in the United States through the use of climatic variables and presence data [2]. However, environmental factors influence species distributions in a hierarchical manner, with certain factors holding greater importance depending on the spatial scale [35]. While climatic factors are relevant to species distribution at broad scales, land cover variables are likely to influence distribution at a finer scale [36][37][38]. We expected annual precipitation to be an important factor influencing potential distribution, as precipitation must be sufficient enough to provide varied prey items and porous soil [25]. Annual precipitation had high importance in previous models [2,3,25], but we did not expect it to be as strong of a contributor compared to large-scale models due to little variation of climate at a smaller scale [36,37]. We expected distance to forest cover and distance to water to be the most important factors influencing the potential distribution of armadillos. We anticipated distance to water to be of high importance because armadillos need access to fresh water and soft soils for foraging [1]. We also expected forest cover to be a strong influence because past findings have shown patches of forested cover, especially those close to watercourses, are important habitat for armadillos in agricultural matrices [39]. We predicted armadillo presence to be positively associated with forest, wetland, and open water cover types [1,20,39]. We predicted armadillos would occur in low to moderately developed areas and agricultural cover types due to the armadillo's seemingly high tolerance to human disturbance [40], but would be negatively associated with highly developed or modified areas and agricultural fields with little adjacent forest cover [39]. We anticipated armadillo presence would increase with increased temperature of the wettest quarter and precipitation of the wettest quarter, as both temperature and precipitation limit colonization [2,3], but did not expect these climate variables to contribute to the model as much as annual precipitation, as annual precipitation has already been found to be of high importance in previous SDMs [2,3,25].

Study Area
We modeled potential armadillo distribution for the 51 Illinois counties below 40 • N (63,994 km 2 ). The climate of southern Illinois is temperate, with cold winters, wet springs, and hot, humid summers [41]. Elevation ranges from 88-324 m [42]. Mean annual temperature ranges from 14.7 • C in Pope County to as low as 11.1 • C in Moultrie County, and mean annual precipitation varies from 128 cm in Union County to 95 cm in Sangamon County [43]. Land cover is comprised of row-crop agriculture and pasture (65%), deciduous forest (21%), development (8%), wetland (3%), and open water (2%) [44]. Mean human density of the area is 25.4 persons per km 2 , with the highest density being 158.5 persons per km 2 in St. Clair County, and the lowest 4.8 persons per km 2 in Pope County [45].

Armadillo Presence
Armadillo presence locations (n = 361) were collected during 2016-2020. Reports of armadillos were solicited by creating a social media page and contacting local naturalist groups and county animal control offices. Records were confirmed as nine-banded armadillos either by photographs of reports, camera trap data, or driving to reports of roadkill. Localities were rarefied by 10 km to avoid spatial bias, resulting in 37 locations used for modeling. Out of 37 presences, 20 were roadkills and 17 were observations of a live animal. Two locations were collected in spring, 18 in summer, 13 in fall, and 4 in winter. Roadkills were salvaged and georeferenced; a fraction of them were deposited in the Illinois Natural History Survey Division of Mammalogy. Reports of live armadillos were georeferenced via Google Earth when coordinates were not provided.

Environmental Variables
We compiled 8 environmental data layers representing climate (n = 4 data layers), land cover (n = 1), global human modification (n = 1), and habitat-distance variables (n = 2) known to be important to armadillos [1][2][3]22]. Environmental data layers were raster-based and resampled to a 30 x 30 m cell size and geospatial analyses were performed using ArcGIS 10.7.1.
Climatic variables were downloaded from WorldClim (http://www.worldclim.org/, Table 1, accessed 8 April 2020). Land cover data were obtained from the National Land Cover Dataset [46] and were categorized into open water, development, forest (deciduous, evergreen, and mixed), herbaceous, pasture, crop, wetland (woody and emergent), and barren cover types [46]. Development was categorized into four levels of intensity based on percentage of impervious surface (open space [<20% impervious surface], low intensity development [20-49%], medium intensity development [50-79%], and high intensity development [80-100%]). Barren land was any area in which vegetation accounted for <15% of total cover, including glacial debris, sand dunes, strip mines, gravel pits, and other accumulations of earthen material [28,46]. The global human modification layer depicted the proportion of landscape modified by humans [47]. Thirteen anthropogenic stressors to biological diversity were grouped into five major classes, including human settlement, agriculture, transportation, mining and energy production, and infrastructure. A raster from the USGS GAP Analysis Project was used that included both interior and Diversity 2021, 13, 266 4 of 12 exterior buffers from forest edge [48]. Buffer distances both into and away from forest edge included 0, 30, 60, 120, 250, 500, 1000, 2000, 4000, and >4000 m. Water features were downloaded from the Natural Resources Conservation Service Geospatial Gateway [49]. Euclidean distance was used to form distance contours from water features. Variables were tested for correlation; highly correlated variables (Pearson's R > 0.7) were removed [28,29] and included fifteen climatic variables from WorldClim. Because 54% of our armadillo presence dataset consisted of roadkills, we were concerned that the model would be biased towards roads and developed land cover types. To test for a difference in habitat surrounding roadkills vs. live presences, we developed a circular buffer equal to the mean home range size of an armadillo (8.7 ha; [22]) around each presence location and calculated percentages of land cover types (i.e., agriculture, development, forest, herbaceous, open water, wetland) within the buffer. We then used a chi-square test of independence (α = 0.05) to determine if land cover percentages differed near roadkill vs. live locations.

Potential Distribution
We modeled potential distribution of armadillos using MaxEnt [36], a program that utilizes a maximum entropy approach to predict the likelihood that a given species is present in a landscape based on presence locations and habitat data [50]. MaxEnt only requires a few presence locations (≥10) and is not as sensitive to spatial error as other presence-only modeling programs, making it ideal for modeling low-density populations [26,50,51].
Models were run with SDMToolbox 2.0, a GIS toolkit designed to simplify the process of spatial distribution modeling [52]. Measures were taken to avoid overfitting to presence points and generating further spatial bias. For example, presence locations were spatially segregated into groups using the spatial jacknife tool in SDMToolbox, rather than divided randomly, prior to k-fold cross-validation. This reduces the amount of environmental bias and allows for ease of detection of overfitting within groups by separating biases geographically [53].
Regularization multipliers and feature class settings were manually altered to tune the model and further reduce overfitting. Regularization limits model complexity by penalizing models that have many features or have features with strong weights [36,54]. Feature classes tested included linear, quadratic, hinge, product, and threshold and a series of regularization multipliers (1-5) when tuning the model. We applied a jackknife procedure to determine variable importance; the variable that produced the highest training gain when used in isolation and/or decreased the gain the most when omitted was determined to be the most useful to the model [26,51].
Response curves were utilized to determine how probability of presence changed in response to environmental variables [55]. Marginal response curves represented a model iteration in which a single environmental variable was varied while all other variables were kept at their mean value. When variables were correlated and marginal response curves were difficult to interpret, response curves representing a model iteration in which only one environmental variable was used were interpreted instead.
Model fit and performance [56] were assessed by the value of the Omission Error Rate (OER), followed by the area under the receiver operator curve (AUC). OER quantifies the proportion of presences that the model predicts as an absence. The ROC plots sensitivity vs. commission error. The best model would have the lowest OER and a high AUC on a scale from 0-1.

Results
Roadkill were found in areas with higher percentages of developed land cover, while live armadillos were observed in areas with higher percentages of forested cover (χ 2 = 13.47, df = 5, p = 0.019, Figure 1). The model had an OER of 0.24, meaning it had a low false negative rate, and an AUC of 0.87, indicating a high sensitivity and specificity, or a high percentage of actual presences predicted and a low false positive rate [56,57]. The model used hinge as a feature class (i.e., parts of response curves were linear, while others were non-linear; [58]). The model used a regularization multiplier of 2.0 to best fit and smooth species response curves to environmental variables.
Response curves were utilized to determine how probability of presence changed in response to environmental variables [55]. Marginal response curves represented a model iteration in which a single environmental variable was varied while all other variables were kept at their mean value. When variables were correlated and marginal response curves were difficult to interpret, response curves representing a model iteration in which only one environmental variable was used were interpreted instead.
Model fit and performance [56] were assessed by the value of the Omission Error Rate (OER), followed by the area under the receiver operator curve (AUC). OER quantifies the proportion of presences that the model predicts as an absence. The ROC plots sensitivity vs. commission error. The best model would have the lowest OER and a high AUC on a scale from 0-1.

Results
Roadkill were found in areas with higher percentages of developed land cover, while live armadillos were observed in areas with higher percentages of forested cover (χ 2 = 13.47, df = 5, p = 0.019, Figure 1). The model had an OER of 0.24, meaning it had a low false negative rate, and an AUC of 0.87, indicating a high sensitivity and specificity, or a high percentage of actual presences predicted and a low false positive rate [56,57]. The model used hinge as a feature class (i.e., parts of response curves were linear, while others were non-linear; [58]). The model used a regularization multiplier of 2.0 to best fit and smooth species response curves to environmental variables. Highest probability of the presence of armadillos was in residential areas and along roadways, with the most contiguous area of potential armadillo presence in southernmost counties characterized by large, forested patches ( Figure 2). In the northern portion of the study area, suitable habitat was concentrated around riparian areas (Figure 2). Land cover was the most important contributing variable to the model (83% contribution) and the most important variable in the jackknife of variable importance (Figure 3). Probability of armadillo presence was highest in open space development to medium development (Figure 4). Lowest probability of presence was in agricultural cover, with <20% probability for Highest probability of the presence of armadillos was in residential areas and along roadways, with the most contiguous area of potential armadillo presence in southernmost counties characterized by large, forested patches ( Figure 2). In the northern portion of the study area, suitable habitat was concentrated around riparian areas (Figure 2). Land cover was the most important contributing variable to the model (83% contribution) and the most important variable in the jackknife of variable importance (Figure 3). Probability of armadillo presence was highest in open space development to medium development ( Figure 4). Lowest probability of presence was in agricultural cover, with <20% probability for pasture, and <10% for cultivated crops (Figure 4). All other land cover types (open water, open space high development, barren land, forest, herbaceous, and wetland) had the same probability of presence, with about 43% probability of armadillo presence within each of these cover types. Annual precipitation (6.5%), distance to forest edge (5.9%), distance to water (4.1%), and mean temperature of the wettest quarter (0.4%) had little effect on armadillos' presence, and all other variables had 0% contribution. Distance to forest and distance to water showed stronger relationships to probability of presence in response curves using only the corresponding variable than in marginal response curves. Probability of armadillo presence increased as distance to forest decreased, and within the forest interior, presence was highest 250-500 m from forest edge ( Figure 5). Probability of presence also increased as distance to water decreased ( Figure 6). Predicted presence was also positively correlated to annual precipitation (Figure 7).        Annual precipitation (6.5%), distance to forest edge (5.9%), distance to water ( and mean temperature of the wettest quarter (0.4%) had little effect on armadillos' ence, and all other variables had 0% contribution. Distance to forest and distance to showed stronger relationships to probability of presence in response curves using the corresponding variable than in marginal response curves. Probability of arm presence increased as distance to forest decreased, and within the forest interior, pre was highest 250-500 m from forest edge ( Figure 5). Probability of presence also incr as distance to water decreased ( Figure 6). Predicted presence was also positively lated to annual precipitation (Figure 7).   Annual precipitation (6.5%), distance to forest edge (5.9%), distance to water (4.1%), and mean temperature of the wettest quarter (0.4%) had little effect on armadillos' presence, and all other variables had 0% contribution. Distance to forest and distance to water showed stronger relationships to probability of presence in response curves using only the corresponding variable than in marginal response curves. Probability of armadillo presence increased as distance to forest decreased, and within the forest interior, presence was highest 250-500 m from forest edge ( Figure 5). Probability of presence also increased as distance to water decreased ( Figure 6). Predicted presence was also positively correlated to annual precipitation (Figure 7).

Discussion
Our model of potential presence of colonizing armadillos in southern Illinois performed well, given the performance measures calculated [56]. Land cover was the most meaningful predictor of armadillo distribution, demonstrating that while climate may dictate broad distributional limits to armadillo populations [2,3], land cover type may be more important to determine the probability of armadillo presence at a fine scale. Variables that affect a species distribution are often influenced by the geographic extent of the model [36]. For example, climatic variables are important for global and meso-scales, while land cover variables influence species distributions for smaller scales [37]. Climate variables were generally uninformative in our analysis but were important to include given the proximity of the northern extent of likely armadillo expansion [2] to our study area.
Contrary to our prediction, the highest probability of armadillo presence was within open to medium intensity developed cover types, largely represented by roads and highways. This was likely because 54% of our presence dataset consisted of roadkill. Although

Discussion
Our model of potential presence of colonizing armadillos in southern Illinois performed well, given the performance measures calculated [56]. Land cover was the most meaningful predictor of armadillo distribution, demonstrating that while climate may dictate broad distributional limits to armadillo populations [2,3], land cover type may be more important to determine the probability of armadillo presence at a fine scale. Variables that affect a species distribution are often influenced by the geographic extent of the model [36]. For example, climatic variables are important for global and meso-scales, while land cover variables influence species distributions for smaller scales [37]. Climate variables were generally uninformative in our analysis but were important to include given the proximity of the northern extent of likely armadillo expansion [2] to our study area.
Contrary to our prediction, the highest probability of armadillo presence was within open to medium intensity developed cover types, largely represented by roads and highways. This was likely because 54% of our presence dataset consisted of roadkill. Although

Discussion
Our model of potential presence of colonizing armadillos in southern Illinois performed well, given the performance measures calculated [56]. Land cover was the most meaningful predictor of armadillo distribution, demonstrating that while climate may dictate broad distributional limits to armadillo populations [2,3], land cover type may be more important to determine the probability of armadillo presence at a fine scale. Variables that affect a species distribution are often influenced by the geographic extent of the model [36]. For example, climatic variables are important for global and meso-scales, while land cover variables influence species distributions for smaller scales [37]. Climate variables were generally uninformative in our analysis but were important to include given the proximity of the northern extent of likely armadillo expansion [2] to our study area.
Contrary to our prediction, the highest probability of armadillo presence was within open to medium intensity developed cover types, largely represented by roads and highways. This was likely because 54% of our presence dataset consisted of roadkill. Although this fact likely caused some bias in our model (as evidenced by roadkill locations occurring in areas of different habitat than for live armadillo locations), we believe it can still guide wildlife managers in predicting where armadillos may colonize in Illinois, as armadillos uti-lize areas with human disturbance and are even considered a pest, foraging on lawns and burrowing under man-made structures [1]. Additionally, roads may accelerate armadillo dispersal due to both accidental and purposeful human translocations. The expansion of transportation routes corresponded with the beginning of the armadillo's range expansion in Texas, and human introductions lead to established populations in Florida [1,25]. Human introductions of armadillos have occurred numerous times throughout the southern United States, including the release or escape of armadillos into new regions, along with accidental transport on vehicles carrying livestock [13,25].
Consistent with our predictions, probability of presence was high in forested areas, such as the Shawnee National Forest and Kaskaskia, Big Muddy, and Little Wabash river systems. This is consistent with the idea that armadillo activity and dispersal corridors are centered around riparian areas [1]. McDonough et al. [20] reported that live armadillos and burrows were frequently sighted in hardwood hammocks and wetland areas, but burrows were never identified in open fields, such as pasture and cropland. Large, contiguous patches of forest cover in southernmost counties will provide quality habitat for armadillos that will likely serve as source populations. These areas are also characterized by low densities of roads, and therefore present a lower risk for pioneering individuals, as car collisions are a major source of mortality [1].
Contrary to our expectations, the land cover classes with the lowest probability of presence were pasture and cropland cover types. We did not anticipate agricultural areas to be the least important cover type, as armadillos use these areas to forage [1], and some even consider armadillos a nuisance to crops [13]. However, agricultural matrices may be able to support armadillo populations given the presence of forested riparian habitat [39]. This low probability of presence could be attributed to lower detection rates in pasture and crop fields, though McDonough et al. [20] also found that armadillos were seen less frequently than expected in fields.
In contrast to our prediction, distance to forest cover, distance to water, and annual precipitation had little contribution to the model. However, there was a clear positive correlation of annual precipitation and predicted presence, which concurs with other studies [2,3,24]. Armadillos are rarely found in arid regions, and rainfall impacts prey availability and foraging activity [1]. Without fresh water access, such as in periods of prolonged drought, armadillo mortality rates increase [59].
Our study can provide valuable information to wildlife managers in anticipating where armadillos may colonize in Illinois, and may serve as a guide for similar studies in other states. Confirmed localities near the northern edge of the study area in combination with predicted suitable habitat suggest that armadillos have the potential to colonize areas in central Illinois. Pioneering armadillos have been most successful in riparian habitat [24,60], and such may be true for Illinois [7] and for other parts of the nation as armadillos continue to expand their range. These riparian corridors may be used by armadillos to move through areas of heavy agricultural use characteristic of Illinois. Riparian systems that continue northward into central Illinois may serve as a marginal sink in which armadillos can survive temporarily but climatic conditions, such as harsh winters or drought, may not support long-term persistence [3,61]. Managers should monitor forested areas within agriculture-dominated regions that may serve as a refuge for armadillos, resulting in increased human-wildlife conflict such as crop damage or predation of poultry eggs [13,39].
Additionally, our study can serve as a guide for where armadillo range may overlap with sensitive species. For example, species active in leaf litter at night when armadillos are foraging, such as salamanders, may be particularly vulnerable to predation [62], although armadillos rarely consume amphibians and reptiles [13]. However, special attention should be paid to species that are threatened in the state, such as the Jefferson Salamander (Ambystoma jeffersonianum, [63]). Furthermore, armadillo habitat overlaps suitable bobwhite habitat in portions of Illinois [17]. However, there is evidence that hardwood removal and prescribed burning used to manage bobwhite habitat also affects armadillo populations negatively [21]. Managers should consider using these tactics to simultaneously reduce armadillo populations and improve bobwhite habitat in regions where these species overlap.