Predicting Spatial Distribution of Key Honeybee Pests in Kenya Using Remotely Sensed and Bioclimatic Variables : Key Honeybee Pests Distribution Models

Bee keeping is indispensable to global food production. It is an alternate income source, especially in rural underdeveloped African settlements, and an important forest conservation incentive. However, dwindling honeybee colonies around the world are attributed to pests and diseases whose spatial distribution and influences are not well established. In this study, we used remotely sensed data to improve the reliability of pest ecological niche (EN) models to attain reliable pest distribution maps. Occurrence data on four pests (Aethina tumida, Galleria mellonella, Oplostomus haroldi and Varroa destructor) were collected from apiaries within four main agro-ecological regions responsible for over 80% of Kenya’s bee keeping. Africlim bioclimatic and derived normalized difference vegetation index (NDVI) variables were used to model their ecological niches using Maximum Entropy (MaxEnt). Combined precipitation variables had a high positive logit influence on all remotely sensed and biotic models’ performance. Remotely sensed vegetation variables had a substantial effect on the model, contributing up to 40.8% for G. mellonella and regions with high rainfall seasonality were predicted to be high-risk areas. Projections (to 2055) indicated that, with the current climate change trend, these regions will experience increased honeybee pest risk. We conclude that honeybee pests could be modelled using bioclimatic data and remotely sensed variables in MaxEnt. Although the bioclimatic data were most relevant in all model results, incorporating vegetation seasonality variables to improve mapping the ‘actual’ habitat of key honeybee pests and to identify risk and containment zones needs to be further investigated.


Introduction
Bee keeping is an important economic activity globally.Its pollinator services are particularly indispensable to global food production and nutritional security [1,2].Additionally, bee keeping aids natural resource conservation, especially in communities living around forests.It diversifies households' incomes in African savannahs often dominated by erratic and unreliable rainfall unable to sufficiently support rain-fed agriculture [3].The supplemental income comes from the sale of hive products such as honey, wax, propolis, royal jelly and to a lesser extent bee venom.However, honeybee health, pollination services and associated livelihood benefits are threatened by climate change, habitat alteration (fragmentation and loss), agriculture intensification, over dependence on agrochemicals, and increasingly by pathogens, pest and diseases [4,5].
Pests are particularly considered the most economically important due to their spatial coverage and ability to inflict both direct (through physical injury) and indirect (as pathogen and disease vectors) damage.Despite their role in colony destruction, occurrence and distribution, honeybee pests remain understudied within the Africa tropics, with investigations restricted to Kenya, Malawi and South Africa [6].Furthermore, data on in-country variations from one agro-ecological zone to another are scarce.In Africa, small hive beetle (Aethina tumida), large hive beetles (Oplostomus haroldi and Oplostomus fuligineus), wax moths (Galleria mellonella and Achroia grisella) and Varroa mite (Varroa destructor), are important pests capable of causing substantial economic damage [4,[6][7][8].Among these, the Varroa mite remains the single most devastating pest to honeybees due to its ability to inflict direct damage via its hematophagous trophic habit and indirectly through the active transmission of honeybee viruses [4,6,9].
Little is known of the spatial and temporal distribution of these pests on the continent.The occurrence and distribution of these pests is influenced by various biotic and abiotic variables.Like other pests, honeybee pests can survive in certain optimal bioclimatic conditions.For instance, the optimal temperature, humidity, precipitation, altitude and biomass/net primary productivity ranges for different honeybee pests can vary significantly [10].Studies have shown that the reproductive ability of honeybee pests can be limited by the prevailing dry conditions and enhanced by hot and humid conditions [8,11].Vegetation phenology can also influence honeybee pest population dynamics and their species richness.Vegetation phenology determines the availability of food resources, which their hosts (honeybees) use to make hive products such as honey and beebread on which the pests thrive on.In addition, plant resins collected by honeybees are useful for hive construction and defense against foreign organisms (social encapsulation) and self-medication [12,13].Therefore, vegetation phenology plays a central role in colony health both directly and indirectly by affecting the life stages of pests such as the hive beetles, which occur outside the hive environment [6,14].
Ecological niche (EN) modelling approaches can provide a pathway that statistically link the spatial variations in the bioclimatic variables to the distribution of a particular species (including honeybee pests).EN models rely on the correlations between habitats where different organisms (e.g., honeybee pests) thrive and environmental variables such as bioclimatic conditions, terrain and vegetation [15,16].However, the accuracy of the locations and temporal distribution of the honeybee pests and their habitats are important factors for deriving accurate models [17,18].Appropriate EN models that could accurately predict the spatial distribution of honeybee pest species rely on representative and accurate pest presence data together with carefully selected ecological and climatic predictor variables [19].
A fine and realistic geographic distribution of honeybee pest species is attained using remotely sensed variables such as vegetation 'greenness' dynamics [20].This is because remotely sensed variables are continuous observations with high spatial and temporal resolution compared to bioclimatic variables that are modelled or interpolated [14].Well processed remotely sensed data also help in reducing over-fitting (over predicting) of the EN models as such improving the accuracy and value of species and pest distribution estimates [20].
Although a number of studies have elucidated the effects of various biological, climatic, vegetation and edaphic factors on some honeybee pests, such as small hive beetles [21] and large hive beetles [7,8], the contribution of vegetation phenology and bioclimatic interactions on their distribution and abundance in Kenya remains minimally exploited.Therefore, mapping the spatial distribution of the honeybee pests is valuable for risk assessment and zonation of honeybee pest risk areas to establish containment zones and pathways of source transmission.State of the art remote sensing techniques can be used to derive phenological variables (such as start and end of the season) from remotely sensed data that offer fine scaled and specific information for prediction of the spread of honeybee pests.A model that links remotely sensed and bioclimatic variables provides an operational system to monitor long term changes related to the spread of pests [20].It also offers a tool that can model futuristic scenarios of the honeybee pest distribution since projected climate and phenological responses can be modelled.
In this study, we aimed to identify key ecological/remotely sensed and bioclimatic variables associated with the proliferation of honeybee pests in Kenya.We also sought to integrate remotely sensed variables for more coherent distribution maps using derived vegetation phenology parameters.We alluded to future scenarios of the distribution of these pests using modelled predictive data.

Study Sites
Occurrence data used in this study were collected from four main regions (agro-ecological zones) in Kenya; Coast, Mount Kenya, Mwingi and Kakamega (Figure 1), which are categorized by a representative climate gradient.An area covering 251,322.66square kilometers, spanning from the coastal region through the central and eastern part of Kenya to Lake Victoria, was used to model the ecological niches of these bee pests (Figure 1).This study site consists of 32 administrative counties.Temperature and humidity are highest in the coastal region while the Mwingi site is located in a semi-arid savannah region which is hot and dry [22].Mount Kenya and Kakamega are high altitudes regions that exhibit cooler and wetter climatic conditions [23].
ISPRS Int.J. Geo-Inf.2017, 6, 66 3 of 19 from remotely sensed data that offer fine scaled and specific information for prediction of the spread of honeybee pests.A model that links remotely sensed and bioclimatic variables provides an operational system to monitor long term changes related to the spread of pests [20].It also offers a tool that can model futuristic scenarios of the honeybee pest distribution since projected climate and phenological responses can be modelled.
In this study, we aimed to identify key ecological/remotely sensed and bioclimatic variables associated with the proliferation of honeybee pests in Kenya.We also sought to integrate remotely sensed variables for more coherent distribution maps using derived vegetation phenology parameters.We alluded to future scenarios of the distribution of these pests using modelled predictive data.

Study Sites
Occurrence data used in this study were collected from four main regions (agro-ecological zones) in Kenya; Coast, Mount Kenya, Mwingi and Kakamega (Figure 1), which are categorized by a representative climate gradient.An area covering 251,322.66square kilometers, spanning from the coastal region through the central and eastern part of Kenya to Lake Victoria, was used to model the ecological niches of these bee pests (Figure 1).This study site consists of 32 administrative counties.Temperature and humidity are highest in the coastal region while the Mwingi site is located in a semi-arid savannah region which is hot and dry [22].Mount Kenya and Kakamega are high altitudes regions that exhibit cooler and wetter climatic conditions [23].

Occurrence Data
Honeybee pest species occurrence data were collected during the wet season (March to April 2014) and dry season (June 2015).An apiary with 5-10 colonies in one location was treated as one sample point.Four honeybee pest species (A.tumida, G. mellonella, O. haroldi and V. destructor) were selected for this study.These four pests are the most frequently encountered and considered the most economically important with widespread prevalence as invasive pests globally [7,8].A census of pests in five colonies in each apiary was done and their presence recorded using standard methods as described by Dietemann et al. [24] for A. tumida, G. mellonella and V. destructor, and by Torto et al. [8] for O. haroldi.We sampled 37 apiaries which had A. tumida present, 38 had G. mellonella, 24 had O. haroldi, and 45 had V. destructor in all the regions included in this study.These were within the sample sizes acceptable in the MaxEnt modelling environment [10,25].
We assumed that individuals of these species were uniformly distributed across space.Presence data were collected from different sites independent from the environmental variables according to unknown probability of distribution.Therefore, apiaries were sampled randomly according to the requirements of estimation density, which obliges that individuals be sampled randomly across landscape in proportion to population density [26].
To test the influence of seasonality on the honeybee pest abundance, Mann-Whitney U-test was performed to test the significance (p ≤ 0.05) differences between wet and dry seasons following the observation of heterogeneous variance.

Preparation of Data for Analysis
Prediction analyses were based on various geographical data sets, which were raster and vector, clipped to the boundary of the study site.These data sets were divided into two main groups; occurrence and predictor variables.Predictor variables were categorized into remotely sensed and bioclimatic variables (Table 1).Remotely sensed variables on vegetation productivity dynamics from Moderate Resolution Imaging Spectroradiometer (MODIS) time series data, at 250-meter spatial resolution, were processed and used as ecological predictor variables in the EN modelling (Table 1).Corrected MODIS 16-day NDVI composites were derived for the years 2001 to 2014 (14-year observation period).The corrected time series data sets were further processed in TIMESAT software [27,28] to derive vegetation seasonality variables that formed part of the input environmental variables in the Maximum Entropy modelling algorithm (Table 1).
TIMESAT analyses phenological signals found in time series satellite data by fitting local functions to the time series data points, then combines them into a global model [29,30].Thereafter, a smooth model function is used to extract phenological variables for each growing season, which in turn reduces the influence of residual signal noise in the NDVI time series data [31,32] and data dimensionality [33,34].Function-fitting parameters used in TIMESAT for this study were: a Savitzky-Golay filter procedure, 3-and 4-point window over 2 fitting steps, adaptation strength of 3.0, no spike or amplitude cutoffs, season cutoff of 0.0, and begin and end of season threshold of 20%.
In total, 12 remotely sensed variables were extracted for each growing season; time for the start of the season (seas_start) which was set at 20% from the left edge minimum, time for the end of the season (seas_end) set at 20% from the right minimum, length of the season or time from the start to the end of the growing season (seas_length), base level, which was calculated by averaging the left and right minimum values (base_level), time for the mid of the season (seas_mid), largest data value for the fitted function during the season (max), seasonal amplitude (amplitude), rate of increase at the beginning of the season (left_der), rate of decrease at the end of the season (right_der), large seasonal integral (large_int), small seasonal integral (small_int) and number of seasons in a calendar year (num_seas) (Table 1).Only variables for the first season were used in this study since data from the second season were not consistent throughout all the years.

Topographical Variables
Various topographical variables; elevation, slope, hillshade and aspect in radiation degrees (Table 1) were included in the EN model to predict occurrence of the four honeybee pests species.The topographical variables were derived from a filled 90 m digital elevation model (DEM) data from the Shuttle Radar Topography Mission (SRTM) instruments [35].The DEM data were resampled to fit the 250 m pixel size and the spatial extent of the MODIS 16-day NDVI datasets.

Bioclimatic Data
For simulation of the honeybee pest species distribution, current climatic conditions at one kilometer grid resolution from the AfriClim data set were used [36][37][38].This data set contains grids of temperature, rainfall and derived bioclimatic summary variables.For prediction of distribution under future climatic condition simulations, downscaled data of the Representative Concentration Pathways Scenarios, Fifth Assessment Report (RCPs-AR5) [39] future year 2055 (mean over 2041-2070) were used.
Bioclimatic data (Table 1) were divided into temperature and moisture variables.The temperature variables included mean annual temperature (bio1), mean diurnal temperature (bio2), isothermality (bio3), temperature seasonality (bio4), maximum temperature for the warmest month (bio5), minimum temperature of the coolest month (bio6), annual temperature range (bio7), mean temperature of the warmest quarter (bio10), mean temperature of coolest quarter (bio11), and potential evapotranspiration (pet).Moisture variables were; mean annual rainfall (bio12), rainfall of the wettest month (bio13), rainfall of the driest month (bio14), rainfall seasonality (bio15), rainfall of the wettest quarter (bio16), rainfall of the driest quarter (bio17), annual moisture index (mi), moisture index of the moist quarter (mimq), moisture index of the arid quarter (miaq), number of dry months (dm), and length of the longest dry season (llds).All the 21 climatic variables were resampled to 250 m cell size and to the spatial extent of the MODIS 16-day NDVI datasets.

Ecological Niche Modelling
Ecological niches of the four honeybee pests were modelled using Maximum Entropy (MaxEnt) in the NMaxEnt tool package version 3.3.3k[40].Probability of estimation for honeybee pest species occurrence was fitted to a set of pixels of features to maximize entropy of estimation density under given constraints of unknown variable phenomena.The values of these pixels were then used to estimate the distribution probability of these pest species.Automated random sampling of pseudo-absence/background samples from a set of pixels within the boundaries of Kenya [41], where these species have not been detected [42] was used to maximize predictivity.Pseudoabsence of the probability of presence which would otherwise be confined to presence only [43].Each species was assumed to have the same probability of being anywhere on the landscape, hence every pixel or environment on the landscape had the same probability of being tagged as "background" in geographic and environmental space.

Variable Selection
To minimize multicollinearity among predictor variables [26], we performed a Pearson correlation test between all the predictors presented in Table 1 to get orthogonal features suitable for the EN model.A correlation coefficient of |r| > 0.7 [44] was set as a collinearity indicator for variables that would severely affect our EN model.Variables that met this criterion were eliminated based on the variable importance analysis, which was performed to identify those with high explanatory power.Test runs were carried out using all bioclimatic and remotely sensed variables in Table 1 to analyze their contribution (Figure 2) to each pest species before building their niche models using the jackknife test [45].A summary of the methodology is illustrated in a flowchart (Figure 3).The interactions of the various variables, occurrence data and selection of the suitable model are graphically represented.

Model Settings
We used default MaxEnt settings with a few alterations aimed at attaining precise and accurate predictions [19,26].Twenty percent of occurrence records were withheld from each model to be used as independent test data.However, the default settings penalized our model by overfitting.Therefore, we opted to use the regularization multiplier of three to spread out the predictions and set the replicate runs at ten to obtain a robust model.Due to increased replication, random seed was used to select different random test/run partition for each run.We used the '10 percentile training presence threshold' which predicts the 10% most extreme presence observation as absent [20] to eliminate outliers from the model.
Modelling was performed under current bioclimatic, vegetation and topographical conditions.Specifically, models were developed using only bioclimatic predictors (model 1), only remotely sensed variables (model 2) and both bioclimatic and remotely sensed variables (model 3) combined.These models were projected into the future (year 2055).We used future climate simulations from Africlim to project future scenario and due to absence of vegetation projection we assumed vegetation and topography to be unchanging over the projection period.A summary of the methodology is illustrated in a flowchart (Figure 3).The interactions of the various variables, occurrence data and selection of the suitable model are graphically represented.

Model Settings
We used default MaxEnt settings with a few alterations aimed at attaining precise and accurate predictions [19,26].Twenty percent of occurrence records were withheld from each model to be used as independent test data.However, the default settings penalized our model by overfitting.Therefore, we opted to use the regularization multiplier of three to spread out the predictions and set the replicate runs at ten to obtain a robust model.Due to increased replication, random seed was used to select different random test/run partition for each run.We used the '10 percentile training presence threshold' which predicts the 10% most extreme presence observation as absent [20] to eliminate outliers from the model.
Modelling was performed under current bioclimatic, vegetation and topographical conditions.Specifically, models were developed using only bioclimatic predictors (model 1), only remotely sensed variables (model 2) and both bioclimatic and remotely sensed variables (model 3) combined.These models were projected into the future (year 2055).We used future climate simulations from Africlim to project future scenario and due to absence of vegetation projection we assumed vegetation and topography to be unchanging over the projection period.

Model Evaluation
We used threshold independent area under curve (AUC) of receiver operating characteristic (ROC) analysis model [19] to assess model performance.The area under a ROC curve indicates the probability that presence (sensitivity) versus absence (specificity) or background points [46] were ordered correctly by a classifier.AUC values of zero (0) indicate impossible occurrence area while one (1) indicate optimal occurrence area or perfect ordering [47].We used the Swets [48] discriminatory power to rank the models as; (i) excellent = 0.91-1.00,(ii) good = 0.81-0.90,(iii) fair = 0.71-0.80,(iv) poor = 0.61-0.70,and (v) fail = 0.51-0.60.To assess the utility of remotely sensed variables in improving the model performance, the null model [49] was used.Thirty-six randomly generated points from the observed presence cells were used to extract AUC values from each of the three models (model 3, model 1 and model 2).Mann-Whitney U-test was used to evaluate the

Model Evaluation
We used threshold independent area under curve (AUC) of receiver operating characteristic (ROC) analysis model [19] to assess model performance.The area under a ROC curve indicates the probability that presence (sensitivity) versus absence (specificity) or background points [46] were ordered correctly by a classifier.AUC values of zero (0) indicate impossible occurrence area while one (1) indicate optimal occurrence area or perfect ordering [47].We used the Swets [48] discriminatory power to rank the models as; (i) excellent = 0.91-1.00,(ii) good = 0.81-0.90,(iii) fair = 0.71-0.80,(iv) poor = 0.61-0.70,and (v) fail = 0.51-0.60.To assess the utility of remotely sensed variables in improving the model performance, the null model [49] was used.Thirty-six randomly generated points from the observed presence cells were used to extract AUC values from each of the three models (model 3, model 1 and model 2).Mann-Whitney U-test was used to evaluate the significance (p ≤ 0.05) differences between the three models.We further calculated the omission error of the various models developed, since our models used presence only data and calculating the commission error requires absence data [50].

Honeybee Pest Abundance
Table 2 shows the abundance data of the four honeybee pests combined across the four regions (Mount Kenya, Mwingi, Kakamega and coastal regions) collected during the wet and dry seasons.Generally, there were more pests during the wet season compared to the dry season.There was a significant difference in pest abundance between seasons.Moreover, A. tumida and V. destructor appeared to be more sensitive to precipitation changes as their abundance increased by almost six folds.

EN Models
Table 3 shows the threshold-independent AUC obtained from the ROC analysis model, which indicated 'good' performance for all model 3 and all model 1 except G. mellonella's, and fair model 2 for all honeybee pests.Model 1 achieved higher AUC values for all pests than model 2.However, model 3 had higher AUC scores for A. tumida (0.87), O. haroldi (0.87) and V. destructor (0.88) than model 1 or model 2. The AUC values for G. mellonella model 3 (0.80) were the same as model 1 (0.80).However, model 3 prediction maps showed lower overfitting with low standard deviation in all honeybee pest species models (Table 3).The null model analysis showed that there was significant difference between model 2 and models 1 and 3 for all species except for O. haroldi.However, there was no significant difference between model 1 and model 3. Therefore, model 2 was eliminated from further model comparison.
The pest models had low omission errors for model 3 than the other models for all pests (listed as a percentage for model 3, model 1 and model 2 respectively, A. tumida 16.6, 19.4 and 25.0, G. mellonella 27.7, 33.3 and 38.9, O. haroldi 30.5, 33.3 and 38.9 and V. distractor 19.4,25.0 and 30.6).Model 3 performed better for all bee pests than the other models.

Predictor Variable Contribution
In all model 3 for the four honeybee pests, bioclimatic variables combined had the strongest relative influence than remotely sensed variables combined.Bioclimatic variables had a contribution of 88.1% for O. haroldi, 83.0% for A. tumida, 68.3% for V. destructor, and 58.7% for G. mellonella (Table 4).Further, in the bioclimatic cluster, precipitation variables had more influence (ranging from 58.0% to 81.0%) than temperature variables (ranging from 0.4% to 7.3%).Remotely sensed variables in model 3 on the other hand (biotic and topographical) had a total contribution of 17.0% for A. tumida, 41.3% for G. mollenela, 11.9% for O. haroldi and 31.7% for V. destructor.The combined relative contribution of remotely sensed biotic variables in model 3 ranged from 11.4% to 40.8% (Table 4).In this variable cluster, maximum NDVI in the season had a substantial relative contribution (between 7.7% and 29.4%) to the EN models.Topographical variables had a relatively lower contribution, less than 4.2% and no modelled effect on V. destructor.Based on the jackknife tests, the relative variable importance (both bioclimatic and remotely sensed) to model 3 (Figure 4), showed varied contribution to the various EN models.The first five ranked variables (bio15, mimq, max_ndvi, amplitude and slope) cumulatively contributed to 94.3% in the A. tumida model.The first five variables (mimq, max_ndvi, bio12, bio15 and base_level) had 96.3% cumulative contribution to G. mellonella model and 92.8% to V. destructor model, while bio15, bio3 and max_ndvi had a cumulative contribution of 95.3% to O. haroldi model.
Bio15 had the highest gain on A. tumida (43.9%),O. haroldi (79.7%) and V. destructor (28.2%) in model 3. Mimq contributed most of the information used in predicting habitat suitability of G. mellonella (32.4%).All these variables had a positive logit on the respective model 3. Topographical variables did not have any influence on V. destructor with little influence on G. mellonella and O. haroldi models.On the other hand, A. tumida and G. mellonella model 1 indicated that mimq had the highest positive effect on their distribution, while bio12 influenced V. destructor most.The O. haroldi model 1 was mostly influenced by bio15.Compared to the other remotely sensed variables, max_ndvi exhibited the largest influence in all the four EN models pertaining to the four honeybee pests.

Visualization of Distribution
The honeybee pest models predicted high habitat suitability in the Mount Kenya and the coastal regions for all four honeybee pests (Figure 5).Mwingi region was predicted to have low (G.mellonella) to high (O.haroldi) habitat suitability.However, Kakamega region had varied likelihood for suitability of all honeybee pests.This area was predicted to have a low O. haroldi risk, moderate V. destructor to moderately high A. tumida and high G. mellonella habitat suitability (Figure 6).Mount Kenya, Kakamega and the coastal regions are characterized by high rainfall seasonality.The coastal region however had higher temperature and humidity than Mount Kenya and Kakamega regions.On the other hand, Mwingi had relatively high temperature and lower isothermality than Mount Kenya and Kakamega.Although the coastal and Mount Kenya regions had similar rainfall seasonality, they had different temperature and isothermality gradients.Conversely, the coastal region and Mwingi had higher mean annual temperature and lower isothermality than the other regions, but exhibited different honeybee pest risk potentials.

Visualization of Distribution
The honeybee pest models predicted high habitat suitability in the Mount Kenya and the coastal regions for all four honeybee pests (Figure 5).Mwingi region was predicted to have low (G.mellonella) to high (O.haroldi) habitat suitability.However, Kakamega region had varied likelihood for suitability of all honeybee pests.This area was predicted to have a low O. haroldi risk, moderate V. destructor to moderately high A. tumida and high G. mellonella habitat suitability (Figure 6).Mount Kenya, Kakamega and the coastal regions are characterized by high rainfall seasonality.The coastal region however had higher temperature and humidity than Mount Kenya and Kakamega regions.On the other hand, Mwingi had relatively high temperature and lower isothermality than Mount Kenya and Kakamega.Although the coastal and Mount Kenya regions had similar rainfall seasonality, they had different temperature and isothermality gradients.Conversely, the coastal region and Mwingi had higher mean annual temperature and lower isothermality than the other regions, but exhibited different honeybee pest risk potentials.
Futuristic scenario models for the four honeybee pests generally indicated that high habitat suitability areas may spread to areas currently classified to have medium to low pest suitability (Figure 6).Modelled bioclimatic data essentially predict an increase in amount of rainfall in the wettest quarter and increased rainfall variability.Mount Kenya and the coastal regions were predicted to be high risk for all the pests while the risk in Kakamega varied from one pest to the other.Futuristic scenario models for the four honeybee pests generally indicated that high habitat suitability areas may spread to areas currently classified to have medium to low pest suitability (Figure 6).Modelled bioclimatic data essentially predict an increase in amount of rainfall in the wettest quarter and increased rainfall variability.

Contribution of Remotely Sensed Data
Remotely sensed variables increased the accuracy and precision of the models by reducing overfitting and refining their predictivity power (Table 3).A subset from the V. destructor maps showed that model 1 had over predicted distribution compared to model 3, which had remotely sensed variables with smaller pixel size (high resolution) and more environmental heterogeneity incorporated (Figure 7).
Model 3 had lower mean of habitat suitability than the model 1 for all the honeybee pests investigated and lower standard deviation for A. tumida and V. destructor (Figure 8).Model 3 histogram shows that predictivity was consolidated around the mean while bio_model histogram shows skewness to the right.Consolidation around the mean was occasioned by remotely sensed variables, whose measurement at grain size ensured variability was detectable at higher precision limited by spatial resolution.Moreover, model 3 had higher ranked AUC values and lower standard deviations than model 1 for all models except G. mellonalla, which, however, had lower standard deviation than its model 1 (Table 3).

Contribution of Remotely Sensed Data
Remotely sensed variables increased the accuracy and precision of the models by reducing overfitting and refining their predictivity power (Table 3).A subset from the V. destructor maps showed that model 1 had over predicted distribution compared to model 3, which had remotely sensed variables with smaller pixel size (high resolution) and more environmental heterogeneity incorporated (Figure 7).Model 3 had lower mean of habitat suitability than the model 1 for all the honeybee pests investigated and lower standard deviation for A. tumida and V. destructor (Figure 8).Model 3 histogram shows that predictivity was consolidated around the mean while bio_model histogram shows skewness to the right.Consolidation around the mean was occasioned by remotely sensed variables, whose measurement at grain size ensured variability was detectable at higher precision limited by spatial resolution.Moreover, model 3 had higher ranked AUC values and lower standard deviations than model 1 for all models except G. mellonalla, which, however, had lower standard deviation than its model 1 (Table 3).

Discussion
Understanding factors that influence the multiplication and spread of honeybee pests across various spatial extents and the level of risk they pose is important in bee health.Therefore, improved predictability of pest diversity and accurate honeybee pests' distribution maps will provide tools and information necessary for honeybee pest control [51].This study sought to identify conditions that encourage increase of honeybee pests in Kenya and developed models that could be used to predict their spread.Projected models served to show the level of the risk posed by the honeybee pest to the bee keepers in future.
Our surveillance data confirmed the presence of variation in abundance and distribution of the four honeybee pest species across the study sites.There were more records observed during the wet than the dry season.These findings were in line with a previous study by Torto et al. [8] which showed higher hive beetles abundance during rainy than the dry seasons in Kenya.However, observed seasonal variation in Varroa mite populations is contrary to previous reports of fairly constant mite infestations across Apis mellifera scutellata (the savannah honey bee) [52].Moreover, the coastal lowlands are inhabited by A. m. litorea and forest highlands by monticolla-like bees, which hybridize with savannah bees along their migratory paths [53].Therefore, there is need to further understand the interaction of these honeybee pests across the different spatial divides.

Predictor Variable Contribution
Although most honeybee pests have wider spatial ranges and can be found in various regions with different climatic conditions [54], there are specific climatic conditions that play a key role in determining their distribution.For instance, hive beetles and wax moth can thrive on alternative food sources while Varroa is an obligate ectoparasite.Therefore, the former two are more likely affected by climatic variations than the latter.We therefore identified four key bioclimatic and remote sensing variables to be the most important in determining occurrence of the four honeybee pests.Rainfall seasonality, mean annual rainfall, moisture index in moist quarter and largest NDVI value in the season had the most logit influence on the model 3 for the four honeybee pests (Table 4).
These variables are predominantly linked to precipitation apart from maximum NDVI in the season.Moreover, field occurrence data (Table 2) showed that there were more pests recorded in the wet season than the dry season.Similar results were presented by Torto et al. [8] who documented that precipitation's positive effect on Varroa was indirectly linked to production of bee forage that influence colony growth and brood increment.This in turn provide ample reproductive sites for the mite [24].Distribution maps alluded to the influence of precipitation on the occurrence and

Discussion
Understanding factors that influence the multiplication and spread of honeybee pests across various spatial extents and the level of risk they pose is important in bee health.Therefore, improved predictability of pest diversity and accurate honeybee pests' distribution maps will provide tools and information necessary for honeybee pest control [51].This study sought to identify conditions that encourage increase of honeybee pests in Kenya and developed models that could be used to predict their spread.Projected models served to show the level of the risk posed by the honeybee pest to the bee keepers in future.
Our surveillance data confirmed the presence of variation in abundance and distribution of the four honeybee pest species across the study sites.There were more records observed during the wet than the dry season.These findings were in line with a previous study by Torto et al. [8] which showed higher hive beetles abundance during rainy than the dry seasons in Kenya.However, observed seasonal variation in Varroa mite populations is contrary to previous reports of fairly constant mite infestations across Apis mellifera scutellata (the savannah honey bee) [52].Moreover, the coastal lowlands are inhabited by A. m. litorea and forest highlands by monticolla-like bees, which hybridize with savannah bees along their migratory paths [53].Therefore, there is need to further understand the interaction of these honeybee pests across the different spatial divides.

Predictor Variable Contribution
Although most honeybee pests have wider spatial ranges and can be found in various regions with different climatic conditions [54], there are specific climatic conditions that play a key role in determining their distribution.For instance, hive beetles and wax moth can thrive on alternative food sources while Varroa is an obligate ectoparasite.Therefore, the former two are more likely affected by climatic variations than the latter.We therefore identified four key bioclimatic and remote sensing variables to be the most important in determining occurrence of the four honeybee pests.Rainfall seasonality, mean annual rainfall, moisture index in moist quarter and largest NDVI value in the season had the most logit influence on the model 3 for the four honeybee pests (Table 4).
These variables are predominantly linked to precipitation apart from maximum NDVI in the season.Moreover, field occurrence data (Table 2) showed that there were more pests recorded in the wet season than the dry season.Similar results were presented by Torto et al. [8] who documented that precipitation's positive effect on Varroa was indirectly linked to production of bee forage that influence colony growth and brood increment.This in turn provide ample reproductive sites for the mite [24].Distribution maps alluded to the influence of precipitation on the occurrence and distribution of the honeybee pest.Occurrence and distribution was high in regions such as Mount Kenya, Kakamega and the coastal strip.Therefore, these four precipitation variables can be used to successfully predict the seasonal variations of these honeybee pests in various regions within country [8].As 'precipitation' is an overly important variable and given the future increases in predicted precipitation amounts and variability [36,37], there may be higher occurrences of the four pests we investigated.
Our EN models showed that topography had little to no influence on the occurrence of the modelled honeybee pests in Kenya.The models predicted both the regions with high altitude (such as Mount Kenya) and low altitude (such as the coast), to have high habitat suitability.This was consistent with Mani's [55] conclusion that most insects (honeybee pests included) had diverse altitudinal gradients.Indeed, the honeybee pest we investigated had the same habitat suitability across all altitudinal gradients.

Contribution of Remotely Sensed Data
The inclusion of remotely sensed variables into our EN models improved their ability to predict the distribution of honeybee pest species.Even though bioclimatic variables offered the ability to predict potential distribution of the honeybee pests [56], remotely sensed variables enhanced the predictability of model 3 for the four pests by reducing overfitting as shown in Figure 7.The prediction ability of these EN models improved because remotely sensed variables captured fragmented environmental conditions on the ground.The fine spatial resolution of remotely sensed variables reduced the errors that emanate from the generalization in bioclimatic variables over given spatial extents.Moreover, the model products (prediction maps) were more realistic in predicting habitat suitability in model 3 and they had reduced overestimation [20,57].
Remotely sensed variables also capture 'adverse' environmental conditions that limit distribution of these honeybee pests across homogenous bioclimatic conditions.Model 1 would predict similar occurrence within regions with similar bioclimatic conditions as shown in Figure 7.However, with remotely sensed variables such as maximum NDVI in the season, high honeybee pest potentials were predicted by model 3 in areas with high NDVI values as opposed to those that recorded low values.Torto et al. [8] demonstrated that honeybee pests reproduction is influenced by vegetation phenology such as NDVI.Moreover Pau et al. [14] also indicated that success of honeybee pests is determined by availability of pollen and nectar, on which their hosts depend on.Therefore, remotely sensed variables that capture these changes (i.e., NDVI variables) offer model 3 information that is not available in the model 1.Remotely sensed variables also provide the ability to refine the prediction of honeybee pest distribution as opposed to bioclimatic variables which reflect potential honeybee pest distribution [20].Further, remotely sensed environmental variables capture heterogeneity that is reflective of ground condition on or near real time basis, while bioclimatic data are interpolated at large areas hence depicting homogeneity at large spatial extents [58,59].

Advantages of Using Integrative EN Models and Applicability
EN models rely only on occurrence to estimate predictions.Availability of absence data improves the model precisions but such data are not readily available and hence one cannot conclude with certainty that lack of occurrence in an area confirms absence [60].Therefore, EN models use pseudoabsence to fill this gap.However, important information such as abundance of a pest in a certain region, which could improve prediction, is not utilized.Various remotely sensed data with different spatial-temporal and spectra resolutions are available, thus widening the EN modelling boundaries [20].Advanced analytical approaches of remotely sensed data also offer the ability to derive and use specific phenological information from raw data to improve detectability.
Although remotely sensed data have been previously used to map the distribution of various plant species [20,61], our models offer tools that could be used to accurately map the distribution of honeybee pests and conceivably other insect species.This is because remotely sensed data incorporated in our models provide timely (recent) information that allow vegetation changes as a result of anthropogenic land use, especially in agricultural areas or unprotected forests under consideration.Consequently, we offer tools that could be used in similar or differently designed methodologies as those documented herein, to improve pest management.
Remotely sensed data however have shortcomings that could negatively impact the prediction models.Since pixel size is an important indicator of the quality of the data, accuracy achieved by prediction models that have used remotely sensed variables are limited to the pixel size.Therefore, EN models that use data with coarse resolution may not give accurate predictions [57].On the other hand, remotely sensed data with fine spatial and spectral resolution, such as World View-2 and 3, AISA Eagle, are costly and unavailable.Moreover, challenges such as cloud cover greatly degrade the quality of optical remotely sensed data because environmental measurements may not be accurately captured or unavailable when the cloud cover is substantial.This however may be overcome by high temporal resolution which offers options to circumvent data with low quality or using microwave remote sensing technology that is not affected by cloud cover.
EN models require variables with the same pixel size (i.e., spatial resolution) in analysis.Seldom will spatial data from different sources have similar spatial resolution.Resampling the datasets to achieve same pixel size could introduce uncertainties [62,63] to the EN models.Such uncertainties and those associated with presence data collection may be misleading [64].Therefore, remotely sensed data used to build prediction models should be carefully selected, considering spatial, spectral and temporal resolutions.Moreover, vegetation data projected to future scenarios that offer EN models the ability to produce more refined prediction maps are not readily available.This leads to model assumption, which may include similar vegetation conditions both in the current and future scenarios that may reduce reliability of projected models in predictions.Additionally, data on flowering conditions that influence bee and pest populations are also not readily available.Flower maps in Kenya are only available in small spatial extents that do not span the whole area under this research [65,66].Similarly, even though apiaries were selected from representative agro ecological sites in Kenya, the full climatic niches of the studied bee pests might not have indeed been captured, particularly the background points were drawn from the whole extent of Kenya.Therefore, the individual variable contribution to the spread of the bee pests outside their training range should be further investigated.

Conclusions
EN models developed were 'good', hence deemed appropriate in predicting bee pest habitat suitability.However, it was established that model 1 which used only bioclimatic data had lower AUC values with higher standard deviations than model 3 that used both bioclimatic and remotely sensed data.Although remotely sensed data improved the models precision, bioclimatic variables influenced the models more than remotely sensed variables.It was also established from modelled bioclimatic data that bioclimatic change will have an impact on the distribution of these honeybee pests.Their spatial distribution would increase to areas currently classified to have low occurrence.Projected pest predictions could however be improved if appropriate pest management measures are put in place.Prediction maps could be used to identify high risk areas for quarantine to limit the spread of the pests.Therefore, these models would be important tools to decision makers in mitigating effects of destructive pests.Moreover, containment zones need further investigation to establish conditions that prohibit the proliferation of these pests.Targeted interventions should focus on zones where future pest outbreaks are expected.

Figure 1 .
Figure 1.Study area and the distribution of apiaries (white points) from where occurrence data were collected in four regions across Kenya categorized by a representative climate gradient.The backdrop shows the climatic gradients across the region.The apiaries are distributed across different agroecological zones.

Figure 1 .
Figure 1.Study area and the distribution of apiaries (white points) from where occurrence data were collected in four regions across Kenya categorized by a representative climate gradient.The backdrop shows the climatic gradients across the region.The apiaries are distributed across different agroecological zones.

Figure 2 .
Figure 2. Collinearity matrix for ecological niche models predictor variables.The collinearity threshold was set at |r| > 0.7 according to Dormann et al. [44].Darker shades of blue and red color indicate high variable collinearity while lighter shades indicate low collinearity.Bioclimatic variables, especially bio12 (mean annual rainfall) and dm (number of dry months) exhibited high collinearity compared to biotic and topographical variables.

Figure 2 .
Figure 2. Collinearity matrix for ecological niche models predictor variables.The collinearity threshold was set at |r| > 0.7 according to Dormann et al. [44].Darker shades of blue and red color indicate high variable collinearity while lighter shades indicate low collinearity.Bioclimatic variables, especially bio12 (mean annual rainfall) and dm (number of dry months) exhibited high collinearity compared to biotic and topographical variables.

Figure 3 .
Figure 3. Graphical summary of the methodology illustrating the interaction between different variables used to build the three bee pest models and selection of the suitable model.

Figure 3 .
Figure 3. Graphical summary of the methodology illustrating the interaction between different variables used to build the three bee pest models and selection of the suitable model.

Figure 4 .
Figure 4. Jackknife variable contribution for (a) A. tumida, (b) G. mellonella, (c) O. haroldi and (d) V. destructor for remote sensing and bioclimatic model (model 3).The dark blue shades show the regularized training gain for the specific variable, light blue without the variable while red show the regularized training gain with all the variables.

Figure 4 .
Figure 4. Jackknife variable contribution for (a) A. tumida, (b) G. mellonella, (c) O. haroldi and (d) V. destructor for remote sensing and bioclimatic model (model 3).The dark blue shades show the regularized training gain for the specific variable, light blue without the variable while red show the regularized training gain with all the variables.

Figure 5 .
Figure 5. Current habitat suitability for (a) A. tumida, (b) G. mellonella, (c) O. haroldi and (d) V. destructor in Kenya.Blue indicates low habitat suitability while red color represents high habitat suitability.Mount Kenya and the coastal regions were predicted to be high risk for all the pests while the risk in Kakamega varied from one pest to the other.

Figure 5 .
Figure 5. Current habitat suitability for (a) A. tumida, (b) G. mellonella, (c) O. haroldi and (d) V. destructor in Kenya.Blue indicates low habitat suitability while red color represents high habitat suitability.Mount Kenya and the coastal regions were predicted to be high risk for all the pests while the risk in Kakamega varied from one pest to the other.ISPRS Int.J. Geo-Inf.2017, 6, 66 13 of 19

Figure 6 .
Figure 6.Projected (2055) habitat suitability for (a) A. tumida, (b) G. mellonela, (c) O. haroldi and (d) V. destructor in Kenya.Blue indicates low habitat suitability while red color represents high habitat suitability.The high-risk areas were predicted to increase spatially across the country for all honeybee pests.

Figure 6 .
Figure 6.Projected (2055) habitat suitability for (a) A. tumida, (b) G. mellonela, (c) O. haroldi and (d) V. destructor in Kenya.Blue indicates low habitat suitability while red color represents high habitat suitability.The high-risk areas were predicted to increase spatially across the country for all honeybee pests.

Figure 6 .
Figure 6.Projected (2055) habitat suitability for (a) A. tumida, (b) G. mellonela, (c) O. haroldi and (d) V. destructor in Kenya.Blue indicates low habitat suitability while red color represents high habitat suitability.The high-risk areas were predicted to increase spatially across the country for all honeybee pests.

Figure 7 .
Figure 7. Improved predictivity and reduced overfitting of ecological niche models by using remotely sensed variables.The map on the left (a) is a subset from the V. destructor bioclimatic model (model 1) that show overestimation of this honeybee pest on the mid and left part of the map compared to the remote sensing and bioclimatic model (model 3) (b) that show high prediction reduced to the lower right corner of the map.

Figure 7 .
Figure 7. Improved predictivity and reduced overfitting of ecological niche models by using remotely sensed variables.The map on the left (a) is a subset from the V. destructor bioclimatic model (model 1) that show overestimation of this honeybee pest on the mid and left part of the map compared to the remote sensing and bioclimatic model (model 3) (b) that show high prediction reduced to the lower right corner of the map.

Figure 8 .
Figure 8. Predictivity frequency of the (a) biological models (model 1) and (b) remote sensing and biological models (model 3) for V. destructor.The histogram of remote sensing and bioclimatic model (model 3) had a lower mean and standard deviation (SD) than bioclimatic model (model 1).The predictivity for the remote sensing and bioclimatic model (model 3) was concentrated around the mean while the bioclimatic models (model 1) prediction was skewed to the right.

Figure 8 .
Figure 8. Predictivity frequency of the (a) biological models (model 1) and (b) remote sensing and biological models (model 3) for V. destructor.The histogram of remote sensing and bioclimatic model (model 3) had a lower mean and standard deviation (SD) than bioclimatic model (model 1).The predictivity for the remote sensing and bioclimatic model (model 3) was concentrated around the mean while the bioclimatic models (model 1) prediction was skewed to the right.

Table 1 .
Variables used in the ecological niche model were clustered to remotely sensed biotic variables derived from space borne normalized difference vegetation index (NDVI) data.Topographical variables were derived from digital elevation model from Shuttle Radar Topography Mission (SRTM), and bioclimatic variables from Africlim.

Table 2 .
Mean pest abundance recorded during the surveillance period in the wet and dry season in Kenya.Means counts in each column with similar letter are not significantly different (p ≤ 0.05) from each other according to Mann-Whitney test.

Table 4 .
Contribution, as percentage, of various bioclimatic and remotely sensed variables to the four ecological niche models using jackknife.The total cluster contribution is indicated at the bottom of clusters for each pest species.