Geographic Distribution of Desert Locusts in Africa, Asia and Europe Using Multiple Sources of Remote-Sensing Data

: In history, every occurrence of a desert locust plague has brought a devastating blow to local agriculture. Analyses of the potential geographic distribution and migration paths of desert locusts can be used to better monitor and provide early warnings about desert locust outbreaks. By using environmental data from multiple remote-sensing data sources, we simulate the potential habitats of desert locusts in Africa, Asia and Europe in this study using a logistic regression model that was developed based on desert locust monitoring records. The logistic regression model showed high accuracy, with an average training area under the curve (AUC) value of 0.84 and a kappa coe ﬃ cient of 0.75. Our analysis indicated that the temperature and leaf area index (LAI) play important roles in shaping the spatial distribution of desert locusts. A model analysis based on data for six environmental variables over the past 15 years predicted that the potential habitats of desert locust present a periodic movement pattern between 40 ◦ N and 30 ◦ S latitude. The area of the potential desert locust habitat reached a maximum in July, with a suitable area exceeding 2.77 × 10 7 km 2 and located entirely between 0 ◦ N and 40 ◦ N in Asia-Europe and Africa. In December, the potential distribution of desert locusts reached its minimum area at 0.68 × 10 7 km 2 and was located between 30 ◦ N and 30 ◦ S in Asia and Africa. According to the model estimates, desert locust-prone areas are distributed in northern Ethiopia, South Sudan, northwestern Kenya, the southern Arabian Peninsula, the border area between India and Pakistan, and the southern Indian Peninsula. In addition, desert locusts were predicted to migrate from east to west between these areas and in Africa between 10 ◦ N and 17 ◦ N. Countries in these areas should closely monitor desert locust populations and respond rapidly.


Introduction
Desert locusts have posed a major threat to agricultural activities since ancient times [1]. Approximately 300 species out of the more than 10,000 known desert locusts in more than 100 countries can cause serious damage to agriculture, forestry and animal husbandry activities. Some countries in Africa and Asia are more vulnerable than others [1]. There are approximately 60 countries with distribution of desert locusts, and research on the distribution area of desert locusts and its change mechanisms was conducted. This study has three objectives: (1) to assess the main factors affecting the distribution of desert locusts; (2) to analyse the potential distribution areas of desert locusts in different months; and (3) to identify hot spots and migration paths suitable for the survival and reproduction of desert locusts in order to increase awareness, prevention and countermeasure preparation in these areas.

Study Area
The study area for this research combines the spatial distributions of previous studies [1,2] with the current monitoring results. The study area ranges from Senegal, the westernmost point of Africa, to Gansu Province, China, in the east and from the northernmost point of Kazakhstan to the Cape of Good Hope in Africa in the south. It covers all of Africa and most of southwestern to central Eurasia, and has a total area of more than 5.15 × 10 7 km 2 . The study area contains different climatic zones, including tropical, arid, humid, and cold zones as defined in the updated Köppen-Geiger climatic classification [20]. Arid and semiarid climate regions that are usually distributed at altitudes of 0-2000 m account for more than half of the study area, including the Sahara, Kalahari, Arubali, Karakum, Taklimakan, and Tal deserts and many other tropical and temperate deserts ( Figure 1). These regions are favourable breeding grounds for desert locusts.
Remote Sens. 2020, 12, x FOR PEER REVIEW 3 of 15 simulate the potential distribution of desert locusts, and research on the distribution area of desert locusts and its change mechanisms was conducted. This study has three objectives: (1) to assess the main factors affecting the distribution of desert locusts; (2) to analyse the potential distribution areas of desert locusts in different months; and (3) to identify hot spots and migration paths suitable for the survival and reproduction of desert locusts in order to increase awareness, prevention and countermeasure preparation in these areas.

Study Area
The study area for this research combines the spatial distributions of previous studies [1,2] with the current monitoring results. The study area ranges from Senegal, the westernmost point of Africa, to Gansu Province, China, in the east and from the northernmost point of Kazakhstan to the Cape of Good Hope in Africa in the south. It covers all of Africa and most of southwestern to central Eurasia, and has a total area of more than 5.15 × 10 7 km 2 . The study area contains different climatic zones, including tropical, arid, humid, and cold zones as defined in the updated Köppen-Geiger climatic classification [20]. Arid and semiarid climate regions that are usually distributed at altitudes of 0-2000 m account for more than half of the study area, including the Sahara, Kalahari, Arubali, Karakum, Taklimakan, and Tal deserts and many other tropical and temperate deserts ( Figure 1). These regions are favourable breeding grounds for desert locusts.

Data Collection and Pre-Processing
Based on the biological characteristics of desert locusts, this study selected six environmental variables, including the Normalized Difference Vegetation Index (NDVI), Leaf Area Index (LAI), Soil Moisture (SM), Rainfall (RF), Land Surface Temperature (LST) and Elevation, to establish a model of the potential geographic distribution of the desert locust ( Table 1). The NDVI and LAI data

Data Collection and Pre-Processing
Based on the biological characteristics of desert locusts, this study selected six environmental variables, including the Normalized Difference Vegetation Index (NDVI), Leaf Area Index (LAI), Soil Moisture (SM), Rainfall (RF), Land Surface Temperature (LST) and Elevation, to establish a model of the potential geographic distribution of the desert locust ( Table 1). The NDVI and LAI data were extracted from the Terra-MODIS version 6 leaf area index product MOD15A2H (500 m resolution) and the vegetation index product MOD11A2 (1 km resolution) (https://modis. gsfc.nasa.gov/; [21,22]). The SM data were obtained from the global land surface data assimilation system (GLDAS 2.1) (3 h and 0.25 • × 0.25 • resolutions) (http://disc.gsfc.nasa.gov/; [23]). The RF data were extracted from the next-generation global satellite precipitation product from the Global Precipitation Mission (0.1 • × 0.1 • resolution) (http://disc.gsfc.nasa.gov/; [24]). The LST and surface elevation data were obtained from the 30-arc-second WorldClim global digital elevation model (http://www.worldclim.org/; [25]). The desert locust outbreak monitoring data were obtained from the Big Earth Data Science Engineering Project (http://data.casearth.cn/). The data were further processed before being used in the analysis following these steps: (1) the average values of the NDVI, LAI, LST and SM were calculated on a monthly basis; (2) the sum of the monthly RF was calculated; (3) the World Geodetic System1984 benchmark was used to project all geographic data into a geographic coordinate system; and (4) the data for all environment variables were resampled to 0.25 • × 0.25 • resolution to match the resolution of the SM data.

Model Establishment
Two thousand points were randomly extracted from the locust outbreak occurrence area and from the non-outbreak area in February 2020 as locust outbreak points and non-outbreak points, respectively. As is shown in workflow of analysis of the potential distribution of desert locusts (Figure 2), these points were then overlaid on the standardized evaluation index layer. ArcGIS's spatial analysis function was used to map the 2000 random points and determine the values of the six evaluation index layers at the random points. Then, 80% of the point data (1600 groups) was used to train the model, and 20% of the point data (400 groups) were used to test the accuracy of the model. To reduce errors caused by the improper selection of predictors, a complete dataset containing all environmental variables was used for the initial operation [26]. According to the principles of binary logistic regression, the random points in the locust outbreak occurrence area were assigned a value of 1, and the random points in the non-outbreak occurrence area were assigned a value of 0. The environmental data were used to analyse the potential habitats of desert locust based on the monthly averages from the past 15 years. Then, SPSS software was used to perform a binary logistic regression analysis on the 76,245 data cells.
In this study, multicollinearity was evaluated using the variance inflation factor (VIF) [27]. The six environmental variables had VIF values lower than 4, indicating that there was no multicollinearity. Two different evaluation methods were used to assess the performance of the models: the receiver operating characteristic (ROC) curve method [28] and Cohen's Kappa [29]. We used the lower area of the receiver's working curve (AUC) to estimate the accuracy of the model prediction [30,31]. A relative weight estimation method based on the maximum correlation orthogonal change was used to analyse the relative importance of the six environmental variables [32]. Remote Sens. 2020, 12, x FOR PEER REVIEW 5 of 15 In this study, multicollinearity was evaluated using the variance inflation factor (VIF) [27]. The six environmental variables had VIF values lower than 4, indicating that there was no multicollinearity. Two different evaluation methods were used to assess the performance of the models: the receiver operating characteristic (ROC) curve method [28] and Cohen's Kappa [29]. We used the lower area of the receiver's working curve (AUC) to estimate the accuracy of the model prediction [30,31]. A relative weight estimation method based on the maximum correlation orthogonal change was used to analyse the relative importance of the six environmental variables [32].
If P is the probability of an event occurring, and its value range is [0,1], the probability of the event not occurring is 1-P . When the value of P is close to 0 or 1, changes in P are difficult to capture, so the value of P needs to be transformed. Generally, the natural logarithm of is used in this situation, and logit conversion is performed on P, which is written as logit P . The value range of logit P is (−∞, +∞). Using P as the dependent variable, a linear regression equation can be established: (1) P can be solved as follows: If P is the probability of an event occurring, and its value range is [0, 1], the probability of the event not occurring is 1 − P. When the value of P is close to 0 or 1, changes in P are difficult to capture, so the value of P needs to be transformed. Generally, the natural logarithm of P(1 − P), ln(P(1 − P)) is used in this situation, and logit conversion is performed on P, which is written as logitP. The value range of logitP is (−∞, +∞). Using P as the dependent variable, a linear regression equation can be established: P can be solved as follows: In Equation (2), X 1 , X 2 , · · · X n are evaluation indexes that affect the probability of the results for the dependent variable; α is a constant that represents the logarithmic value of the ratio of the probability of occurrence of this event to the probability of non-occurrence when no evaluation index is involved; β 1 , β 2 , · · · β n are the logistic regression coefficients of each evaluation index, which means that when a single evaluation index changes, the logarithmic change value of the ratio of the occurrence and Remote Sens. 2020, 12, 3593 6 of 14 non-occurrence probability of the event; and P indicates the probability of the event occurring under the combined action of various evaluation indicators.

Assessment of Potential Desert Locust Habitat
The logistic regression model was used to predict the probability of desert locust survival (0-100%) in each grid at a spatial resolution of 0.25 • in the study area. To produce a map of the desert locust distribution, the continuous probability values were converted to binary predictions based on a threshold value. This probability threshold was determined by matching the model predictions to the extracted distribution of desert locusts according to the maximum training sensitivity plus specificity criterion [33]. This criterion uses training data to optimize the trade-off between specificity and sensitivity; it has been recognized as one of the most effective threshold selection methods [34,35]. Grids with predicted probabilities higher than the threshold value were assigned a value of 1, representing high to moderate habitat suitability, and were labelled suitable habitats. Grids with predicted probabilities lower than the threshold value were assigned a value of 0, representing low habitat suitability or an unsuitable habitat, and were labelled unsuitable habitats.
The above model was used to analyse the monthly average data over 12 months, and the predicted probability map for each month was converted into a binary distribution map as described above. Finally, the changes in the potential distribution area of desert locusts over the 12 months were analysed by comparing the habitat suitability/unsuitability maps for all the months. The changes in the potential locust distribution area were classified as (1) increased habitat, (2) decreased habitat, and (3) unchanged habitat [36]. This paper evaluates the changes in the spatial pattern of the desert locust potential distribution area from two perspectives: habitat area change and habitat range change. By calculating the location of the potential distribution area of desert locusts in the different months, the potential habitat change trends and directions of desert locust dispersal can be obtained.

Model Validation and Variable Contribution
Logistic models have not achieve great accuracy, and this is likely due to the broad distribution across several climatic regions. Although the modelling accuracy also depends on the factors like spatial resolution, size of the study area, methods and quality of input datasets [37], our study overall shows an acceptable model performance based on these two statistics. The ROC method and the kappa coefficient were used to test the accuracy of the model. After the calculations, the AUC value for the model was 0.84, and kappa was 0.75. Both methods indicated that the model performed well. The relative weight estimation method was used to recalculate the logistic regression coefficients of the various environmental variables to determine their contributions to the model results. The results showed that LST (27.02%) and LAI (25.63%) were the main contributors to the potential desert locust distribution. Their cumulative contribution was 52.65%.

Potential Distribution of Desert Locusts
The trained desert locust distribution model and the monthly average data for six environmental variables were used to simulate the potential distribution range of desert locusts in the study area over 12 months (Figure 3). The results show that the potential distribution area of desert locusts gradually increased from January to July. The distribution area reached its maximum in July, at approximately 2.77 × 10 7 km 2 , and was concentrated in the northern countries of Africa, the Arabian Peninsula, the southern tip of Turkey, the southern tip of Uzbekistan and Kazakhstan that reaches Central Asia, Iran, Pakistan, and the Indian Peninsula and extended to the southern part of Tibet, China. In addition, there were scattered distribution areas in northern Xinjiang and southern Spain. Starting in July, the distribution area gradually decreased, and it reached a minimum of 0.68 × 10 7 km 2 in December. This area was mainly distributed in the eastern part of Africa, from 10 • N to 30 • S latitude. North of the equator, the potential distribution area was only in South Sudan, northern Ethiopia, southern Somalia, the western Red Sea coast of the Arabian Peninsula and southern Yemen, Oman, the southernmost tip of Iran, the central and southern parts of the Indian Peninsula, and the border area between India and Pakistan.
in July, at approximately 2.77 × 10 7 km 2 , and was concentrated in the northern countries of Africa, the Arabian Peninsula, the southern tip of Turkey, the southern tip of Uzbekistan and Kazakhstan that reaches Central Asia, Iran, Pakistan, and the Indian Peninsula and extended to the southern part of Tibet, China. In addition, there were scattered distribution areas in northern Xinjiang and southern Spain. Starting in July, the distribution area gradually decreased, and it reached a minimum of 0.68 × 10 7 km 2 in December. This area was mainly distributed in the eastern part of Africa, from 10°N to 30°S latitude. North of the equator, the potential distribution area was only in South Sudan, northern Ethiopia, southern Somalia, the western Red Sea coast of the Arabian Peninsula and southern Yemen, Oman, the southernmost tip of Iran, the central and southern parts of the Indian Peninsula, and the border area between India and Pakistan.

Changes in the Potential Distribution Area
Based on the results of the potential desert locust distribution model with the highest training sensitivity and specificity, binary potential distribution maps for the desert locust ( Figure 3) were developed, and the areal changes in the potential distribution range in different months were

Changes in the Potential Distribution Area
Based on the results of the potential desert locust distribution model with the highest training sensitivity and specificity, binary potential distribution maps for the desert locust ( Figure 3) were developed, and the areal changes in the potential distribution range in different months were calculated ( Table 2). The results showed that the potential distribution area of desert locusts began to gradually increase in January and reached a maximum in July. Starting in August, the difference between the increased and decreased areas became negative, indicating that the potential distribution area of desert locusts was gradually decreasing; the distribution area reached its lowest value in December. In addition, the spatial differences between the potential desert locust distribution maps in adjacent months were determined to obtain the geographic and spatial distribution of the increased and decreased habitat areas in each month (Figure 4). Beginning in February, the potential distribution area of desert locusts gradually moved from south to north, and this phenomenon continued until July. Additionally, all of the increased habitat area was located in the southern part of the potential distribution area of desert locusts starting in August, at which point the whole area began to gradually move southward. The distribution area reached its southernmost point in January and began to move northward in February. The potential distribution area of desert locusts showed obvious periodic north-south movement within the year.
We extracted the geographic location of the areas where the desert locust population increased in each month and the corresponding temperature and SM data for these areas ( Figure 5). In the areas where the locust outbreak area increased, the temperature was fairly consistent within each month, but the SM varied substantially. We believe that SM and temperature impacted the locust distribution to different degrees because of this difference in their variability.
Remote Sens. 2020, 12, x FOR PEER REVIEW 8 of 15 calculated ( Table 2). The results showed that the potential distribution area of desert locusts began to gradually increase in January and reached a maximum in July. Starting in August, the difference between the increased and decreased areas became negative, indicating that the potential distribution area of desert locusts was gradually decreasing; the distribution area reached its lowest value in December. In addition, the spatial differences between the potential desert locust distribution maps in adjacent months were determined to obtain the geographic and spatial distribution of the increased and decreased habitat areas in each month (Figure 4). Beginning in February, the potential distribution area of desert locusts gradually moved from south to north, and this phenomenon continued until July. Additionally, all of the increased habitat area was located in the southern part of the potential distribution area of desert locusts starting in August, at which point the whole area began to gradually move southward. The distribution area reached its southernmost point in January and began to move northward in February. The potential distribution area of desert locusts showed obvious periodic north-south movement within the year.    We extracted the geographic location of the areas where the desert locust population increased in each month and the corresponding temperature and SM data for these areas ( Figure 5). In the areas where the locust outbreak area increased, the temperature was fairly consistent within each month, but the SM varied substantially. We believe that SM and temperature impacted the locust distribution to different degrees because of this difference in their variability.

The Important Influence of Temperature Change on Desert Locusts Distribution
For this model, we demonstrated the partial effect of the two most important variables on the spatial distribution of desert locust, having first ranked the relative importance of each of the significant variables. Studies have shown that temperature is the major factor controlling the migration and distribution of desert locusts (Table 3). Among the six environmental variables, LST contributed the most to the desert locust distribution (27.02%), followed by the monthly average LAI (25.63%), and SM had the lowest impact (2.7%) ( Table 3). This result may be explained by the fact that desert locusts occur over much of the area of warmer and more lush vegetation. Similar to the results of Gómez et al. [3], our results show that the average LST and NDVI are the main factors affecting locust development. Waldner [13] showed that the dynamic greenness map in summer had a strong correlation with the desert locust breeding area (F score = 0.64-0.87). This result can be explained in two ways: (1) during the migration stage of locusts, temperature is the main factor that affects whether or not they migrate. At the same time, they need to eat to survive, so locusts usually choose to land in areas with high vegetation coverage; (2) according to the results of previous studies [17], SM is the main factor affecting locust incubation, but our analysis showed that SM had little influence on locust migration and movement.

Prediction and Analysis of the Migration Path of Desert Locust
Through the analysis of the potential distribution range of desert locusts each month, the overall potential distribution range of the desert locust in Asia, Europe and Africa has been obtained ( Figure A1). The result shows that the potential distribution area of the desert locust is widely distributed over tens of millions of square kilometers of the Asian and African continents, including the entire Indian Peninsula, the Arabian Peninsula, and all countries in northern Africa, and extends along the eastern side of Africa to South Africa, in Spain and Europe. It is also distributed in southern Turkey, with the northernmost point reaching southern Kazakhstan. However, there is no suitable area in the western part of Tibet of China. Only the southern part of Tibet, China, is suitable for desert locusts.
In addition, based on the potential distribution areas of desert locusts in each month (Figure 3), we extracted the areas suitable for desert locust survival throughout the year and for two-thirds of the year and classified them as high-risk and moderate-risk areas, respectively ( Figure A2). The high-risk areas were located in South Sudan, the northeastern part of the Congo, central Uganda, the northwestern part of Kenya, the northern and central regions of Ethiopia, the western side of the Arabian Peninsula, southern Pakistan and the eastern coastal areas of the Indian Peninsula. The risk areas included countries such as Mauritania, Mali, Niger, Nigeria, Chad and Sudan between 10 • N latitude and 17 • N latitude in Africa and were also distributed near the Persian Gulf in eastern Saudi Arabia, Oman, Yemen, and Asia. Another risk area was located in northern India and central Pakistan. Since the range of at-risk areas changes mainly from March to October, the risk areas are likely the main activity areas for desert locusts in summer; the high-risk areas are the areas where desert locusts are common in winter; and the locusts will be in at-risk areas in winter and summer during their migration and proliferation.

Uncertainties and Limitations
Several studies predict that global warming will alter areas of suitable bioclimatic conditions and shift climate suitability toward the high-latitude areas for a range of invertebrate species [38,39]. In recent decades, the temperature in Central Asia has increased significantly (0.4 • C/decade); at the same time, the region has experienced several early springs [40]. Taking into account the uncertainty of future environmental data, confidence in the prediction results of Central Asia and Southern Africa is low, and the potential distribution area of the desert locust may be slightly changed. This model did not account for desert locust life history traits and population biology or their interaction in a changing climate. However, by comparing the predicted results of the model with the actual distribution of the reports of desert locusts, if the environmental factors do not change drastically in the coming months, the model can accurately predict the potential distribution areas of desert locusts.
Although two vegetation factors are used to analyse the land surface, we did not demonstrate the influence of land use on the desert locust. The principal anthropic interactions with locusts are agricultural land use and locust control operations [18]. While some regions in central Africa are forested areas which are not suitable for desert locust, however, it is possible that forest areas turn to grasslands or cultivated field in the future due to climate change and land degradation to some extent. Many other factors also affect locust distribution, such as wind speed and direction, grass productivity and soil physical properties. Improvements in the prediction of the spatial distribution of desert locust outbreaks are, theoretically, possible by integrating these relational variables. However, it is still an open question due to the range of complex and non-linear interactions between these variables.

Conclusions
The primary goal of this research was to analyse the potential distribution range of desert locusts in different months. In addition, high-risk areas were identified in the study area, and the possible migration patterns of desert locusts were analysed. To achieve the primary goal, a logistic regression method combined with remote-sensing images from multiple sources was used to establish a prediction model for the potential distribution range of desert locusts. The accuracy of the model, as evaluated by the receiver operating curve (AUC = 0.84) and kappa coefficient (kappa = 0.75), was good. The results of the study show that the areas at risk from desert locusts are located in countries such as Mauritania, Mali, Niger, Nigeria, Chad, and Sudan between 10 • N and 17 • N latitude in Africa, in the northeastern part of the Congo, central Uganda, northwestern Kenya, and northern Ethiopia; in Central Asia, the risk areas include the eastern and western sides of the Arabian Peninsula, Yemen, Oman, central Pakistan, and the northern and eastern coastal areas of India. The temperature (LST) and leaf area index (LAI) have important impacts on changes in the potential distribution area of desert locusts. Due to the north-south movement of the sun's position, the potential distribution range of the desert locust shows a periodic movement pattern. However, because the potential distribution area reaches high-latitude areas for only a short time, the desert locusts do not show obvious north-south migration; instead, they migrate east-west in Africa from 10 • to 17 • north. These results provide the possible distribution range and development path of the desert locust, and from an operational point of view, which may be useful for desert locust surveillance and control operations.