Transmission Risk Prediction and Evaluation of Mountain-Type Zoonotic Visceral Leishmaniasis in China Based on Climatic and Environmental Variables

: With global warming and socioeconomic developments, there is a tendency toward the emergence and spread of mountain-type zoonotic visceral leishmaniasis (MT-ZVL) in China. Timely identiﬁcation of the transmission risk and spread of MT-ZVL is, therefore, of great signiﬁcance for effectively interrupting the spread of MT-ZVL and eliminating the disease. In this study, 26 environmental variables—namely, climatic, geographical, and 2 socioeconomic indicators were collected from regions where MT-ZVL patients were detected during the period from 2019 to 2021, to create 10 ecological niche models. The performance of these ecological niche models was evaluated using the area under the receiver-operating characteristic curve (AUC) and true skill statistic (TSS), and ensemble models were created to predict the transmission risk of MT-ZVL in China. All ten ecological niche models were effective at predicting the transmission risk of MT-ZVL in China, and there were signiﬁcant differences in the mean AUC (H = 33.311, p < 0.05) and TSS values among these ten models (H = 26.344, p < 0.05). The random forest, maximum entropy, generalized boosted, and multivariate adaptive regression splines showed high performance at predicting the transmission risk of MT-ZVL (AUC > 0.95, TSS > 0.85). Ensemble models predicted a transmission risk of MT-ZVL in the provinces of Shanxi, Shaanxi, Henan, Gansu, Sichuan, and Hebei, which was centered in Shanxi Province and presented high spatial clustering characteristics. Multiple ensemble ecological niche models created based on climatic and environmental variables are effective at predicting the transmission risk of MT-ZVL in China. This risk is centered in Shanxi Province and tends towards gradual radiation dispersion to surrounding regions. Our results provide insights into MT-ZVL surveillance in regions at high risk of MT-ZVL.


Introduction
Visceral leishmaniasis (VL), also known as kala-azar, is a zoonotic infectious disease caused by protozoan parasites of the genus Leishmania spp. and transmitted by the bite of and continental climates. The precipitation and relative humidity gradually increase from the north to the south of the study area, and the temperature gradually increases. The sand fly Ph. chinensis, a primary vector for transmission of MT-ZVL in China, is predominantly identified in plains, hilly, and loess plateau regions at 30 • N to 43 • N and 102 • E to 121 • E, with elevations of 10 to 2750 m [29]. Local cases of MT-ZVL reported during the past five years were all concentrated in this region. Therefore, it is possible to examine the correlation between the distribution of MT-ZVL patients and potential P. sinensis distribution in this study area.

Collection of Case Data
All epidemiological data pertaining to MT-ZVL cases were collected from the National Notifiable Diseases Reporting System of the Chinese Center for Disease Control and Prevention. The townships or streets where local MT-ZVL cases were reported during the 2019-2021 period were defined as sampling sites, and the longitude and latitude coordinates were measured for each site. Then, each sampling site was checked to ensure the accuracy of the geographical coordinates. A total of 187 valid sampling sites were finally selected. In addition, the databases of reported cases from 2015 to 2018 were collected and combined to describe the trends of the disease.

Collection of Climatic and Environmental Variables
The environmental variable data pertaining to the distributions of VL and its vector sand flies were collected and finalized based on a literature review [30,31]. Ultimately, we included 19 climatic variables, 7 geographical variables, and 2 socioeconomic variables ( Table 1). The climatic variables were extracted from the WorldClim website. Available online: https://www.worldclim.org/data/worldclim21.html (accessed on 25 April 2022). Geographical variables included elevation, type of landform, type of land use, annual normalized difference vegetation index, and type of vegetation coverage. The socioeconomic variables included gross domestic product and population density, provided by the Resource and Environment Data Center, Chinese Academy of Sciences (Beijing, China). Available online: https://www.resdc.cn (accessed on 25 April 2022). All environmental variable data were clipped to the study area using ArcGIS 10.2 (ESRI, Redlands, CA, USA) and a bilinear resampling algorithm with 1 × 1 km spatial resolution.

Ecological Niche Modeling
Multiple algorithms are available to create ecological niche models, and modeling with the same data may generate different results and predictive maps [32,33]. There is no consensus on an optimal single algorithm for ecological niche models [34,35]. Therefore, scientists are encouraged to overcome the uncertainty using different approaches. In our study, ten ecological niche models were generated based on four machine learning algorithms: the surface range envelope (SRE) model, based on an environmental envelope (threshold-based) algorithm; the generalized linear model (GLM), generalized additive model (GAM), and multivariate adaptive regression splines (MARS), based on statistical regression algorithms; the generalized boosted model (GBM), classification tree analysis (CTA), and flexible discriminant analysis (FDA), based on classification algorithms; and the maximum entropy (MaxEnt) model, artificial neural network (ANN), and random forest (RF) model, based on machine learning algorithms. All models were created using the BIOMOD2 package in R. As presence-absence data are required to create models, 500 absence points were randomly sampled at a ratio of 1:2 based on 186 presence points to create two sets of presence-absence data. Each set of data was classified into 75% training datasets and 25% validation datasets, which were input into the ten ecological niche models. Each run was repeated ten times for a single model with the same variables.

Assessment of the Performance of Ecological Niche Models
Currently, there are threshold-free and -associated measures to assess the performance of ecological niche models [19]. In this study, the area under the receiver-operating characteristic curve (AUC), a threshold-free index, and the true skill statistic (TSS), a thresholdrelated index, were employed to assess the performance of the ecological niche models. The AUC value, ranging from 0 to 1, indicates the predictive accuracy of the ecological niche models, and an AUC value approaching 1 indicates higher accuracy. The TSS value is not affected by the occurrence of the distribution, but it can accurately identify the accuracy of a model. TSS values range between −1 and 1, and a TSS value close to 1 indicates high accuracy. A TSS value of >0.7 is generally accepted as a satisfactory prediction. Both indices can be used to evaluate the predictive accuracy of ecological niche models. The AUC value can be used to assess the predictive accuracy of a model regardless of the data distribution [36], whereas the TSS measures the consistency between the prediction results by models and sample data [37]. Therefore, the combination of these two indices more comprehensively assesses the performance of ecological niche models.

Prediction of MT-ZVL Transmission Risk in China
Due to the uncertainty of predictions using a single ecological niche model, ensemble models were created based on a single model to improve the accuracy and reliability of predicting the transmission risk of MT-ZVL. Briefly, a single ecological niche model with a poor training efficiency was removed based on an AUC value of >0.90 and a TSS value of >0.85 to screen for ensemble models. Then, the prediction accuracy of a single ecological niche model was normalized such that the raster data ranged from 0 to 1. Following the definition of the weight according to TSS values, the predictive accuracy of ensemble models was estimated using a weighted mean of probabilities (WM). Four grades were classified based on the WM values: no-risk areas (WM, 0.5 and lower), low-risk areas (WM, 0.51 to 0.7), medium-risk areas (0.71 to 0.9), and high-risk areas (0.91 to 1). Finally, the model prediction results were loaded into ArcGIS software to map the transmission risk of MT-ZVL in China.

Statistical Analysis
In what follows, the AUC and TSS values are described as the mean ± standard deviation (SD). Comparisons of the AUC and TSS values were carried out using the Kruskal-Wallis H-test. A p-value of <0.05 was considered statistically significant.

Comparison of the MT-ZVL Transmission Risk Predicted by Using Ecological Niche Models
Based on the distributions of MT-ZVL cases and Ph. chinensis vectors, ten ecological niche models were created to predict the transmission risk of MT-ZVL. Our data showed that all ten ecological niche models were effective at predicting areas at risk of MT-ZVL transmission; however, the coverage of the predicted areas varied among the models. The ANN and MARS predicted the largest but most dispersed areas at risk of MT-ZVL transmission, and CTA predicted the second-largest areas. The other seven models predicted relatively concentrated areas at risk of MT-ZVL transmission, and the GAM and MaxEnt models predicted the smallest areas. The CTA and SRE models predicted large areas at risk of MT-ZVL transmission, and most areas were predicted to be at high or medium risk. The high-, medium-, and low-risk areas predicted by the other seven models were centered in Shanxi Province and showed a tendency to disperse to the surrounding regions, with southern Liaoning Province to the north and the southern edge of the hilly regions in northern Sichuan Province to the south (Figure 3).

Comparison of the MT-ZVL Transmission Risk Predicted by Using Ecological Niche Models
Based on the distributions of MT-ZVL cases and Ph. chinensis vectors, ten ecological niche models were created to predict the transmission risk of MT-ZVL. Our data showed that all ten ecological niche models were effective at predicting areas at risk of MT-ZVL transmission; however, the coverage of the predicted areas varied among the models. The ANN and MARS predicted the largest but most dispersed areas at risk of MT-ZVL transmission, and CTA predicted the second-largest areas. The other seven models predicted relatively concentrated areas at risk of MT-ZVL transmission, and the GAM and MaxEnt models predicted the smallest areas. The CTA and SRE models predicted large areas at risk of MT-ZVL transmission, and most areas were predicted to be at high or medium risk. The high-, medium-, and low-risk areas predicted by the other seven models were centered in Shanxi Province and showed a tendency to disperse to the surrounding regions, with southern Liaoning Province to the north and the southern edge of the hilly regions in northern Sichuan Province to the south (Figure 3).

Performance of Ecological Niche Models
Ten ecological niche models were employed to predict the transmission risk areas of MT-ZVL in China, and the performance of the ecological niche models was evaluated. The RF model showed the highest mean AUC and TSS, and the SRE model exhibited the lowest mean AUC and TSS (Table 2). There were significant differences in the mean AUC (H = 33.311, p < 0.05) and TSS values among the ten ecological niche models (H = 26.344, p < 0.05). The results show that the RF, MaxEnt, GBM, and MARS models had the highest relative performance at predicting the risk of MT-ZVL transmission (mean AUC > 0.95, mean TSS > 0.85). The SRE model had the lowest performance (mean AUC < 0.8, mean TSS < 0.8).

Contributions of Environmental Variables to Ecological Niche Models
The contributions of the environmental variables to the ten ecological niche models were normalized (Table 3). The climatic variables that contributed most to the ecological niche models included the mean temperature of the coldest quarter, precipitation during the wettest quarter, the minimum temperature of the coldest month, and annual precipitation. The highest contributing geographical variables were elevation and normalized difference vegetation index, and the socioeconomic variables that contributed most were gross domestic product and population density. In addition, the importance of each environmental variable varied in ecological niche models that performed best at predicting the transmission risk of MT-ZVL. The five most important contributing variables in the RF model were normalized difference vegetation index (17%), isothermality (13.5%), mean temperature of the coldest quarter (11.3%), mean temperature of the wettest quarter (7.4%), and the minimum temperature of the coldest month (6.8%), with cumulative contributions of 56%. The five most important contributing variables in the MaxEnt model were the minimum temperature of the coldest month (24.1%), mean diurnal range (11.6%), annual precipitation (7.2%), precipitation during the driest month (6.5%), and precipitation during the wettest quarter (6.3%), with cumulative contributions of 55.7%. The five most important contributing variables in the GBM model were the mean temperature of the coldest quarter (63.6%), isothermality (7.7%), the minimum temperature of the coldest month (4.9%), the normalized difference vegetation index (4.7%), and precipitation during the wettest quarter (3.6%), with cumulative contributions of 84.5% (Figure 4).

Prediction of MT-ZVL Transmission Risk in China
Ensemble models predicted the risk of MT-ZVL transmission in a majority of the provinces of Shanxi and Shaanxi, southern Henan Province, the junction of Ningxia, southern Gansu, and northern Sichuan, the border areas of Hebei and Shanxi, and the

Prediction of MT-ZVL Transmission Risk in China
Ensemble models predicted the risk of MT-ZVL transmission in a majority of the provinces of Shanxi and Shaanxi, southern Henan Province, the junction of Ningxia, southern Gansu, and northern Sichuan, the border areas of Hebei and Shanxi, and the hilly regions in northwestern Beijing. The areas at high risk of MT-ZVL transmission were concentrated in central and southern Shanxi, eastern Shaanxi, and northern Henan. Mediumrisk areas surrounded the high-risk areas and were predominantly detected in Shanxi, Shaanxi, and Henan. Low-risk areas were mainly found on the border between southern Gansu and Sichuan, northeastern Henan, southern Hebei, and the hilly regions in northwestern Beijing. Overall, the areas at risk of MT-ZVL transmission were centered in Shanxi Province and distributed to the surrounding areas, which presented a high cluster ( Figure 5). hilly regions in northwestern Beijing. The areas at high risk of MT-ZVL transmission were concentrated in central and southern Shanxi, eastern Shaanxi, and northern Henan. Medium-risk areas surrounded the high-risk areas and were predominantly detected in Shanxi, Shaanxi, and Henan. Low-risk areas were mainly found on the border between southern Gansu and Sichuan, northeastern Henan, southern Hebei, and the hilly regions in northwestern Beijing. Overall, the areas at risk of MT-ZVL transmission were centered in Shanxi Province and distributed to the surrounding areas, which presented a high cluster ( Figure 5).

Discussion
Historically, MT-ZVL was mainly prevalent in the hilly regions of Gansu, Qinghai, Ningxia, northern Sichuan, northern Shaanxi, Henan, Hebei, Beijing, and Liaoning-north of the Yangtze River in China [41]. The predominant source of infection is canines. Humans acquire infections from infected dogs, and vectors transmitting MT-ZVL are wild

Discussion
Historically, MT-ZVL was mainly prevalent in the hilly regions of Gansu, Qinghai, Ningxia, northern Sichuan, northern Shaanxi, Henan, Hebei, Beijing, and Liaoning-north of the Yangtze River in China [41]. The predominant source of infection is canines. Humans acquire infections from infected dogs, and vectors transmitting MT-ZVL are wild species of sand flies. During the early stage of the founding of the People's Republic of China, there were a large number of MT-ZVL cases, and most were detected in children and young individuals from 5 to 30 years old (74.4%), as well as in men (63%) [42]. The disease severely damaged socioeconomic development in China. In the 1950s and 1960s, a package of VL control measures was implemented including Ph. chinensis control, canine control, and case management, and VL was almost eradicated in eastern, central, and northern China [43,44]. Therefore, MT-ZVL was controlled effectively, but only sporadic cases were found in Gansu and Sichuan, in southern China [45,46]. With recent socioeconomic developments, the living conditions and the environment have greatly improved, and the number of dogs has increased rapidly. Due to suitable climatic conditions caused by global warming, there is a rapid expansion in the distribution and density of P. chinensis populations, resulting in reemerging VL or emerging natural foci of VL in Shaanxi, Shanxi, Henan, and Hebei [28]. Notably, the number of reported MT-ZVL cases has increased rapidly in Shanxi Province during the past five years [47]. In 2016, local cases were identified in Henan Province, which had not reported any cases since 1983 [48]. Since then, local cases of VL have been detected in Hebei and Beijing, posing an imminent threat to local human health [40]. Therefore, it is urgent to identify the areas at risk of MT-ZVL transmission to interrupt its transmission.
Since vectors and pathogens are greatly affected by climatic, ecological, and environmental factors [49] and show temporal and spatial clustering accordingly [50], ecological niche models show great promise at predicting the transmission risk of vector-borne diseases [51]. Ecological niche models have widely been employed to predict the transmission risk of schistosomiasis and malaria; however, there have been few reports pertaining to the application of ecological niche models to studies of VL. In this study, we collected the distributions of MT-ZVL cases in China and climatic, geographical, and socioeconomic variables that may affect VL transmission risk to create ten ecological niche models. Ensemble models were then screened to predict the areas at risk of MT-ZVL transmission in China.
All ten ecological niche models were found to be effective at predicting the areas at risk of MT-ZVL transmission, but with varying results among the models. The GAM, MaxEnt, and GBM models predicted relatively concentrated areas at risk of MT-ZVL transmission, while the ANN, MARS, and CTA models predicted widespread areas. The CTA and SRE models predicted large areas at risk of MT-ZVL transmission, and most areas were predicted to be at high or medium risk. The high-, medium-, and low-risk areas predicted by the ecological niche models were centered in Shanxi Province and tended toward dispersion to the surrounding regions, with southern Liaoning Province to the north and the southern edge of the hilly regions in northern Sichuan Province to the south. We also evaluated the performance of the ten ecological niche models: The RF, MaxEnt, GBM, and MARS models performed best (mean AUC > 0.95 and mean TSS > 0.85). As a single ecological niche model suffers from problems of spatial uncertainty, such as low accuracy and poor predictive value [35], ensemble models were screened and showed higher performance and predictive accuracy than any single ecological niche model, reducing the spatial uncertainty of predicting MT-ZVL transmission risk [52]. To investigate the model's performance, the AUC and TSS values were employed. However, AUC and TSS provide the accuracy of different aspects of models. A new system named CCHZ (C: Chen, C: Chen, H: Hu, and Z: Zhou)-DISO (distance between indices of simulation and observation) has the advantage of measuring the overall and comprehensive performance of different models [53][54][55]. It will be used to quantify the overall performance among different models in our future study.
Our findings showed that the areas at risk of MT-ZVL transmission predicted by our ensemble models covered the current actual distribution of MT-ZVL and extended to the surrounding regions. High-and medium-risk areas were predicted to be concentrated in central and southern Shanxi, eastern Shaanxi, southern Gansu, and northern Henan. Lowrisk areas were mainly predicted in northern Sichuan, southern Hebei, and the hilly regions in northwestern Beijing. Overall, the primary high-risk areas for MT-ZVL transmission were centered in Shanxi Province and distributed to the surrounding regions, appearing as a high spatial cluster. The secondary high-risk areas were mainly distributed in southern Gansu and concentrated in several counties of Longnan City and the Gannan Tibetan Autonomous Region. The results mostly agreed with the current distribution of MT-ZVL. We predicted larger areas at risk of MT-ZVL transmission and larger high-risk areas in Shanxi and Henan provinces than the previous predictions of the nine ecological niche models. This was consistent with the current trends for MT-ZVL incidence in these two provinces [56]. In addition, our ensemble models predicted medium-and low-risk areas for MT-ZVL transmission in Ningxia, Henan, Hebei, and Beijing. To date, few cases of MT-ZVL have been reported in Henan and Hebei, and no local cases have been detected in Ningxia. Notably, local cases of MT-ZVL were identified in the suburbs of Beijing Municipality, where VL had been almost eradicated for decades. These reports demonstrate the likelihood of VL rebounding and reemerging in these provinces, implying the urgent need for the surveillance of Ph. chinensis vectors.
Next, we investigated the potential causes responsible for MT-ZVL rebounding and spreading. We identified climatic variables as the most important contributors to the ecological niche models, and four climatic variables (the mean temperature of the coldest quarter, precipitation during the wettest quarter, the minimum temperature of the coldest month, and annual precipitation) and two geographical variables (elevation and the normalized difference vegetation index) contributed the most to the models, emphasizing the rise of the minimum temperature and increased precipitation caused by global warming. In addition, we predicted the high-risk areas for MT-ZVL transmission to be in the Yanshan-Taihangshan mountain deciduous broad-leaved forest ecological zone and the Fenwei Basin Agro-ecological zone. With a temperate continental monsoon climate, these two ecological zones are hot and rainy in summer, with a hilly and frondent environment and a large number of wild animals. These are favorable climatic conditions for the breeding, reproduction, and habitation of wild sand flies. Furthermore, our findings showed the contribution of socioeconomic variables to ecological niche models. These variables are associated with the living environment of local residents. Local loess cave dwellings and architectural works made of cement, brick, and tile provide suitable habitats for Ph. chinensis breeding [57]. More importantly, the diagnosis, screening, and management of VL were neglected following the effective control of this disease in China, and this is partly responsible for the rebound of local VL epidemics [58]. In the context of a gradually increasing population density of wild species of P. chinensis, the introduction of animal sources of infections is very likely to cause VL transmission and reemergence. This may result in the spread of local VL to surrounding regions. With global warming, the duration and area of Ph. chinensis activities may further expand, which will further increase the VL transmission risk in these regions [59].

Conclusions
In summary, we generated ensemble ecological niche models based on environmental variables, and these ensemble models were shown to be effective at predicting the transmission risk of MT-ZVL in China. The findings by the ecological niche models agreed with the trends for MT-ZVL incidence in China and may provide technical support for MT-ZVL control and surveillance. Nevertheless, there are some limitations, such as the MT-ZVL case data were collected from the National Notifiable Diseases Reporting System, missing diagnoses or reports cannot be completely excluded, and the variable of the distribution of Ph. chinensis was not included, which may lead to risk underestimated. Further studies to acquire accurate distributions of MT-ZVL cases and vector species through field surveys are needed, and more quantitative variables should be included in the training datasets