Next Article in Journal
Denoising and Feature Enhancement Network for Target Detection Based on SAR Images
Previous Article in Journal
A Multi-Path Feature Extraction and Transformer Feature Enhancement DEM Super-Resolution Reconstruction Network
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Spatiotemporal Changes of Pine Caterpillar Infestation Risk and the Driving Effect of Habitat Factors in Northeast China

College of Geo-Exploration Science and Technology, Jilin University, Changchun 130026, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2025, 17(10), 1738; https://doi.org/10.3390/rs17101738
Submission received: 25 March 2025 / Revised: 10 May 2025 / Accepted: 13 May 2025 / Published: 16 May 2025

Abstract

Pine caterpillar (Dendrolimus) infestations threaten pine forests, causing severe ecological and economic impacts. Identifying the driving factors behind these infestations is essential for effective forest management. This study uses the APCIRD framework combined with an improved random forest model to analyze spatiotemporal changes in infestation risk and the driving effects of habitat factors in Northeast China. From 2019 to 2024, we applied SHapley Additive exPlanations (SHAP), frequency analysis, fitting functions, and GeoDetector to quantify the impact of key drivers, such as snow cover and soil, on infestation risk. The findings include (1) the APCIRD framework with the MLP-random forest model (MRF) accurately assesses infestation risks. MRF is composed of MLP and random forest. Between 2019 and 2024, areas with high infestation risk declined, shifting from higher to lower levels, with Eastern Heilongjiang and Southwest Liaoning remaining as key concern areas; (2) snow cover and soil factors are critical to infestation risk, with eight key habitat factors significantly affecting the risk. Their relationships with infestation risk follow complex, non-monotonic quartic and cubic patterns; (3) factors triggering high infestation risks are mostly at low to moderate levels. High-risk areas tend to have low to moderate elevation (<800 m), moderate to high solar radiation and temperature, gentle slopes (<30°), low to moderate evaporation, shallow snow depth (<0.02), moderate snow temperature (266.73–275), low to moderate soil moisture (0.2–0.3), moderate to high soil temperature (276.73–286.92), low to moderate rainfall, moderate wind speed, low leaf area index, high vegetation type, low vegetation cover, low population density, and low surface runoff. Interactions between factors provide a stronger explanation of infestation risk than individual factors. The APCIRD framework, combined with MRF, offers valuable insights for understanding the drivers of pine caterpillar infestations.

Graphical Abstract

1. Introduction

Forests are essential parts of terrestrial ecosystems, rich in both biomass and biodiversity [1,2]. In northeastern China, a major timber-producing region, pine caterpillar infestations threaten forest resources [3,4,5], ecosystem stability, and economic sustainability [6,7,8]. Accurately assessing the risk of pine caterpillar infestation and understanding how habitat factors drive this risk are crucial.
Understanding the habitat factors influencing pine caterpillar infestation mainly relies on sample plot surveys and meteorological station data, which have limitations [9,10,11]. Meteorological stations are unevenly distributed, with few in remote or mountainous areas, creating gaps in capturing local climate variations in complex terrains [12,13,14]. Furthermore, meteorological data focus on single factors like temperature and precipitation, often overlooking key non-meteorological factors such as soil, topography, and vegetation [15]. Some stations also lack long-term continuous data, which limits their usability [16]. To address these issues, recent research combines remote sensing and ground survey data with multi-scale modeling approaches [17]. Advances in remote sensing technology now provide large-scale, long-term habitat data with spatiotemporal continuity and diverse dimensions, enabling the monitoring of critical factors such as host area, growth status, field environment, and agricultural landscape patterns. Multi-source remote sensing, including optical, microwave, and thermal infrared data, improves habitat suitability assessment [18,19,20]. However, some studies still rely solely on outdated datasets, such as ‘WorldClim’ (1970–2000) [21,22], or short-term local data, limiting the ability to identify long-term trends and analyze the driving effects of habitat factors on infestation risk [23,24]. Additionally, incomplete or unevenly distributed data on climate, vegetation, and infestations reduce the general applicability of the findings. This highlights the need for a new approach to improve pest infestation risk assessments.
When assessing the level of infestation risk, multiple factors (e.g., climate, topography, soil, and land use) are often weighted for a comprehensive analysis [25,26,27]. However, these methods have limitations, including inaccurate weight assignments, overlooked factor interactions, lack of scientific basis for importance rankings, and insufficient uncertainty analysis, which reduce applicability across regions and scenarios. Pest habitat studies often use models like species distribution [28,29,30], statistical [31,32,33], and niche models [34,35,36]. However, these methods often lack consistency, limiting their integration into a unified, dynamic framework. MaxEnt is widely used to analyze species distributions, but its results typically focus on areas near the sample data [37,38,39], with limited comparability due to differences in model assumptions and parameters. Many models are region- or condition-specific, lacking scalability for larger areas [40]. Machine learning models often predict infestation risk zones but rarely analyze geographical or spatiotemporal distributions in depth [41,42,43], focusing mostly on short-term infestations and neglecting long-term trends [44]. Pine caterpillar infestations are influenced by both intrinsic (e.g., eggs) and extrinsic (e.g., climate, topography, soil, forest composition, human activities) factors, creating a complex habitat system that single models struggle to analyze [11,45,46]. Current studies often overlook multi-factor interactions [10], and real-world habitat factors display non-uniform distributions, further limiting analysis accuracy [47,48,49].
Existing studies mainly examine single-factor impacts on pine caterpillar infestation risk. Sparse meteorological data and insufficient fusion have hindered a systematic risk assessment framework. The absence of region-specific factor selection complicates accurate risk assessment and long-term habitat influence analysis. To improve this issue, this study incorporated snow cover and soil factors in Northeast China and proposed an APCIRD framework combined with the idea of MLP-random forest (MRF) to comprehensively understand and quantify the driving role of habitat factors on the risk of pine caterpillar infestation in Northeast China. Multi-layer Perceptron (MLP) can better capture complex relationships in non-normal data, and random forests have been widely used in building models [50,51,52]. This study assesses the temporal and spatial variations in pine caterpillar infestation risk from 2019 to 2024 by combining MRF with SHAP, fitting functions, frequency analysis, and GeoDetector. It identifies key areas and habitat factors requiring attention, models the functional relationship between key habitat factors and infestation risk, quantifies the optimal threshold range of these factors that contribute to high infestation risk, and demonstrates that the interaction between factors provides stronger explanatory power for infestation risk than individual factors.
The main contributions of this study are as follows: (1) Considering actual conditions, including snow accumulation and soil, this study integrates the APCIRD framework with an MRF to draw a county-scale infection risk map from 2019 to 2024 and highlight key areas of concern. (2) Assessing the impact of snowpack and soil on infestation risk, quantifying the relationship between key habitat factors and risk, and analyzing risk variation with these factors. (3) Analyzing the characteristics of habitat factors contributing to high risk and quantifying their optimal threshold ranges. (4) Demonstrating that interaction between factors provides stronger explanatory power for infestation risk than individual factors.

2. Study Area and Data

2.1. Study Area

We focused on northeastern China, covering Heilongjiang, Jilin, Liaoning, and eastern Inner Mongolia (Figure 1). The geographical extent spans from 38°40′N to 53°30′N and from 115°05′E to 135°00′E, covering approximately 1.24 million km2. The terrain is mainly composed of plains, hills, and mountains. Winters are marked by low temperatures, high wind speeds, and prolonged snow cover, affecting forest water cycles and energy flow. Key pine species include Pinus sylvestris L., Pinus koraiensis Siebold & Zucc., Pinus tabuliformis Carrière., and Pinus pumila (Pall.) Regel (http://nsii.org.cn/).

2.2. Data Source

2.2.1. Distribution Points of Pine Caterpillar Infestation and Non-Infestation

The distribution point data of pine caterpillar infestation mainly come from the Global Biodiversity Information Network Database (https://www.gbif.org/), the China Animal Theme Database (http://www.zoology.csdb.cn/), the Biodiversity and Ecological Security (https://bio-one.org.cn/), the Species 2000 China Node (http://col.especies.cn/), the National Forestry and Grassland Science Data Center (https://www.forestdata.cn/), pine caterpillar infestation warning information issued by the Forestry Departments of Jilin, Liaoning, Heilongjiang, and Inner Mongolia Autonomous Region, warning information released on the official websites of forestry and grassland bureaus of counties and cities, and the China Forestry Statistical Yearbook. Among them, the Global Biodiversity Information Network Database and China Animal Theme Database are GBIF data, which record the coordinates and years of the occurrence points of pine caterpillars, and most of them are data observed and recorded by humans. The Biodiversity and Ecological Security and Species 2000 China Node record information such as specimens, collection years, and common distribution areas of pine caterpillars, and the data are recorded in text form. The early warning information released by provincial and municipal forestry departments is mostly at the county and township scale, and the news information release is mainly based on the county, township, and forest farm scale information. The data recorded by county-level forestry bureaus and forest farms are mostly township, forest farm scale and coordinate information. The China Forestry Statistical Yearbook records the area of forest pests. At the same time, referring to the existing literature research and news information release, the pine caterpillar disaster information in the existing literature research is mostly a relatively accurate coordinate space range, mostly concentrated in the exact forest farm or even specific forest farm team. Reference was made to the historical records of the coordinates of the occurrence points of forestry departments in some regions (such as the Forestry Bureaus of Chaoyang County, Lingyuan City, and Changtu County in Liaoning Province, and the Forestry Bureau of Changbai Mountain Nature Reserve), scientific research results of colleges and universities (such as the research of Northeast Forestry University on the pine caterpillar disaster area in Daqingshan County, Heilongjiang Province and the results of 37 field sample plots conducted by Northeast Normal University in Changbai Mountain) and related pest monitoring reports (such as the pine caterpillar disaster area reports released by provincial and municipal forestry bureaus, information released by news media, etc.).
In 2018 and 2019, a serious pine caterpillar disaster occurred in Changbai Mountain. In 2019 and 2020, two field surveys were conducted on the pine caterpillar disaster in Changbai Mountain. There were 34 coordinate points recorded in 2019 and 65 coordinate points recorded in 2020 (Figure 2). After comprehensive analysis, 528 distribution point data of pine caterpillar infestation from 1981 to 2018 were finally determined and collected. The repeated points among the 528 points were sorted out year by year, and a total of 1903 records were collected. After sorting out the data of pine caterpillar infestation distribution points, the data were screened and the data with missing information were removed to obtain the usable pine caterpillar infestation distribution point data.
In order to avoid spatial overlap of samples, the minimum spatial buffer distance (10 km) from the known infestation points was set, which was controlled within the range of 10–50 km, and non-infestation points were randomly selected according to time and space stratification to ensure the representativeness and balanced distribution of the samples, so that they matched the infestation points as much as possible in terms of spatial distribution and sample quantity. In this study, such points do not represent “absolutely no insect pests” but serve as an approximate substitute for “unrecorded infestation” areas. Although there are limitations in historical records, non-infestation points may include areas that have not been monitored or were missed in records. In the absence of continuous insect population density monitoring data, the spatial modeling of infestation risk can still be supported to a certain extent by setting spatial range and time layer screening strategies.

2.2.2. Habitat Factor Data

The habitat factor data (Table 1, “5-year average” in the Table 1 refers to the 5-year average before the year of pine caterpillar infestation) primarily come from the ERA5_Land reanalysis dataset (https://cds.climate.copernicus.eu/datasets/, accessed on 2 August 2024), with a spatial resolution of 0.1° × 0.1°, covering climate, vegetation, soil, and snow factors; soil moisture and temperature are both measured at a depth of 0~7 cm underground. Topographic factors are derived from NASA DEM (30 m) data (https://earthdata.nasa.gov/), with slope calculated based on the DEM data. The human factor is represented by the 5-year average population density (1 km) (https://landscan.ornl.gov).
Since the studied pine forest areas are mainly distributed in mountainous areas, the annual average habitat factor dataset from 1979 to 2018 was reconstructed using the CDO (1.9.9rc1) software by the bilinear interpolation method with a spatial resolution of 0.01° × 0.01°. We resampled DEM, slope, and population density data to a uniform 0.01° resolution. Infestation data were matched with habitat factors using time and location information. A 5 km sampling window was centered on infestation coordinates. Given the typical 3–8-year infestation cycle, the five-year average of habitat factors before each infestation year were used as independent variables for model training, applying the same method to non-infestation points. After data cleaning, the dataset comprised 3900 samples.

2.3. Shapiro–Wilk Test

The normality of habitat factor data is crucial for selecting the appropriate model construction method. The Shapiro–Wilk test includes three indicators: the statistic, p-value, and normality. The statistic measures the degree of deviation from a normal distribution, with values ranging from 0 to 1. Values closer to 1 indicate a better fit to a normal distribution [53]. The p-value is used to assess whether the null hypothesis (data follow a normal distribution) can be rejected. When the p-value > 0.05, the data may follow a normal distribution; when the p-value ≤ 0.05, the data significantly deviates from normality.
The results of the Shapiro–Wilk test (Figure 3) show that the p-values for all factors are less than 0.05, indicating that these factors do not pass the normality test at the 0.05 significance level. Some factors (e.g., u10 and v10) have values close to 1, all greater than 0.98, while the statistics for t2m_04 and stl1_04 are 0.9854 and 0.9800, respectively, indicating they are close to a normal distribution. Despite their p-values being less than 0.05, the deviation is relatively low, but potential bias should be noted. In contrast, the statistics for sro_04, sd_04, and tvh are significantly lower than 1 (0.4529, 0.3390, and 0.5076, respectively), suggesting these factors deviate markedly from normality, possibly due to skewness or kurtosis. Other factors, such as d2m, e, and stl1, have statistics around 0.90, indicating a clear deviation from normality.

2.4. Risk Assessment and Habitat Factor Analysis Methods

2.4.1. Conceptual Framework

This study proposes the idea of combining the APCIRD framework with machine learning (Figure 4), which integrates pine caterpillar infestation distribution data, habitat factors, SHAP, frequency analysis, fitting functions, and GeoDetector to assess the risk and spatiotemporal changes of the infestation from 2019 to 2024, while also comprehensively analyzing the driving effects of habitat factors on infestation risk.

2.4.2. The Risk Assessment Model

In the previous study, the Shapiro–Wilk test on 3900 data points showed that all factors failed the normality test, which may better reflect real conditions. The MLP feature extractor, a deep learning model for capturing nonlinear features, maps high-dimensional data to a lower-dimensional space through fully connected layers. Therefore, we used MLP to extract factor features and combined them with random forest to improve the accuracy of infestation risk assessment, forming an MRF model.
To compare performance, we applied three machine learning models—random forest (RF) [54], Extreme Gradient Boosting (XGBoost) [55], and Light Gradient Boosting Machine (LightGBM) [56]—along with SHapley Additive exPlanations (SHAP), frequency analysis, and GeoDetector to analyze factor contributions and interactions in infestation risk. RF, an ensemble method, enhances accuracy by aggregating predictions from multiple decision trees, handles high-dimensional data well, and provides feature importance insights. XGBoost, a high-performance gradient boosting variant, iteratively trains decision trees for better efficiency but requires complex parameter tuning. LightGBM, developed by Microsoft, is a fast, memory-efficient gradient boosting framework ideal for large-scale and complex data. The dataset had a roughly 1:1 positive-to-negative sample ratio, split into training and test sets at a 4:1 ratio. Model performance was assessed using precision, recall, F1-score, and validation accuracy.

2.4.3. SHAP and Fitting Function

SHAP enhances the interpretability of machine learning models. By combining SHAP values with normalized variables, it helps to fit the functional relationship between key habitat factors and the risk of pine caterpillar infestation, illustrating how the risk changes as these factors vary [57]. This approach improves our understanding of the driving relationship between pine caterpillar infestation risk and habitat factors.

2.4.4. Frequency Analysis

Introducing frequency analysis methods commonly used in ecology and geography to examine the distribution of biological or environmental phenomena helps analyze the characteristics of habitat factors contributing to high infestation risk [58]. This method clarifies the optimal threshold range between high infestation risk levels and each habitat factor, calculated using the following formula:
F = Si/S
In Equation (1), Si represents the number of pixels in factor i with higher and highest risk levels, while (S) represents the total number of pixels with higher and highest risk levels occurring. (F ∈ [0,1]).

2.4.5. GeoDetector

GeoDetector is a tool used to analyze spatial distribution and its influencing factors, commonly applied in fields like geography and environmental science [59]. It reveals driving effects by detecting spatial heterogeneity and interactions between factors, without assuming homogeneity of variance or normality. This study employed GeoDetector to analyze the relationship between pine caterpillar infestation risk levels and habitat factors. The analysis included (1) evaluating the explanatory power of individual factors on infestation risk and (2) exploring the explanatory power of habitat factor interactions on infestation risk. Each factor was discretized into five levels using the natural breakpoint method, and GeoDetector was used to calculate each factor’s explanatory power (q-value) and significance (p-value).

3. Results and Analysis

3.1. Comparison of Accuracy of Different Models

This study estimated pine caterpillar infestation risk probabilities using machine learning models, defining probabilities above 0.5 as “Occur” and below 0.5 as “No Occur”. The models showed significant performance differences (Table 2). MRF performed the best, achieving high precision (0.97), recall (0.99), and F1-score (0.98) for the “No Occur” class, indicating a low false positive rate. For the “Occur” class, it had a precision of 0.97, a recall of 0.90, and an F1-score of 0.95, with a low false negative rate. The validation accuracy (val_accuracy) was 0.9748. RF also performed well, with a precision of 0.95, recall of 0.99, and F1-score of 0.97 for the “No Occur” class. For the “Occur” class, it had a precision of 0.96, recall of 0.86, and an F1-score of 0.91, with a val_accuracy of 0.9505. XGBoost had slightly lower performance than RF, with an F1-score of 0.96 for the “No Occur” class (precision: 0.93, recall: 0.98), while for “Occur”, its precision was 0.95, recall was lower at 0.83, and the F1-score was 0.89, resulting in a val_accuracy of 0.9385. LightGBM outperformed RF in the “No Occur” class, with a precision of 0.94, recall of 0.98, and F1-score of 0.96. For the “Occur” class, it had a precision of 0.95, recall of 0.85, and an F1-score of 0.90, positioning it between RF and XGBoost, with a val_accuracy of 0.9417. Overall, MRF had the best performance, followed by RF.

3.2. Spatiotemporal Changes of Infestation Risk Levels of Pine Caterpillar

Model validation results show that the MRF method is the most effective for assessing pine caterpillar infestation risk. We applied this model to evaluate infestation risk in the study area for 2019 and classified the risk levels (Table 3 and Figure 5). Comparison with the 2019 warning data from the Pest Control Center of the State Forestry and Grassland Administration of China showed that the high-risk infestation areas aligned with the official warnings (https://mp.weixin.qq.com/s/54Nk2w7THrYkuEHlMopW3w (accessed on 2 August 2024)).
From 2019 to 2024, the pine caterpillar infestation risk level showed obvious changes (Figure 5). In general, the proportion of low-risk areas (Lowest and Lower) was high, while the proportion of high-risk areas (Higher and Highest) decreased year by year, reflecting the gradual reduction of the risk of disaster. In 2019, the low-risk areas were 39.09% (Lowest) and 37.48% (Lower), the medium-risk areas were 12.83%, and the high-risk areas (Higher was 7.46% and Highest was 3.14%) accounted for a small proportion. By 2020, the low-risk areas increased further, with Lowest at 40.49% and Lower rising sharply to 51.34%, while the high-risk areas decreased significantly, with Highest falling to 1.37%. In 2021, the low-risk areas dominated, with Lowest and Lower accounting for 47.93% and 48.90% respectively, and the high-risk areas almost disappeared (Higher and Highest were 0.01% and 0%). In 2022, the proportion of Lowest risk areas further increased to 59.37%, and the proportion of Highest risk areas was almost zero. In 2023 and 2024, the proportion of low-risk areas fluctuated, especially in 2024, when the Lower area rose sharply to 74.74%. Despite this, the proportion of Highest risk areas remained at a very low level, especially in 2024, when the Highest risk area completely disappeared. Overall, the risk of pine caterpillar infestation has decreased significantly in the past few years, with the number of low-risk areas increasing year by year and the number of high-risk areas gradually decreasing.
As can be seen from Figure 6, the number of pixels of the pine caterpillar disaster risk level showed significant fluctuations between 2019 and 2024, reflecting the changing trend of disaster risk in different years. The low-risk areas (Lowest and Lower) occupied a higher number of pixels in most years, especially in 2022 and 2024, the number of pixels in the Lowest risk area increased significantly, to 8655 and 3517, respectively, while the Lower area reached 10,896 pixels in 2024, indicating that the low-risk area has a certain expansion trend in space. In the medium-risk area (Medium), it decreased year by year from 1870 pixels in 2019 to 168 pixels in 2024, showing a gradual weakening of the disaster risk. In particular, the number of pixels in the Medium area dropped significantly in 2021 and 2022. The number of pixels in the high-risk areas (Higher and Highest) showed a clear downward trend. In 2019, the number of pixels in the Higher and Highest areas was 1087 and 458, respectively. By 2024, these areas had almost disappeared, with only 126 Higher areas remaining. Overall, the changes in the risk of pine caterpillar infestation between 2019 and 2024 showed that the proportion of low-risk areas increased year by year, while the proportion of high-risk areas gradually decreased, reflecting the overall downward trend of the risk of pine caterpillar infestation.
In practical applications, after verifying the accuracy of the model, we conducted a detailed assessment of the risk of pine caterpillars from 2019 to 2024, and based on this, we drew a spatial distribution map of the pine caterpillar risk level (Figure 7). This analysis shows the changing trend and spatial distribution characteristics of the risk of pine caterpillars in different years. Specifically, the high-risk areas for pine caterpillars in 2019 were mainly distributed in the southwest and northeast of Liaoning Province, the east and southeast of Jilin Province, and the south, east and northeast of Heilongjiang Province. The risk of pine caterpillars in these areas is relatively concentrated, especially in Lingyuan, Jianping, Yuanbaoshan, Ningcheng, and other places in the southwest of Liaoning Province, while the northeast mainly covers Tonghua, Xinbin, Qingyuan, and other areas. In Jilin Province, Hunchun, Dunhua, Longjing, Antu, Helong, and Changbai Mountain in the east, as well as Liuhe, Huinan, Panshi, Jingyu, and other places in the southeast, are all areas with a high risk of pine caterpillars. There are also certain high-risk areas in the south, east, and northeast of Heilongjiang Province. Acheng and Binxian in the south, Yangming, Aimin, Muling, Linkou, Jitong, and other areas in the east, as well as Huanan, Huachuan, Tangyuan, Youyi, Baoshan, and other places in the northeast, all have a high risk of pine caterpillars. By 2020, the high-risk areas for pine caterpillars changed, mainly concentrated in the southwest of Liaoning Province, mainly in Lingyuan, Chaoyang, Beipiao, Jianping, Jianchang, and other places. High-risk areas have also appeared in the Oroqen area of the Inner Mongolia Autonomous Region. Compared with 2019, the risk areas in Jilin Province decreased. The risk distribution in 2023 shows that the higher-risk areas for pine caterpillars are still concentrated in the southwest of Liaoning Province, including Lingyuan, Chaoyang, Beipiao, Jianping, Jianchang, and other places, but the risk areas have moved northward and extended to Fuxin, Changtu and other places. The high-risk areas in Jilin Province are mainly distributed in the northern slope of Tianchi Lake in Changbai Mountain, indicating that the ecological environment in this area still has an important impact on the growth and spread of pine caterpillars. In 2021, 2022, and 2024, the disaster risk of pine caterpillars was generally at a medium risk level or below, the risk area gradually decreased, and the distribution was relatively more dispersed. This change shows that the infestation risk of pine caterpillars has been effectively controlled, especially in 2024, when the overall risk level was low. By analyzing the spatial distribution of pine caterpillar risk levels in different years, it can be clearly observed that the infestation risk of pine caterpillars has shown a certain spatial contraction trend over time, especially in the years when prevention and control measures were gradually strengthened, the disaster risk of pine caterpillars decreased significantly.

3.3. Frequency Analysis of Infestation Risk Levels of Pine Caterpillar

To accurately quantify the optimal threshold range of each habitat factor that leads to higher and highest risk levels of pine caterpillar infestation, frequency analysis was applied to the model assessment results. Since higher and highest infestation risk levels were absent in 2021, 2022, and 2024, only the risk assessment data for 2019, 2020, and 2023 were considered (Figure 8, Figure 9 and Figure 10). The frequency analysis revealed general characteristics of areas with higher pine caterpillar infestation risks, including low to medium altitude, medium to high net surface solar radiation, moderate to high temperatures, gentle slopes (<30°), low to medium evaporation, low snow depth, medium snow temperature, low to moderate soil moisture, moderate to high soil temperature, low to moderate rainfall, low to moderate wind speed, low to moderate leaf area index, high vegetation type, low to moderate vegetation cover, low population density, and low surface runoff. Altitude affects pine caterpillars through temperature, humidity, and vegetation. At higher altitudes, cooler temperatures and shorter growing seasons limit their life cycle and reproduction. Lower altitudes, with warmer climates and longer seasons, support caterpillar growth, aided by abundant pine vegetation [60]. Slope also influences distribution; steeper slopes typically have lower soil moisture, which limits pine growth and food sources, while wet, low-slope areas favor caterpillar growth and reproduction [61]. Soil moisture is critical for plant water supply, with wet conditions promoting pine growth and providing more food. Soil temperature affects root growth and plant resistance. Extreme temperatures, however, can negatively impact the caterpillar life cycle [62]. The interaction between soil temperature and moisture influences activity and habitat selection, with moist, warm springs providing ideal conditions for growth and reproduction.
The optimal threshold ranges for the quantified high and highest levels of pine caterpillar infestation risk for each habitat factor are as follows: cvh: optimal threshold is <0.7, with risk increasing in the range of 0.2–0.6; d2m: optimal range is 268–274, especially 271–274, where infestation risk is higher; DEM: optimal range is <800, with higher occurrence probability in this range; e: optimal range is −0.0017 to −0.0009, with strong reactions in the range of −0.0017 to −0.0013; lai_hv: optimal range is 1.01–3.50, with risk increasing from 1.88 to 3.20; slope: optimal range is <26°, with significant impact below 17.32°; srr: optimal range is >11.10 M (Million), with significant impact from 11.09 M to 15.19 M; stl1: optimal range is 276.73–286.92, with a notable response above 280, consistent with stl1_04; swvl1: optimal range is 0.2–0.3, with notable dependence, consistent with swvl1_04; t2m: strong response above 278, consistent with t2m_04; tp: optimal range is <0.0023, with high sensitivity, consistent with tp_04; tsn: optimal range is 266.73–275; tvh: optimal range is 13.9–19; u10: optimal range is 0.6–1.7; and v10: optimal range is −0.76 to 0.30. Quantifying the optimal threshold ranges between habitat factors and high pine caterpillar infestation risk contributes to a better understanding of the driving effects of these factors.

3.4. Identification of Key Habitat Factors

To accurately identify the key habitat factors influencing the risk of pine caterpillar infestation, a combination of characteristic importance and single-factor explanatory power analysis was used. Figure 11 presents the importance ranking of each habitat factor, with higher values indicating a greater impact on the model. The top 10 most important factors are stl1, swvl1, sd, ssr, t2m, tsn, d2m, stl1_04, swvl1_04, and u10. Figure 12 and Table 4 display the explanatory power of each factor on infestation risk, with higher q-values indicating stronger explanatory power.
The factors marked in red in Table 4 represent the top 12 factors based on their explanatory power rankings each year, having a strong influence on infestation risk. By analyzing both the ranking of feature importance and the frequency of the top twelve factors each year, eight factors—stl1, swvl1, ssr, sd, t2m, tsn, d2m, and lai_hv—emerged as key habitat factors for infestation risk. The snow factors (sd, tsn) and soil factors (stl1, swvl1) are particularly important. This underscores the significance of considering the area’s long snow cover duration and the effect of snow on soil factors, highlighting the need to include both snow and soil factors in the analysis. This approach provides a fresh perspective compared to previous studies, which primarily focused on topography, temperature, and precipitation.

3.5. SHAP and Fitting Function Analysis

Figure 13 and Figure 14 illustrate the driving effect of habitat factors on the risk of pine caterpillar infestation, based on the average absolute SHAP value. A longer bar on the horizontal axis indicates a greater driving effect of the factor. Key habitat factors such as ssr, sd, swvl1, stl1, t2m, d2m, tsn, and lai_hv show higher SHAP values, consistent with the factors identified through feature importance and explanatory power analysis. Some medium and low-importance factors exhibit higher SHAP values than key drivers. The SHAP values of ssr, tsn, d2m, t2m, stl1, and lai_hv are positively correlated, with lower values found in areas less than 0 and higher values in areas greater than 0. In contrast, swvl1 shows a negative correlation. Factors like cvh, slope, and t2m_04 have narrower SHAP value distributions, indicating minimal impact on infestation risk.
Figure 13 and Figure 14 illustrate the overall influence of habitat factors on pine caterpillar infestation risk but do not provide a detailed analysis of their specific relationships. To address this, key habitat factors were normalized, and SHAP value scatter plots were generated (Figure 15). The driving relationships of ssr, lai_hv, d2m, stl1, swvl1, sd, and tsn were modeled using quartic polynomials. For ssr, the trend increased gently at first, then sharply around 0.4. The lai_hv showed steady growth, transitioning from rapid to gradual around 0.4. The d2m exhibited a complex pattern of decrease, increase, and then another decrease, with trend shifts around 0.1 and 0.6. A similar pattern was observed for sd, with changes around 0.15. The tsn increased steadily, shifting from gradual to rapid growth around 0.3. The stl1 showed a steady rise, accelerating around 0.4. The swvl1 decreased, with a trend shift at approximately 0.7. The relationship for t2m was modeled using a cubic polynomial, showing an initial rapid increase, a slower rise, and another sharp increase, with trend changes at about 0.2 and 0.7.

3.6. Interaction Detector Results

Figure 16 demonstrates that the interaction between habitat factors has greater explanatory power regarding the risk of pine caterpillar infestation than individual factors alone. Light blue indicates a double factor enhancement, while orange represents a nonlinear enhancement. The p-values for the single-factor explanatory power of tvh, sro_04, slope, and popu_densi are close to 1, suggesting their limited reliability in explaining infestation risk, both individually and through interactions. Only the interactions of the remaining factors are discussed. Nonlinear enhancement is primarily observed between sd and tp, as well as between tp and DEM. The interaction between habitat factors offers a more substantial explanatory power for infestation risk than individual factors, underscoring the complexity of the driving effects on pine caterpillar infestation risk.

4. Discussion

Previous studies have primarily focused on the impact of individual factors on the risk of pine caterpillar infestations, particularly climate variables (such as temperature, precipitation, and drought) and stand structure [63,64]. Moreover, much of the research has concentrated on short-term infestation factors, with a lack of long-term studies on habitat variables and insufficient integration of diverse data sources [65,66]. Previous studies suggest that pure forests, particularly coniferous forests, are more vulnerable to pine caterpillar infestations than mixed forests [67,68]. In addition, site conditions such as topography and soil characteristics also influence the risk of infestation [69,70,71]. These studies highlight the importance of considering habitat factors comprehensively in assessing the risk of pine caterpillar infestation. However, there are few studies that further explore the driving effects or relationships.
Instead of relying on traditional methods such as species distribution and niche modeling, this study introduced the APCIRD framework and MRF to assess the risk of pine caterpillar infestation. Key habitat factors identified include ssr, sd, swvl1, stl1, t2m, d2m, tsn, and lai_hv, with particular emphasis on snow (sd, tsn) and soil factors (stl1, swvl1), underscoring their significance in this study’s design. The driving relationships of these factors were modeled using quartic and cubic polynomials, revealing complex, nonlinear interactions. For example, ssr (importance value 0.0942) exhibits a rapid increase in risk beyond a certain threshold, while lai_hv (importance value 0.0334) has a strong impact until a certain limit, after which its effect diminishes. sd (importance value 0.0814) shows a U-shaped relationship with risk, with moderate levels increasing it. The swvl1 (importance value 0.0587) has a negative impact at lower levels but exacerbates risk beyond a certain threshold. This study also identifies optimal threshold ranges for habitat factors contributing to high infestation risks and demonstrates that factor interactions offer stronger explanatory power for infestation risk than individual factors. These findings align with previous research on the spatiotemporal driving effects of habitat factors on pine caterpillar infestation risk [16,72].
Unlike previous studies, this research indicates that in this study area, climate, soil, snow, and vegetation are the primary factors influencing the risk of pine caterpillar infestation, while topography and human factors play a lesser role. Frequency analysis results show the area is characterized by low to medium altitude and gentle slopes (slope < 30°). High temperatures at lower altitudes favor pine caterpillar development, while low temperatures at higher altitudes hinder it. Moderate to high surface net solar radiation enhances the risk of infestation. This study emphasizes that climate, snow, and soil factors have significant driving effects on infestation risk. Understanding these dynamic feedbacks is crucial, especially in the context of climate change and its impact on insect disturbances.
Dynamic feedback is essential to understanding insect disturbances, particularly in the context of climate change [73,74]. Global climate change is increasing the likelihood of range expansions for certain insects, thereby amplifying the risk of infestations. Climate change may also weaken the limiting effects of some habitat factors on insect distribution in specific regions [75,76]. For example, an interaction between soil moisture (swvl1) and soil temperature (stl1) was observed: moist soils warm more slowly than dry soils, suggesting that under moist conditions, swvl1 and stl1 may be negatively correlated. During the day, increased soil moisture suppresses rapid temperature rises, while at night, it slows down cooling. Consequently, habitats with high solar radiation and low soil moisture are associated with a higher risk of pine caterpillar infestations.
This study argues that the risk of pine caterpillar infestation results from the complex interaction of multiple habitat factors, closely tied to the specific conditions of the study area. It highlights the importance of considering snow and soil factors, which have been rarely addressed in previous research on infestation risk. This provides new insights for future studies in this area. However, factors like slope aspect, forest structure, and landscape pattern were not included, potentially affecting the accuracy of the risk assessment and driving effect analysis. The data resolution may also limit the ability to capture important details, suggesting that higher resolution data should be used in future research. The APCIRD framework, combined with MRF, serves as an effective tool for assessing future large-scale insect infestation risks and their driving factors.

5. Conclusions

Considering the actual situation, this study incorporated factors like snow accumulation and soil, proposing a combination of the APCIRD risk assessment framework and MRF to evaluate the risk and spatiotemporal variations of pine caterpillar infestation from 2019 to 2024. It identified key areas and habitat factors for infestation risk, modeled the functional relationship between these factors and infestation risk, analyzed the characteristics and optimal threshold ranges for high-risk factors, and found that factor interactions provide stronger explanatory power for infestation risk than individual factors. (1) From 2019 to 2024, areas with high pine caterpillar infestation and the highest risk levels gradually decrease, with risk levels changing from high to low and spatial distribution changing from concentrated to scattered. Eastern Heilongjiang and Southwest Liaoning remain key areas of focus. (2) Snow cover and soil factors play a key role in pine caterpillar infestation. ssr, sd, swvl1, stl1, t2m, d2m, tsn, and lai_hv are key habitat factors significantly impacting infestation risk. (3) Key habitat factors exhibit quartic and cubic polynomial relationships with infestation risk, with t2m following a cubic polynomial function and swvl1 showing a negative correlation, indicating nonlinear driving effects. This suggests that forestry management and protection should consider the specific relationships between habitat factors and pine caterpillar infestation risk when developing policies. (4) The characteristics and threshold ranges of factors triggering high infestation risks are mainly at low to medium levels. Areas with high pine caterpillar infestation risk generally exhibit the following characteristics: low to moderate altitude (<800 m), moderate to high surface net solar radiation, moderate to high temperature, gentle slopes (<30°), low to moderate evaporation, low snow depth (<0.02), moderate snow temperature (266.73–275), low to moderate soil moisture (0.2–0.3), moderate to high soil temperature (276.73–286.92), low to moderate rainfall, low to moderate wind speed, low to moderate leaf area index, high vegetation type, low to moderate vegetation cover, low population density, and low surface runoff. The integration of the APCIRD framework and MRF effectively assesses infestation risk and analyzes the driving role of habitat factors.

Author Contributions

J.Z.: conceptualization, methodology, investigation, visualization, formal analysis, original draft. M.W. (Mingchang Wang): funding acquisition, review and editing, project administration. D.C.: data curation, investigation. L.W.: data curation. X.J.: review and editing. Q.D.: review and editing. F.W.: review and editing. M.W. (Minshui Wang): resources. All authors have read and agreed to the published version of the manuscript.

Funding

This research work was financially supported by the National Natural Science Foundation of China (No. 42171407, 42077242), the Key Program of the National Natural Science Foundation of China (No. 42330607), and the Scientific Research Project of Jilin Provincial Education Department (JJKH20231181KJ).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

  1. Grêt-Regamey, A.; Weibel, B. Global Assessment of Mountain Ecosystem Services Using Earth Observation Data. Ecosyst. Serv. 2020, 46, 101213. [Google Scholar] [CrossRef]
  2. Xie, Y.; Cheng, C.; Zhang, T.; Wu, X.; Wang, P. Donor-Side Valuation of Forest Ecosystem Services in China during 1990–2020. Energy Ecol. Environ. 2023, 8, 503–521. [Google Scholar] [CrossRef]
  3. Cortini, F.; Comeau, P.G. Pests, Climate and Competition Effects on Survival and Growth of Trembling Aspen in Western Canada. New For. 2020, 51, 175–190. [Google Scholar] [CrossRef]
  4. Han, D.; Wang, S.; Zhang, J.; Cui, R.; Wang, Q. Evaluating Dendrolimus superans (Lepidoptera: Lasiocampidae) Occurrence and Density Modeling with Habitat Conditions. Forests 2024, 15, 388. [Google Scholar] [CrossRef]
  5. Schroeder, M.; Cocoş, D. Performance of the Tree-Killing Bark Beetles Ips typographus and Pityogenes chalcographus in Non-Indigenous Lodgepole Pine and Their Historical Host Norway Spruce. Agric. For. Entomol. 2018, 20, 347–357. [Google Scholar] [CrossRef]
  6. Chen, H.; Hu, Y.; Chang, Y.; Bu, R.; Li, Y.; Liu, M. Simulating Impact of Larch Caterpillar (Dendrolimus superans) on Fire Regime and Forest Landscape in Da Hinggan Mountains, Northeast China. Chin. Geogr. Sci. 2011, 21, 575–586. [Google Scholar] [CrossRef]
  7. Cheng, X.; Qian, G.; Song, X.; Zhang, S.; Zhou, X.; Zou, Y.; Zhang, G.; Fang, G.; Song, Y.; Bi, S. The Catastrophe Prediction Models of Dendrolimus punctatus Based on Disaster Index. Int. J. Pest Manag. 2021, 70, 616–625. [Google Scholar] [CrossRef]
  8. Wu, S.J.; Zhu, T.H.; Qiao, T.M.; Li, S.J.; Shan, H. Prediction of the Potential Distribution of Dendrolimus Houi Lajonquiere in Sichuan of China Based on the Species Distribution Model. Appl. Ecol. Environ. Res. 2021, 19, 2227–2240. [Google Scholar] [CrossRef]
  9. Bao, Y.; Han, A.; Zhang, J.; Liu, X.; Tong, Z.; Bao, Y. Contribution of the Synergistic Interaction between Topography and Climate Variables to Pine Caterpillar (Dendrolimus spp.) Outbreaks in Shandong Province, China. Agric. For. Meteorol. 2022, 322, 109023. [Google Scholar] [CrossRef]
  10. Bao, Y.; Na, L.; Han, A.; Guna, A.; Wang, F.; Liu, X.; Zhang, J.; Wang, C.; Tong, S.; Bao, Y. Drought Drives the Pine Caterpillars (Dendrolimus spp.) Outbreaks and Their Prediction under Different RCPs Scenarios: A Case Study of Shandong Province, China. For. Ecol. Manag. 2020, 475, 118446. [Google Scholar] [CrossRef]
  11. You, W.; You, H.; Wu, L.; Ji, Z.; He, D. Landscape-Level Spatiotemporal Patterns of Dendrolimus punctatus Walker and Its Driving Forces: Evidence from a Pinus massoniana Forest. Trees-Struct. Funct. 2020, 34, 553–562. [Google Scholar] [CrossRef]
  12. Liu, N.; Zhao, X.; Zhang, X.; Zhao, J.; Wang, H.; Wu, D. Remotely Sensed Evidence of the Divergent Climate Impacts of Wind Farms on Croplands and Grasslands. Sci. Total Environ. 2023, 905, 167203. [Google Scholar] [CrossRef] [PubMed]
  13. Xiang, Y.; Tang, Y.; Wang, Z.; Peng, C.; Huang, C.; Dian, Y.; Teng, M.; Zhou, Z. Seasonal Variations of the Relationship between Spectral Indexes and Land Surface Temperature Based on Local Climate Zones: A Study in Three Yangtze River Megacities. Remote Sens. 2023, 15, 870. [Google Scholar] [CrossRef]
  14. Xiang, Y.; Yuan, C.; Cen, Q.; Huang, C.; Wu, C.; Teng, M.; Zhou, Z. Heat Risk Assessment and Response to Green Infrastructure Based on Local Climate Zones. Build. Environ. 2024, 248, 111040. [Google Scholar] [CrossRef]
  15. Han, R.D.; Parajulee, M.; Zhong, H.; Feng, G. Effects of Environmental Humidity on the Survival and Development of Pine Caterpillars, Dendrolimus tabulaeformis (Lepidoptera: Lasiocampidae). Insect Sci. 2008, 15, 147–152. [Google Scholar] [CrossRef]
  16. Fang, L.; Yu, Y.; Fang, G.; Zhang, X.; Yu, Z.; Zhang, X.; Crocker, E.; Yang, J. Effects of Meteorological Factors on the Defoliation Dynamics of the Larch Caterpillar (Dendrolimus superans Butler) in the Great Xing’an Boreal Forests. J. For. Res. 2021, 32, 2683–2697. [Google Scholar] [CrossRef]
  17. Hua, H.; Wu, C.; Jassal, R.S.; Huang, J.; Liu, R.; Wang, Y. Pine Caterpillar Occurrence Modeling Using Satellite Spring Phenology and Meteorological Variables. Environ. Res. Lett. 2022, 17, 104046. [Google Scholar] [CrossRef]
  18. Gao, H.; Wang, C.; Wang, G.; Zhu, J.; Tang, Y.; Shen, P.; Zhu, Z. A Crop Classification Method Integrating GF-3 PolSAR and Sentinel-2A Optical Data in the Dongting Lake Basin. Sensors 2018, 18, 3139. [Google Scholar] [CrossRef]
  19. Mercier, A.; Betbeder, J.; Denize, J.; Roger, J.L.; Spicher, F.; Lacoux, J.; Roger, D.; Baudry, J.; Hubert-Moy, L. Estimating Crop Parameters Using Sentinel-1 and 2 Datasets and Geospatial Field Data. Data Brief 2021, 38, 107408. [Google Scholar] [CrossRef]
  20. Zheng, Q.; Huang, W.; Cui, X.; Shi, Y.; Liu, L. New Spectral Index for Detecting Wheat Yellow Rust Using Sentinel-2 Multispectral Imagery. Sensors 2018, 18, 868. [Google Scholar] [CrossRef]
  21. Booth, T.H. Checking Bioclimatic Variables That Combine Temperature and Precipitation Data before Their Use in Species Distribution Models. Austral Ecol. 2022, 47, 1506–1514. [Google Scholar] [CrossRef]
  22. Soria-Auza, R.W.; Kessler, M.; Bach, K.; Barajas-Barbosa, P.M.; Lehnert, M.; Herzog, S.K.; Böhner, J. Impact of the Quality of Climate Models for Modelling Species Occurrences in Countries with Poor Climatic Documentation: A Case Study from Bolivia. Ecol. Model. 2010, 221, 1221–1229. [Google Scholar] [CrossRef]
  23. Archaux, F.; Bergès, L. Optimising Vegetation Monitoring. A Case Study in A French Lowland Forest. Environ. Monit. Assess. 2008, 141, 19–25. [Google Scholar] [CrossRef]
  24. Zhang, X.; Zhang, Z.; Wang, W.; Fang, W.T.; Chiang, Y.T.; Liu, X.; Ju, H. Vegetation Successions of Coastal Wetlands in Southern Laizhou Bay, Bohai Sea, Northern China, Influenced by the Changes in Relative Surface Elevation and Soil Salinity. J. Environ. Manag. 2021, 293, 112964. [Google Scholar] [CrossRef]
  25. Christiansen, B. The Shortcomings of Nonlinear Principal Component Analysis in Identifying Circulation Regimes. J. Clim. 2005, 18, 4814–4823. [Google Scholar] [CrossRef]
  26. Liu, Z.; Wang, M.; Liu, X.; Wang, F.; Li, X.; Wang, J.; Hou, G.; Zhao, S. Ecological Security Assessment and Warning of Cultivated Land Quality in the Black Soil Region of Northeast China. Land 2023, 12, 1005. [Google Scholar] [CrossRef]
  27. Yang, X.; Hao, Z.; Liu, K.; Tao, Z.; Shi, G. An Improved Unascertained Measure-Set Pair Analysis Model Based on Fuzzy AHP and Entropy for Landslide Susceptibility Zonation Mapping. Sustainability 2023, 15, 6205. [Google Scholar] [CrossRef]
  28. Campos, J.C.; Garcia, N.; Alírio, J.; Arenas-Castro, S.; Teodoro, A.C.; Sillero, N. Ecological Niche Models Using MaxEnt in Google Earth Engine: Evaluation, Guidelines and Recommendations. Ecol. Inform. 2023, 76, 102147. [Google Scholar] [CrossRef]
  29. Gagula, A.; Campana, M.B.D.; Narit, M.G.; Guerrero, P.D.; Parac, E.P. Using Maxent in Quantifying the Impacts of Climate Change in Land Suitability of Abaca (Musa Testilis) in Caraga Region, Philippines. In Proceedings of the 8th Geoinformation Science Symposium 2023: Geoinformation Science for Sustainable Planet, Yogyakarta, Indonesia, 28–30 August 2023. [Google Scholar] [CrossRef]
  30. Yalcin, M.; Sari, F.; Yildiz, A. Exploration of Potential Geothermal Fields Using MAXENT and AHP: A Case Study of the Büyük Menderes Graben. Geothermics 2023, 114, 102792. [Google Scholar] [CrossRef]
  31. Bera, D.; Das Chatterjee, N.; Bera, S. Comparative Performance of Linear Regression, Polynomial Regression and Generalized Additive Model for Canopy Cover Estimation in the Dry Deciduous Forest of West Bengal. Remote Sens. Appl. Soc. Environ. 2021, 22, 100502. [Google Scholar] [CrossRef]
  32. Park, S.Y.; Yoon, D.K.; Park, S.H.; Jeon, J.I.; Lee, J.M.; Yang, W.H.; Cho, Y.S.; Kwon, J.; Lee, C.M. Proposal of a Methodology for Prediction of Indoor PM2.5 Concentration Using Sensor-Based Residential Environments Monitoring Data and Time-Divided Multiple Linear Regression Model. Toxics 2023, 11, 526. [Google Scholar] [CrossRef] [PubMed]
  33. Yılmaz, M. A Comparative Assessment of the Statistical Methods Based on Urban Population Density Estimation. Geocarto Int. 2023, 38, 2152494. [Google Scholar] [CrossRef]
  34. Early, R.; Rwomushana, I.; Chipabika, G.; Day, R. Comparing, Evaluating and Combining Statistical Species Distribution Models and CLIMEX to Forecast the Distributions of Emerging Crop Pests. Pest Manag. Sci. 2022, 78, 671–683. [Google Scholar] [CrossRef] [PubMed]
  35. Fitzgibbon, A.; Pisut, D.; Fleisher, D. Evaluation of Maximum Entropy (Maxent) Machine Learning Model to Assess Relationships between Climate and Corn Suitability. Land 2022, 11, 1382. [Google Scholar] [CrossRef]
  36. Zhao, Z.; Xiao, N.; Shen, M.; Li, J. Comparison between Optimized MaxEnt and Random Forest Modeling in Predicting Potential Distribution: A Case Study with Quasipaa boulengeri in China. Sci. Total Environ. 2022, 842, 156867. [Google Scholar] [CrossRef]
  37. Dai, X.; Wu, W.; Ji, L.; Tian, S.; Yang, B.; Guan, B.; Wu, D. MaxEnt Model-Based Prediction of Potential Distributions of Parnassia Wightiana (Celastraceae) in China. Biodivers. Data J. 2022, 10, e81073. [Google Scholar] [CrossRef]
  38. Huercha; Song, R.; Ma, Y.; Hu, Z.; Li, Y.; Li, M.; Wu, L.; Li, C.; Dao, E.; Fan, X.; et al. MaxEnt Modeling of Dermacentor marginatus (Acari: Ixodidae) Distribution in Xinjiang, China. J. Med. Entomol. 2020, 57, 1659–1667. [Google Scholar] [CrossRef]
  39. Zhao, J.; Ma, L.; Song, C.; Xue, Z.; Zheng, R.; Yan, X.; Hao, C. Modelling Potential Distribution of Tuta Absoluta in China under Climate Change Using CLIMEX and MaxEnt. J. Appl. Entomol. 2023, 147, 895–907. [Google Scholar] [CrossRef]
  40. Song, J.W.; Jung, J.M.; Nam, Y.; Jung, J.K.; Jung, S.; Lee, W.H. Spatial Ensemble Modeling for Predicting the Potential Distribution of Lymantria dispar asiatica (Lepidoptera: Erebidae: Lymantriinae) in South Korea. Environ. Monit. Assess. 2022, 194, 889. [Google Scholar] [CrossRef]
  41. Bellin, N.; Tesi, G.; Marchesani, N.; Rossi, V. Species Distribution Modeling and Machine Learning in Assessing the Potential Distribution of Freshwater Zooplankton in Northern Italy. Ecol. Inform. 2022, 69, 101682. [Google Scholar] [CrossRef]
  42. Chen, S.; Ding, Y. Machine Learning and Its Applications in Studying the Geographical Distribution of Ants. Diversity 2022, 14, 706. [Google Scholar] [CrossRef]
  43. El Alaoui, O.; Idri, A. Predicting the Potential Distribution of Wheatear Birds Using Stacked Generalization-Based Ensembles. Ecol. Inform. 2023, 75, 102084. [Google Scholar] [CrossRef]
  44. Zhao, Y.; Liu, H.; Qu, W.; Luan, P.; Sun, J. Research on Geological Safety Evaluation Index Systems and Methods for Assessing Underground Space in Coastal Bedrock Cities Based on a Back-Propagation Neural Network Comprehensive Evaluation–Analytic Hierarchy Process (BPCE-AHP). Sustainability 2023, 15, 8055. [Google Scholar] [CrossRef]
  45. Azcárate, F.M.; Seoane, J.; Silvestre, M. Factors Affecting Pine Processionary Moth (Thaumetopoea pityocampa) Incidence in Mediterranean Pine Stands: A Multiscale Approach. For. Ecol. Manag. 2023, 529, 120728. [Google Scholar] [CrossRef]
  46. Chen, L.; Huang, J.G.; Dawson, A.; Zhai, L.; Stadt, K.J.; Comeau, P.G.; Whitehouse, C. Contributions of Insects and Droughts to Growth Decline of Trembling Aspen Mixed Boreal Forest of Western Canada. Glob. Change Biol. 2018, 24, 655–667. [Google Scholar] [CrossRef]
  47. Arabameri, A.; Nalivan, O.A.; Saha, S.; Roy, J.; Pradhan, B.; Tiefenbacher, J.P.; Ngo, P.T.T. Novel Ensemble Approaches of Machine Learning Techniques in Modeling the Gully Erosion Susceptibility. Remote Sens. 2020, 12, 1890. [Google Scholar] [CrossRef]
  48. Lu, S.; Ye, S.-J. Using an Image Segmentation and Support Vector Machine Method for Identifying Two Locust Species and Instars. J. Integr. Agric. 2020, 19, 1301–1313. [Google Scholar] [CrossRef]
  49. Zhang, C.; Park, D.S.; Yoon, S.; Zhang, S. Editorial: Machine Learning and Artificial Intelligence for Smart Agriculture. Front. Plant Sci. 2023, 13, 1121468. [Google Scholar] [CrossRef]
  50. Rafiq, D.; Bazaz, M.A. A Collection of Large-Scale Benchmark Models for Nonlinear Model Order Reduction. Arch. Comput. Methods Eng. 2023, 30, 69–83. [Google Scholar] [CrossRef]
  51. Wang, L.; Xu, M. Regression-Based Identification and Order Reduction Method for Nonlinear Dynamic Structural Models. Proc. Inst. Mech. Eng. Part G J. Aerosp. Eng. 2023, 237, 3508–3523. [Google Scholar] [CrossRef]
  52. Zhu, R.; Fei, Q.; Jiang, D.; Marchesiello, S.; Anastasio, D. Bayesian Model Selection in Nonlinear Subspace Identification. AIAA J. 2022, 60, 92–101. [Google Scholar] [CrossRef]
  53. Monter-Pozos, A.; González-Estrada, E. On Testing the Skew Normal Distribution by Using Shapiro–Wilk Test. J. Comput. Appl. Math. 2024, 440, 115649. [Google Scholar] [CrossRef]
  54. El Habib Daho, M.; Amine Chikh, M. Combining Bootstrapping Samples, Random Subspaces and Random Forests to Build Classifiers. J. Med. Imaging Health Inform. 2015, 5, 539–544. [Google Scholar] [CrossRef]
  55. Zhu, H.; Liu, H.; Zhou, Q.; Cui, A. A XGBoost-Based Downscaling-Calibration Scheme for Extreme Precipitation Events. IEEE Trans. Geosci. Remote Sens. 2023, 61, 4103512. [Google Scholar] [CrossRef]
  56. Lyu, J.; Zheng, P.; Qi, Y.; Huang, G. LightGBM-LncLoc: A LightGBM-Based Computational Predictor for Recognizing Long Non-Coding RNA Subcellular Localization. Mathematics 2023, 11, 602. [Google Scholar] [CrossRef]
  57. Zhang, X.; Wu, T.; Du, Q.; Quyang, N.; Nie, W.; Liu, Y.; Gou, P.; Li, G. Spatiotemporal changes of ecosystem health and the impact of its driving factors on the Loess Plateau in China. J. Ecol. Indic. 2025, 170, 1677–1680. [Google Scholar] [CrossRef]
  58. Pham, B.T.; Luu, C.; Van Phong, T.; Nguyen, H.D.; Van Le, H.; Tran, T.Q.; Ta, H.T.; Prakash, I. Flood Risk Assessment Using Hybrid Artificial Intelligence Models Integrated with Multi-Criteria Decision Analysis in Quang Nam Province, Vietnam. J. Hydrol. 2021, 592, 125815. [Google Scholar] [CrossRef]
  59. Liu, C.; Li, W.; Wang, W.; Zhou, H.; Liang, T.; Hou, F.; Xu, J.; Xue, P. Quantitative Spatial Analysis of Vegetation Dynamics and Potential Driving Factors in a Typical Alpine Region on the Northeastern Tibetan Plateau Using the Google Earth Engine. Catena 2021, 206, 105500. [Google Scholar] [CrossRef]
  60. Hodkinson, I.D. Terrestrial Insects along Elevation Gradients: Species and Community Responses to Altitude. Biol. Rev. Camb. Philos. Soc. 2005, 80, 489–513. [Google Scholar] [CrossRef]
  61. Marini, L.; Fontana, P.; Klimek, S.; Battisti, A.; Gaston, K.J. Impact of Farm Size and Topography on Plant and Insect Diversity of Managed Grasslands in the Alps. Biol. Conserv. 2009, 142, 394–403. [Google Scholar] [CrossRef]
  62. Hamann, A.; Wang, T. Potential Effects of Climate Change on Ecosystem and Tree Species Distribution in British Columbia. Ecology 2006, 87, 2773–2786. [Google Scholar] [CrossRef] [PubMed]
  63. Gazol, A.; Hernández-Alonso, R.; Camarero, J.J. Patterns and Drivers of Pine Processionary Moth Defoliation in Mediterranean Mountain Forests. Front. Ecol. Evol. 2019, 7, 458. [Google Scholar] [CrossRef]
  64. Marini, L.; Ayres, M.P.; Battisti, A.; Faccoli, M. Climate Affects Severity and Altitudinal Distribution of Outbreaks in an Eruptive Bark Beetle. Clim. Chang. 2012, 115, 327–341. [Google Scholar] [CrossRef]
  65. Haynes, K.J.; Liebhold, A.M.; Lefcheck, J.S.; Morin, R.S.; Wang, G. Climate Affects the Outbreaks of a Forest Defoliator Indirectly through Its Tree Hosts. Oecologia 2022, 198, 407–418. [Google Scholar] [CrossRef]
  66. Lai, H.; Hales, S.; Woodward, A.; Walker, C.; Marks, E.; Pillai, A.; Chen, R.X.; Morton, S.M. Effects of Heavy Rainfall on Waterborne Disease Hospitalizations among Young Children in Wet and Dry Areas of New Zealand. Environ. Int. 2020, 145, 106136. [Google Scholar] [CrossRef]
  67. Aoki, C.F.; Cook, M.; Dunn, J.; Finley, D.; Fleming, L.; Yoo, R.; Ayres, M.P. Old Pests in New Places: Effects of Stand Structure and Forest Type on Susceptibility to a Bark Beetle on the Edge of Its Native Range. For. Ecol. Manag. 2018, 419–420, 206–219. [Google Scholar] [CrossRef]
  68. Bognounou, F.; De Grandprè, L.; Pureswaran, D.S.; Kneeshaw, D. Temporal Variation in Plant Neighborhood Effects on the Defoliation of Primary and Secondary Hosts by an Insect Pest. Ecosphere 2017, 8, e01759. [Google Scholar] [CrossRef]
  69. Dodds, K.J.; Aoki, C.F.; Arango-Velez, A.; Cancelliere, J.; D’Amato, A.W.; DiGirolomo, M.F.; Rabaglia, R.J. Expansion of Southern Pine Beetle into Northeastern Forests: Management and Impact of a Primary Bark Beetle in a New Region. J. For. 2018, 116, 178–191. [Google Scholar] [CrossRef]
  70. Sánchez-Cuesta, R.; Ruiz-Gómez, F.J.; Duque-Lazo, J.; González-Moreno, P.; Navarro-Cerrillo, R.M. The Environmental Drivers Influencing Spatio-Temporal Dynamics of Oak Defoliation and Mortality in Dehesas of Southern Spain. For. Ecol. Manag. 2021, 485, 118946. [Google Scholar] [CrossRef]
  71. Walter, J.A.; Platt, R.V. Multi-Temporal Analysis Reveals That Predictors of Mountain Pine Beetle Infestation Change during Outbreak Cycles. For. Ecol. Manag. 2013, 302, 308–318. [Google Scholar] [CrossRef]
  72. Figueredo, L.; Villa-Murillo, A.; Colmenarez, Y.; Vásquez, C. A Hybrid Artificial Intelligence Model for Aeneolamia varia (Hemiptera: Cercopidae) Populations in Sugarcane Crops. J. Insect Sci. 2021, 21, 11. [Google Scholar] [CrossRef] [PubMed]
  73. DeRose, R.J.; Bentz, B.J.; Long, J.N.; Shaw, J.D. Effect of Increasing Temperatures on the Distribution of Spruce Beetle in Engelmann Spruce Forests of the Interior West, USA. For. Ecol. Manag. 2013, 308, 198–206. [Google Scholar] [CrossRef]
  74. Lalande, B.M.; Hughes, K.; Jacobi, W.R.; Tinkham, W.T.; Reich, R.; Stewart, J.E. Subalpine Fir Mortality in Colorado Is Associated with Stand Density, Warming Climates and Interactions among Fungal Diseases and the Western Balsam Bark Beetle. For. Ecol. Manag. 2020, 466, 118133. [Google Scholar] [CrossRef]
  75. Bajwa, A.A.; Farooq, M.; Al-Sadi, A.M.; Nawaz, A.; Jabran, K.; Siddique, K.H.M. Impact of Climate Change on Biology and Management of Wheat Pests. Crop Prot. 2020, 137, 105304. [Google Scholar] [CrossRef]
  76. Ma, C.S.; Ma, G.; Pincebourde, S. Survive a Warming Climate: Insect Responses to Extreme High Temperatures. Annu. Rev. Entomol. 2021, 66, 163–184. [Google Scholar] [CrossRef]
Figure 1. Study area location, pine forest distribution, and historical pine caterpillar infestation areas.
Figure 1. Study area location, pine forest distribution, and historical pine caterpillar infestation areas.
Remotesensing 17 01738 g001
Figure 2. Distribution of field survey sites in 2019 and 2020.
Figure 2. Distribution of field survey sites in 2019 and 2020.
Remotesensing 17 01738 g002
Figure 3. Analysis of Shapiro–Wilk normality test.
Figure 3. Analysis of Shapiro–Wilk normality test.
Remotesensing 17 01738 g003
Figure 4. Framework for pine caterpillar infestation risk assessment and habitat factor analysis.
Figure 4. Framework for pine caterpillar infestation risk assessment and habitat factor analysis.
Remotesensing 17 01738 g004
Figure 5. Changes in pine caterpillar infestation risk levels from 2019 to 2024.
Figure 5. Changes in pine caterpillar infestation risk levels from 2019 to 2024.
Remotesensing 17 01738 g005
Figure 6. Pine caterpillar infestation risk levels pixels from 2019 to 2024.
Figure 6. Pine caterpillar infestation risk levels pixels from 2019 to 2024.
Remotesensing 17 01738 g006
Figure 7. Spatial distribution of pine caterpillar infestation risk from 2019 to 2024.
Figure 7. Spatial distribution of pine caterpillar infestation risk from 2019 to 2024.
Remotesensing 17 01738 g007
Figure 8. Frequency analysis results of 2019 (the colored and white bars represent areas with higher and highest risk, respectively).
Figure 8. Frequency analysis results of 2019 (the colored and white bars represent areas with higher and highest risk, respectively).
Remotesensing 17 01738 g008
Figure 9. Frequency analysis results of 2020 (the colored and white bars represent areas with higher and highest risk, respectively).
Figure 9. Frequency analysis results of 2020 (the colored and white bars represent areas with higher and highest risk, respectively).
Remotesensing 17 01738 g009
Figure 10. Frequency analysis results of 2023 (the colored and white bars represent areas with higher and highest risk, respectively).
Figure 10. Frequency analysis results of 2023 (the colored and white bars represent areas with higher and highest risk, respectively).
Remotesensing 17 01738 g010
Figure 11. Importance ranking of habitat factors.
Figure 11. Importance ranking of habitat factors.
Remotesensing 17 01738 g011
Figure 12. Explanatory power of single factors on infestation risk from 2019 to 2024.
Figure 12. Explanatory power of single factors on infestation risk from 2019 to 2024.
Remotesensing 17 01738 g012
Figure 13. SHAP interpretability analysis.
Figure 13. SHAP interpretability analysis.
Remotesensing 17 01738 g013
Figure 14. Beeswarm plot of SHAP values.
Figure 14. Beeswarm plot of SHAP values.
Remotesensing 17 01738 g014
Figure 15. Driving relationships between key habitat factors and infestation risk.
Figure 15. Driving relationships between key habitat factors and infestation risk.
Remotesensing 17 01738 g015
Figure 16. Interaction detector analysis results.
Figure 16. Interaction detector analysis results.
Remotesensing 17 01738 g016
Table 1. Habitat factor data.
Table 1. Habitat factor data.
TypesFactor DescriptionAbbreviation
Climate5-year average surface net solar radiationssr (J/m2)
5-year average 2 m temperaturet2m (K)
5-year average 2 m dewpoint temperatured2m (K)
5-year average 10 m u-component of windu10 (m/s)
5-year average 10 m v-component of windv10 (m/s)
5-year average evaporatione (m of weq)
5-year average total precipitationtp (m)
average total precipitation in Apriltp_04 (m)
average 2 m temperature in Aprilt2m_04 (K)
average surface runoff in Aprilsro_04 (m)
Vegetation5-year average high vegetation coveragecvh (0–1)
5-year average leaf area index of high vegetationslai_hv (m2/m2)
5-year average high vegetation typestvh
Soil5-year average soil moisture at 0–7 cm depthswvl1 (m3/m3)
5-year average temperature at 0–7 cm depthstl1 (K)
monthly mean soil moisture in April of the year of occurrence occurrenceswvl1_04 (m3/m3)
monthly mean temperature in April of the year of occurrencestl1_04 (K)
Snow5-year average temperature of snow layertsn (K)
5-year average snow depthsd (m of weq)
average snow depth in Aprilsd_04 (m of weq)
Topographydigital elevation modelDEM (m)
slopeslope
Human5-year average population densitypopu_densi
Note: “m of weq” stands for “water equivalent”.
Table 2. Accuracy of pine caterpillar infestation risk assessment models using different methods.
Table 2. Accuracy of pine caterpillar infestation risk assessment models using different methods.
MethodsClassPrecisionRecallF1-ScoreVal_Accuracy
MRFNo Occur0.970.990.980.9748
Occur0.970.900.95
RFNo Occur0.950.990.970.9505
Occur0.960.860.91
XGBoostNo Occur0.930.980.960.9385
Occur0.950.830.89
LGBMNo Occur0.940.980.960.9417
Occur0.950.850.90
Table 3. Pine caterpillar infestation risk levels and their corresponding ranges.
Table 3. Pine caterpillar infestation risk levels and their corresponding ranges.
LowestLowerMediumHigherHighest
Probability value0~0.250.25~0.500.50~0.700.70~0.850.85~1.00
Table 4. Changes in the explanatory power of single factors on infestation risk from 2019 to 2024.
Table 4. Changes in the explanatory power of single factors on infestation risk from 2019 to 2024.
201920202021202220232024
d2m0.98650.92110.94370.91930.94370.9553
e0.96420.81110.83550.80890.83550.9513
cvh0.97390.94630.95980.93220.95980.9383
lai_hv0.98460.95290.96140.93930.96140.9562
sd0.74520.17510.34430.29670.34430.9615
ssr0.98150.92790.92230.89830.92230.9595
stl10.98600.94730.95090.92770.95090.9524
swvl10.98680.96160.95650.93920.95650.9602
tvh0.24770.14720.24140.28930.24140.2049
t2m0.98700.94540.95260.92630.95260.9560
tp0.92490.76010.74840.74410.74840.9524
tsn0.98550.93120.93950.91380.93950.9477
u100.98180.84950.86710.84750.86710.9616
v100.98630.89830.91960.91190.91960.9616
sro_040.09010.07790.17600.20030.17600.3464
sd_040.16270.18190.19790.27190.19790.8530
tp_040.70160.75720.80220.76370.80220.2371
t2m_040.92880.92750.94370.92720.94370.9567
swvl1_040.96710.95770.96920.94740.96920.9611
stl1_040.94070.94110.94910.92840.94910.9572
popu_densi0.13890.21350.26260.20880.26260.1719
DEM0.21930.20180.23980.25550.23980.3126
slope0.13950.12910.14720.18530.14720.1746
Note: The p-values of tvh, sro_04, popu_densi, and slope are close to 1, indicating unreliable explanatory power, while the other factors have p-values below 0.05, showing reliable explanatory power.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhao, J.; Wang, M.; Cai, D.; Wu, L.; Ji, X.; Ding, Q.; Wang, F.; Wang, M. Spatiotemporal Changes of Pine Caterpillar Infestation Risk and the Driving Effect of Habitat Factors in Northeast China. Remote Sens. 2025, 17, 1738. https://doi.org/10.3390/rs17101738

AMA Style

Zhao J, Wang M, Cai D, Wu L, Ji X, Ding Q, Wang F, Wang M. Spatiotemporal Changes of Pine Caterpillar Infestation Risk and the Driving Effect of Habitat Factors in Northeast China. Remote Sensing. 2025; 17(10):1738. https://doi.org/10.3390/rs17101738

Chicago/Turabian Style

Zhao, Jingzheng, Mingchang Wang, Dong Cai, Linlin Wu, Xue Ji, Qing Ding, Fengyan Wang, and Minshui Wang. 2025. "Spatiotemporal Changes of Pine Caterpillar Infestation Risk and the Driving Effect of Habitat Factors in Northeast China" Remote Sensing 17, no. 10: 1738. https://doi.org/10.3390/rs17101738

APA Style

Zhao, J., Wang, M., Cai, D., Wu, L., Ji, X., Ding, Q., Wang, F., & Wang, M. (2025). Spatiotemporal Changes of Pine Caterpillar Infestation Risk and the Driving Effect of Habitat Factors in Northeast China. Remote Sensing, 17(10), 1738. https://doi.org/10.3390/rs17101738

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop