Wetlands in the mid- and high-latitudes account for approximately 64% of the world’s natural wetlands [1
], which provide ecological functions and services (e.g., biodiversity, carbon sink and storage, water conservation, hydrological adjustment, climatic regulation, and wildlife habitats) [2
]. These wetlands are particularly vulnerable to environmental changes [4
]. Recent studies showed that extensive wetlands disappeared in the mid- and high-latitudes during the last century and this trend is predicted to continue throughout the 21st century [5
]. Climate change and human activities are arguably the most important factors driving wetland changes [6
]. Climate change affects wetland distribution through directly altering the hydrological process [7
] and indirectly changing soil temperature, biogeochemical cycles, and vegetation dynamics [8
]. Human activities such as urbanization [10
], agricultural reclamation [12
], establishing the reservoirs [13
], aquaculture [14
], and overgrazing [15
] can lead to immediate wetland area changes. Climate change and human activities will have potential negative consequences for wetland distribution.
Previous studies have analyzed changes in wetland distribution under climate change scenarios derived from the Special Report of Emissions Scenarios (SRES) [16
]. Antoine, et al. [18
] pointed out that climate change reduced wetland areas by 5.3–13.6% over two periods 1961–2000 and 2081–2100 in Northwest France by using climate change scenarios derived from 14 GCMs for the A1B greenhouse gas (GHG). Nicholls [19
] concluded that the coastal wetlands would be lost with 5–20% losses by the 2080s in the A1F1 world downscaled from the HadCM3 model. However, previous climate warming scenarios (e.g., SRES [18
]), did not include socioeconomic drivers [21
], whereas the RCPs were able to account for the effects of various combinations of economies, technology developments, and demographics [17
]. Additionally, the IPCC 5th Assessment Report pointed out that the simulations of climate change from RCP scenarios would lead to precipitation changes and ice as well as snow melting and the geographic distribution of land and water would be altered to adapt to climate change [24
]. Thus, the RCP scenarios were needed to predict future effects of climate change on wetland distribution.
Many models were developed to investigate the future wetland distribution changes in the mid- and high-latitudes under climate change and human activities, such as the zero-inflation model [27
], maximum entropy model [28
], logistic regression model [30
], wetland landscape model [31
], and cellar automata-Markov model [32
]. Predictions using the maximum entropy model suggested that wetlands would decrease greatly over the 21st century in a mid- and high-latitude region of Northeast China [28
]. Predictions using the logistic regression model suggested that mean patch area, shape, and aggregation of the marsh landscape decreased with climate warming in a mid- and high-latitude region of Northeast China over the 21st century [34
]. Predictions using the cellar automata-Markov model suggested that the wetland under the effect of human activities had a significant amount of loss in a mid- and high-latitude region of Turkey in 2023 [33
]. However, most studies of wetland distribution changes emphasized either climate change or human activities alone [35
] and rarely considered both. Therefore, the future wetland changes in the mid- and high-latitudes and the combined effects of climate change and human activities on these changes remained uncertain.
In our study, we used the Random Forest model [36
] to analyze the relative importance of driving variables and predict the temporal and spatial changes in wetland distribution under climate change and human activities over the 21st century in a mid- and high-latitude region of Northeast China. We used the historical wetland distribution and driving factors including climate, hydrology, topography, and human activities to build the model and predict the wetland distribution changes at a spatial resolution of 200 m. Our research questions included the following: (1) what is the relative importance of climate change and human activities in driving historical wetland distribution changes? (2) how will wetland distribution change, driven by the combined effects of climate change and human activities over the 21st century?
2. Materials and Methods
2.1. Study Area
Our study area was located in the Heilongjiang River Basin of Northeast China, a mid- and high-latitude region extending from 123°33′ to 127°31′ E and 46°10′ to 48°25′ N with a total area of 3,408,965 ha (Figure 1
). The region has a temperate, continental, semi-humid, semi-arid monsoon climate, with long, cold winters and warm summers. Annual mean temperature is 2.74 °C and annual mean precipitation is 469.46 mm. Heilongjiang River Basin contained the most abundant wetlands in Northeast China, but the wetlands suffered serious human disturbances with population increasing quickly and extensive agricultural reclamation due to its suitable farming conditions.
2.2. Historical Wetland Data
We used cloud-free Landsat Thematic Mapper (TM) and Enhanced Thematic Mapper (ETM+) with a spatial resolution of 30 m for the 1990s, 2000s, and 2010s to identify the wetland distribution in our study area. Landsat TM/ETM+ satellite images were downloaded from the USGS Center for Earth Resources Observation and Science (http://glovis.usgs.gov
). We used radiometric calibration and FlAASH atmospheric correction models to correct all images removing radiometric and atmospheric effects with ENVI 5.2. We applied object-oriented classification methods to classify the wetland distribution [37
] and improved the accuracy of wetland classification by environmental factors related to wetland distribution, such as DEM [39
]. We validated the wetland classification results for the 1990s and 2000s by comparing with other studies [40
] that assured the classification accuracy for the 1990s and 2000s was above 90%. We validated the wetland classification results of the 2010s using field data with an accuracy of 85%. We resampled the classification results of the 1990s, 2000s, and 2010s into a resolution of 90 m.
2.3. Climate Data and Climate Change Scenarios
We included a current climate scenario and three climate change scenarios. We chose five GCMs (IPSL-CM5A-MR, MIROC5, MIROC-ESM-CHEM, MRI-CGCM3, and NorESM1-M) from CMIP5 by comparing the root-mean-square errors (RMSEs) between the historical and observed climate data [23
]. The smaller RMSEs indicated that the corresponding model performed better [22
]. Thus, we considered that the five GCMs performed relatively well [21
]. We used the historic climate data rather than the projected current climate data from the ensemble of the five GCMs in our modelling scenarios [20
]. We selected RCP scenarios from the GCMs including RCP 2.6, RCP 4.5, and RCP 8.5 emission scenarios representing the lowest, intermediate, and highest increases in temperature in this region, respectively [42
]. The historical monthly climate data (1960s–2009) were derived from China Meteorological Administration and the Meteorological Data Center (http://data.cma.cn/site/index.html
). The climate scenarios data (2010s–2099) were obtained from CMIP5 (http://cmip-pcmdi.llnl.gov/cmip5/
). We resampled all climate data to a 0.5° resolution grid because of the different resolutions among GCMs [45
]. We assembled the annual mean temperature and annual mean precipitation from five GCMs under RCP 2.6, RCP 4.5, and RCP 8.5 emission scenarios at the 2040s (2010s–2039), 2070s (2040s–2069), and 2100s (2070s–2099). Annual mean temperature and annual mean precipitation changed slightly under the RCP 2.6 emission scenario, increased under the RCP 4.5 emission scenario, and increased dramatically under the RCP 8.5 emission scenario (Table 1
2.4. Topographical Variables, Hydrological Variables, Climatic Variables, and Human Activity Variables
We selected topography, hydrology, climate, and human activities as driving factors, which included 15 driving variables. Because of the inconsistent resolutions of driving variables, we resampled the maps of the wetland and driving variables to 200-m cells.
We included the warmness index, coldness index, annual mean precipitation, humidity index, annual mean temperature, potential evapotranspiration ratio, and annual biological temperature as climatic variables [20
]. We calculated these variables for four periods, which included the current period (1960s–2009) and future periods (2010s–2039, 2040s–2069, and 2070s–2099). We calculated the warmness index and coldness index based on Kira’s method [46
], the humidity index based on Kira’s WI [48
], and the annual biological temperature and potential evapotranspiration ratio based on the revised Holdridge’s method [49
We included aspect, slope, and elevation as topography variables [29
]. We obtained slope, aspect, and elevation from the Digital Elevation Model (DEM) that was derived from the latest earth electronic topographic data of 2009 jointly launched by the National Aeronautics and Space Administration and the Ministry of Economy, Trade, and Industry of Japan. We used distance to water body as a hydrological variable because of the inaccessibility of other hydrologic data such as underground water [52
We included agricultural population proportion [53
], paddy field area proportion, dry farmland area proportion, and distance to roads [52
] as human activity variables [54
]. The census data, including population and agricultural cultivation (1990–2009), were derived from China Data Sharing Infrastructure of Earth System Science (http://www.geodata.cn/
). Distance to roads was calculated by using roads map (2009). The roads maps were only evaluated in 2009 because historic roads maps were not available.
2.5. Model Performance, Validation, and Prediction
We analyzed the relative importance of 15 driving variables in historic wetland distribution changes using the Random Forest model by calculating the values of Mean Decrease Accuracy (MDA). The Random Forest model was suitable for explaining the nonlinear and collinear relationships among driving variables and was capable of handling a flexible number of input variables [55
]. Random Forest was an ensemble learning technology that implemented a Breiman random forest algorithm based on a combination of many decision trees [56
]. For each tree in the Random Forest model, a random set of variables and a random sample from the dataset for training were selected [58
]. MDA in Random Forest could be used to rank variable importance. MDA quantified variable importance through measuring the change in Random Forest prediction accuracy, when the variable values were randomly permuted compared to original observations [59
]. The larger MDA value denoted that the variable was more important [61
]. Additionally, the Random Forest model we used was from a package in R (http://www.R-project.org
We used the Random Forest model to predict future wetland distributions based on the relationship between historical wetland distribution and the driving variables. We first built the model using the historical wetland datasets including the 1990s and 2000s. Specifically, we divided the study area into approximately 850,000 cells. We used all 15 driving variables in the Random Forest model to explain wetland distribution in each grid cell and ranked the driving variables by importance. For the sake of building a statistically testable Random Forest model, we sampled 50% from the total number of cells where wetland occurred and 20% from the total number of cells of non-wetland from the 1990s to 2000s as the training datasets to establish the Random Forest models. Secondly, we predicted wetland distributions during the 2010s and then validated these predictions using the observed wetland data from the 2010s by comparing the predicted and observed wetland area during the 2010s. We also compared the predicted and observed wetland spatial distributions using receiver operating characteristics (ROC), which plotted sensitivity on the y axis and 1-specificity on the x axis for all possible thresholds (Phillips et al., 2006) and characterized the model performance using the area under the curve (AUC) (Phillips et al., 2006). AUC values ranged from zero (very poor model accuracy) to one (perfect fit between observations and predictions) (Swets, 1988), which were described as follows: poor (0.5–0.70), good (0.70–0.90), and excellent (0.90–1) (He et al., 2013). We calculated the ROC curve and AUC directly from the Random Forest model predictions for the 2010s.
We finally predicted the future wetland distributions for the 2040s, 2070s, and 2100s under RCP 2.6, RCP 4.5, and RCP 8.5 emission scenarios, respectively. We summarized the differences in predicted wetland area among emission scenarios for the 2040s, 2070s, and 2100s. To capture the spatial changes in wetland distribution for the 2040s, 2070s, and 2100s, we calculated and mapped loss, gain, and persistence rates in which future wetland distribution changed from present to absent, absent to present, and persistent, respectively, compared with the wetland distribution under the current climate scenario.
In our study, we quantified the importance of 15 driving variables that can help to explore the relative importance of climatic factors and human activity factors in driving historical wetland distribution changes. We predicted wetland distributions under RCP 2.6, RCP 4.5, and RCP 8.5 emission scenarios in a mid- and high-latitude region which will help predict wetland distribution change driven by the combined effects of climate change and human activities over the 21st century. Our results showed that the variables with high importance scores driving historical wetland distribution changes included agricultural population proportion, warmness index, distance to water body, coldness index, and annual mean precipitation. We found that climatic factors had larger effects than human activity factors on average in regard to wetland distribution changes over recent decades and that human activity has accelerated wetland changes. Average predicted wetland distribution decreased dramatically among the three periods of time investigated. Predicted wetland distribution changes were mainly in the southern portion of the study area due to the location of most wetlands. The losses in predicted wetland distribution were around agricultural lands and expanded continually north to the whole region over time, whereas gains in predicted wetland distributions were associated with grasslands and water in the southern-most portion of the region. In the mid- and high-latitudes with increasing human activities and climate change developing, our findings provide information for wetland resource management and restoration.