R-MFNet: Analysis of Urban Carbon Stock Change against the Background of Land-Use Change Based on a Residual Multi-Module Fusion Network

: Regional land-use change is the leading cause of ecosystem carbon stock change; it is essential to investigate the response of LUCC to carbon stock to achieve the strategic goal of “double carbon” in a region. This paper proposes a residual network algorithm, the Residual Multi-module Fusion Network (R-MFNet), to address the problems of blurred feature boundary information, low classiﬁcation accuracy, and high noise, which are often encountered in traditional classiﬁcation methods. The network algorithm uses an R-ASPP module to expand the receptive ﬁeld of the feature map to extract sufﬁcient and multi-scale target features; it uses the attention mechanism to assign weights to the multi-scale information of each channel and space. It can fully preserve the remote sensing image features extracted by the convolutional layer through the residual connection. Using this classiﬁcation network method, the classiﬁcation of three Landsat-TM/OLI images of Zhengzhou City (the capital of Henan Province) from 2001 to 2020 was realized (the years that the three images were taken are 2001, 2009, and 2020). Compared with SVM, 2D-CNN, and deep residual networks (ResNet), the overall accuracy of the test dataset is increased by 10.07%, 3.96%, and 1.33%, respectively. The classiﬁcation achieved using this method is closer to the real land surface, and its accuracy is higher than that of the ﬁnished product data obtained using the traditional classiﬁcation method, providing high-precision land-use classiﬁcation data for the subsequent carbon storage estimation research. Based on the land-use classiﬁcation data and the carbon density data corrected by meteorological data (temperature and precipitation data), the InVEST model is used to analyze the land-use change and its impact on carbon storage in the region. The results showed that, from 2001 to 2020, the carbon stock in the study area showed a downward trend, with a total decrease of 1.48 × 10 7 t. Over the course of this 19-year period, the farmland area in Zhengzhou decreased by 1101.72 km 2 , and the built land area increased sharply by 936.16 km 2 . The area of land transfer accounted for 29.26% of the total area of Zhengzhou City from 2001 to 2009, and 31.20% from 2009 to 2020. The conversion of farmland to built land is the primary type of land transfer and the most important reason for decreasing carbon stock. The research results can provide support, in the form of scientiﬁc data, for land-use management decisions and carbon storage function protections in Zhengzhou and other cities around the world undergoing rapid urbanization.


Introduction
Global climate change is an increasingly severe issue, posing a significant challenge to human survival and sustainable development [1]. Ecosystems can help mitigate the greenhouse effect and regulate the global climate by continuously absorbing carbon dioxide through the photosynthesis of surface plants [2,3]. The core issue of climate regulation and low-carbon development is land use. Changes in land-use/cover play a vital role in maintaining ecosystem carbon stock services by changing ecosystem structures and functions, both of which directly affect vegetation carbon stocks through changes in the soil environment and the return of plant residues to the soil [4,5], becoming essential elements of current climate change research [6][7][8]. Since 2009, China has been the world's largest carbon emitter, and currently contributes about 27% of global carbon emissions; it therefore exerts a critical influence in maintaining the global carbon emission balance and carbon sequestration balance [9]. In March 2021, the country introduced the carbon peak and carbon neutrality program as part of the Fourteenth Five-Year Plan. Therefore, under the goal of "double carbon", it is essential to conduct regional carbon stock studies to ensure a carbon balance and to formulate carbon reduction policies at the national and regional levels.
Zhengzhou City is located in the southern part of the Yellow River Basin, where the main vegetation types are farmland, woodland, and grassland. The Zhengzhou Yellow River Wetland National Nature Reserve in Henan Province provides a habitat for tens of thousands of species of plants and animals. The surface cover in the region is dominated by crop cultivation, with farmland accounting for about 70% of the total area. The ecosystem is relatively homogeneous, and human activities have a significant impact on the regional ecological environment. For the all-round development of Zhengzhou, the Fourteenth Five-Year Plan for the Ecological Environmental Protection of Zhengzhou City (http://www.zhengzhou.gov.cn/, accessed on 12 April 2023) [10] was formulated; this plan proposes a clear and active response to climate change, taking carbon reduction as the key strategic direction and providing an action guide for ecological environmental protection work.
The methods currently used for calculating carbon stock include field surveys and model simulations [11,12]. Compared with the traditional field survey method, the model simulation method can assess carbon stock changes at different scales and visualize the assessment results spatially [13][14][15]; this method is widely used at the regional scale. Among the many assessment models, the InVEST model is widely used for its low data requirements, fast operation speeds, and high accuracy [16][17][18]. In recent years, the InVEST model has been used to investigate the effects of land-use change on carbon stocks from different perspectives (e.g., urbanization, policy protection) and scales (such as administrative divisions and watersheds). Hou used it and the PLUS model to explore and predict the characteristics of ecosystem carbon stocks and their relationships with land-use patterns from 2000 to 2020 and 2020 to 2040. The results show that carbon stocks exhibit a decreasing trend followed by an increasing trend. Change in land-use type is the main factor leading to changes in carbon stocks, and the rapid expansion of building land leads to decreases in carbon stocks [19]. Yang used this model combined with the PLUS model to predict the characteristics of land use and carbon stocks in Xian under different future scenarios and studied the effects of land-use changes on carbon stocks; the final results showed that the expansion of construction land and the transfer of high-carbon-density land reduced the carbon stocks in Xian by 2.49 × 10 6 t from 2000 to 2015. Under the natural growth scenario, carbon stocks continue to decline, while, under the ecological protection scenario, carbon stocks increase [20]. The above studies show that the impact of land-use change on carbon stocks mainly depends on the ecosystem type and the mode of transfer between different land-use types.
The Invest model is used to study carbon storage. Most previous scholarship uses the input carbon density data in order to study the driving factors of carbon density; such studies use these factors to modify the carbon density to obtain the carbon density data suitable for the study area. However, such works overlook another highly important aspect, which is high-precision land-use/cover data. Most previous studies are based on finished product data, but these finished product data are generally large in scale, and their accuracy is not ideal in specific research areas. Therefore, the idea of using deep learning methods to classify remote sensing images in small areas emerged; higher-precision land-use/cover data can better reflect the real surface structure, which will undoubtedly improve the accuracy of estimations of carbon storage.
The most basic task of land-use change research is to classify the land features from remote sensing images. At present, the most commonly used traditional machine learning classification methods include SVM, KNN, random forest, multi-source logistic regression (MLR), etc. However, these traditional methods only extract the shallow feature information of remote sensing images and fail to learn the abstract features, meaning that they cannot achieve high classification accuracy [21]. Because of these problems, deep learning has begun to be used in remote sensing image classification because of its ability to extract deeper features of remote sensing images. Deep learning techniques can automatically learn the target change features and overcome the problems, such as poor stability and poor feature extraction abilities, that are encountered in traditional classification methods. Currently, the use of deep learning algorithms to classify remote sensing images is a significant trend [22][23][24][25][26][27]. Some common network models are the stacked auto-encoder (SAE), convolutional neural networks (CNN), and CNN-based remote sensing image classification methods. At first, a one-dimensional convolutional kernel was used, such as the one-dimensional convolutional neural network proposed by Hu [28]. Later, a group of scholars constructed two-dimensional convolutional neural networks (2D-CNN) and three-dimensional convolutional neural networks (3D-CNN), introducing deeper network structures such as the "residual link". Zheng used an improved 3D-CNN to classify hyperspectral image features [29]. Qiao used the residual network (ResNet) and added a double attention mechanism to classify remote sensing scenes, and this method has higher accuracy than the convolutional neural network [30]. Xu introduced atrous convolutions to enhance the receptive field of feature extraction based on Resnet, in order to capture richer multi-scale detail features [31]. Li improved the U-Net by adding the SFAM module and the ASPP module, using this method to classify typical crops [32].
However, the classification effect achieved using these residual networks is still not ideal, partly because the residual network causes gradient explosion as the depth increases; therefore, if the depth of the network is arbitrarily increased, it may be counterproductive. Another reason is that the complexity of remote sensing images prevents deep learning networks from learning more features. Based on these two considerations, we designed and used a pre-activated residual network module, which not only increases the depth of the network but also does so without requiring additional computation. In response to the complexity of remote sensing images, the CBAM attention mechanism is introduced; this can accelerate model fitting and further improve the segmentation of image target edges. In addition, a multi-scale feature extraction and fusion module is added to the model; it can extract the features of different receptive fields and introduce residual connections into the module, so that it can learn more low-level and high-level features. The network designed in this article uses feature fusion modules as the basis for constructing a residual network. Multiple modules work together to maximize the learning of remote sensing image features, thereby improving the edge information loss of large and small linear targets that is caused by the loss of target feature information in previous convolutional neural networks. This is particularly evident in the building classification. The residual multi-module fusion network (R-MFNet) proposed in this article can accurately classify each kind of ground object, laying a data foundation for the subsequent accurate statistical analysis of the area and of the spatial distribution of various land-use types; moreover, this network can further improve the accuracy of carbon storage estimations. This paper not only proposes a CNN network that can fully extract the characteristics of remote sensing images, but also provides a new research basis for further improving the accuracy of carbon storage estimations.

Research Area
Zhengzhou (112 • 42 -114 • 14 E and 34 • 16 -34 • 58 N) City is located in the central north of Henan Province in the south of the North China Plain; the city has a total land area of 7446 km 2 . The city borders Kaifeng to the east, Luoyang to the west, Xinxiang and Jiaozuo to the north, and Pingdingshan and Xuchang to the south. The terrain and landforms of Zhengzhou City are rich and varied, with the topography being low in the east and high in the west. Zhengzhou has four distinct seasons and a distinct rainy and dry season. The Yellow River and the Huai River are the 2 major water systems in the city, and there are 124 rivers of various sizes. There are many types of soils and rich reserves of natural resources. The soil in the urban area of Zhengzhou is mainly within the scope of agricultural areas, which are divided into 4 soil types, 9 subtypes, 17 soil genera, and 51 soil species, including cinnamon soil, tidal soil, aeolian sand soil, and paddy soil. It is mainly composed of brown soil and tidal soil. Thirty-six kinds of minerals, including coal, oil and stone, and refractory clay, have been identified, and this area is one of the largest oil and stone bases in China. Zhengzhou is the capital of Henan Province, the core city of the Central Plains City Cluster. Note that the Central Plains urban agglomeration is the core area for development and strengthening in the urbanization strategy of Henan Province. The state will actively cultivate and develop urban agglomeration to become an important region, promoting the uniform development of land space and rapid economic development [33]. It is an essential central city in the central region with a high political and economic status. The location map of Zhengzhou is shown in Figure 1.

Land-Use Data
The remote sensing image data selected in this paper were Landsat data from geospatial data cloud sites, http://www.gscloud.cn/search (accessed on 12 April 2023). Due to the influence of the quality of the Landsat data, images from the same month could not be collected. To ensure the reliability of the experiment, the data were collected in late spring or early summer, which are periods with fewer clouds, and the years 2001, 2009, and 2020 were selected. The image data size was 4545 × 2706 pixels, where the background was outside the administrative boundary. To select training samples of various land types, the determination of the type of land use required was based on the land-use characteristics of Zhengzhou and subsequent estimates of carbon storage, and it was then combined with the reference framework established by the National Remote Sensing Monitoring Coverage Classification System and the types of features needed for carbon storage. The land-use types were classified into six categories, farmland, woodland, water body, grassland, built land, and other land [34], and their specific interpretation criteria are shown in Table 1. We selected samples from the image and labeled them manually. The advantage of this is that we could obtain any desired samples. We used a portion of the labeled samples for training and a portion for validation, with a ratio of 30% and 70%, respectively; we provide the sample selection details for 2020 in Table 2. To improve the classification accuracy and make the different land-use categories more easily distinguished, we introduced three indices: NDV I, NDW I, and NDBI. The NDV I was introduced to improve the classification accuracy of woodland, grassland, and other land types. The NDW I was introduced to distinguish water bodies from other types. The NDBI was introduced to improve the model's ability to distinguish between buildings and other objects. The three indices are calculated using Equations (1)-(3): where N is the near-infrared band, RED is the red band, GREEN is the green band, and S is the short-wave infrared band. After exponential calculations and band fusion, the Landsat data added NDV I, NDW I, and NDBI to the original band. The band fusion data were the original image data for deep learning classification.

Carbon Density Data
The measured carbon density data can effectively improve the accuracy of carbon stock estimation. Still, there is a lack of measured data based on land-use types in Zhengzhou, so the data determination in this paper was completed in two stages: (1) collecting the data from the National Ecological Science Data Center (http://www.cnern.org.cn/, accessed on 12 April 2023) and referring to the literature [35][36][37]; since there is little research on carbon storage in Zhengzhou, the carbon density data of nearby areas were collected.
(2) Since the data in Zhengzhou were obtained from the results of national studies and some local studies, they were not the actual measurement results; moreover, the carbon density values vary with climate, soil properties, and land use, so they needed to be corrected. The carbon density is affected by many factors, such as the type of vegetation, soil moisture, climate conditions, etc. However, the influence of elevation, terrain, and other factors is relatively small, and the aforementioned factors are usually affected by precipitation and temperature. Therefore, when revising the carbon density value of Zhengzhou, the influence of temperature and precipitation needed to be be considered comprehensively. The literature review shows that both the vegetation carbon density and the soil organic carbon density have significant positive correlations with annual precipitation, but the correlation with the annual average temperature is not strong and the formula is highly generalized. Therefore, the formula used by Alam was used to correct the precipitation factor (corrected for the effect between the annual mean precipitation and biomass carbon density and soil organic carbon density) [38]. Meanwhile, the formula used by Giardina and Chen was used as the formula to correct the temperature factor (corrected for the effect between the annual average temperature and biomass carbon density; the influence between the annual average temperature and the soil carbon density was not considered) [39,40]: where C SP is the soil carbon density obtained from the annual precipitation, C BP and C BT are the carbon density of biomass based on annual precipitation and the annual mean temperature, respectively, AP is the average annual precipitation (mm), and AT is the average annual temperature ( • C). The mean annual temperature and precipitation of Zhengzhou and the whole country were substituted into the above equations (9.38 • C/15.64 • C and 677.3 mm/624.79 mm for the national scale and Zhengzhou between 2001 and 2020, respectively), and the data were collected from statistical yearbooks. The ratio of the two is the correction factor. The collected carbon density data at the national scale were multiplied by the correction factor to calculate the carbon density of Zhengzhou.
where K BP and K BT are the correction factors of the precipitation factor and the temperature factor for biomass carbon density, respectively, and C BP and C BP are the biomass carbon density data obtained from annual precipitation in Zhengzhou and national scale, respectively. C BT and C BT are the biomass carbon density data based on the annual mean temperature in Zhengzhou and the national scale, respectively. C SP and C SP are the soil carbon density data based on the annual mean temperature in Zhengzhou city and at the national scale, respectively. K B and K S are the biomass carbon density correction coefficients and soil carbon density correction coefficients, respectively. The carbon density values of different land-use types revised by annual precipitation and annual mean temperature in Zhengzhou are shown in Table 3. C above represents the carbon density of aboveground organisms, including all living vegetation above the surface (bark, trunks, branches, and leaves, etc.). C below represents the carbon density of underground organisms, mainly referring to the carbon storage of plant roots. C soil represents the carbon density of soil, generally referring to the carbon storage of mineral soil and organic soil. C dead represents the carbon density of dead organic matter, generally referring to the carbon storage of litter and dead plants.

Methods
In this study, the coding and decoding paths are bridged by the residual atrous spatial pyramid pooling structure (R-ASPP), which uses multiple parallel atrous convolutions to extract the multi-scale receptive field features of the target region and improve the classification accuracy of the model for features with significant scale changes. To solve the problem of remote sensing image data being rich in spatial information but subject to redundant geographic features interfering with the deep neural network model, a strategy of introducing a convolutional block attention module (CBAM) into the network is proposed. Additionally, an improved residual block is introduced to fuse deep features with shallow features, enhancing the propagation ability of features, and further improving the operation speed of the model while avoiding the problem of gradient disappearance in training.

Network Structure
The overall architecture of the improved classification model proposed in this paper is shown in Figure 2. It uses BN processing and the ReLU activation function after each convolutional layer, effectively preventing overfitting or underfitting during the training process. The coding path of the network consists of four downsampling modules and one modified ASPP module, each of which contains a modified residual block and an attention module. The decoding path of the network consists of four up-sampling modules, each of which also includes an improved residual block and an attention module to recover the resolution of local features to the input image size. The encoding path and the decoding path are bridged using an improved ASPP module, which extracts multi-scale features of ground objects from the high-level features learned from the encoding path and then transmits these features to the decoding path. The output of the encoding path incorporates the remote sensing image features learned by the attention and pre-activated residual modules. Each module takes advantage of its respective structures to reduce the loss of information during forward propagation and to extract the more fully abstracted features needed for classification. The final layer of the decoding path uses a sigmoid function to project the multichannel feature mapping onto the target region to generate the final classification result.

Pre-Activated Residual Network Module
The expressiveness of the network is enhanced with the increase in the network depth. For the problem of gradient disappearance caused by deepening the network depth with the residual unit, He added the rectified linear unit (ReLU) activation layer to the residual unit to obtain a network with improved convergence performance and pixel classification performance [41].
By observing the position of the activation function in Figure 3, it is clear that the activation function (Relu and BN) is placed before the convolutional layers. Compared with the traditional concept of post-activation, the pre-activation residual unit module simplifies the model and makes it converge faster. ReLU is a nonlinear activation function which activates the remote sensing data processed by BN algorithm through the ReLU activation function and adds it to the convolutional layers before each convolutional operation. The BN algorithm can play the role of regularization to improve feature complexity. The addition of the BN layer means that the data input to the ReLU activation layer in backward propagation can produce more pronounced gradients, enhancing the propagation of features, breaking the interference of higher learning rate settings on the ReLU function activation neurons, and allowing the network to converge faster while avoiding the problem of gradient disappearance. Therefore, with the pre-activation module, the feature information of various features can be freely propagated forward and backward in the whole network.

Attention Mechanism Module
The feature representation of the network becomes richer as the network deepens, but much spatial information is lost due to cascaded convolution and downsampling operations. For this reason, this network adds attention modules in the coding and decoding paths, which propagate spatial information in the coding layer to the decoding layer and reduce the loss of information during forward propagation. The attention module used by the network is composed of two attention modules, which correct the input data in both the space and the channel. When the location information of the input feature map is of higher importance in both channel and spatial correction, it will achieve higher activation, thus encouraging the network to learn more compelling features.
CBAM is a convolutional attention mechanism module, which is a comprehensive combination of a channel attention module (CAM) and a spatial attention module (SAM) [42]. Figure 4 shows its structure.
We have made some improvements to the CBAM attention mechanism by incorporating residual connections to enable it to learn and integrate more information. The mechanism consists of two main operational steps: CAM first performs maximum pooling and average pooling operations on the input features, then tiles and inputs the feature map into a multilayer perceptron (MLP) with a two-layer network, sums the output results, and activates them through the sigmoid activation function to obtain the channel attention features. They are multiplied by the input features and residual connected with the input features. The resulting features are used as the input of the SAM module. After the SAM module takes effect, it multiplies the residual connected features of CAM and performs residual connections with the input features to obtain the final output features.
The attention mechanism is applied to the entire residual network to promote the effective flow of image information in the network, enabling the network to capture key information and improving its ability to recognize ground objects.

Residual ASPP Module
There are four kinds of atrous convolution parallel sampling with different sampling rates in the ASPP structure. Atrous convolution can expand the receptive field of the feature map without losing the image resolution, meaning that the target can be precisely located. However, the limitation of ASPP is also reflected in the atrous convolution. Atrous convolution with a high sampling rate is good for recognizing large objects but loses the useful information of small objects; meanwhile, atrous convolution with a low sampling rate can obtain the position information of small objects but loses more of the contour edge information of large objects. The parallel combination of atrous convolution with different sampling rates can compensate for the information missing from atrous convolution to some extent. However, the effective content of the missing information still needs to be effectively utilized. Given the limitations of ASPP and the characteristics of images in Zhengzhou, we propose a residual ASPP module to supplement the information lost in the ASPP feature extraction process to achieve better classification results. The improved residual ASPP is derived from the combination of the residual block and ASPP. The residual block is used to make up for the deficiencies of ASPP by adding a shortcut connection to the four different sampling rates of ASPP, which is helpful in supplementing the information missed after the atrous convolution. The improved structure of ASPP is shown in Figure 5. The specific calculation formula is as follows: where Concat is the concatenation operation, H r.n (X) represents the perforated convolution of an n-size convolution kernel and a sampling rate r, where H r.n (X) = F(x) + x. Here, F(x) is the output of the lower branch, x is the output of the upper branch, and I poobing is the image feature of the image pooling branch in Figure 5, which is the average pooling feature of the input feature map.

Carbon Stock Estimation
The carbon module in the InVEST model is used to analyze changes in carbon storage in the ecosystem of Zhengzhou. The working principle of this module is calculating the carbon storage of Zhengzhou by combining the land-use/land cover data and the corresponding carbon density table of each land type. The model divides the carbon storage of the regional ecosystem into four categories; some studies ignore the dead organic carbon, and only calculate the other three types of carbon pools. In addition, the model also includes the fifth carbon pool. Considering the previous research experience and the data acquisition situation, the fifth carbon pool is not included in this study [43,44]. The calculation of the basic carbon pool meets the expectations of this study and can directly serve the research purposes. We add up the total amount from the four carbon pools to represent the total carbon reserves of the study area, as follows: where i is the i-th land class, and C i is the carbon density of the i-th land class. C i,above is the aboveground biomass carbon density of the i-th land class, C i,below is the below-ground biomass carbon density of the i-th land class, C i,soil is the soil organic matter carbon density of i-th land class, C i,dead is the carbon density of the dead organic matter of the i-th land class, C total is the total regional carbon stock, S i is the total area of the i-th land class, and n represents the total number of land classes, i.e., 6 in this study.

Carbon Stock Changes Due to Land-Use Change
Due to the differences in carbon density between different land-use types, if there is mutual conversion between them, the carbon storage change caused by their change can be obtained by multiplying the area of conversion between them by the carbon density value due to their change. The specific method can be achieved using the land-use matrix obtained above, which is calculated as follows: where T ij is the total amount of carbon stock change caused by the transfer from the i-th land class to the j-th land class. S ij is the area transferred from land class i to land class j. ∆c ij is the difference between the carbon density corresponding to land class j and the carbon density of land class i.

Results
R-MFNet was used to classify three images of Zhengzhou from different years (2001, 2009, and 2020). In order to show the effect of model improvement, some control experiments were set up, including SVM and a deep learning model (2D-CNN) with a good classification effect, as well as a deep residual network model (Resnet). The model does not include an attention mechanism or a feature fusion module. The overall accuracy and Kappa coefficient are used to express the classification effect, which are two indicators commonly used in remote sensing to represent the classification quality. The accuracy evaluation of the different models is shown in Table 4; it shows the image classification results in 2020 with the highest Kappa coefficient. According to the classification performance of the model, we can see that the R-MFNet model proposed has the best classification accuracy, but there are also some misclassifications. In order to obtain more accurate classification data for the land-use types, we correct our own training samples by means of a field survey and observation of the accurate classification results of our predecessors. The classification accuracy of the model is continuously improved by deleting the error points and increasing the training sample points of land-use types with fewer classification results. The higher the classification accuracy is, the more beneficial it is to the subsequent estimation of carbon reserves. The final accuracy evaluation after multiple corrections is shown in Table 5. This result provides a high-precision land-use-type map for the subsequent carbon reserve estimations The R-MFNet network proposed in this paper has good feature extraction abilities because of its network structure characteristics, and its classification results are more consistent with the actual land type. In order to verify the advantages of the model in reducing the loss of land type information, we selected an area from the Landsat image of Zhengzhou in 2020, including built land, water bodies, and farmland. To ensure the universality of the results, we selected regions randomly and compared their classification results with SVM, 2D-CNN, and Resnet, as shown in Figure 6. The sample diagram for marking is shown in Figure 6b. It can clearly be seen from Figure 6 that the classification effect of 2D-CNN is the worst, possibly because the loss of target information is relatively severe due to its small number of network layers, simple structure, and weak feature learning ability, which manifests in two forms: the first is that the information of linear objects such as roads cannot be detected, the second is that the connection boundary between land-use types is too ideal, and its ability to control noise is too poor. Additionally, the edge contour detection of large buildings is not meticulous, and there are some inaccurate classifications of water bodies. The classification effect of the Resnet model is better than that of 2D-CNN, possibly because its residual structure supplements the target information to be detected, and the connection boundary between various types of land use has been improved. However, it still shows some missing detections of target objects, and there are still a small number of inaccurate classifications of water bodies; nevertheless, its ability to detect non-building linear objects such as roads and dirt roads near cultivated land has been improved. SVM is a commonly used remote sensing image classification method; most of the existing land-use-type product data used for carbon stock estimation using the InVEST model are based on this classification method. SVM produces more accurate object detection results, and the edge detection information of objects is obvious. Compared with the three other types of network, due to its residual structure and feature fusion module, the R-MFNet model proposed in this paper learns more effective information. Both large buildings and linear objects show good classification results, with the lowest level of misclassification, and the model has some advantages in noise suppression and object edge contour detection. Its classification results are the most consistent with the real land-use type. The R-MFNet network exhibits more sufficient detection of information, as shown in the two elliptical marks in Figure 6; this is the result of its network structure. It still exhibits strong feature learning ability even when the sample size is reduced.
When comparing the classification results, we designed a residual network based on our proposed network that does not include the pre-activation residual module, residual attention module, and residual ASPP module, and which is similar to U-Net. Compared to the network proposed in this article, it is a weak classifier, but, compared to U-Net, it is a strong classifier. We also selected an area from 2020 that includes six types of land use and compared them using some classic models, including U-net, 1D-CNN, and HybirdSN. The results are shown in Table 6. Table 6. Comparison of classification accuracy of typical regional for classical methods.  Table 6, it can be seen that, compared to other classic classification networks, the network proposed in this article has the highest accuracy in both the final results (OA values and Kappa coefficients) and for each land type.

Ablation Experiment for the Residual Multi-Module Fusion Network
The main purpose of pre-activation residual units is to increase the efficiency of the network, and our focus is on the introduction of residual attention mechanism and a residual ASPP structure. To verify their impact on the network, we added an ablation experiment section to the paper, using a network without the residual attention and residual ASPP components as the baseline method. We separately explored the impact of these two components on the network performance. For the ablation experiments, we replaced the corresponding components with convolutional layers with the same number of parameters to ensure that the total number of parameters in the network remained unchanged. Table 7 shows the classification accuracy of the remote sensing images obtained by combining two different modules: (1) using both the residual attention module and the residual ASPP module; (2) using only the residual ASPP module; (3) using only the residual attention module; (4) not using either module. By comparing and analyzing the OA values of Table 7, it was found that using only the residual ASPP module increased the accuracy by 1.64%, using only the residual attention module increased the accuracy by 0.39%, and using both modules simultaneously increased the accuracy by 2.5%. The different combinations of various modules have the same effect on the improvement in the Kappa coefficient as on the OA value; that is, the contribution of the residual ASPP module to the network is higher than that of the residual attention module, and the highest accuracy is achieved by using both modules simultaneously. The improved accuracy values (OA) obtained using the above combination methods are visualized to obtain the contribution levels of different modules, as shown in Figure 7.   Figure 8 shows that, from 2001 to 2020, the structural change in land use in Zhengzhou was dominated by built land and farmland, followed by woodland and water bodies, and the distribution of each land-use type also has strong spatial heterogeneity. Woodland is concentrated in the western part of Zhengzhou and the mountainous and hilly areas in the southwest and is distributed in blocks in the western part, while presenting a linear distribution in the southwestern part. The farmland area is relatively large, with most of it distributed in the plain area around buildings, in the northern and eastern parts of Zhengzhou, and in some mountainous areas. The water body areas mainly comprise the Yellow River and the tributaries that feed into the Yellow River, and a small number of reservoirs and mountainous rivers. Built land is mainly distributed in the central area of Zhengzhou, with a more aggregated spatial distribution range and showing a marked point distribution. The area made up of grassland and other land is small, and the other land is mostly beaches and bare land exposed during the dry period, while grassland is scattered among the other land types. With the rapid promotion of urbanization in the 21st century, the spatial distribution structure of land-cover types in Zhengzhou City in 2009 ( Figure 8b) and 2020 (Figure 8c) changed drastically, with built land expanding rapidly and showing a planar distribution across the whole space, while the spatial distribution of farmland shrunk significantly. The spatial distribution pattern of woodland and grassland in the southwest mountainous area is fragmented, and an overall spatial distribution pattern of the interlaced distribution of built land, farmland, and woodland is formed.  Table 8. We can see that, in 2001, the spatial structure of Zhengzhou City was mainly farmland, woodland, and built land, with areas of 5681.03 km 2 , 390.55 km 2 , and 1272.78 km 2 , respectively, while water bodies, grassland, and other land were 107.78 km 2 , 132.36 km 2 , and 0.54 km 2 , respectively.  As a result of the above analysis, with the rapid urbanization of Zhengzhou, significant changes took place in the built land, woodland, grassland, and farmland in Zhengzhou during the period of 2001-2020. Specifically, the area of built land increased, and the grassland and farmland area decreased. Water bodies and other land areas showed a slight growth trend. Woodland is mainly located in the highest elevation areas of Zhengzhou, water bodies are mainly located in the lowest elevation areas of Zhengzhou, and most of the buillt land and farmland are located in the plain area of Zhengzhou.

Land-Use Change Transfer Matrix
In order to quantitatively analyze these changes, a transformation matrix of Zhengzhou City is drawn. The results are shown in Table 9. The values of no transfer or a small transfer area are marked as 0. Although the other land types changed significantly compared with these areas, the proportion is very small. Therefore, it is not analyzed. The total change in land-use types in Zhengzhou from 2001 to 2020 is 1396.65 km 2 , as shown in Table 9; this accounts for 18.41% of the total area of Zhengzhou City. As for increases, the changed categories, from largest to smallest, are built land, woodland, farmland, water body, and grassland. The newly increased built land mainly consists of altered farmland (946.48 km 2 ), while the newly increased farmland primarily consists of altered water bodies (40.0 km 2 ) and grassland (40.33 km 2 ). The newly increased woodland was converted from farmland (179.30 km 2 ), while the newly increased grassland consists of converted farmland (40.21 km 2 ). There are two possible reasons for this phenomenon; one is that the rapid development of Zhengzhou City has led to the rapid increase in built land, and the other is that the release and implementation of the national policy of returning farmland to woodland and grassland have led to the conversion of some farmland into woodland and grassland.
In summary, from 2001 to 2020, different land types were converted in various ways, mainly built land, farmland, and woodland. Between 2001 and 2020, new built land was largely converted from farmland, accounting for 69.80% of all new land types. With the development of Zhengzhou, the population and economy have increased rapidly, creating increased demand for built land. Therefore, the population and economic growth that occurred during this period is the main reason for the large-scale expansion of urban built land, which directly drives the conversion process of farmland to built land.

Carbon Stock Changes
In terms of quantity (Figure 9  Among the land-use types in Zhengzhou, farmland is the type with the highest carbon storage, as shown in Figure 10, accounting for 86.11% of the total regional carbon storage, followed by woodland (9.08%), construction land (2.68%), grassland (2.12%), water bodies (0.02%), and other land (less than 0.01%). The above proportions are obtained from the percentage of carbon storage in 2001. The proportional structure of the carbon reserves did not change significantly across the time points.
In Figure 11, the spatial distribution of carbon stocks in Zhengzhou City from 2001 to 2020 shows roughly the same pattern; specifically, the carbon stocks increase gradually from the east toward the west and southwest, showing a pattern of "high in the west and low in the east". The low-carbon-stock area is mainly distributed in the north, the high-carbon-stock area is concentrated in the west and southwest, and the medium carbon stock is distributed in the middle of Zhengzhou. Because the same land-use type has the same carbon density value when using the model to estimate carbon reserves, the carbon stock distribution of different values is consistent with the classification results; the high-carbon-storage area corresponds to woodland, and the low-carbon-storage area corresponds to water bodies and other land. In contrast, the corresponding land type in the medium value carbon stock area is built land. Regarding spatial changes, the distribution pattern of carbon reserves in Zhengzhou has changed slightly. The expansion of built land is a highly important reason for this change and has led to a sharp increase in the median area of carbon reserves in the urban center.

Carbon Stock Changes Due to Land-Use Change
There is mutual conversion between six different types of land-use, because the differences in the area and type of conversion lead to changes in carbon reserves, as shown in Table 10. The conversion area affects the total number of carbon reserves, while the conversion type affects the carbon density. The transfer away from farmland leads to a reduction in the carbon stock of 1.56 × 10 7 t; this is mainly because most of the farmland is converted into built land with a lower carbon density. Woodland has strong carbon sequestration abilities, and its carbon density is the maximum among the land types, so the transfer from woodland to any other land-use type is not conducive to carbon stock sequestration. The decrease in carbon stocks due to the transfer away from woodland is 2.93 × 10 5 t, while the increased woodland is mainly transferred from farmland, which increases the carbon stock by 1.97 × 10 6 t, accounting for 58.97% of the total increase in the carbon stock. The grassland fluctuated from "increase to rapid decrease", showing an overall trend of grassland degradation, but the transfer away from grassland increased the carbon storage by 5.66 × 10 4 t. This is because the grassland is converted to woodland and farmland, with higher carbon densities, and the transfer area is large, resulting in an increase in carbon storage that is more significant than the decrease. Water bodies are a land type with relatively large changes in area, mainly being transferred from farmland and built land. The transfer away from water bodies increases the carbon stock by 8.89 × 10 5 t, accounting for 26.61% of the total increase. Built land shows a trend of "decrease-rapid increase", and is mainly converted from farmland and water bodies. When farmland is converted to built land, it is not conducive to the increase in carbon reserves, with a total decrease of 1.68 × 10 7 t, accounting for 92.23% of the total decrease. Water bodies being converted to built land is conducive to increased carbon storage, with a total increase of 5.58 × 10 4 t. The transfer away from built land increased the carbon storage by 1.52 × 10 5 t, accounting for 4.59% of the total increase in carbon stock. This is because most of the built land is converted to water bodies and farmland; although the area converted to water bodies was larger, the carbon stock increased more when converted to farmland.
With the urbanization of Zhengzhou City, its ecological environment has been damaged, and the overall reduction in carbon stock in Zhengzhou City caused by land-use change between 2001 and 2020 is 1.48 × 10 7 t. The carbon storage increase caused by the positive evolution of the ecosystem (farmland, grassland, and other land types being converted to woodland, and farmland and other land types being converted to grassland) is 2.22 × 10 6 t, accounting for 66.64% of the total increase in carbon storage. The carbon stock reduction caused by the reverse evolution of the ecosystem (the transfer of woodland, farmland, and grassland to built land) is 1.71 × 10 7 t, accounting for 94.11% of the total carbon stock reduction. In other words, the reduction in the carbon stock in Zhengzhou from 2001 to 2020 is mainly due to the conversion of land types with higher carbon intensity, because of reverse ecosystem evolution and the acceleration of urbanization.

Discussion
This paper aims to address the problems of the low resolution of Landsat images, the thin edges of objects, and the complex and variable contours of large targets, which lead to the omission of small objects, the misclassification of large targets, and low classification accuracy. As such, this paper proposes the R-MFNet: (1) a BN layer is added to the residual structure, which reduces the drift of internal covariance and prevents the problems of network gradient disappearance and performance degradation that are caused by deepening the network with a residual structure. The shortcut connection is used to effectively fuse the low-level features and high-level features, so as to minimize the loss of effective information from the whole network. (2) To address the problems of the missing and inaccurate extraction of remote sensing image features by existing algorithms, the residual ASPP module is used as the network center to effectively extract multi-scale receptive field features from the feature map and to improve the classification ability of the model to process objects of different sizes. (3) Two attention models are combined to effectively reduce the loss of feature information in the transmission process. The experiment proves that this model has higher classification accuracy than the SVM model and the traditional residual network and provides a high-precision land-use classification map for subsequent carbon storage analysis. This article is consistent with the research results of Han, and, compared to traditional residual networks, fusing multi-scale features can improve accuracy by 2-3% [45]. Elsewhere, Guo improved the attention mechanism by incorporating residual connections, which can enable the network to focus on important features in images and improve classification performance. Adding attention mechanisms to CNN networks can improve accuracy by 2-3%, and adding residual connections further improves accuracy [46]. In the proposed network, we also added an improved attention module using residual connections to the CNN network. Through ablation experiments, it was found that adding this module increased the Kappa coefficient (*100) of the experimental results by 2.18, which is similar to Guo's experimental results. Huan proposed a multi-scale feature fusion module that is different from the ASPP module, but its core premise is the extraction of features using residual connections and dilation convolutions with different expansion rates [47]. The network structure proposed in this article is centered around an improved ASPP module with residual connections. It is a pyramid structure that can extract multi-scale layered features and fully utilize original information. Through ablation experiments, we demonstrated its core position in classification tasks, and its contribution to the network is stronger than that of the residual attention module; due to its inclusion, the classification of large buildings and linear roads in images is more accurate.
Gao utilizes pre-activation residual networks and traditional residual networks to classify hyperspectral images, and concludes that the overall accuracy of the classification results using pre-activation residual networks is 0.2 to 0.5 percentage points higher than that of results obtained using traditional residual networks. Moreover, it is easier to generate better performance in training, as reaffirmed by our research results [48]. We designed a network containing only a single residual module to demonstrate the superiority of pre-activation residual networks, as shown in Table 11; the results in the table are the  average of 10 experiments. From Table 11, it can be seen that, under the same training time and epoch, the loss of pre-activation is smaller. It has therefore been verified that its fitting speed is fast and its application value is high. By analyzing the socio-economic development and land-use change factors in Zhengzhou from 2001 to 2020, we found that the main factors affecting the changes in carbon stock in Zhengzhou are the national ecological protection policies and urbanization in Zhengzhou. The state attaches great importance to ecological protection, has introduced many ecological protection policies, and has carried out many ecological protection projects. These ecological protection policies and measures have promoted land conservation and optimized regional land-use patterns. In particular, with the release and implementation of the policy of restoring farmland to woodland and grassland [49], significant changes have taken place in the types of land use in Zhengzhou from 2001 to 2020. The main impact of such projects is the shift from farmland to woodland and grassland, which subsequently led to marked changes in carbon storage. Deng showed that reforestation projects contributed many carbon sinks, which can offset some carbon emissions [50]. Shao concluded that, under the intensive green ecological conservation scenario in Beijing, more built land within the urban area is converted to other ecological land with higher carbon intensity values, and the area of the watershed also increases [51]. The conversion of farmland and grassland to built land is reduced, predicting a carbon stock of 16.39 × 10 6 t in 2035, which is 7.5 × 10 5 t higher than the lowest natural evolution scenario. Based on the research of Deng and Shao, we can surmise that, under the ecological protection policies, carbon storage will increase, which is consistent with our research results. Through experiments, we found that the policy of restoring farmland to forests and grassland in Zhengzhou has led to the positive evolution of land-use types, which accounts for about 66% of the total increase in carbon storage. Therefore, ecological projects, land management policies, and land-use changes caused by the ecological management of nature reserves can facilitate the maintenance of and increase in carbon stocks.
In addition, with the economic development and urbanization of Zhengzhou City, Zhengzhou's carbon reserves will continue to decrease, forcing the government to introduce some arable land protection policies to protect the most basic arable land from damage. Zhu showed that the rapid economic development of Zhejiang Province has resulted in a substantial encroachment of farmland and grassland, which led to a decrease in the carbon stock of 17.5 × 10 9 t from 1990 to 2010 [52]. Lin concluded that, in the natural development scenario of Guangdong Province, the carbon stock of the coastal cities around the Pearl River Estuary will continue to decline, mainly because of the decrease in the area of farmland and the rapid increase in built land [53]. The research of Zhu and Lin shows that, with the rapid progress of urbanization, farmland has decreased across the board and built land has increased significantly, which is consistent with our research results. This is the main reason for the reduction in carbon storage, which is also consistent with our research results, because the reverse evolution of land use in Zhengzhou has reduced carbon storage, accounting for about 94% of the total reduction. Thus, we obtained the following results from our carbon storage research: the effect of land-use changes in different transfer directions on the carbon stocks of ecosystems is twofold. Considering their impact on carbon storage under positive ecosystems evolution, it is clear that Zhengzhou should continue to implement and consolidate the relevant policies.
Different land types have different carbon density values, so it is particularly important to study whether the carbon density values used are accurate. Carbon density is influenced by multiple factors such as the hydrological conditions, climate, and soil type [54]. When existing studies use a model to assess carbon stocks, they are mostly based on the carbon density that comes with that model. They are modified by combining field surveys and related studies, which greatly improves the universality and scientific validity of the model [55]. Because the carbon density values of Zhengzhou cannot be determined, the carbon density of regions with similar geographical conditions to Zhengzhou was corrected. However, the carbon stock module assumes that the ecosystem carbon density data remain constant over time, but the actual results show that there are variations, so the use of constant carbon density data may lead to errors in the estimations. Therefore, to further improve the estimation accuracy, the monitoring and investigation of the carbon densities of different land types need to be strengthened in future studies to obtain long-term and dynamic observation data, in order to reduce the uncertainty caused by carbon density.

Conclusions
Based on the Landsat TM/OLI images of Zhengzhou City, R-MFNet was proposed to classify the land use of Zhengzhou City to obtain high-precision land-use data. In addition, some documents were collected to obtain the national and local carbon density data; by correcting these values, the carbon density values of Zhengzhou were obtained. The InVEST model was used to evaluate some of the impacts of land-use change on carbon stocks between 2001 and 2020, and the following conclusions can be drawn.
From 2001 to 2020, the land use in Zhengzhou changed significantly, with transfers between more or less every land type, resulting in increases in woodland, built land, water bodies, and other land, and decreases in farmland and grassland. Regarding the speed of the change in land-use types (the dynamic index), that for built land is the fastest, while that for the other land-use types is relatively slow.
According to the results of the InVEST model, the carbon stocks in Zhengzhou in 2001, 2009, and 2020 are 1.36 × 10 8 t, 1.44 × 10 8 t, and 1.21 × 10 8 t, respectively, showing the trend of "increase-rapid decrease", with an overall decrease of 1.48 × 10 7 t. The carbon stocks in the west and southwest of Zhengzhou are higher, while those in the central part are lower. The conversion of farmland and woodland to other types reduced carbon storage (other land uses are not considered here). A possible reason for this is that these land types are mainly converted to built land, which has low carbon intensity. Therefore, the reduction in farmland and the expansion of built land are important reasons for the reduction in regional carbon reserves.
In the future, if the urbanization of Zhengzhou is not prevented, its carbon reserves will continue to decrease. In order to actively respond to the national "double-carbon goal", Zhengzhou should strictly implement the farmland protection policy, prevent the disorderly expansion of the city, coordinate ecological construction, and reasonably determine the scale of new built land. These measures will enable it to perform well in terms of ecological protection while maintaining high-quality development.