An Approach for Multi-Source Land Use and Land Cover Data Fusion Considering Spatial Correlations

Yang, Jing; Jiang, Yiheng; Song, Qirui; Wang, Zheng; Hu, Yang; Li, Kaiqiang; Sun, Yizhong

doi:10.3390/rs17071131

Open AccessArticle

An Approach for Multi-Source Land Use and Land Cover Data Fusion Considering Spatial Correlations

by

Jing Yang

^1,2,3,4,

Yiheng Jiang

¹,

Qirui Song

¹,

Zheng Wang

¹,

Yang Hu

¹,

Kaiqiang Li

¹ and

Yizhong Sun

^2,3,*

¹

School of Internet of Things, Nanjing University of Posts and Telecommunications, Nanjing 210003, China

²

Key Laboratory of Virtual Geographic Environment, Ministry of Education, Nanjing Normal University, Nanjing 210023, China

³

Jiangsu Center for Collaborative Innovation in Geographical Information Resource Development and Application, Nanjing 210023, China

⁴

Smart Health Big Data Analysis and Location Services Engineering Research Center of Jiangsu Province, Nanjing 210003, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2025, 17(7), 1131; https://doi.org/10.3390/rs17071131

Submission received: 20 January 2025 / Revised: 20 March 2025 / Accepted: 20 March 2025 / Published: 22 March 2025

(This article belongs to the Special Issue Artificial Intelligence and Machine Learning with Applications in Remote Sensing (Third Edition))

Download

Browse Figures

Versions Notes

Abstract

As one of the foundational datasets in geographical information science, land use and land cover (LULC) data plays a crucial role in the study of human–environment interaction mechanisms, urban sustainable development, and other related issues. Although existing research has explored land use type recognition from remote sensing imagery, interpretation algorithms, and other perspectives, significant spatial discrepancies exist between these data products. Therefore, we introduced a multi-source LULC data integration approach that incorporates spatial dependencies, employing a fully connected neural network alongside geographical environmental variables to enhance the accuracy of land use data. The Yangtze River Delta was chosen as the case study area for method evaluation and validation. Our results show that the proposed method significantly improves land use classification accuracy. A comparative analysis from both global and category-specific perspectives revealed that the data product obtained exhibited notably higher overall accuracy, Kappa coefficient, and intersection over union compared to the China land cover dataset, the global 30 m fine land cover dynamic monitoring dataset, and the multi-period land use remote sensing monitoring dataset. Additionally, both the quantity and allocation disagreements of the fused LULC data were improved. The proposed multi-source land use data fusion method and its products can provide support and services for urban sustainable construction, resource management, and environmental monitoring and protection, demonstrating significant research value and importance.

Keywords:

land use and land cover data; geographical environmental factors; fully connected neural network; multi-source data fusion

Graphical Abstract

1. Introduction

The rate and spatial extent of human-induced alterations to the land surface, primarily manifested as land use and land cover (LULC) changes, are unprecedented and profoundly invasive [1,2]. These changes have fundamentally transformed a large part of the Earth’s terrestrial surface, influencing core elements of the Earth system [3]. As one of the key foundational datasets in information geography, the assessment, analysis, and monitoring of LULC data are thus crucial for revealing the mechanisms of Earth system evolution [4,5], improving the efficiency of land resource use [6,7], optimizing the allocation of urban land resources [8,9], identifying regional environmental issues [10], and formulating land management policies [3,11,12]. In recent decades, improving the accuracy of LULC data to promote in-depth research across various fields of geographical science has been a major research focus in information geography [13,14,15].

Remote sensing image interpretation is one of the primary methods for acquiring LULC data, with remote sensing images being collected from platforms such as satellites and aircraft, preprocessing, such as geometric and radiometric corrections, being performed, and then the images being interpreted using techniques such as image classification, feature extraction, and spatial analysis to extract LULC information and generate LULC datasets [16]. Research has shown that the key factors influencing the accuracy of data obtained through such methods are the choice of remote sensing images and the interpretation algorithms (technologies) [17,18]. Currently, the most commonly used remote sensing image sources are from Landsat, Sentinel, MODIS, and other satellites [19]. However, due to cloud cover caused by temporal variations, atmospheric scattering, and potential discrepancies in the orbital, altitude, and orthogonal angles between the different data sources, remote sensing images from different sources can exhibit differences in quality, which in turn impacts interpretation accuracy [20,21]. To address the constraints of relying on single-source remote sensing data, several studies have introduced multi-source fusion techniques at the pixel, feature, and decision levels, with the goal of boosting image coherence and information richness, ultimately improving classification accuracy and data reliability [22,23,24,25]. Numerous researchers have also incorporated interpretation methods, such as supervised classification, unsupervised classification, spectral indices, regression models, machine learning, and deep learning, to derive LULC details from different remote sensing images [3,26,27,28]. However, due to the inherent limitations of single methods, some researchers have proposed that integrating multiple methods can further enhance data accuracy. For instance, Chen et al. discovered that merging deep learning’s robust feature extraction capabilities with the reliability of traditional classifiers enhanced the accuracy of identifying intricate land cover types [29]. By constructing large-scale labeled datasets and incorporating transfer learning techniques, Gao et al. applied pretrained models to data from specific regions, achieving an efficient and highly accurate LULC classification, this strategy reduced the demand for labeled data while enhancing the model’s generalization ability [30].

Although numerous LULC data products have been generated based on remote sensing images in existing studies, significant spatial differences in interpretation accuracy exist in the same regions due to variations in image data sources and interpretation algorithms [31,32,33]. In geography, researchers have introduced the concept of “geospatial correlation”, which shows the potential interdependence between variables observed within the same spatial domain, highlighting the possible relationships between data points within a shared geographical area [34]. When mapped to LULC data, this concept indicates that the spatial arrangement of different land use types is spatially correlated with other geographical factors. For example, cultivated land is often closely associated with natural factors, such as water sources, topography, and soil [35], whereas urban land development is influenced by factors including transportation networks, population density, and economic levels [36]. Therefore, from the perspective of spatial correlation, identifying the correct land use pixels in multi-source LULC data, and generating new LULC data products, is a feasible approach to improving the accuracy of LULC data.

This study aims to address the limitations of relying on single-source remote sensing data and single interpretation algorithms by proposing a novel approach to improve the accuracy of LULC data through multi-source data fusion. To achieve this, we conducted the following: (1) analyze the spatial correlations between various land use types and geographical environmental factors; (2) develop a fully connected neural network (FCNN) model that integrates multi-source LULC data with geographical environmental variables to achieve data fusion; and (3) evaluate the performance of the proposed model using the Yangtze River Delta (YRD) as a case study. By leveraging the concept of geospatial correlation, we aimed to generate more accurate and reliable LULC data products, thereby providing a robust foundation for applications in Earth system evolution research, land resource management, and urban planning.

2. Methodology

As illustrated in Figure 1, the procedure followed in this study comprised three main components. Firstly, data acquisition and preprocessing were conducted. Three types of LULC data—the China land cover dataset (CLCD), the global land cover (GLC) 30 m high-resolution dynamic monitoring dataset, and the multi-period land use and cover change (LUCC) remote sensing monitoring dataset—were chosen for fusion. These data were reclassified based on land use type, and the geographical environmental factors associated with each type were analyzed spatially. By employing spatial overlay analysis, the corresponding land use types from the CLCD, GLC, and LUCC data for each visually interpreted point were identified. Subsequently, these points were divided into training and validation datasets. Secondly, a multi-source LULC data fusion model was constructed. Utilizing the training dataset as the input, the model was designed based on the principles of an FCNN. The neuron structure of the input layer was established, and parameters in the hidden layer, including the number of layers, neuron count, weights, and activation functions, were determined. The network structure of the output layer was also designed. Through optimization and iteration, the optimal model parameters were identified, thereby completing the construction of the data fusion model. Finally, multiple evaluation indicators were selected, and a validation dataset was deployed to assess the model’s effectiveness from various perspectives. The proposed model was then applied to fuse multi-source LULC data.

2.1. Study Area

The Yangtze River Delta (YRD) is situated in the eastern coastal area of China, between the East China Sea and the Yellow Sea, where it forms an alluvial plain at the point where the Yangtze River meets the sea. This area primarily encompasses Shanghai, Jiangsu, Anhui, and Zhejiang Provinces, and is recognized as one of the most economically vibrant and rapidly urbanized regions in China (Figure 2). The YRD features a diverse natural geography, containing rich ecosystems, such as rivers, lakes, grasslands, and forests. This diversity provides a complex scenario for LULC classification and fusion, fully demonstrating the advantages of multi-source data integration. At the same time, the land use in this region presents multiple demands, including high-density urban development, intensive agricultural production, and ecological protection. High-accuracy LULC data provide an important foundation for promoting regional economic integration, environmental protection, and sustainable development. Hence, we chose the YRD as the focal area for our multi-source LULC data fusion study.

2.2. Study Data

As shown in Table 1, the foundational data used in this study included two main categories—multi-source LULC data from 2015 and geographical environmental factor data. The LULC data were derived from current mainstream remote sensing monitoring products, specifically including the CLCD, GLC, and LUCC [37,38,39]. Additionally, we incorporated 13 geographical environmental factors. To ensure consistency in subsequent processing, all data were uniformly resampled to a resolution of 30 m. Visual interpretation combined with high-resolution remote sensing imagery was employed to manually annotate over 6000 sample points. To ensure spatial and categorical uniformity, the interpretation points were randomly generated based on the proportional distribution of different land use types across various cities in the study area.

2.3. Determination of Geographical Environment Factors

Three commonly used LULC datasets were used for the fusion—the CLCD, GLC, and LUCC. Based on the land use classification standards of these three LULC datasets, the new fused LULC data included seven land use categories—cultivated land, forest land, grassland, water area, construction land, bare land, and glaciers (Supplementary Materials).

The rational selection of geographical environmental factors was critical to ensuring the accuracy of the multi-source LULC data fusion. Previous studies have reported that cultivated land, grassland, and forest land exhibit significant spatial locational correlations with factors such as digital elevation model (DEM), slope, aspect, precipitation, and mean temperature [40,41,42]. The Normalized Difference Vegetation Index (NDVI), a spectral index for evaluating vegetation conditions, is a commonly used indicator for distinguishing these three land types [43]. Similarly, water bodies are closely associated with multiple factors [44]. Elevation data from DEMs help delineate watersheds and identify water bodies [45]. Precipitation, as a key source of water replenishment, has a dynamic interaction with the extent of water bodies [46], while surface reflectance has become a core indicator for monitoring the distribution of, and changes in, water bodies [47]. The allocation of construction land is intricately linked to human activities and is shaped by distance to cities, county centers, national highways, expressways, and railways. Nighttime light intensity serves as an important proxy for human activity levels [48,49]. Bare land is typically characterized by sparse vegetation, with the NDVI and surface reflectance effectively representing this feature [50]. The distribution of glaciers is influenced by multiple factors, including DEMs, distance to cities and county centers, surface reflectance, and mean temperature [51].

Building upon the preceding examination of each land use type, we ultimately selected 13 geographical environmental factors as variables for multi-source LULC data fusion—DEM, slope, aspect, distance to cities, distance to county centers, distance to national highways, distance to expressways, distance to railways, nighttime light intensity, precipitation, mean temperature, NDVI, and surface reflectance.

2.4. Construction of Land Use and Land Cover Data Fusion Model

In deep learning, FCNNs are widely used network structures, consisting of three hierarchical levels—the input layer, hidden layers, and the output layer—and characterized by their ability to enhance model learning capacity by capturing nonlinear relationships between layers [52]. Previous studies have shown that FCNNs offer notable benefits in LULC classification by efficiently capturing intricate spatial features and hidden patterns from the input data [53]. Therefore, we employed an FCNN to construct our multi-source LULC data fusion model.

2.4.1. Input Layer

In the input layer, each of the seven land use categories from the three data sources, along with the 13 geographical environmental factors, were represented as separate neurons, resulting in a total of 34 neurons in the input layer. The input for the LULC categories was processed using one-hot encoding, which assigns a unique bit to each category, thereby avoiding errors caused by sequential ordering and ensuring equality among categories. For the geographical environmental factors, binarization and normalization were applied to standardize the data into a unified scale. This preprocessing reduces the differences in feature scales and effectively minimizes computational resource requirements.

Each neuron in the input layer is fed as a vector:

X = (x_{1}, x_{2}, x_{3}, \dots, x_{34})

(1)

where

x_{1}

to

x_{7}

represent the land use types of the CLCD within a raster cell after one-hot encoding. These types correspond to cultivated land, forest land, grassland, water area, construction land, bare land, and glaciers. Their values are binary, where if the raster cell in the CLCD belongs to a specific land use type, the corresponding element value is 1, while the values for the other elements are 0; similarly,

x_{8}

to

x_{14}

and

x_{15}

to

x_{21}

represent the land use types in the GLC and LUCC datasets, respectively, with their element values determined in a manner analogous to the CLCD; and

x_{22}

to

x_{34}

are the normalized values of the geographical environmental factors, representing the influence intensity of each factor.

2.4.2. Hidden Layer

The accuracy of an FCNN is greatly influenced by the number of hidden layers and neurons during its construction. For multi-classification problems, excessively deep layers may lead to overfitting, making the model difficult to converge, while a single layer can increase the difficulty of training the model [54]. Therefore, we employed two hidden layers. Furthermore, the number of neurons also affects the final training accuracy. Based on evaluation formulas proposed in previous works, we established the number of neurons for the two hidden layers,

N_{a} {a n d N}_{b}

[55]:

N_{h} = \frac{N_{s}}{(α \times (N_{i} + N_{o}))}

(2)

where

N_{h}

represents the quantity of neurons in the hidden layer;

N_{s}

is the amount of samples in the training set; α is a variable that can be set arbitrarily;

N_{i}

is the quantity of neurons in the input layer; and

N_{o}

is the quantity of neurons in the output layer. Based on this method, the numbers of neurons in the two hidden layers were determined to be 64 and 32, respectively.

The weights and biases jointly determine the linear transformation of each neuron in the hidden layers, while the activation function, by adding nonlinearity, allows the network to model more intricate nonlinear relationships. This enables the network to effectively learn hidden patterns and characteristics across various land use types. For example, the output of the j neuron in the k layer,

y_{j}^{k}

, is connected to the n neurons in the previous layer as follows:

y_{j}^{k} = f (\sum_{i = 1}^{n} y_{i}^{k - 1} \times w_{i j}^{k} + b_{j}^{k})

(3)

where

w_{i j}^{k}

represents the weight of the connection from the i neuron in the k − 1 layer to the j neuron in the k layer;

b_{j}^{k}

denotes the bias of the j neuron in the k layer; and f is the activation function:

R e L U (y) = m a x (0, y)

(4)

where the ReLU activation function is employed, improving the efficiency of gradient descent and backpropagation while effectively preventing problems such as gradient explosion and vanishing gradients [56].

2.4.3. Output Layer

In the neural network, the output layer consisted of seven neurons. We selected the Softmax method, which is suitable for multi-class classification tasks, to calculate the output value of each neuron. The results were then output in the form of a vector, Y.

Y = (y_{1}, y_{2}, y_{3}, \dots, y_{7})

(5)

where

y_{1}

to

y_{7}

represent the predicted probability values for the seven land use types. The land use type with the highest probability was chosen as the model’s predicted result.

2.4.4. Model Optimization and Iteration

In deep learning, the loss function and optimizer are key components. The cross-entropy loss function is a widely used metric that measures the disparity between the model’s predictions and the actual labels. In multi-class classification tasks, the cross-entropy loss function effectively evaluates the model’s accuracy, making it a popular choice for optimizing model parameters [57]. The adaptive moment estimation optimizer adjusts the learning rate for each parameter in real time by incorporating first-order moment estimation (the gradient mean) and second-order moment estimation (the gradient variance). This allows different gradients to be weighted differently, enabling the neural network to converge quickly and stably to the optimal solution with a stable learning rate [58]. Therefore, we selected these two components for the FCNN.

2.5. Accuracy Verification

Here, 70% of the dataset was designated as the training set, with the remaining 30% used for validation. Stratified sampling was applied during classification to maintain a similar class distribution in both the training and validation sets as in the original dataset, thereby enhancing the model’s ability to generalize across different land use types. Scholars have identified the class boundary uncertainty issue, referring to the mismatch between true class boundaries and interpreted boundaries [59,60]. Although the accuracy calculation in this study is based on visual interpretation points and does not directly involve boundary uncertainty, classification errors in visual interpretation points may be indirectly attributed to quantity uncertainty and spatial uncertainty inherent in boundary uncertainty. Therefore, this research employs a comprehensive framework which includes the Kappa coefficient, User’s accuracy, Producer’s accuracy, Overall Accuracy (OA), Quantity disagreement, Allocation disagreement, and Intersection over Union (IoU) to evaluate and analyze the effectiveness of the model [61,62,63].

The Kappa coefficient is commonly used to measure the effectiveness of classification, and can effectively demonstrate the consistency of the validation as follows:

K a p p a = \frac{p_{o} - p_{e}}{1 - p_{e}}

(6)

where

p_{o}

denotes the number of pixels where the predicted outcomes align with the actual results, divided by the overall number of pixels in the study region (i.e., the accuracy); and

p_{e}

is the sum of the true quantity and predicted quantity for each land use category divided by the square of the total number of pixels in the study area.

Although the Kappa coefficient serves as an important metric for assessing model accuracy, it is insufficient for a comprehensive evaluation of the classification process of land use data. Therefore, we incorporated the user’s accuracy and the producer’s accuracy to supplement the assessment. Producer’s accuracy measures how many interpretation points of a specific land cover category are correctly classified by the model, reflecting the model’s sensitivity to that particular category. User’s accuracy evaluates how many of the model-classified land cover points are actually correct, primarily assessing the model’s classification reliability.

U s e r ’ s A c c u r a c y = \frac{T P}{T P + F P}

(7)

P r o d u c e r ’ s A c c u r a c y = \frac{T P}{T P + F N}

(8)

where TP (true positive) represents the correctly classified positive pixels; FN (false negative) denotes the number of pixels that are actually correct, but misclassified as incorrect; and FP (false positive) refers to the number of pixels that are actually incorrect, but misclassified as correct.

The OA represents the ratio of correctly classified samples to the total number of samples, serving as a key metric for evaluating classification performance:

O A = \frac{T r u e}{A l l}

(9)

where True represents the number of pixels where the predicted results match the actual results; and All represents the total number of pixels in the study area.

To further investigate the factors influencing the classification accuracy of the model, we selected quantity disagreement and allocation disagreement as error evaluation metrics. Quantity disagreement refers to the discrepancy between the class quantities in visually interpreted land cover classification results and ground truth interpretation results. This reflects errors in quantitative distribution across land cover categories, which can identify phenomena of overestimation or underestimation in classification quantities. Allocation disagreement manifests as the spatial distribution mismatch between visually interpreted land cover classifications and ground truth results, indicating spatial allocation errors in classification outcomes that reveal locational misclassifications:

q_{g} = | (\sum_{i = 1}^{n} p_{i g}) - (\sum_{j = 1}^{n} p_{g j}) |

(10)

Q D = \frac{\sum_{g = 1}^{k} q_{g}}{2}

(11)

where

p_{i g}

represents the number of samples that actually belong to category g in the visual interpretation, but that are predicted as other categories;

p_{g j}

denotes the number of samples classified as category g in the prediction results, but that actually belong to other categories; n corresponds to the total number of samples meeting the requirements; and k is the total number of categories.

v_{g} = 2 \times \min [(\sum_{i = 1}^{n} p_{i g} - p_{g g}), (\sum_{j = 1}^{n} p_{g j} - p_{g g})]

(12)

A D = \frac{\sum_{g = 1}^{k} v_{g}}{2}

(13)

where

p_{g g}

represents the number of correctly classified samples; min takes the minimum value between the omission error and the commission error; n corresponds to the total number of samples meeting the requirements; and k is the total number of categories.

IoU is defined as a metric that quantifies the spatial overlap between land cover categories in visually interpreted classification results and ground truth reference data. It measures the similarity between datasets by calculating the ratio of intersection to union, with specific emphasis on evaluating overall spatial accuracy of classification outputs.

{I o U}_{i} = \frac{A_{i} \cap B_{i}}{A_{i} \cup B_{i}}

(14)

where

{I o U}_{i}

represents the overlap between the

i

-th land use category in the LULC data and the visual interpretation points;

A_{i}

denotes the spatial distribution of the i-th land use category in the LULC data; and

B_{i}

is the spatial distribution of the i-th land use category in the visual interpretation points.

3. Results

3.1. Results of the Multi-Source LULC Data Fusion

Based on the FCNN, and integrating multi-source LULC data with geographical environmental factors, we generated the land use type product for the YRD region in 2015. As illustrated in Figure 3, the land use in the YRD shows a clear north–south distribution trend. The area along the Yangtze River and in its northern regions is primarily dominated by agricultural cropland, while the southern low mountain and hilly areas are mainly covered by forest land. Construction areas are predominantly distributed in point and belt patterns, with construction land being particularly dense along the Yangtze River, in Zhejiang Province, and in the intensive development zones of Shanghai. Water bodies are concentrated in the river-network-rich plains, and include Taihu Lake and Yangcheng Lake and numerous rivers and ponds, displaying a typical “water town” pattern. Grasslands and bare land are very limited in scale. Due to its location in a subtropical monsoon climate zone and given the absence of conditions conducive to glacier formation, the YRD does not have any glaciers.

3.2. Comparison and Analysis of Fused Data

3.2.1. Spatial Comparison and Analysis

Figure 3 and Figure 4 present local comparisons of different LULC data. Based on the local area in the lower left corner of Figure 3, it can be seen that the LULC data obtained here has significantly better classification accuracy for water areas and forest land than the other datasets. In Figure 3, the boundaries between water bodies and forests are clearer and more accurate, especially at the interface between the water bodies and the surrounding forests, avoiding the classification noise and boundary uncertainty caused by spatial resolution limitations. The forest classification in Figure 3 is also precise, with uniform color distribution and no obvious misclassifications or blurred areas preventing water bodies or bare land from being mistakenly classified as forest. In contrast, Figure 4a,c shows lower classification accuracy in this region, with less distinct water body boundaries, leading to broader transitional zones between different land use types.

In the local area in the upper right corner of Figure 3, a significant advantage in classifying urban roads, water bodies, and forest land can be seen. Figure 3 illustrates the ability to delineate urban roads with greater accuracy, particularly in densely urbanized areas, where more accurate road details can be seen compared to Figure 4c. In terms of water body and forest classification, Figure 3 also shows higher accuracy, especially in the transitional zone between water bodies and forests, clearly showing natural boundaries. Compared to Figure 4b, Figure 3 offers a more detailed classification of water bodies and forests, and is closer to the remote sensing image (Figure 4d), providing a more realistic reflection of the land use status on the ground.

3.2.2. Area Comparison and Analysis

Figure 5 illustrates the distribution of each land use type across various LULC datasets. Due to the very small amount of grassland and bare land in the YRD region, we mainly focused on construction land, cultivated land, forest land, and water areas.

Firstly, it can be seen that the land use patterns in the different data sources are similar. Land use in the YRD is primarily characterized by cultivated land and forest land, which together make up approximately 75% of the total area, showcasing a distinct pattern of agricultural and ecological balance. Secondly, when analyzing the area of each land use type across various data sources, it is evident that the fused data represents an intermediate value between the areas of land use types in the CLCD, GLC, and LUCC datasets. For instance, the area of construction land in the fused data are less than that in the LUCC dataset, but greater than that in the CLCD and GLC datasets.

3.2.3. Accuracy Comparison and Analysis

As shown in Table 2, we compared the accuracy of multi-source LULC data and the fused LULC data from both global and land use category perspectives. From a global viewpoint, the fused data achieved an OA coefficient of 0.869 and a Kappa coefficient of 0.813, which are significantly higher than the accuracy of the CLCD, LUCC, and GLC data, with improvements ranging from 9 to 20% for OA and 12–29% for Kappa.

In the fused LULC data, the accuracy for each land use category is higher than that of the original data, especially for grassland and bare land, where the Kappa coefficients increase by 13–50% and 36–74%, respectively. The Kappa coefficients for construction land, cultivated land, and forest land also improved by 12–27%, 11–27%, and 4–26%, respectively. An analysis of the accuracy of each land use category in the fused data shows that forest land and grassland achieved the highest accuracy, with Kappa coefficients nearing 0.9. The accuracy of construction land was the lowest, at only 0.7. Overall, the producer’s and user’s accuracy of the proposed model are significantly superior to those of other data sources, showing improvements of approximately 18% and 10%, respectively. This indicates substantial enhancements in the reliability of the classification results and the model’s ability to identify individual land cover types. Particularly for single categories, such as bare land and construction land, the improvement in classification accuracy was notably significant, further validating the effectiveness of the auxiliary factors introduced into the model for enhancing the classification accuracy of these land cover types. Additionally, by comparing the IoU across various land use categories, the advantages of the proposed model became more evident. Whether for individual land cover types or the overall classification results, the fused model demonstrates significantly higher IoU values than the source data, with an overall improvement exceeding 20%. These results demonstrate that the proposed model can more accurately delineate the spatial distribution of various land cover types, fully highlighting its superiority in integrating multi-source data and improving classification accuracy.

From our results (Figure 6), the proposed model exhibits the lowest quantity disagreement, at only 4.473%—significantly lower than the GLC data (19.866%), CLCD data (20.728%), and LUCC data (21.559%). This indicates that the proposed model achieved higher accuracy in the overall quantity distribution of land use types, more accurately reflecting the actual distribution of land categories. In terms of allocation disagreement, the proposed model also performed the best, with a value of only 10.783%, which is notably lower than the other data sources, such as LUCC (17.199%), GLC (17.407%), and CLCD (17.758%). This further demonstrates that the fused data achieves higher accuracy in spatial distribution, more accurately capturing the spatial characteristics of land categories.

4. Discussion

The CLCD, GLC, and LUCC are the three most commonly used LULC data products in geographical studies. The CLCD data integrates Landsat and Sentinel data and time-series surface parameters, using machine learning and time-series analysis methods to identify surface information [64]. Landsat and Sentinel are optical sensors, which suffer from cloud coverage issues, leading to data loss and reduced interpretation accuracy [65]. Although the fusion of synthetic aperture radar data can overcome this problem, the effect is not very significant [66]. Due to the complex land use structure around urban areas, classification bias is likely to occur in these regions [67]. Additionally, the generalization capability of this method is limited, and there are constraints in defining mixed agricultural–forestry–water transition zones [68]. Existing studies have found that the GLC dataset has lower classification accuracy for land covers with similar optical characteristics, and it is difficult to define construction land in areas with low building density [69]. Furthermore, it is challenging to identify land use types specific to certain regions (e.g., mixed agroforestry systems) [70]. In the LUCC dataset, the distribution of “complex features” in urban areas often leads to poor interpretation accuracy [71]. Additionally, different land cover types (such as construction, vegetation, and bare land) often experience spectral mixing at the pixel scale, especially with low-resolution imagery (such as MODIS), which leads to reduced classification accuracy in the LUCC dataset [72].

The improvement in accuracy across all land use categories in this study demonstrates that multi-source data fusion, considering geospatial correlation, is an effective way of enhancing the accuracy of land use products. To further improve the efficiency of the model, we analyzed the sources of classification errors in the fused land use data. The dynamic monitoring and analysis of construction land is critical for obtaining a real-time understanding of land changes and promoting intensive and efficient land use [73]. Although the classification accuracy of construction land was improved through the proposed method, the Kappa coefficient remained at only around 0.7, which is the lowest among all land use categories. Therefore, we used this as an example for error analysis. Firstly, although construction land has the lowest accuracy among all fused data, it also has the lowest accuracy in the three LULC datasets (CLCD, GLC, and LUCC). The accuracy of construction land after fusion increased by 12–27%. The construction land category exhibits notably low IoU values across three benchmark datasets—CLCD (0.501), GLC (0.399), and LUCC (0.495). Through data fusion implementation, the IoU for construction land demonstrates a significant enhancement to 0.580, quantitatively validating the model’s improved capability in spatial allocation accuracy for this critical land cover type. Therefore, the low accuracy of construction land is primarily due to the high errors in the original data, while the proposed model can moderately enhance the classification accuracy of construction land. Overall, the accuracy of the data fusion in this model depends on the accuracy of the original fused data.

Additionally, the method proposed in this study is applicable to areas with two or more contemporaneous LULC datasets. Firstly, because the proposed method involves multi-source data fusion, having two LULC datasets is a necessary condition for implementing the fusion. Secondly, although we utilized three datasets (CLCD, GLC, and LUCC), the neuron structure of the input layer can be adjusted based on the input data. For instance, when only two LULC datasets are fused, the input layer would consist of only 14 land neurons. The parameters of the hidden layer are automatically determined based on the proposed method, while the parameters of the output layer remain unchanged.

Although our proposed multi-source LULC data fusion method can effectively improve land use classification accuracy, there are still some tasks and issues that need further refinement and resolution. Firstly, based on the quantity and allocation disagreement comparison (Figure 6), although we achieved improvements in both quantity accuracy and spatial accuracy, the allocation disagreement remains relatively low. Therefore, further enhancement of the spatial allocation accuracy of various land use types is one of the key issues to be addressed in the future. Secondly, based on 13 geographical environmental factors, we obtained high-precision LULC data for the YRD in 2015 by integrating the CLCD, GLC, and LUCC datasets. The next step will be to analyze the contribution of different geographical environmental factors to the accuracy improvement of various land use categories, as well as the sensitivity of the fusion results to the resolution of geographical environmental factor data. This will further enhance the operational efficiency and accuracy of the model. At the same time, there is a need to produce long-term and large-scale LULC data to support the understanding of long-term processes and mechanisms in the Earth system [74]. Last but not least, we developed a multi-source LULC data fusion model based on FCNN, but other neural network models, such as convolutional neural networks, recurrent neural networks, and generative adversarial networks, have also shown their respective advantages [75]. Therefore, exploring and applying these alternative neural network architectures in the future could further optimize the model’s performance.

5. Conclusions

In conclusion, we used an FCNN combined with geographical environmental factors to develop a multi-source LULC data fusion method and generate 30 m resolution LULC data for the YRD region in 2015. The results indicate that the LULC data generated through this method achieved an OA of 0.869, a Kappa coefficient of 0.813, and an IoU score of 0.718. Compared to the CLCD, LUCC, and GLC datasets, the fused LULC data showed improvements in both OA and class-specific accuracy metrics. This demonstrates that the LULC data generated in this study exhibits significant advantages, both from a global perspective and in terms of individual land use categories. Additionally, both the quantity and allocation disagreements of the fused LULC data were improved, with the former showing the most notable enhancement.

The proposed model is applicable to study areas with two or more contemporaneous LULC datasets, and the accuracy of the fused data primarily depends on the accuracy of the original LULC datasets being integrated. Furthermore, the results indicate that there is still room for improvement in spatial allocation consistency. The model and data developed in this study can provide data support for ecological and environmental sciences, climate change research, and spatiotemporal dynamic analysis, as well as supporting land spatial planning, land management, and global sustainable development policy formulation, making it of significant research value and importance.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/rs17071131/s1, Table S1. Mapping relationship of land classification categories of fused LULC, CLCD, GLC and LUCC data.

Author Contributions

Conceptualization, J.Y.; methodology, J.Y. and Y.S.; data acquisition, J.Y. and Y.J.; software Q.S.; formal analysis, J.Y. and Y.H.; writing and review, J.Y. and Y.S.; editing, Z.W.; visualization, J.Y. and K.L.; supervision, Y.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (NO. 42301483, 42371408), the Natural Science Foundation of Jiangsu Province (NO. BK20230372), the Natural Science Research of Jiangsu Higher Education Institutions of China (NO. 22KJB170018), and the China Postdoctoral Science Foundation (NO. 2024M751464).

Data Availability Statement

The data used in this paper are available upon request from the corresponding author via email.

Conflicts of Interest

No potential conflicts of interest were reported by the author(s).

References

Tan, J.; Yu, D.; Li, Q.; Tan, X.; Zhou, W. Spatial relationship between land-use/land-cover change and land surface temperature in the Dongting Lake area, China. Sci. Rep. 2020, 10, 9245. [Google Scholar] [CrossRef]
Ramzan, M.; Saqib, Z.A.; Hussain, E.; Khan, J.A.; Nazir, A.; Dasti, M.Y.S.; Ali, S.; Niazi, N.K. Remote sensing-based prediction of temporal changes in land surface temperature and land use-land cover (LULC) in urban environments. Land 2022, 11, 1610. [Google Scholar] [CrossRef]
Wang, J.; Bretz, M.; Dewan, M.A.A.; Delavar, M.A. Machine learning in modelling land-use and land cover-change (LULCC): Current status, challenges and prospects. Sci. Total Environ. 2022, 822, 153559. [Google Scholar] [CrossRef] [PubMed]
Chen, B.; Xu, B.; Gong, P. Mapping essential urban land use categories (EULUC) using geospatial big data: Progress, challenges, and opportunities. Big Earth Data 2021, 5, 410–441. [Google Scholar] [CrossRef]
Ye, J.; Hu, Y.; Zhen, L.; Wang, H.; Zhang, Y. Analysis on Land-Use Change and its driving mechanism in Xilingol, China, during 2000–2020 using the google earth engine. Remote Sens. 2021, 13, 5134. [Google Scholar] [CrossRef]
Zhang, H.; Zheng, J.; Hunjra, A.I.; Zhao, S.; Bouri, E. How does urban land use efficiency improve resource and environment carrying capacity? Socio-Econ. Plan. Sci. 2024, 91, 101760. [Google Scholar] [CrossRef]
Koroso, N.H.; Lengoiboni, M.; Zevenbergen, J.A. Urbanization and urban land use efficiency: Evidence from regional and Addis Ababa satellite cities, Ethiopia. Habitat Int. 2021, 117, 102437. [Google Scholar] [CrossRef]
Bodhankar, S.; Gupta, K.; Kumar, P.; Srivastav, S.K. GIS-based multi-objective urban land allocation approach for optimal allocation of urban land uses. J. Indian Soc. Remote Sens. 2022, 50, 763–774. [Google Scholar] [CrossRef]
Mohammadyari, F.; Tavakoli, M.; Zarandian, A.; Abdollahi, S. Optimization land use based on multi-scenario simulation of ecosystem service for sustainable landscape planning in a mixed urban-Forest watershed. Ecol. Model. 2023, 483, 110440. [Google Scholar] [CrossRef]
MohanRajan, S.N.; Loganathan, A.; Manoharan, P. Survey on Land Use/Land Cover (LU/LC) change analysis in remote sensing and GIS environment: Techniques and Challenges. Environ. Sci. Pollut. Res. 2020, 27, 29900–29926. [Google Scholar] [CrossRef]
Lippe, M.; Rummel, L.; Günter, S. Simulating land use and land cover change under contrasting levels of policy enforcement and its spatially-explicit impact on tropical forest landscapes in Ecuador. Land Use Policy 2022, 119, 106207. [Google Scholar] [CrossRef]
Chao, W.; Yu, Y.; Fanzong, G. Using street view images to examine the association between human perceptions of locale and urban vitality in Shenzhen, China. Sustain. Cities Soc. 2023, 88, 104291. [Google Scholar] [CrossRef]
Qu, L.A.; Chen, Z.; Li, M.; Zhi, J.; Wang, H. Accuracy improvements to pixel-based and object-based lulc classification with auxiliary datasets from Google Earth engine. Remote Sens. 2021, 13, 453. [Google Scholar] [CrossRef]
Wang, H.; Yan, H.; Hu, Y.; Xi, Y.; Yang, Y. Consistency and accuracy of four high-resolution LULC datasets—Indochina Peninsula case study. Land 2022, 11, 758. [Google Scholar] [CrossRef]
Dash, P.; Sanders, S.L.; Parajuli, P.; Ouyang, Y. Improving the accuracy of land use and land cover classification of landsat data in an agricultural watershed. Remote Sens. 2023, 15, 4020. [Google Scholar] [CrossRef]
Vivekananda, G.N.; Swathi, R.; Sujith, A.V.L.N. RETRACTED ARTICLE: Multi-temporal image analysis for LULC classification and change detection. Eur. J. Remote Sens. 2021, 54 (Suppl. S2), 189–199. [Google Scholar] [CrossRef]
Richards, J.A.; Richards, J.A. Remote Sensing Digital Image Analysis; Springer: Berlin/Heidelberg, Germany, 2022; Volume 5, pp. 256–258. [Google Scholar]
Kotaridis, I.; Lazaridou, M. Remote sensing image segmentation advances: A meta-analysis. ISPRS J. Photogramm. Remote Sens. 2021, 173, 309–322. [Google Scholar] [CrossRef]
Song, X.P.; Huang, W.; Hansen, M.C.; Potapov, P. An evaluation of Landsat, Sentinel-2, Sentinel-1 and MODIS data for crop type mapping. Sci. Remote Sens. 2021, 3, 100018. [Google Scholar] [CrossRef]
Sun, X.; Wang, B.; Wang, Z.; Li, H.; Li, H.; Fu, K. Research progress on few-shot learning for remote sensing image interpretation. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 2387–2402. [Google Scholar] [CrossRef]
Tian, L.; Cao, Y.; He, B.; Zhang, Y.; He, C.; Li, D. Image enhancement driven by object characteristics and dense feature reuse network for ship target detection in remote sensing imagery. Remote Sens. 2021, 13, 1327. [Google Scholar] [CrossRef]
Wang, Z.; Ma, Y.; Zhang, Y. Review of pixel-level remote sensing image fusion based on deep learning. Inf. Fusion 2023, 90, 36–58. [Google Scholar] [CrossRef]
Lyu, P.; He, L.; He, Z.; Liu, Y.; Deng, H.; Qu, R.; Wang, J.; Zhao, Y.; Wei, Y. Research on remote sensing prospecting technology based on multi-source data fusion in deep-cutting areas. Ore Geol. Rev. 2021, 138, 104359. [Google Scholar] [CrossRef]
Mohammadpour, P.; Viegas, C. Applications of Multi-Source and Multi-Sensor Data Fusion of Remote Sensing for Forest Species Mapping. In Advances in Remote Sensing for Forest Monitoring; John Wiley & Sons Ltd.: Hoboken, NJ, USA, 2022; pp. 255–287. [Google Scholar] [CrossRef]
Li, R.; Zhou, M.; Zhang, D.; Yan, Y.; Huo, Q. A survey of multi-source image fusion. Multimed. Tools Appl. 2024, 83, 18573–18605. [Google Scholar] [CrossRef]
Vali, A.; Comai, S.; Matteucci, M. Deep learning for land use and land cover classification based on hyperspectral and multispectral earth observation data: A review. Remote Sens. 2020, 12, 2495. [Google Scholar] [CrossRef]
Macarringue, L.S.; Bolfe, É.L.; Pereira, P.R.M. Developments in land use and land cover classification techniques in remote sensing: A review. J. Geogr. Inf. Syst. 2022, 14, 1–28. [Google Scholar] [CrossRef]
Ali, K.; Johnson, B.A. Land-use and land-cover classification in semi-arid areas from medium-resolution remote-sensing imagery: A deep learning approach. Sensors 2022, 22, 8750. [Google Scholar] [CrossRef]
Chen, W.; Ouyang, S.; Tong, W.; Li, X.; Zheng, X.; Wang, L. GCSANet: A global context spatial attention deep learning network for remote sensing scene classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 1150–1162. [Google Scholar] [CrossRef]
Gao, Y.; Ruan, Y.; Fang, C.; Yin, S. Deep learning and transfer learning models of energy consumption forecasting for a building with poor information data. Energy Build. 2020, 223, 110156. [Google Scholar] [CrossRef]
Zheng, Q.H.; Chen, W.; Li, S.L.; Yu, L.; Zhang, X.; Liu, L.F.; Singh, R.P.; Liu, C.Q. Accuracy comparison and driving factor analysis of LULC changes using multi-source time-series remote sensing data in a coastal area. Ecol. Inform. 2021, 66, 101457. [Google Scholar] [CrossRef]
Balha, A.; Mallick, J.; Pandey, S.; Gupta, S.; Singh, C.K. A comparative analysis of different pixel and object-based classification algorithms using multi-source high spatial resolution satellite data for LULC mapping. Earth Sci. Inform. 2021, 14, 2231–2247. [Google Scholar] [CrossRef]
Asenso Barnieh, B.; Jia, L.; Menenti, M.; Yu, L.; Nyantakyi, E.K.; Kabo-Bah, A.T.; Jiang, M.; Zhou, J.; Lv, Y.; Zeng, Y.; et al. Spatiotemporal Patterns in Land Use/Land Cover Observed by Fusion of Multi-Source Fine-Resolution Data in West Africa. Land 2023, 12, 1032. [Google Scholar] [CrossRef]
Sillero, N.; Barbosa, A.M. Common mistakes in ecological niche models. Int. J. Geogr. Inf. Sci. 2021, 35, 213–226. [Google Scholar] [CrossRef]
Liang, X.; Jin, X.; Yang, X.; Xu, W.; Lin, J.; Zhou, Y. Exploring cultivated land evolution in mountainous areas of Southwest China, an empirical study of developments since the 1980s. Land Degrad. Dev. 2021, 32, 546–558. [Google Scholar] [CrossRef]
Lu, X.; Shi, Z.; Li, J.; Dong, J.; Song, M.; Hou, J. Research on the impact of factor flow on urban land use efficiency from the perspective of urbanization. Land 2022, 11, 389. [Google Scholar] [CrossRef]
Yang, J.; Huang, X. China Land Cover Dataset (CLCD) [Data Set]; Wuhan University: Wuhan, China, 2021; Available online: https://zenodo.org/record/5816591 (accessed on 19 March 2025).
Gong, P.; Liu, H.; Zhang, M. Global 30 m Land Cover Dynamic Dataset (GLC_FCS30) [Data Set]; CAS Big Earth Data Platform: Beijing, China, 2019. [Google Scholar] [CrossRef]
Liu, J.; Kuang, W.; Zhang, Z. China Multi-Period Land Use/Cover Remote Sensing Monitoring Dataset (LUCC) [Data Set]; Resource and Environment Science and Data Center (RESDC): 2020. Available online: https://www.resdc.cn/DOI/DOI.aspx?DOIid=XXX%EF%BC%89%E3%80%82 (accessed on 19 March 2025).
Liang, H.; Kasimu, A.; Ma, H.; Zhao, Y.; Zhang, X.; Wei, B. Exploring the Variations and Influencing Factors of Land Surface Temperature in the Urban Agglomeration on the Northern Slope of the Tianshan Mountains. Sustainability 2022, 14, 10663. [Google Scholar] [CrossRef]
Li, Z.; Fan, Y.; Zhang, R.; Chen, P.; Jing, X.; Lyu, C.; Zhang, R.; Li, Y.; Liu, Y. Synergistic impacts of Landscape, Soil, and environmental factors on the spatial distribution of soil aggregates stability in the Danjiangkou reservoir area. Catena 2024, 237, 107840. [Google Scholar] [CrossRef]
Zhang, X.; Wang, J.; Gao, Y.; Wang, L. Variations and controlling factors of vegetation dynamics on the Qingzang Plateau of China over the recent 20 years. Geogr. Sustain. 2021, 2, 74–85. [Google Scholar] [CrossRef]
Huang, S.; Tang, L.; Hupy, J.P.; Wang, Y.; Shao, G. A commentary review on the use of normalized difference vegetation index (NDVI) in the era of popular remote sensing. J. For. Res. 2021, 32, 1–6. [Google Scholar] [CrossRef]
Ustaoğlu, F.; Tepe, Y.; Taş, B. Assessment of stream quality and health risk in a subtropical Turkey river system: A combined approach using statistical analysis and water quality index. Ecol. Indic. 2020, 113, 105815. [Google Scholar] [CrossRef]
Rocha, J.; Duarte, A.; Silva, M.; Fabres, S.; Vasques, J.; Revilla-Romero, B.; Quintela, A. The importance of high resolution digital elevation models for improved hydrological simulations of a mediterranean forested catchment. Remote Sens. 2020, 12, 3287. [Google Scholar] [CrossRef]
Bakhshianlamouki, E.; Masia, S.; Karimi, P.; van der Zaag, P.; Sušnik, J. A system dynamics model to quantify the impacts of restoration measures on the water-energy-food nexus in the Urmia lake Basin, Iran. Sci. Total Environ. 2020, 708, 134874. [Google Scholar] [CrossRef]
Bijeesh, T.V.; Narasimhamurthy, K.N. Surface water detection and delineation using remote sensing images: A review of methods and algorithms. Sustain. Water Resour. Manag. 2020, 6, 68. [Google Scholar] [CrossRef]
Dong, L.; Li, J.; Xu, Y.; Yang, Y.; Li, X.; Zhang, H. Study on the spatial classification of construction land types in Chinese cities: A case study in Zhejiang province. Land 2021, 10, 523. [Google Scholar] [CrossRef]
Zhai, D.; Zhang, X.; Zhuo, J.; Mao, Y. Driving the Evolution of Land Use Patterns: The Impact of Urban Agglomeration Construction Land in the Yangtze River Delta, China. Land 2024, 13, 1514. [Google Scholar] [CrossRef]
Nguyen, C.T.; Chidthaisong, A.; Kieu Diem, P.; Huo, L.Z. A modified bare soil index to identify bare land features during agricultural fallow-period in southeast Asia using Landsat 8. Land 2021, 10, 231. [Google Scholar] [CrossRef]
Nautiyal, S.; Goswami, M.; Prakash, S.; Rao, K.S.; Maikhuri, R.K.; Saxena, K.G.; Baksi, S.; Banerjee, S. Spatio-temporal variations of geo-climatic environment in a high-altitude landscape of Central Himalaya: An assessment from the perspective of vulnerability of glacial lakes. Nat. Hazards Res. 2022, 2, 343–362. [Google Scholar] [CrossRef]
Upadhyay, S.K.; Kumar, A. A novel approach for rice plant diseases classification with deep convolutional neural network. Int. J. Inf. Technol. 2022, 14, 185–199. [Google Scholar] [CrossRef]
Li, Z.; Chen, B.; Wu, S.; Su, M.; Chen, J.M.; Xu, B. Deep learning for urban land use category classification: A review and experimental assessment. Remote Sens. Environ. 2024, 311, 114290. [Google Scholar] [CrossRef]
Sankararaman, K.A.; De, S.; Xu, Z.; Huang, W.R.; Goldstein, T.; PMLR. The impact of neural network overparameterization on gradient confusion and stochastic gradient descent. In Proceedings of the 37th International Conference on Machine Learning, Virtual, 13–18 July 2020; Available online: https://proceedings.mlr.press/v119/sankararaman20a (accessed on 4 May 2024).
Uzair, M.; Jamil, N.; IEEE. Effects of hidden layers on the efficiency of neural networks. In Proceedings of the 2020 IEEE 23rd International Multitopic Conference (INMIC), Bahawalpur, Pakistan, 5–7 November 2020. [Google Scholar] [CrossRef]
Liu, M.; Chen, L.; Du, X.; Jin, L.; Shang, M. Activated gradients for deep neural networks. IEEE Trans. Neural Netw. Learn. Syst. 2021, 34, 2156–2168. [Google Scholar] [CrossRef]
Mao, A.; Mohri, M.; Zhong, Y.; PMLR. Cross-entropy loss functions: Theoretical analysis and applications. In Proceedings of the 40th International Conference on Machine Learning, Honolulu, HI, USA, 23–29 July 2023; Available online: https://proceedings.mlr.press/v202/mao23a.html (accessed on 4 May 2024).
Reyad, M.; Sarhan, A.M.; Arafa, M. A modified Adam algorithm for deep neural network optimization. Neural Comput. Appl. 2023, 35, 17095–17112. [Google Scholar] [CrossRef]
Guo, X.; Zhang, W.; Zhang, L. Robust structural topology optimization considering boundary uncertainties. Comput. Methods Appl. Mech. Eng. 2013, 253, 356–368. [Google Scholar] [CrossRef]
Roodposhti, S.M.; Aryal, J.; Lucieer, A. Uncertainty Assessment of Hyperspectral Image Classification: Deep Learning vs. Random Fores. Entropy 2019, 21, 78. [Google Scholar] [CrossRef]
Maxwell, A.E.; Warner, T.A. Thematic classification accuracy assessment with inherently uncertain boundaries: An argument for center-weighted accuracy assessment metrics. Remote Sens. 2020, 12, 1905. [Google Scholar] [CrossRef]
Pontius, R.G., Jr.; Millones, M. Death to Kappa: Birth of quantity disagreement and allocation disagreement for accuracy assessment. Int. J. Remote Sens. 2011, 32, 4407–4429. [Google Scholar] [CrossRef]
Choi, H.; Lee, H.J.; You, H.J.; Rhee, S.Y.; Jeon, W.S. Comparative Analysis of Generalized Intersection over Union and Error Matrix for Vegetation Cover Classification Assessment. Sens. Mater. 2019, 31, 3849. [Google Scholar] [CrossRef]
He, T.; Zhang, M.; Guo, A.; Zhai, G.; Wu, C.; Xiao, W. A novel index combining temperature and vegetation conditions for monitoring surface mining disturbance using Landsat time series. Catena 2023, 229, 107235. [Google Scholar] [CrossRef]
Meraner, A.; Ebel, P.; Zhu, X.X.; Schmitt, M. Cloud removal in Sentinel-2 imagery using a deep residual neural network and SAR-optical data fusion. ISPRS J. Photogramm. Remote Sens. 2020, 166, 333–346. [Google Scholar] [CrossRef]
Kahraman, S.; Bacher, R. A comprehensive review of hyperspectral data fusion with lidar and sar data. Annu. Rev. Control. 2021, 51, 236–253. [Google Scholar] [CrossRef]
Ji, X.; Han, X.; Zhu, X.; Huang, Y.; Song, Z.; Wang, J.; Zhou, M.; Wang, X. Comparison and Validation of Multiple Medium-and High-Resolution Land Cover Products in Southwest China. Remote Sens. 2024, 16, 1111. [Google Scholar] [CrossRef]
Li, Z.; Li, L.; Wang, Y.; Man, W.; Liu, W.; Nie, Q. Spatial Change of the Farming–Pastoral Ecotone in Northern China from 1985 to 2021. Land 2022, 11, 2179. [Google Scholar] [CrossRef]
Wu, F.; Wang, C.; Zhang, H.; Li, J.; Li, L.; Chen, W.; Zhang, B. Built-up area mapping in China from GF-3 SAR imagery based on the framework of deep learning. Remote Sens. Environ. 2021, 262, 112515. [Google Scholar] [CrossRef]
Nair, P.R.; Kumar, B.M.; Nair, V.D. Classification of agroforestry systems. In An Introduction to Agroforestry: Four Decades of Scientific Developments; Springer Nature: Berlin/Heidelberg, Germany, 2021; pp. 29–44. [Google Scholar] [CrossRef]
Zhao, D.; Ji, L.; Yang, F.; Liu, X. A Possibility-Based Method for Urban Land Cover Classification Using Airborne Lidar Data. Remote Sens. 2022, 14, 5941. [Google Scholar] [CrossRef]
Zhao, J.; Wang, L.; Yang, H.; Wu, P.; Wang, B.; Pan, C.; Wu, Y. A land cover classification method for high-resolution remote sensing images based on NDVI deep learning fusion network. Remote Sens. 2022, 14, 5455. [Google Scholar] [CrossRef]
Rao, A.S.; Radanovic, M.; Liu, Y.; Hu, S.; Fang, Y.; Khoshelham, K.; Palaniswami, M.; Ngo, T. Real-time monitoring of construction sites: Sensors, methods, and applications. Autom. Constr. 2022, 136, 104099. [Google Scholar] [CrossRef]
Zhang, Y.; Niu, X.; Hu, Y.; Yan, H.; Zhen, L. Temporal and spatial evolution characteristics and its driving mechanism of land use/land cover change in Laos from 2000 to 2020. Land 2022, 11, 1188. [Google Scholar] [CrossRef]
Zhang, J.; Liu, Z.; Jiang, W.; Liu, Y.; Zhou, X.; Li, X. Application of deep generative networks for SAR/ISAR: A review. Artif. Intell. Rev. 2023, 56, 11905–11983. [Google Scholar] [CrossRef]

Figure 1. Flowchart of multi-source LULC data fusion.

Figure 2. Location of the study area.

Figure 3. Spatial distribution of the fused LULC data.

Figure 4. Spatial comparison of different LULC data: (a) CLCD data; (b) LUCC data; (c) GLC data; and (d) remote sensing imagery.

Figure 5. Area of each land use type from different data sources.

Figure 6. Quantity disagreement and allocation disagreement comparison of multi-source LULC data products.

Table 1. Research data.

Data Type	Data Name	Data Resolution	Data Source	Data Year
Land use dataset	CLCD	30 m	https://open.geovisearth.com/service/resource/31 (accessed on 12 December 2023)	2015
	GLC	30 m	https://data.casearth.cn/sdo/detail/6523adf6819aec0c3a438252 (accessed on 12 December 2023)	2015
	LUCC	30 m	https://www.resdc.cn/ (accessed on 12 December 2023)	2015
Geographical environmental factors	DEM	1 km	https://zhuanlan.zhihu.com/p/30702123 (accessed on 4 May 2024)	2017
	Slope	1 km	Author	2017
	Aspect	1 km	Author	2017
	Distance to city	20 km	Author	2017
	Distance to county center	20 km	Author	2017
	Distance to national highway	20 km	Author	2017
	Distance to expressway	20 km	Author	2017
	Distance to railway	20 km	Author	2017
	Nighttime light intensity	1 km	https://eogdata.mines.edu/products/vnl/ (accessed on 4 May 2024)	2015
	Surface reflectance effectively	30 m	https://data.casearth.cn/thematic/RTU_Data/303 (accessed on 4 May 2024)	2018
	Average precipitation	1 km	https://blog.csdn.net/m0_63269495/article/details/135645183 (accessed on 4 May 2024)	2015
	Normalized Difference Vegetation Index (NDVI)	1 km	http://www.gisrs.cn/infofordata?id=05b59e69-ba30-4454-a9c0-67ca038fb9f3 (accessed on 6 May 2024)	2015
	Mean temperature	1 km	http://www.gisrs.cn/infofordata?id=3f816a8e-ebea-4484-b9e6-c27761fdb85f (accessed on 6 May 2024)	2015

Table 2. Accuracy comparison of multi-source LULC data products.

Land Use	Evaluating Indicator	Dataset
Land Use	Evaluating Indicator	Fusion Data	CLCD	LUCC	GLC
Cultivated land	OA	0.894	0.746	0.789	0.741
	Kappa	0.794	0.702	0.656	0.577
	User accuracy	0.879	0.931	0.832	0.791
	Producer accuracy	0.897	0.746	0.789	0.741
	IoU	0.798	0.707	0.680	0.619
Forest land	OA	0.919	0.902	0.851	0.704
	Kappa	0.891	0.774	0.857	0.659
	User accuracy	0.930	0.824	0.936	0.782
	Producer accuracy	0.925	0.902	0.851	0.704
	IoU	0.857	0.756	0.804	0.589
Grassland	OA	0.930	0.794	0.728	0.584
	Kappa	0.914	0.794	0.501	0.455
	User accuracy	0.963	0.727	0.438	0.438
	Producer accuracy	0.907	0.915	0.728	0.584
	IoU	0.832	0.682	0.376	0.334
Water area	OA	0.807	0.747	0.710	0.732
	Kappa	0.750	0.690	0.714	0.615
	User accuracy	0.758	0.719	0.764	0.583
	Producer accuracy	0.774	0.747	0.710	0.732
	IoU	0.620	0.578	0.583	0.481
Construction land	OA	0.705	0.695	0.626	0.591
	Kappa	0.700	0.616	0.619	0.510
	User accuracy	0.726	0.642	0.702	0.551
	Producer accuracy	0.711	0.695	0.626	0.591
	IoU	0.577	0.501	0.495	0.399
Bare land	OA	0.690	0.726	0.588	0.519
	Kappa	0.764	0.195	0.369	0.491
	User accuracy	0.800	0.224	0.284	0.482
	Producer accuracy	0.690	0.726	0.588	0.520
	IoU	0.625	0.207	0.237	0.333
Global	OA	0.869	0.791	0.771	0.698
	Kappa	0.813	0.714	0.679	0.591
	User accuracy	0.843	0.678	0.659	0.605
	Producer accuracy	0.817	0.788	0.715	0.645
	IoU	0.718	0.572	0.529	0.459

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yang, J.; Jiang, Y.; Song, Q.; Wang, Z.; Hu, Y.; Li, K.; Sun, Y. An Approach for Multi-Source Land Use and Land Cover Data Fusion Considering Spatial Correlations. Remote Sens. 2025, 17, 1131. https://doi.org/10.3390/rs17071131

AMA Style

Yang J, Jiang Y, Song Q, Wang Z, Hu Y, Li K, Sun Y. An Approach for Multi-Source Land Use and Land Cover Data Fusion Considering Spatial Correlations. Remote Sensing. 2025; 17(7):1131. https://doi.org/10.3390/rs17071131

Chicago/Turabian Style

Yang, Jing, Yiheng Jiang, Qirui Song, Zheng Wang, Yang Hu, Kaiqiang Li, and Yizhong Sun. 2025. "An Approach for Multi-Source Land Use and Land Cover Data Fusion Considering Spatial Correlations" Remote Sensing 17, no. 7: 1131. https://doi.org/10.3390/rs17071131

APA Style

Yang, J., Jiang, Y., Song, Q., Wang, Z., Hu, Y., Li, K., & Sun, Y. (2025). An Approach for Multi-Source Land Use and Land Cover Data Fusion Considering Spatial Correlations. Remote Sensing, 17(7), 1131. https://doi.org/10.3390/rs17071131

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Approach for Multi-Source Land Use and Land Cover Data Fusion Considering Spatial Correlations

Abstract

1. Introduction

2. Methodology

2.1. Study Area

2.2. Study Data

2.3. Determination of Geographical Environment Factors

2.4. Construction of Land Use and Land Cover Data Fusion Model

2.4.1. Input Layer

2.4.2. Hidden Layer

2.4.3. Output Layer

2.4.4. Model Optimization and Iteration

2.5. Accuracy Verification

3. Results

3.1. Results of the Multi-Source LULC Data Fusion

3.2. Comparison and Analysis of Fused Data

3.2.1. Spatial Comparison and Analysis

3.2.2. Area Comparison and Analysis

3.2.3. Accuracy Comparison and Analysis

4. Discussion

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI