Predicting Cropland Non-Agriculturalization Susceptibility Using Multi-Source Data and Graph Attention Networks: A Case Study of Wuhan, China

Wan, Shiqi; Huang, Lina; Xia, Zhangying

doi:10.3390/ijgi15020077

Open AccessArticle

Predicting Cropland Non-Agriculturalization Susceptibility Using Multi-Source Data and Graph Attention Networks: A Case Study of Wuhan, China

by

Shiqi Wan

¹

,

Lina Huang

^1,2,*

and

Zhangying Xia

¹

School of Resources and Environmental Sciences, Wuhan University, Wuhan 430079, China

²

Hubei Luojia Laboratory, Wuhan 430079, China

^*

Author to whom correspondence should be addressed.

ISPRS Int. J. Geo-Inf. 2026, 15(2), 77; https://doi.org/10.3390/ijgi15020077

Submission received: 27 October 2025 / Revised: 2 February 2026 / Accepted: 12 February 2026 / Published: 14 February 2026

Download

Browse Figures

Versions Notes

Abstract

Cropland non-agriculturalization (CNA) threatens food security, ecosystem services, and sustainable development amid accelerating global urbanization. However, existing monitoring methods are often retrospective and lack adequate spatial and temporal resolution for proactive management. This study proposes GS-GAT, a graph-based deep learning framework for predicting CNA susceptibility at the meso-spatial scale. A spatial graph was constructed for the non-central districts of Wuhan, China, and multisource features were extracted across four dimensions: imagery, land cover, topography, and socioeconomics. A comprehensive intensity index is developed to compute susceptibility levels at the street-block level based on multi-year land use data from 2018 to 2022. To address class imbalance, GraphSMOTE is employed to enhance minority node representation. The key model of GS-GAT is trained across four temporal snapshots using attention-based feature aggregation and joint optimization of classification and structural reconstruction losses. Experimental results show that GS-GAT demonstrated an average AUC of 85.6% and an F1 score of 82.6%, which increased to 93% and 91%, respectively, under relaxed evaluation criteria, whereby baseline models such as SVM and XGBoost were outperformed. Ablation studies confirm the contributions of feature fusion and GraphSMOTE to model robustness and minority class detection. The proposed framework offers a scalable and interpretable approach for early identification of cropland conversion risks, supporting more targeted land-use management and cropland protection strategies.

Keywords:

land use transition; imbalanced data augmentation; graph-based deep learning; semantic embedding

Graphical Abstract

1. Introduction

The accelerated urbanization and industrialization in developing countries such as China have intensified the process of cropland non-agriculturalization (CNA) [1,2,3,4,5,6,7]. CNA refers to the transformation of cropland into non-agricultural land uses driven by economic expansion, population growth, policy orientation, and land-use restructuring, and is characterized by strong spatial heterogeneity and multi-factor coupling. As a core issue in global urban and industrial expansion, CNA has become the focus of interdisciplinary research across geography, land science, ecology, and regional planning. Previous studies have widely acknowledged that CNA emerges from the interaction between macro-level development strategies and micro-level land conversion behaviors, exhibiting strong spatial clustering and diffusion characteristics.

In rapidly growing metropolitan regions, fiscal dependence on land development, low agricultural returns, and urban spillover jointly accelerate CNA. Central cities often exert a siphoning effect, attracting population and capital while displacing surrounding cropland. Typical patterns have been observed in Jakarta’s Jabotabek region [8], Guangzhou [9], and Zhejiang Province [10]. In contrast, developed countries, although they experienced large-scale cropland conversion during industrialization, now rely more on legal regulation and market-based instruments such as zoning control, tax incentives, and farmland protection policies to moderate land-use transitions. Nevertheless, CNA pressures continue to persist globally due to ongoing economic growth, population concentration, and complex policy implementation environments. These processes exert profound impacts on food security, ecosystem services, and rural livelihoods [11,12,13]. In China, cropland conversion from 1990 to 2020 reached annual averages of 1520.60 km² to urban land, 1464.60 km² to rural residential land, and 987.44 km² to other construction land [14]. Despite the implementation of cropland balance and permanent farmland protection policies, both total and per capita cropland areas have continued to decline. Beyond food security, CNA is closely associated with soil degradation, biodiversity loss, and landscape fragmentation, while recent studies have revealed a strong spatial correlation between cropland conversion and carbon emissions, further amplifying its ecological significance [15,16].

To address these challenges, extensive efforts have applied remote sensing, GIS, machine learning, and spatiotemporal modeling techniques to monitor CNA patterns and identify driving mechanisms. In the domain of remote sensing, multi-source optical and SAR data integrated via platforms such as Google Earth Engine have substantially improved cropland mapping and change detection [17,18,19]. High-frequency vegetation index-based methods, such as seasonal NDVI differencing, have demonstrated strong performance in detecting abandoned cropland in mountainous and fragmented landscapes [20]. At the algorithmic level, traditional GIS-based spatial analyses, including kernel density estimation and hot spot detection, have been used to characterize broad spatial patterns of CNA. Spatial econometric models further quantify both direct and spillover effects of socioeconomic and policy drivers [1,2,21]. In recent years, machine learning and deep learning methods have demonstrated superior performance in extracting high-dimensional features and improving predictive accuracy. The XGBoost-SHAP framework reveals the dominant role of socioeconomic variables in driving CNA [22], while semantic segmentation models based on Vision Transformers and ChangeFormers significantly enhance abandoned cropland detection [23]. Dynamic convolution (FADConv) and frequency-based attention mechanisms have achieved a balance between accuracy and computational efficiency in mapping CNA using high-resolution imagery [24]. PSO-optimized XGBoost models have also been applied to cropland degrainization susceptibility mapping [25]. More recently, integrated “Remote Sensing–GIS–Machine Learning” frameworks have been developed to combine multi-scale indicators and simulate future scenarios [26,27,28].

Despite these methodological advances, most existing approaches remain constrained by several structural limitations. NDVI-based methods are sensitive to vegetation sparsity and image quality [20], deep learning models require large volumes of annotated data and often lack transferability [23,29,30], and spatial econometric models struggle to capture nonlinear and multiscale relationships [1,21]. More importantly, four critical challenges persist: limited interpretability, insufficient temporal resolution for regulatory intervention, lack of regulatory integration, and the unresolved trade-off between accuracy and operational cost. These constraints reduce the practical value of many CNA models for land-use governance.

Graph-based modeling provides a natural framework to address these challenges by explicitly encoding spatial relationships and enabling joint modeling of structure and attributes. Graphs have been widely adopted in complex relational systems [30,31,32] and micro-scale scientific domains [33,34]. Graph neural networks (GNNs) extend this paradigm by enabling end-to-end representation learning on graph-structured data [35,36,37,38]. Among them, Kipf et al. [39] proposed Graph Convolutional Networks (GCNs), which applied spectral convolutions to graph data. GraphSAGE [40] supports scalable neighborhood aggregation, while Graph Attention Networks (GAT) introduce adaptive weighting mechanisms to capture heterogeneous spatial influences [41,42]. These properties make GNNs particularly suitable for modeling CNA, which inherently involves spatial proximity, cross-parcel interaction, and multi-source drivers. Spatiotemporal GNNs further enhance the representation of dynamic geographic systems [43,44,45,46,47,48,49]. Nevertheless, the application of GNNs to CNA susceptibility prediction remains limited.

To address the above limitations, this study proposes a unified CNA susceptibility prediction framework that integrates multi-source temporal remote sensing data with a spatially structured Graph Attention Network. Each node represents a block-scale land parcel, and edges are defined jointly by spatial adjacency and attribute similarity, embedding the causal chain from geographic constraints to construction behavior and spectral response. Multi-head attention enables adaptive aggregation of heterogeneous neighborhood features, supporting annual-scale, low-cost, and interpretable prediction. This framework enables meso-scale assessment and provides a practical tool for proactive farmland protection and refined land-use supervision.

2. Materials

2.1. Study Area

Wuhan, located in central China within eastern Hubei Province, lies at the eastern edge of the Jianghan Plain and at the confluence of the Yangtze and Han Rivers (Figure 1a), enjoys favorable natural conditions, flat terrain, and fertile soils, making it particularly well-suited for agricultural production. It serves as a key agricultural science and technology innovation hub in central China and functions as the largest economic and cultural hub in the region, with a permanent population exceeding 13 million and a total area of 8569.15 km², of which the built-up area is 973.84 km². As the core of the urban agglomeration in the middle reaches of the Yangtze River, Wuhan plays a leading role in regional economic development. Since the reform and opening-up, rapid economic growth and accelerated urbanization have led to extensive requisitioning of cropland in the non-central urban districts for purposes including industrial parks and residential developments. These developments, driven by lower costs and higher land appreciation potential, have attracted significant investment. However, due to the scattered distribution of cropland in non-central districts and weak enforcement of cropland protection policies, illegal land-use activities have been challenging to monitor and regulate. For example, in Huangpi District, the illegal occupation of cropland to construct a “public performance stage” went undetected for half a year due to its remoteness from major roads, and by the time enforcement via satellite imagery was initiated, some structures had already caused irreversible damage to the tillable layer. From a social perspective, changes in land-use perceptions have made farmers more inclined to convert cropland to non-agricultural uses for higher returns, while large-scale rural labor migration to urban areas has reduced labor input in cropland, further accelerating non-agriculturalization in non-central districts. Consequently, regulating CNA in Wuhan’s non-central urban districts is of urgent importance to reconcile urban development with cropland conservation, thereby ensuring food security and sustainable agriculture.

The study area comprises Wuhan’s non-central urban districts, including six administrative peripheral districts: Caidian, Jiangxia, Huangpi, Xinzhou, Dongxihu, and Hannan (Figure 1b). Geographically, it spans longitudes 113°4

1^{'}

–115°

05^{'}

E and latitudes 29°

58^{'}

–31°

22^{'}

N, covering approximately 7600 km², which accounts for 89% of Wuhan’s total area. The northern subregions, Huangpi and Xinzhou, are dominated by low hills, with cropland concentrated in intermontane basins and alluvial plain pockets; the southern areas, Jiangxia and Hannan, lie on the eastern fringe of the Jianghan Plain, featuring flat topography with dense rivers and lakes, resulting in expansive areas of continuous cropland. The western districts, Caidian and Dongxihu, adjoin the Chenhu wetland nature reserve and belong to the Yangtze-Han River alluvial plains, characterized by low-lying terrain, intricate water networks, and extensive tidal flats and enclosed polders. The study area experiences a humid subtropical monsoon climate in the northern subtropical zone, with synchronous rainfall and temperature patterns and an average annual precipitation of 1200–1400 mm, favorable for crops such as rice and rapeseed. It represents an intermediate belt in the “central city–new city–rural” gradient evolution of Wuhan, typifying an urban-rural transition zone that performs dual roles of urban decentralization and ecological barrier preservation, while facing compounded pressures on ecological and food security. On one hand, frequent land-use transitions are observed to be driven by multiple conflicting dynamics, including industrial upgrading (e.g., the transition from traditional to metropolitan agriculture), population flows (e.g., urban village renovation), and policy negotiations (e.g., land quota transfers). On the other hand, the area bears significant agricultural responsibility, with cropland comprising over 90% of land use and overlapping with ecologically sensitive zones such as Chenhu and Liangzi Lake. Therefore, it is a representative region for predicting cropland non-agriculturalization susceptibility.

2.2. Data Sources

The datasets employed in this study include Wuhan’s land use and land cover change (LUCC) data, terrain data, remote sensing imagery, socioeconomic indicators, and volunteered geographic information for the period 2018–2022. The LUCC data consist of land use classification maps and NDVI time series. Land use data are derived from the China Land Cover Dataset (CLCD) developed by the research team of Huang Xin at Wuhan University, with a spatial resolution of 30 m and annual temporal resolution. NDVI data originate from NASA Earthdata, with a spatial resolution of 1 km and monthly temporal resolution. Elevation data are obtained from the Digital Elevation Model (DEM) available via the Geospatial Data Cloud, at 30 m resolution. Remote sensing imagery is sourced from Sentinel-2 Level-2A products hosted by Copernicus, with bands 2, 3, 4, and 8 at 10 m spatial resolution. Socioeconomic data include statistical indicators from the Wuhan Statistical Yearbook, nighttime light data, building roof datasets, and population distributions. Specifically, the yearbook provides annual metrics such as GDP and the value of the secondary industry; nighttime light data are obtained from NPP-VIIRS-like products provided by the National Earth System Science Data Center, with 500 m resolution and annual updates; rooftop area data are sourced from the CBRA dataset at 2.5 m resolution; and population data are derived from WorldPop, with a resolution of 100 m. Volunteered geographic vector data encompass road networks, hydrological features, and bus stop locations. Road and water network data are extracted from OpenStreetMap, while bus stop data are collected from the public transit information portal. For analytical consistency, all non-numeric vector data were reprojected to the GCS_WGS_1984 coordinate system and clipped to the administrative boundary of Wuhan (Table 1).

Data preprocessing was performed using ENVI 5.6 and ArcGIS 10.8, including band merging, reprojection, spatial extent clipping, and simplification of road network topology. Road network processing was conducted using the ArcGIS data management tools and analysis tools. To reduce redundancy, multiple parallel road segments representing the same primary roadway were collapsed into a single centerline. Disconnected or dangling road segments were either removed or reconnected to ensure network continuity and structural integrity, which is essential for subsequent street block unit delineation. Based on the above data sources, cropland non-agriculturalization susceptibility levels for four consecutive periods (2018–2019, 2019–2020, 2020–2021, and 2021–2022) were calculated and extracted. These formed the basis for constructing the CNA graph structures along with their corresponding feature systems.

3. Methods

To predict CNA susceptibility, this study proposes a three-stage hybrid methodological framework that integrates domain-specific knowledge and graph-based deep learning (Figure 2). The approach comprises the following components. First, during the data preparation stage, CNA susceptibility levels are quantified and a spatial graph structure is constructed. In the susceptibility quantification phase, a Comprehensive Intensity of Cropland Non-agriculturalization (

C I C N

) index is developed based on socioeconomic, natural, and policy-related indicators. This index is calculated at the node level and subsequently categorized into four levels of CNA susceptibility, namely none, low, medium, and high, via a combination of expert visual interpretation and the natural breaks classification method. Concurrently, the spatial graph structure is constructed by treating block-level units as nodes and defining edges based on spatial adjacency and attribute similarity. The resulting heterogeneous graph is labeled with the corresponding susceptibility level. Second, during the model development stage, the data are divided into training and testing sets at a 7:3 ratio. To address the issue of class imbalance, a GraphSMOTE-based over-sampling strategy is applied to enhance minority class representation during training. GAT serves as the backbone model, implemented using the PyTorch framework. Finally, during the evaluation and interpretation stage, model performance is assessed using Micro-F1 and Area Under the Curve (AUC) metrics. In addition to standard accuracy evaluation, a relaxed criterion is introduced, where predictions within ±1 susceptibility level of the ground truth (

| d i f f | \leq 1

) are also considered acceptable.

3.1. Evaluation of Cropland Non-Agriculturalization Susceptibility

Most existing studies measure CNA intensity using conversion rates. However, this study posits that CNA is a multifaceted process not solely characterized by quantitative changes, but also shaped by cropland quality, ecological integrity, economic benefits, and restoration feasibility. Accordingly, the concept of Comprehensive Intensity of Cropland Non-agriculturalization (

C I C N

) is introduced, and an evaluation matrix is constructed to capture the multidimensional impacts of CNA across economic, ecological, and social dimensions (Table 2).

Land cover categories are reclassified into six types, including cropland, water, forest, grassland, construction, and unused land, which results in five types of specific cropland conversion trajectories.

C I C N

is evaluated across four dimensions: cropland quality, ecosystem impact, economic benefit, and difficulty of reclamation. The evaluation matrix is based on a 7-point Likert Scale (0: no change, ±1: negligible decline/improvement, ±2: moderate impact, ±3: significant impact), follows the environmental index evaluation method where +3 indicates strongly adverse impacts and −3 denotes strong positive effects, consistent with ecological compensation frameworks. Take the cropland-construction conversion type as an example, the scores for cropland quality, ecosystem impact, and reclamation difficulty are all +3, indicating severe negative effects—significant quality loss, ecological degradation, and high reclamation difficulty. Its economic benefit score of −3 suggests considerable economic gains that partly offset these adverse impacts. The scoring basis reflects expert judgment calibrated with empirical knowledge of land-use transition processes, and the weights were determined on the basis of prior studies on land use and cropland, which prioritizes long-term food security and ecological sustainability over short-term economic gains, emphasizing the fundamental importance of cropland quality and ecosystem integrity in assessing CNA impacts [18,21,25,28]. The scoring criteria for each transition type and evaluation dimension are summarized in Table A1.

The composite

C I C N

score for each spatial unit is calculated by aggregating the scores across all land conversion types and their respective dimensions, weighted by the importance of each dimension. Specifically, land use data from two adjacent years are compared to compute the annual cropland non-agriculturalization rate for each type of conversion within each unit. The final

C I C N

score for each unit is derived by multiplying the normalized CNA rate by the weighted impact score from the evaluation matrix. This process allows for a comprehensive and time-sensitive quantification of CNA intensity. The

C I C N

score for each unit is calculated using the following formula:

C_{i} = \sum_{j} w_{j} \cdot x_{i j}

(1)

C N A_{i} = \frac{n_{i}^{(t - 1) t}}{n_{c r o p l a n d}^{t - 1}}

(2)

C I C N = \sum_{i} C N A_{i}^{*} \cdot (C_{i}^{*} + 1)

(3)

where i denotes the type of cropland non-agriculturalization; j represents the evaluation dimension in the scoring matrix;

C_{i}

corresponds to the composite coefficient assigned to each CNA type;

C N A_{i}

refers to the cropland non-agriculturalization rate of type i from year

t - 1

to year t;

C I C N

represents the comprehensive intensity of cropland non-agriculturalization for the current spatial unit; ∗ indicates that the value has been normalized. The constant term (+1) in Formula (3) plays a critical role in maintaining both numerical stability and physical interpretability. First, it guarantees that any cropland loss contributes a baseline value to the total intensity, preventing neutralized transitions from being ignored. Second, by transforming the multiplier into the range [1, 2], the qualitative severity becomes an enhancement factor, allowing the framework to distinguish scale-driven from impact-driven cropland non-agriculturalization. To enhance the interpretability of the proposed formulation, a numerical example is provided in Appendix B.

After calculating

C I C N

for each spatial unit, the natural breaks method is used to categorize

C I C N

into four susceptibility levels: none (0), low (1), medium (2), and high (3). These preliminary levels are further refined through visual interpretation using historical imagery from Google Earth, to correct potential misclassifications in land cover data. Units with low

C I C N

but rapid cropland transition processes are manually upgraded to higher susceptibility classes. Figure 3 outlines the visual interpretation rules corresponding to different susceptibility levels. For example, low-susceptibility areas exhibit surface hardening or bare land, while medium-susceptibility areas contain scattered factories or low-rise buildings. High-susceptibility zones are characterized by extensive non-agricultural built-up areas replacing cropland. To minimize subjectivity, two independent analysts optimized the preliminary levels based on

C I C N

results using historical imagery from Google Earth. Inter-observer agreement reached 85%, indicating substantial consistency.

3.2. Construction of the Cropland Non-Agriculturalization Feature System

Currently, there is no consensus regarding the determinants of cropland non-agriculturalization susceptibility. Based on the intrinsic mechanisms of CNA, this study systematically constructs a multi-dimensional feature system that integrates natural attributes, landscape structure, human activity, and spatiotemporal dynamics. Features relevant to CNA susceptibility are derived from four primary dimensions: imagery, land cover, topography, and socioeconomics. The rationale behind each feature type and its explanatory power in the CNA process is discussed in detail. As shown in Figure 4, the proposed four-dimensional feature system balances natural constraints and anthropogenic disturbances, incorporates both static background and dynamic signals, and provides a robust data foundation for subsequent modeling and mechanism interpretation.

3.2.1. Imagery

Remote sensing imagery is the most direct data source for monitoring surface cover. Spectral and texture information respectively reflect the physical attributes and spatial configuration of land features. Therefore, both spectral and texture characteristics are utilized as imagery features. On one hand, different land cover types exhibit significant spectral separability in visible–NIR wavelengths; for example, healthy vegetation is observed to reflect strongly in the near-infrared band, whereas construction land shows markedly lower reflectance. Thus, mean band values are selected as spectral features to distinguish cropland from built-up areas based on inherent radiometric responses without prior knowledge. On the other hand, cropland non-agriculturalization is often accompanied by either homogenization (e.g., contiguous factory roofs) or fragmentation (e.g., scattered sheds) of land surface patterns. Gray-Level Co-occurrence Matrix (GLCM) metrics are used to quantify these structural changes [50]. Four indicators, namely contrast (texture sharpness), entropy (complexity), angular second moment (uniformity), and inverse difference moment (local homogeneity), are employed to capture texture variation caused by human activities in multiple directions, compensating for the limited spatial sensitivity of spectral features.

3.2.2. Land Cover

Land cover data not only represent land classification but also imply human intentions and spatial interaction patterns. This study combines static semantic features and dynamic NDVI time-series features to construct land cover indicators. Traditional categorical encoding is insufficient to represent functional associations. To address this, a metaphorical analogy is utilized, namely “region as document”, “land type as word” and “land sequence as sentence” to build a land category corpus. Using Word2Vec, each land category is embedded into a high-dimensional semantic space to capture potential transition logics embedded in spatial contexts [51]. Weighted average and Principal Component Analysis (PCA) are applied to derive semantic features of each region, preserving contextual meaning while unifying data across multiple spatial scales. Furthermore, as seasonal exposure differences are frequently exhibited by CNA, manifested by distinct NDVI curves across spring tillage, summer growth, autumn harvest, and winter abandonment, quarterly NDVI statistics are utilized to identify permanently abandoned, seasonally fallowed, or transitional cropland, thereby mitigating misclassification risks from static land cover data.

3.2.3. Topography

Topography indirectly influences CNA through development cost and land suitability. Plains, hills, and mountainous areas exhibit substantial variation in slope, aspect, and elevation. Flat areas offer low development costs and are more susceptible to urban or industrial encroachment. South-facing terraced cropland, although highly efficient for farming, may be repurposed for tourism because of its scenic value. Hilly and mountainous areas often have fragmented, marginal cropland that may be abandoned or passively converted under policy interventions. Thus, slope, aspect, elevation range, and coefficient of variation are selected to characterize topographic constraints on land use change.

3.2.4. Socioeconomy

Socioeconomic factors are direct drivers of CNA, and their spatial heterogeneity governs the probability of land-use transformation. Transport accessibility is one of the most critical factors influencing CNA and is a leading indicator of urban expansion. Road network density and bus stop counts jointly reflect agricultural logistics efficiency and the accessibility of construction land, thus influencing the location of urban expansion and non-agricultural projects. Cropland-to-construction is the most common CNA pathway. Higher building density and expanded built-up area indicate stronger land development intensity and increased risk of surrounding cropland loss. Economic development and industrial upgrading, especially the expansion of secondary industries, is observed to be tightly linked to the conversion of cropland to non-agricultural uses. High levels of economic activity further exacerbate land use competition. Population totals and densities influence CNA through housing, employment, and infrastructure demands, and spatial population distribution helps locate urbanization hotspots. Therefore, four indicators, consisting of transport accessibility, building density, economic vitality, and population pressure, are selected to represent the socioeconomic dimension of CNA drivers.

3.3. Graph Model Construction

3.3.1. Node and Edge Construction

Geographical entities, characterized by fixed spatial relationships and dynamically changing attribute features, are inherently well-suited for graph-based modeling. A graph structure consists of nodes and edges, in which nodes represent to the smallest spatial unit of analysis. In geographical research, the spatial scale determines the extent and granularity of the study area. Within the context of cropland non-agriculturalization (CNA), macro-scale studies often focus on urban or provincial patterns, benefiting from higher data accessibility, whereas micro-scale studies emphasize spatial distribution and transitions at the grid-cell level, which are advantageous for understanding the underlying drivers and mechanisms.

This study adopts a meso-scale representation based on street-block units to construct the spatial graph structure for CNA prediction. The rationale is threefold: (1) As a meso-scale spatial unit, street blocks can capture complex interactions across adjacent units, which reflect spatial spillover effects. For instance, extensive non-agricultural development within a given block may accelerate CNA processes in its neighboring blocks, a pattern that can be effectively captured through graph edge connections. (2) Compared to raster-based representations, block-level graph structures are better suited for integrating multi-temporal and multi-resolution heterogeneous datasets, and for expressing the complex relationships inherent in CNA processes. (3) Street blocks also serve as fundamental administrative units in urban planning and management. Thus, predictions made at this scale are more aligned with practical land-use decision-making, enabling early identification and prioritization of high-risk areas for policy intervention.

To ensure consistent delineation of street-block units across heterogeneous urban-rural contexts, the road network extracted from OpenStreetMap was first refined through manual correction using Sentinel-2 and Google Earth high-resolution imagery. In sparsely roaded rural areas, visually identifiable field paths, irrigation ditches, and linear settlement boundaries were used to supplement missing road segments, thereby forming enclosed polygons. Blocks smaller than 40 ha were merged and those larger than 4000 ha were subdivided to reduce size heterogeneity. As shown in Figure 5, the resulting 1290 blocks were identified, which have an average area of 795.8 ha. However, the resulting CV of 1.04 indicates substantial heterogeneity in block size, reflecting pronounced spatial variability across the urban–rural transition zones. To mitigate the influence of uneven block size, all density-related features (e.g., building, road and population densities et al.) were normalized by block area. This normalization effectively reduces first-order scale bias, ensuring that feature values are comparable across heterogeneous spatial units. Since block size variation may still affect spatial connectivity and intra-block heterogeneity, a sensitivity analysis (±20% block area adjustment) was conducted to verify the robustness of the model results. The results of the block size sensitivity test are presented in Table A2. The sensitivity test showed that when block areas were adjusted by ±20%, the variations in model AUC and F1-score were within 1.5%, indicating that the model is robust to moderate changes in block size. All delineations were checked by two independent analysts, and discrepancies below 5% in block boundaries confirmed the internal consistency of the segmentation.

3.3.2. Feature Calculation

Following the data sources outlined in Section 2.2 and the feature framework developed in Section 3.2, a complete computational pipeline was established to translate conceptual definitions into quantifiable indicators for each node.

Imagery Features

Spectral features were calculated by computing the arithmetic mean of surface reflectance values for each node in four Sentinel-2 bands: Blue (B2/0.490 μm), Green (B3/0.560 μm), Red (B4/0.665 μm), and Near-Infrared (B8/0.842 μm), resulting in a four-dimensional spectral feature vector. Texture features were derived using the gray-level co-occurrence matrix (GLCM) method, calculated across four directions, specifically the horizontal, vertical, and two diagonal axes. From these, four indices were extracted: contrast, entropy, angular second moment (ASM), and inverse difference moment (IDM), capturing image sharpness, complexity, uniformity, and local consistency, respectively.

Land Cover Features

As shown in Figure 6, for static semantic representation of land cover, each region was treated as a “document”, each pixel’s land cover type as a “word”, and sequences generated by random walks as “sentences”. For instance, with a walk length of five, a sample sentence could be “water–cropland–water–water–forest”. These sequences formed a land cover corpus, used to generate high-dimensional semantic embeddings via the Word2Vec Skip-Gram model. The weighted average of embeddings in each node was reduced to 10 principal components via PCA to form semantic land cover features. Additionally, dynamic NDVI time-series features were derived by computing quarterly averages from monthly NDVI data to capture seasonal phenological patterns and differentiate fallow, transitional, or intensively cultivated areas.

Topographic Features

Topographic variables were derived from DEM data. Using ArcGIS zonal statistics, each node’s slope, aspect, and elevation range were computed. Elevation variability was quantified using the coefficient of variation of elevation, resulting in seven topographic features.

Socioeconomic Features

Socioeconomic indicators were extracted using ArcGIS 10.8 from both remote sensing and crowdsourced vector data. Accessibility metrics, including road density and bus stop counts, were computed from OpenStreetMap and bus data. Building density was calculated from rooftop area datasets. Economic vitality was characterized using GDP, secondary industry output and growth, and nighttime light indices (TNLI and ANLI). Population pressure was measured by aggregating WorldPop pixel-level data within each node to obtain total population and dividing by node area to calculate population density.

3.4. Cropland Non-Agriculturalization Susceptibility Prediction Model

The proposed Cropland Non-Agriculturalization Graph Attention Network (GS-GAT) comprises two core components: a GraphSMOTE-based sample balancing module and a Graph Attention Network (GAT) module. As illustrated in Figure 7, the geographical graph containing susceptibility prediction features and class labels is first augmented using the GraphSMOTE strategy to improve minority class representation. The resulting augmented graph is then fed into the GAT model. Through a multi-head attention mechanism driven by attribute similarity, node features are aggregated within the convolution layers. Following multi-layer feature updates guided by both classification and structural reconstruction losses, the model outputs predictions of CNA susceptibility levels for the subsequent year.

3.4.1. Minority Node Augmentation Based on GraphSMOTE

In CNA susceptibility prediction, certain classes are underrepresented in the dataset, which introduces bias toward majority classes, leading to reduced prediction accuracy. To address this issue, this study incorporates a sample balancing strategy based on GraphSMOTE, a graph-based extension of the classical SMOTE algorithm proposed by Zhao et al. [52]. GraphSMOTE is specifically designed to tackle class imbalance in graph-structured data by oversampling minority class nodes while preserving the topological structure and attribute distribution of the original graph. As shown in Figure 8, the algorithm operates by leveraging the neighborhood relationships of minority nodes. Using feature vectors of original minority nodes and their nearby neighbors, it generates synthetic samples through linear interpolation, and also creates new edges connecting synthetic and real nodes to maintain the graph’s connectivity. The interpolation process accounts for both attribute similarity and graph topology, ensuring the generated nodes are realistic and topologically consistent. To prevent label leakage, augmentation is only applied during training.

3.4.2. GCNN with Graph Attention Mechanism

A three-layer Graph Attention Network (GAT) is employed to model spatial feature aggregation and susceptibility classification. Originally proposed by Veličković et al. [41], GAT addresses the limitations of traditional GNNs by introducing an attention mechanism that allows for the adaptive assignment of weights to neighboring nodes based on their features. This flexibility enables the model to better capture both local and global graph structure.

Aligned with Tobler’s First Law of Geography and the spatiotemporal nature of CNA, an attribute-driven attention mechanism is adopted. At each layer, node embeddings are updated by computing the similarity between a center node and its neighbors in terms of attribute features. Higher similarity yields greater attention weights, and the final embedding is a weighted sum of neighbor features. The attention-based update is formally expressed as:

r_{i j} = a ([W f_{i} | | W f_{j}])

(4)

w_{i j} = \frac{exp (L e a k y R e l u (r_{i j}))}{exp \sum (L e a k y R e l u (r_{i j}))}

(5)

f_{i}^{n + 1} = δ (\sum_{j \in N_{i}} w_{i j} W f_{i}^{n})

(6)

where

r_{i j}

represents the integrated attribute features of node i and node j,

| |

denotes the feature concatenation operation,

w_{i j}

is the normalized attention weight coefficient,

f_{i}^{n + 1}

stands for the updated node features, and n indicates the current number of graph convolutional layers.

During model training, both classification loss and structural reconstruction loss are jointly optimized. After applying the GraphSMOTE strategy to augment minority-class nodes within the geographical graph, which comprises CNA susceptibility features and corresponding labels, several temporally distinct augmented graphs are simultaneously fed into the GAT model for parallel training. The model computes the error between the node-level outputs from each GAT layer and the one-hot encoded ground truth labels, and uses backpropagation to iteratively update the trainable parameters. This process yields susceptibility predictions across all time periods. In each training iteration, the model takes the node features and edge indices of the current graph batch as input and performs forward propagation, producing two outputs, unnormalized node-level logits and raw edge-level logits. The classification loss is computed using categorical cross-entropy between the node logits and the ground truth labels. Simultaneously, the structural reconstruction loss is calculated using binary cross-entropy between the predicted edge logits and the ground-truth adjacency matrix. These two loss components are combined using a weighted sum to form the total loss:

l o s s = l o s s_{c l s} + λ \times l o s s_{r e c}

(7)

Here,

λ

is a tunable hyperparameter that balances the contribution of structural consistency to the overall loss.

This formulation ensures that classification loss primarily drives the node-level prediction task, while the structure-aware reconstruction loss acts as a regularization term. By preserving the original graph topology in the learned embeddings, this design mitigates overfitting, enhances generalization, and partially offsets the bias introduced by class imbalance during training.

3.5. Model Training and Evaluation Settings

The model implementation was conducted using the PyTorch 1.7.0+cu101 deep learning framework and the Python 3.8 programming language. The experimental environment consisted of an Intel Core i7-10700 CPU and an NVIDIA GeForce GTX 1660 SUPER GPU. Based on the feature dataset established in Section 3.2, the entire dataset was divided into a training set and a test set at a ratio of 7:3. Each temporal period contained 896 training samples and 394 test samples. Model hyperparameters, including epoch, learning rate, number of layers, attention heads, and the

λ

of joint loss are optimized via grid search. The hyperparameters and cross-validation results are showed in Table A3. The training batch size was set to 4. The model was optimized using the Adam optimizer, the cross-entropy loss function, and a cosine learning rate scheduler.

Model evaluation serves as a critical mechanism for quantifying the performance of classification algorithms. In this study, two widely recognized metrics, namely the Receiver Operating Characteristic Area Under the Curve (ROC-AUC) and the Micro-F1 score, are adopted to evaluate the classification performance of the proposed CNA susceptibility prediction model from multiple perspectives.

ROC-AUC is one of the most commonly used metrics for multi-class classification problems. It evaluates the model’s ability to distinguish between positive and negative instances, offering a comprehensive measure of discriminative performance. This metric is particularly well-suited for imbalanced classification tasks such as CNA susceptibility prediction. The ROC curve plots the true positive rate (TPR) against the false positive rate (FPR) across various classification thresholds. The AUC value, defined as the area under the ROC curve, ranges from 0 to 1, with higher AUC values indicating stronger classification performance. In this study, the One-vs-One (OvO) strategy is adopted to compute multi-class ROC-AUC values.

The Micro-F1 score, defined as the harmonic mean of precision and recall, evaluates the model’s performance by treating all classes collectively. This metric emphasizes holistic model performance across all instances, making it especially effective under conditions of class imbalance. It avoids overemphasizing dominant classes and better reflects the model’s practical classification capability.

These performance metrics are derived from four fundamental statistics. True Positives (TP): the number of instances correctly predicted as class i. False Positives (FP): the number of instances from other classes incorrectly predicted as class i. True Negatives (TN): the number of instances not in class i correctly predicted as such. False Negatives (FN): the number of instances in class i incorrectly predicted as belonging to another class. In the context of CNA susceptibility prediction, these metrics help quantify the model’s ability to identify high-risk regions while minimizing misclassification across different susceptibility levels. For multi-class classification tasks, confusion matrices are also employed to visualize prediction accuracy and misclassification tendencies across all classes.

The formulas used to compute these evaluation metrics are as follows:

T N R = \frac{T N}{T N + F P}

(8)

P r e c i s i o n = P P V = \frac{T P}{T P + F P}

(9)

R e c a l l = T P R = \frac{T P}{T P + F N}

(10)

F P R = \frac{F P}{F P + T N}

(11)

R O C - A U C_{O v O} = \frac{2}{C (C - 1)} \sum_{1 \leq i \leq j \leq C} \frac{\sum_{k = 1}^{n_{i, j}} (T P R_{k} \times (F P R_{k} - F P R_{k - 1}))}{F P R_{m a x}}

(12)

M i c r o - f 1 = \frac{2 \times P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l}

(13)

Here, C represents the number of classes, (i, j) denotes a pair of classes, and it refers to the number of thresholds used to calculate the AUC value for the class pair (i, j).

4. Results

4.1. Prediction Results of CNA Susceptibility

Based on the spatial partitioning and dataset described earlier, CNA susceptibility prediction experiments were conducted. Figure 9 displays the prediction results for the test area (Hannan and Jiangxia districts) across four distinct time periods. As shown in Table 3, the model achieved the highest F1 score of 82.6% and the AUC of 85.6% during certain period. Furthermore, given that CNA susceptibility is ordinal in nature, a “relaxed correctness” criterion is defined, in which predictions with a deviation no greater than one level (

| d i f f | \leq 1

) are considered acceptable. Under this condition, the model’s F1 and AUC scores further improve, with maximum values reaching 0.91 and 0.93, respectively. The true/false prediction map is shown in Figure 10a. To assess the robustness of model and its spatial generalization beyond specific regional divisions, a 5-fold spatial cross-validation experiment was conducted. The AUC and F1 scores for each fold, along with their corresponding means and standard deviations, are summarized in Table 4. By comparing the mean cross-validation results with those from the original fixed spatial split, it was found that the performance differences (

Δ

) were consistently within the expected variability range (

| Δ | < K F o l d_{s} t d

), indicating that the model is not sensitive to regional partitioning and demonstrates stable spatial generalization.

As shown in the relaxed prediction confusion matrix (Figure 10b), when prediction errors do not exceed level one (

| d i f f | \leq 1

), the model tends to predict low susceptibility as none (1→0) and moderate susceptibility as high (2→3). In practical applications, such relaxed misclassifications are tolerable in the implementation of cropland protection policies. For example, treating low and none as equivalent may not significantly affect regulatory actions, and predicting a moderate area as high may prompt stricter protective measures, thereby better safeguarding cropland. Thus, relaxed prediction metrics serve as a meaningful supplement to strict classification metrics, better reflecting model’s ability to achieve near-correct classification which aligning with the real-world tolerance of policy implementation.

Overall, the GS-GAT model demonstrates strong predictive performance across all four test years, offering robust evidence for its practical applicability in forecasting CNA susceptibility. The use of relaxed correctness criteria further highlights the model’s flexibility and real-world suitability.

4.2. Ablation Experiment Results

4.2.1. Feature Dimension Ablation

To evaluate the contribution of different feature dimensions to model performance, a series of ablation experiments were conducted by selectively removing one or more dimensions from the input features. As shown in Table 5 and Table 6, the average AUC and F1 scores showed consistent decreases when individual or combined feature dimensions were removed. However, the extent of the performance drop varied across feature types. The removal of socioeconomic features led to a relatively modest decline, whereas the exclusion of imagery and land cover features resulted in more pronounced performance degradation. These findings were further supported by the results in Table 7, where models trained with only a single feature dimension yielded substantially lower scores compared to the full-feature model.

4.2.2. Ablation of the GraphSMOTE Strategy

As shown in Table 8, applying the GraphSMOTE strategy significantly increased the number and diversity of minority-class samples in the training set. Given the marked class imbalance in the original dataset, where one dominant class substantially outnumbers the others, GraphSMOTE was employed to augment the three underrepresented susceptibility levels. This oversampling approach aimed to balance the training set and improve classification performance across all classes. As shown in Table 9 and Figure 11, the baseline GAT model achieves a test F1 of 0.64, which is significantly lower than the training results (0.76), indicating limited generalization under class imbalance. In comparison, the under-sampling variant converges slightly faster but attains a lower test F1 of 0.62, suggesting that the reduction of majority samples weakens overall representation capacity. Moreover, the GAT with Focal Loss exhibits oscillations during early training due to the adaptive weighting of difficult samples and later stabilizes with improved generalization, reaching a test F1 of 0.69. By contrast, the proposed GAT with GraphSMOTE maintains stable training behavior similar to the baseline while achieving the highest test F1 of 0.77, demonstrating that the synthetic nodes generated by GraphSMOTE enhance minority class representation without inducing overfitting, thereby improving both discrimination and generalization performance.

4.3. Comparative Experiment Results

To further evaluate the effectiveness of the proposed GS-GAT model, comparative experiments were conducted against two widely used baseline models: Support Vector Machine (SVM) and Extreme Gradient Boosting (XGBoost). SVM is a supervised learning model grounded in statistical learning theory. It constructs an optimal hyperplane in feature space to maximize the margin between different classes, and is particularly suitable for small-sample, nonlinear, and high-dimensional problems. It has been widely applied in text classification and image recognition. XGBoost, on the other hand, is an ensemble learning algorithm based on decision trees. It builds multiple decision trees iteratively, optimizes a target loss function using gradient descent, and incorporates regularization to prevent overfitting. XGBoost is highly effective on structured and large-scale datasets and has been extensively used in sales forecasting and risk assessment. As shown in Table 10, the GS-GAT model outperforms both SVM and XGBoost in terms of average performance across all periods and peak performance in individual periods. This is particularly noticeable in AUC values. Under relaxed prediction criteria (

| d i f f | \leq 1

), although SVM and XGBoost can achieve F1 scores comparable to GS-GAT, their AUC values are significantly lower. This indicates that GS-GAT has a superior overall discrimination ability across all classification thresholds and is more robust to imbalanced data distributions. The superior performance of GS-GAT is largely attributable to its ability to leverage spatial relationships and semantic interactions between nodes within the graph structure, as well as its capacity to integrate multi-source features comprehensively.

Quantitative results and qualitative comparisons are shown in Figure 12, which presents prediction outputs under different land use scenarios. (1) Fragmented Farmland: In scenarios where cropland is irregular, fragmented, and dispersed, GS-GAT effectively captures spatial complexity and predicts high-susceptibility zones using topological and contextual information. In contrast, SVM and XGBoost rely solely on node-level features and often misclassify large-area nodes by ignoring localized transitions. SVM is particularly sensitive to noise, which can lead to false positives in cases of bare fields or greenhouse reflections. (2) Continuous Farmland: When cropland is contiguous and CNA susceptibility is generally low, GS-GAT effectively captures capturing the smooth spatial transitions between nodes, yielding consistent predictions. SVM and XGBoost, lacking spatial context, tend to overfit to spectral changes, resulting in prediction artifacts. (3) High-Density Built-Up Areas: Both GS-GAT and SVM perform well, accurately identifying susceptibility levels. However, XGBoost frequently misclassifies high-risk areas as medium or low risk due to its lower sensitivity to subtle transitions in built-up environments. (4) Mixed Land Use Areas: In complex regions at the cropland-construction or cropland-water interfaces, GS-GAT demonstrates strong integrative capability, producing results highly consistent with actual labels. SVM is overly sensitive to spectral variation, leading to over-prediction of susceptibility, while XGBoost underestimates susceptibility, especially in built-up areas.

As shown in the confusion matrices in Figure 13, all three models achieve high precision when predicting extreme classes (0: None, 3: High), but perform less accurately on intermediate classes (1: Low, 2: Mid). This is likely due to less distinct feature separability between these mid-range categories. GS-GAT’s misclassifications are primarily between adjacent levels (e.g., 1→0, 2→3), indicating structured error tendencies. An exception was observed during the 2020–2021 period, during which a spike in misclassification from High (3) to None (0) occurred; this phenomenon is potentially attributable to COVID-19 disruptions, which altered land-use behavior and weakened the model’s learned correlation established by the model between socioeconomic or spectral signals and CNA. In contrast, SVM and XGBoost exhibit more systemic underestimation of high-risk classes, with 18%–25% more misclassifications of Mid (2) and High (3) as None (0) compared to GS-GAT. Such underestimation substantially increases the risk of missed detections in high-susceptibility areas.

5. Discussion

The prediction of CNA susceptibility represents a critical intersection of geographical information science, environmental economics, and sustainable land management. While existing monitoring methods have predominantly focused on retrospective change detection [53] and high-frequency pixel-level analysis [54], a crucial paradigm shift is introduced here. By conceptualizing CNA as a structural outcome arising from the complex interplay of multi-source features within street-block units, movement beyond simple spectral change detection is achieved toward a systematic understanding of the latent propensity for land-use transition. The experimental results, which demonstrate an average AUC of 85.6% and a relaxed F1 score of up to 91%, provide robust evidence that the GS-GAT framework effectively decodes the intricate functional skeleton of the urban-rural transition zone.

5.1. Meso-Scale Representation and Multi-Resolution Data Fusion

Previous research concerning spatial patterns of land-use change is frequently limited by a scale mismatch between data resolution and the operational units utilized in land governance. Although traditional pixel-based models provide indispensable value in the identification of specific land-cover transitions, their efficacy is often reduced when aggregated to areal units where the synergistic effects of various facilities and socioeconomic drivers must be analyzed [55]. Within the intricate structural composition of Wuhan’s non-central districts, CNA is rarely initiated by an isolated factory or road segment; instead, it is determined by a foundational risk landscape established by dominant functional attributes. Frequent land-use transitions are shaped by the interaction of multiple, often competing forces, such as industrial restructuring, demographic mobility, and policy bargaining.

A primary challenge in multi-source geographic modeling pertains to the integration of datasets characterized by highly disparate spatial resolutions. A feasible methodology for fusing multi-source and multi-resolution data, ranging from 2.5-m rooftop data to 1-km NDVI products, is established via a meso-scale aggregation strategy centered on street-block units. These units function as information containers that encapsulate the statistical properties of multiple features, thereby eliminating resampling errors and artificial patterns that typically arise from pixel-level alignment. Scale-induced bias is further alleviated by the normalization of density-related features, such as building and population, relative to the respective block area. Results from the block size sensitivity analysis demonstrate that fluctuations in AUC and F1-score performance are maintained below 1.5% even when block sizes vary by ±20%, confirming that the GS-GAT framework is robust against the inherent resolution disparities of the input data. A transition from micro-facility counts to macro-functional dimensions is realized through the implementation of this meso-scale representation. Owing to the fragmentation of semantically coherent urban units and the loss of structural continuity caused by uniform grid decomposition, raster-based frameworks are inherently limited in representing cross-unit spatial interactions [56]. In contrast, the block-level graph representation method adopted in this study facilitates the explicit modeling of inter-block dependencies, thereby providing an effective mechanism for capturing spatial spillover effects.

5.2. Geographic Principles and Semantic Landscape Embedding

From a theoretical perspective, the superior performance of the GS-GAT model is rooted in its deep coupling with the fundamental principles of geography. While traditional machine learning models are limited by treating spatial units as independent samples [25], Tobler’s First Law of Geography is explicitly encoded by the attribute-driven attention mechanism. At each layer, node embeddings are updated by computing the semantic similarity between a center block and its neighbors. It is ensured that not only the static attributes of a parcel but also the dynamic pressures exerted by its geographical context are captured.

Furthermore, a significant methodological innovation is represented by the use of the “region as document” metaphor for land cover semantic embedding. By treating pixels as words and spatial sequences as sentences for Word2Vec embedding, movement beyond simple categorical encoding is facilitated. The transition logic embedded in the spatial configuration of land features is captured through this approach. For instance, a block containing a specific sequence such as cropland-water-construction might signal a higher latent tendency for further development than a purely agricultural cropland-cropland sequence. Contextual meanings of the landscape are preserved by this high-dimensional semantic space, allowing for a distinction to be made between stable agricultural zones and unstable transitional zones characterized by fragmented land patterns and spectral heterogeneity.

5.3. Assessment of Model Robustness and Mitigation of Data Imbalance

As demonstrated by feature dimension ablation experiments, overall performance is observed to remain relatively stable even under conditions of reduced feature inputs. Specifically, because the decrease in Micro-F1 never exceeded 5%, a high degree of feature robustness is exhibited by the GS-GAT model. Strong performance is maintained even in the presence of missing or degraded features through the effective utilization of remaining information. Better generalization and stability are implied by this robustness, as generalized patterns relevant to CNA susceptibility are learned without over-reliance on any specific feature subset. Such ablation results indirectly reflect the benefits of the attention mechanism embedded within GS-GAT. Focus is dynamically shifted toward the most informative features under varying input conditions via the assignment of attention weights based on content similarity and node associations, thereby ensuring that prediction stability is maintained. Additionally, the bias inherent in the original highly imbalanced dataset was effectively mitigated through the GraphSMOTE strategy. By synthesizing minority-class nodes while preserving topological consistency, a test F1 of 0.77 was achieved. It is confirmed that addressing class imbalance is a theoretical necessity for modeling rare but high-impact geographic events.

5.4. Strategic Utility for Proactive Cropland Governance

A key strength of the proposed GS-GAT framework is its ability to facilitate proactive cropland protection by transforming land monitoring from post-event damage assessment to anticipatory risk management. Conventional regulatory and observation systems largely rely on post hoc detection of land use and land cover change, such as satellite-based identification of unauthorized cropland conversion, which is inherently limited by temporal latency and omission errors, particularly in remote, fragmented, or topographically complex landscapes. In contrast, the GS-GAT model achieved a reduction in missed detection rates of 18% to 25% relative to conventional machine learning baselines, including support vector machines and XGBoost, with particularly strong performance in heterogeneous and mixed-use land systems. By identifying high-risk parcels prior to irreversible land conversion, inspection efforts can be prioritized, regulatory resources can be allocated more efficiently, and early intervention can be implemented to mitigate long-term land degradation.

Proactive monitoring and risk prediction frameworks have been increasingly recognized as essential components of sustainable land system governance and food security assurance. Recent studies have demonstrated that rapid cropland non-agriculturalization monitoring based on multi-source remote sensing data can support large-scale protection initiatives by enabling timely detection of conversion trajectories that are often overlooked by conventional methods [17]. Furthermore, systematic near real-time crop type and land cover mapping from satellite observations has become a foundational element of adaptive agricultural management, enhancing the characterization of spatial and temporal dynamics across heterogeneous agroecosystems [57].

Beyond regulatory enforcement, proactive cropland protection plays a critical role in strengthening agricultural system resilience and long-term food security. Cropland loss through abandonment or conversion to non-agricultural land uses has been widely associated with increased food supply vulnerability and ecosystem degradation, which has led to growing calls for integrated monitoring and early warning systems within both scientific and policy communities [58]. Within this context, predictive models that integrate geospatial, temporal, and environmental drivers provide actionable decision support for land managers and policymakers, thereby enabling evidence-based land use planning and sustainable agricultural development.

Finally, the adoption of a relaxed correctness metric in the evaluation process is intended to reflect the operational tolerances of land management and regulatory practice. Although the classification of moderate-risk parcels as high risk may be considered an overestimation from a purely statistical perspective, such conservative labeling is consistent with precautionary land protection strategies and contributes to a reduced probability of cropland loss. Early identification of vulnerable areas enables the implementation of preemptive management measures that help preserve soil productivity, landscape stability, and associated ecosystem services.

5.5. Prospective Research Avenues

Building on the present framework, several promising directions can be pursued to further advance CNA susceptibility modeling. Future research may refine the representation of transitional categories by incorporating more discriminative semantic, temporal, and regulatory indicators, thereby enhancing the resolution of intermediate risk patterns. In addition, the integration of explicit policy and institutional constraints, such as redline boundaries and permanent basic farmland protection zones, would enable closer alignment between model outputs and real-world land governance processes. The analytical depth of the framework can also be strengthened through model interpretability techniques, including SHAP-based feature attribution and attention weight visualization, which would clarify how neighborhood interactions and functional contexts shape local conversion risks. Such tools would not only support scientific interpretation but also provide actionable insights for land managers. More broadly, continued development of macro-functional representations will facilitate a conceptual transition from micro-facility counting toward systematic functional environments, enabling a more comprehensive understanding of how urban–rural opportunity structures condition future cropland trajectories.

6. Conclusions

This study proposes GS-GAT, a novel method for predicting cropland non-agriculturalization (CNA) susceptibility, integrating multi-source remote sensing and socioeconomic data with a graph attention network architecture. By constructing a street-block-level heterogeneous graph structure, the model fuses four types of features, including imagery, land cover, topography, and socioeconomics, while incorporating the GraphSMOTE strategy to address class imbalance. The framework enables early prediction of CNA susceptibility levels for the following year. By leveraging four consecutive annual datasets from 2018 to 2022, cross-year parallel training and validation were conducted to systematically evaluate model performance across fragmented and mixed-use farmland scenarios, thereby offering a new technical pathway for cropland protection and refined land-use supervision in urban fringe zones.

Experimental results from non-central districts in Wuhan demonstrate the feasibility and effectiveness of the proposed method. The GS-GAT model achieved an average AUC of 85.6% and an average F1 score of 82.6% across the four test years. Under the relaxed prediction criterion, the AUC and F1 scores increased to 93% and 91%, respectively. The ablation study demonstrated that removing the GraphSMOTE strategy significantly decreased the number of minority-class samples and led to a sharp drop in AUC from 80.53% to 60.4%, confirming the necessity of minority augmentation strategies in high-heterogeneity regions. Comparative experiments with traditional models such as SVM and XGBoost further validated the superiority of GS-GAT, particularly in identifying fragmented and mixed-use farmland. GS-GAT reduced the missed detection rate by 18%–25%, enabling early identification of high-risk plots, supporting targeted on-site inspection and policy design for cropland protection, and reducing monitoring and management costs.

Nevertheless, the current feature system does not yet incorporate soil physicochemical properties or detailed land policy variables, limiting the model’s ability to interpret complex driving mechanisms. Although the attention weights in the GAT reflect the importance of neighborhood features, they lack explicit mapping to policy constraints such as redlines and permanent basic farmland boundaries. Future work will incorporate additional observational and regulatory constraint data, improve interpretability through SHAP values and attention weight visualization, and perform long-term validation in more representative regions. These efforts aim to assess the model’s generalizability and stability across varying climatic and socioeconomic conditions, ultimately providing stronger support for cropland protection and sustainable development.

Author Contributions

Conceptualization, Shiqi Wan and Lina Huang; methodology, Shiqi Wan and Lina Huang; validation, Shiqi Wan and Lina Huang and Zhangying Xia; formal analysis, Shiqi Wan; investigation, Shiqi Wan; resources, Lina Huang; data curation, Shiqi Wan and Lina Huang and Zhangying Xia; writing—original draft preparation, Shiqi Wan and Lina Huang; writing—review and editing, Shiqi Wan and Zhangying Xia; visualization, Shiqi Wan; supervision, Lina Huang; funding acquisition, Lina Huang. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Natural Science Foundation of China grant number 42394060 and 42394062, International (Regional) Cooperation and Exchange (ICE) Projects of the National Natural Science Foundation of China (NSFC) grant number W2421057.

Data Availability Statement

All the data used in this study appear in Section 2.2 of this article.

Acknowledgments

We are thankful to the anonymous reviewers and handling editors for their constructive comments in improving this manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. Scoring criteria of evaluation matrix for the comprehensive effects of cropland.

Conversion Type	Cropland Quality	Ecosystem Impact	Economic Benefit	Reclamation Difficulty
Cropland–Water	The formation of water bodies alters surface hydrological conditions, which can readily lead to soil salinization and nutrient loss, disrupt the structure of the arable layer, and significantly reduce farmland fertility and quality.	The formation of water bodies alters local habitat patterns but also provides new habitats for aquatic organisms, partially enhancing regional ecological diversity.	The conversion of land to water bodies can generate economic benefits, such as fisheries and recreational tourism, but it simultaneously reduces arable land and affects food supply, resulting in a slightly positive overall effect.	The formation of water bodies is largely irreversible, and reclamation requires drainage and amelioration of saline-alkaline soils, leading to high restoration costs and prolonged recovery periods.
Score	+2	−1	−1	+2
Cropland–Forest	Forest vegetation, with well-developed roots, improves soil structure and organic matter but has lower fertility and cultivation suitability than arable land, slightly reducing overall farmland quality.	The conversion of land to forest significantly enhances biodiversity, strengthens soil and water conservation and carbon sequestration, and positively influences regional ecosystem structure and function.	Forest development can generate long-term economic benefits, such as timber production and ecotourism, resulting in an overall positive socio-economic impact.	Forest reclamation requires tree removal and reconstruction of the arable layer, demanding substantial investment and extended time, and is partly irreversible.
Score	+1	−2	−1	+2
Cropland–Grassland	Grassland generally exhibits lower soil fertility and poorer texture, providing suboptimal conditions for cultivation and exerting a mild negative effect on farmland quality.	The conversion of land to grassland enhances landscape connectivity and improves ecosystem stability, demonstrating a positive ecological effect.	Grassland utilization can provide livestock-related benefits while supporting ecological conservation, thereby having a moderately positive socio-economic impact.	Grassland reclamation requires some land preparation and fertilization but is more feasible than forest or water body restoration, with moderate difficulty.
Score	+1	−2	−1	+1
Cropland–Construction	The development of construction land results in soil coverage or compaction, severe degradation of soil physicochemical properties, and irreversible damage to the arable layer, exerting the strongest negative impact on farmland quality.	The expansion of construction land disrupts original ecosystem structures, leading to habitat loss and a sharp decline in ecological functions, constituting a major driver of ecosystem degradation.	Construction land generates high-intensity economic outputs, such as from industry and real estate, making a significant contribution to the socio-economic system.	Permanent soil sealing renders the land nearly impossible to reclaim, with extremely high restoration costs and the strongest irreversibility.
Score	+3	+3	−3	+3
Cropland–Unused Land	Unused land is characterized by low soil fertility and poor texture, providing suboptimal conditions for cultivation and exerting a negative effect on farmland quality.	Compared to arable land, unused land exhibits reduced ecological functionality and limited ecosystem services.	The development potential of unused land is limited, and the loss of original farmland’s economic benefits results in an overall decline in socio-economic contribution.	Although some investment is required for soil improvement and land preparation, reclamation potential is relatively high and restoration difficulty low.
Score	+2	+1	+1	+1

Positive scores indicate negative (adverse) impacts on cropland quality, ecosystem function, or socio-economic balance; negative scores indicate positive (beneficial) effects such as ecological enhancement or sustainable economic benefits. Larger absolute values denote stronger influence intensity.

Table A2. Results of block size sensitivity analysis.

Adjustment	AUC (%)	F1 (%)	$\| diff \| \leq 1$
Adjustment	AUC (%)	F1 (%)	AUC (%)	F1 (%)
Baseline	82.23	76.95	91.01	87.83
Area +20%	82.20	76.37	91.14	87.24
Area −20%	81.85	75.65	90.75	86.78

Table A3. Hyperparameter tuning results on the validation set (TOP 5).

Epochs	Learning Rate	Heads	Hidden Dims	$λ$	Val F1	Val AUC
300	0.0005	[8, 6, 6]	[512, 384]	0.0001	0.7715	0.7855
200	0.0005	[8, 6, 6]	[512, 384]	0.001	0.7708	0.7848
500	0.001	[4, 4, 4]	[512, 384]	0.0001	0.7708	0.7855
500	0.0005	[8, 6, 6]	[512, 384]	0.001	0.7708	0.7855
400	0.0005	[8, 8, 6]	[256, 128]	0.01	0.7702	0.7855

The grid search range covered epochs (200–500), learning rates (1 ×

10^{- 3}

–1 ×

10^{- 4}

), attention heads ([4, 4, 4]–[8, 8, 6]), hidden dimensions ([256, 128]–[512, 384]), and the joint loss weight

λ

(1 ×

10^{- 5}

–1 ×

10^{- 2}

). Only the top five hyperparameter combinations are listed. The best configuration achieved the highest validation micro-F1 (0.7715) and AUC (0.7855), corresponding to epochs = 300, learning rate = 5 ×

10^{- 4}

, heads = [8, 6, 6], hidden dimensions = [512, 384], and

λ

=

10^{- 4}

.

Appendix B. Illustrative Example of CICN Calculation

To clarify the calculation logic of the comprehensive intensity of cropland non-agriculturalization (

C I C N

), a numerical example is presented for a representative street-block unit k. Assume that unit k has an initial cropland area of 100 ha in year

t - 1

. During the transition to year t, 10 ha of cropland is converted to construction land (

i = 1

), and 5 ha is converted to water (

i = 2

).

Appendix B.1. Step 1: Comprehensive Coefficients C_i

Using the evaluation matrix in Table 2, the coefficients are:

C_{construction} = (3 \times 0.4) + (3 \times 0.25) + (- 3 \times 0.15) + (3 \times 0.2) = 2.1,

C_{water} = (2 \times 0.4) + (- 1 \times 0.25) + (- 1 \times 0.15) + (2 \times 0.2) = 0.8 .

Appendix B.2. Step 2: Conversion Rates CNA_i

The conversion rates for unit k are:

C N A_{1} = \frac{10}{100} = 0.1, C N A_{2} = \frac{5}{100} = 0.05 .

Appendix B.3. Step 3: Final Aggregation of CICN

Assuming normalized values

C_{1}^{*} = 1.0

,

C_{2}^{*} = 0.38

(based on min–max normalization of the sample pool), and

C N A_{i}^{*} = C N A_{i}

, the final

C I C N

value of unit k is calculated using Formula (3):

C I C N_{k} = [0.1 \times (1.0 + 1)] + [0.05 \times (0.38 + 1)] = 0.269 .

This example demonstrates how both the scale of cropland conversion and the qualitative severity of different transition types are jointly reflected in the proposed

C I C N

index.

References

Chen, Y.; Wang, S.; Wang, Y. Spatiotemporal evolution of cultivated land non-agriculturalization and its drivers in typical areas of southwest China from 2000 to 2020. Remote Sens. 2022, 14, 3211. [Google Scholar] [CrossRef]
Liu, H.; Wang, H.; Jin, Z.; Pan, C. Spatial and temporal evolution characteristics and driving mechanism of cultivated land conversion in Lower Liaohe River Plain. Chin. J. Eco-Agric. 2024, 32, 1420–1431. [Google Scholar]
Lu, Q.; Zhu, S.; Xiao, Z.; Zhu, G.; Li, J.; Cui, J.; He, W.; Sun, J. Spatiotemporal Variability and Drivers of Cropland Non-Agricultural Conversion Across Mountainous County Types: Evidence from the Qian-Gui Karst Region, China. Agriculture 2025, 15, 795. [Google Scholar] [CrossRef]
Yang, Y.; Peng, S.; Lü, M.; Chen, X.; Guo, X. Spatial and temporal changes and driving mechanism of cultivated land conversion in central Yunnan urban agglomeration from 1990 to 2020. Shuitu Baochi Xuebao 2024, 38, 239–251. [Google Scholar]
Chen, F.; Liu, J.; Chang, Y.; Zhang, Q.; Yu, H.; Zhang, S. Spatial pattern differentiation of non-grain cultivated land and its driving factors in China. China Land Sci. 2021, 35, 33–43. [Google Scholar]
Tu, S.; Long, H.; Zhang, Y.; Ge, D.; Qu, Y. Rural restructuring at village level under rapid urbanization in metropolitan suburbs of China and its implications for innovations in land use policy. Habitat Int. 2018, 77, 143–152. [Google Scholar] [CrossRef]
Yin, J.; Zhao, X.; Zhang, W.; Wang, P. Rural land use change driven by informal industrialization: Evidence from Fengzhuang Village in China. Land 2020, 9, 190. [Google Scholar] [CrossRef]
Pribadi, D.O.; Pauleit, S. The dynamics of peri-urban agriculture during rapid urbanization of Jabodetabek Metropolitan Area. Land Use Policy 2015, 48, 13–24. [Google Scholar] [CrossRef]
Zhang, Y.; Wang, Q.; Hu, Y.; Wang, W.; Mao, X. Analysis of the Evolution of Non-Agriculturization Arable Land Use Pattern and Its Driving Mechanisms. Land 2025, 14, 968. [Google Scholar] [CrossRef]
Han, Y.; Pan, Q.; Cao, Y.; Zhang, J.; Yuan, J.; Li, B.; Li, S.; Ma, R.; Luo, X.; Sha, L.; et al. Estimation of grain crop yields after returning the illegal nurseries and orchards to cultivated land in the Yangtze River Delta Region. Land 2022, 11, 1993. [Google Scholar] [CrossRef]
Mulya, S.P.; Hudalah, D. Urbanization pressure and farmers’ attitudes: Implications for agricultural sustainability. Environ. Dev. Sustain. 2024, 1–34. [Google Scholar] [CrossRef]
Lu, D.; Wang, Z.; Su, K.; Zhou, Y.; Li, X.; Lin, A. Understanding the impact of cultivated land-use changes on China’s grain production potential and policy implications: A perspective of non-agriculturalization, non-grainization, and marginalization. J. Clean. Prod. 2024, 436, 140647. [Google Scholar] [CrossRef]
Song, H.; Li, X.; Xin, L.; Wang, X. Forging new pathways: How farmland abandonment affects decision-making of non-grain production—Insight from China’s mountainous areas. J. Environ. Manag. 2025, 373, 123753. [Google Scholar] [CrossRef]
Wu, X.; Zhao, N.; Wang, Y.; Zhang, L.; Wang, W.; Liu, Y. Cropland non-agriculturalization caused by the expansion of built-up areas in China during 1990–2020. Land Use Policy 2024, 146, 107312. [Google Scholar] [CrossRef]
Regasa, A.; Haile, W.; Abera, G. Soil acidity and fertility status of surface soils under different land uses in Sayo district of Oromia, western Ethiopia. PLoS ONE 2024, 19, e0316009. [Google Scholar] [CrossRef]
Liu, X.; Wang, S.; Wu, P.; Feng, K.; Hubacek, K.; Li, X.; Sun, L. Impacts of urban expansion on terrestrial carbon storage in China. Environ. Sci. Technol. 2019, 53, 6834–6844. [Google Scholar] [CrossRef]
Yang, L.; Sun, Q.; Gui, R.; Hu, J. Monitoring of Cropland Non-Agriculturalization Based on Google Earth Engine and Multi-Source Data. Appl. Sci. 2025, 15, 1474. [Google Scholar] [CrossRef]
Zhang, M.; Sun, P.; Sun, Z. Spatiotemporally Mapping Non-Grain Production of Winter Wheat Using a Developed Auto-Generating Sample Algorithm on Google Earth Engine. Remote Sens. 2024, 16, 659. [Google Scholar] [CrossRef]
Wang, C.; Gao, Q.; Wang, X.; Yu, M. Spatially differentiated trends in urbanization, agricultural land abandonment and reclamation, and woodland recovery in Northern China. Sci. Rep. 2016, 6, 37658. [Google Scholar] [CrossRef]
Wu, M.; Hu, Y.; Wang, H.; Liu, G.; Yang, L. Remote sensing extraction and feature analysis of abandoned farmland in hilly and mountainous areas: A case study of Xingning, Guangdong. Remote Sens. Appl. Soc. Environ. 2020, 20, 100403. [Google Scholar]
Ran, D.; Zhang, Z.; Jing, Y. A study on the spatial–temporal evolution and driving factors of non-grain production in China’s major grain-producing provinces. Int. J. Environ. Res. Public Health 2022, 19, 16630. [Google Scholar] [CrossRef] [PubMed]
Zhang, G.; Li, X.; Zhang, L.; Wei, X. Dynamics and causes of cropland Non-Agriculturalization in typical regions of China: An explanation Based on interpretable Machine learning. Ecol. Indic. 2024, 166, 112348. [Google Scholar] [CrossRef]
Karim, M.; Guan, H.; Zhang, J.; Ayoub, M. ViT-ChangeFormer: A deep learning approach for cropland abandonment detection in lahore, Pakistan using Landsat-8 and Sentinel-2 data. Remote Sens. Appl. Soc. Environ. 2025, 37, 101468. [Google Scholar] [CrossRef]
Shu, T.; Shen, L. FADConv: A Frequency-Aware Dynamic Convolution for Farmland Non-agriculturalization Identification and Segmentation. arXiv 2025, arXiv:2504.03510. [Google Scholar] [CrossRef]
Hao, Q.; Zhang, T.; Cheng, X.; He, P.; Zhu, X.; Chen, Y. GIS-based non-grain cultivated land susceptibility prediction using data mining methods. Sci. Rep. 2024, 14, 4433. [Google Scholar] [CrossRef]
Wang, J.; Chen, Y.; Liu, Z.; Liu, X.; Li, X. An in-depth multiscale analysis of farmland abandonment and recultivation dynamics in the Yangtze River Delta, China: A landscape ecology perspective empowered by google earth engine. Environ. Sustain. Indic. 2024, 24, 100541. [Google Scholar] [CrossRef]
Yan, H.; Chen, H.; Wang, F.; Qiu, L.; Li, R. Early warning and management measures for the non-agriculturalization of cultivated land in Shaanxi Province of China based on a patch-generated land use simulation Model. Agriculture 2025, 15, 672. [Google Scholar] [CrossRef]
Su, Y.; Wang, X.; Wang, C.; Zhu, C.; Jiang, Q.; Li, Y. Trade-offs between economic benefits and environmental impacts in non-grain expansion: A case study in the eastern plain of China. Environ. Sci. Pollut. Res. 2024, 31, 15932–15945. [Google Scholar] [CrossRef]
Guo, Z.; Wen, J.; Xu, R. A shape and size free-CNN for urban functional zone mapping with high-resolution satellite images and POI data. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5622117. [Google Scholar] [CrossRef]
Fan, W.; Ma, Y.; Li, Q.; He, Y.; Zhao, E.; Tang, J.; Yin, D. Graph neural networks for social recommendation. In Proceedings of the World Wide Web Conference; Association for Computing Machinery: New York, NY, USA, 2019; pp. 417–426. [Google Scholar]
Wu, B.; Wang, Y.; Zeng, Y.; Liu, J.; Zhao, J.; Yang, C.; Li, Y.; Xia, L.; Yin, D.; Shi, C. Graph Foundation Models for Recommendation: A Comprehensive Survey. arXiv 2025, arXiv:2502.08346. [Google Scholar] [CrossRef]
Kwon, J.; Ahn, S.; Seo, Y.D. RecKG: Knowledge Graph for Recommender Systems. In Proceedings of the 39th ACM/SIGAPP Symposium on Applied Computing; Association for Computing Machinery: New York, NY, USA, 2024; pp. 600–607. [Google Scholar]
Baranwal, M.; Magner, A.; Saldinger, J.; Turali-Emre, E.S.; Elvati, P.; Kozarekar, S.; VanEpps, J.S.; Kotov, N.A.; Violi, A.; Hero, A.O. Struct2Graph: A graph attention network for structure based predictions of protein–protein interactions. BMC Bioinform. 2022, 23, 370. [Google Scholar] [CrossRef]
Jin, Y.; Lu, J.; Shi, R.; Yang, Y. EmbedDTI: Enhancing the molecular representations via sequence embedding and graph convolutional network for the prediction of drug-target interaction. Biomolecules 2021, 11, 1783. [Google Scholar] [CrossRef]
Gori, M.; Monfardini, G.; Scarselli, F. A new model for learning in graph domains. In Proceedings of the 2005 IEEE International Joint Conference on Neural Networks; IEEE: Piscataway, NJ, USA, 2005; Volume 2, pp. 729–734. [Google Scholar]
Scarselli, F.; Gori, M.; Tsoi, A.C.; Hagenbuchner, M.; Monfardini, G. The graph neural network model. IEEE Trans. Neural Netw. 2008, 20, 61–80. [Google Scholar] [CrossRef] [PubMed]
López-Cardona, Á.; Bernárdez, G.; Barlet-Ros, P.; Cabellos-Aparicio, A. Proximal policy optimization with graph neural networks for optimal power flow. arXiv 2022, arXiv:2212.12470. [Google Scholar] [CrossRef]
Corradini, F.; Gerosa, F.; Gori, M.; Lucheroni, C.; Piangerelli, M.; Zannotti, M. A systematic literature review of spatio-temporal graph neural network models for time series forecasting and classification. arXiv 2024, arXiv:2410.22377. [Google Scholar] [CrossRef] [PubMed]
Kipf, T. Semi-Supervised Classification with Graph Convolutional Networks. arXiv 2016, arXiv:1609.02907. [Google Scholar]
Hamilton, W.; Ying, Z.; Leskovec, J. Inductive representation learning on large graphs. Adv. Neural Inf. Process. Syst. 2017, 30, 1024–1034. [Google Scholar]
Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph attention networks. arXiv 2017, arXiv:1710.10903. [Google Scholar]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 5998–6008. [Google Scholar]
Wu, Z.; Pan, S.; Chen, F.; Long, G.; Zhang, C.; Yu, P.S. A comprehensive survey on graph neural networks. IEEE Trans. Neural Netw. Learn. Syst. 2020, 32, 4–24. [Google Scholar] [CrossRef]
Lu, H.; Wang, L.; Ma, X.; Cheng, J.; Zhou, M. A survey of graph neural networks and their industrial applications. Neurocomputing 2025, 614, 128761. [Google Scholar] [CrossRef]
Bai, J.; Zhu, J.; Song, Y.; Zhao, L.; Hou, Z.; Du, R.; Li, H. A3t-gcn: Attention temporal graph convolutional network for traffic forecasting. ISPRS Int. J. Geo-Inf. 2021, 10, 485. [Google Scholar] [CrossRef]
Ren, Y.; Xie, Z.; Zhai, S. Urban Land Use Classification Model Fusing Multimodal Deep Features. ISPRS Int. J. Geo-Inf. 2024, 13, 378. [Google Scholar] [CrossRef]
Zhou, J.; Cui, G.; Hu, S.; Zhang, Z.; Yang, C.; Liu, Z.; Wang, L.; Li, C.; Sun, M. Graph neural networks: A review of methods and applications. AI Open 2020, 1, 57–81. [Google Scholar] [CrossRef]
Waikhom, L.; Patgiri, R. A survey of graph neural networks in various learning paradigms: Methods, applications, and challenges. Artif. Intell. Rev. 2023, 56, 6295–6364. [Google Scholar] [CrossRef]
Zhang, H.; Wu, B.; Yuan, X.; Pan, S.; Tong, H.; Pei, J. Trustworthy graph neural networks: Aspects, methods, and trends. Proc. IEEE 2024, 112, 97–139. [Google Scholar] [CrossRef]
Mohanaiah, P.; Sathyanarayana, P.; GuruKumar, L. Image texture feature extraction using GLCM approach. Int. J. Sci. Res. Publ. 2013, 3, 1–5. [Google Scholar]
Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Efficient estimation of word representations in vector space. arXiv 2013, arXiv:1301.3781. [Google Scholar] [CrossRef]
Zhao, T.; Zhang, X.; Wang, S. Graphsmote: Imbalanced node classification on graphs with graph neural networks. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining; Association for Computing Machinery: New York, NY, USA, 2021; pp. 833–841. [Google Scholar]
Cheng, G.; Huang, Y.; Li, X.; Lyu, S.; Xu, Z.; Zhao, H.; Zhao, Q.; Xiang, S. Change detection methods for remote sensing in the last decade: A comprehensive review. Remote Sens. 2024, 16, 2355. [Google Scholar] [CrossRef]
Fu, Y.; Zhu, Z.; Liu, L.; Zhan, W.; He, T.; Shen, H.; Zhao, J.; Liu, Y.; Zhang, H.; Liu, Z.; et al. Remote sensing time series analysis: A review of data and applications. J. Remote Sens. 2024, 4, 0285. [Google Scholar] [CrossRef]
Gaur, S.; Singh, R. A comprehensive review on land use/land cover (LULC) change modeling for urban development: Current status and future prospects. Sustainability 2023, 15, 903. [Google Scholar] [CrossRef]
Bo, H.; Bei, Z.; Yimeng, S. Urban land-use mapping using a deep convolutional neural network with high spatial resolution multispectral remote sensing imagery. Remote Sens. Environ. 2018, 214, 73–86. [Google Scholar]
Zhang, C.; Kerner, H.; Wang, S.; Hao, P.; Li, Z.; Hunt, K.A.; Abernethy, J.; Zhao, H.; Gao, F.; Di, L.; et al. Remote sensing for crop mapping: A perspective on current and future crop-specific land cover data products. Remote Sens. Environ. 2025, 330, 114995. [Google Scholar] [CrossRef]
Pu, L. Impact of cropland use changes based on non-agriculturalization, non-grainization and abandonment on grain potential production in Northeast China. Sci. Rep. 2025, 15, 23596. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Overview of the study area. (a) Map of Hubei. (b) Wuhan study area.

Figure 2. Research workflow for CNA susceptibility prediction.

Figure 3. Visual interpretation features and descriptions for CNA susceptibility levels.

Figure 4. Cropland Non-Agriculturalization feature system.

Figure 5. Spatial partitioning of node units.

Figure 6. Extraction of land cover semantic embeddings.

Figure 7. Framework of the GS-GAT.

Figure 8. Mechanism of the GraphSMOTE.

Figure 9. CNA susceptibility prediction results based on the GS-GAT model. (a) CNA susceptibility from 2018 to 2019. (b) CNA susceptibility from 2019 to 2020. (c) CNA susceptibility from 2020 to 2021. (d) CNA susceptibility from 2021 to 2022.

Figure 10. Predicted results under relaxed correctness criterion. (a) Map of true/false prediction. (b) Confusion matrix (

| d i f f | \leq 1

).

Figure 10. Predicted results under relaxed correctness criterion. (a) Map of true/false prediction. (b) Confusion matrix (

| d i f f | \leq 1

).

Figure 11. Comparison of training performance under different data imbalance handling strategies: (a) Training F1-score curves. (b) Training loss curves.

Figure 12. Comparison of prediction results under different land use scenarios.

Figure 13. Confusion matrices of different models for susceptibility prediction.

Table 1. Information and origins of data.

Data Name	Temporal Resolution	Spatial Resolution	Data Source
China’s Land-Use/Cover Datasets	1 year	30 m	https://zenodo.org/records/12779975 accessed on 8 November 2024
NDVI data	1 month	1 km	https://www.earthdata.nasa.gov/ accessed on 8 November 2024
DEM	-	30 m	https://www.gscloud.cn accessed on 8 November 2024
Sentinel-2 L2A RS image	1 year	10 m	https://dataspace.copernicus.eu/explore-data/ data-collections/sentinel-data/sentinel-2 accessed on 10 November 2024
Socio-economic data	1 year	-	https://tjj.wuhan.gov.cn/tjfw/tjnj/ accessed on 10 November 2024
NPP-VIIRS-like NTL Data	1 year	500 m	(https://www.geodata.cn) accessed on 10 November 2024
CBRA rooftop area data	-	2.5 m	(https://zenodo.org/records/7861676) accessed on 11 November 2024
Worldpop population data	-	100 m	(https://www.worldpop.org/) accessed on 10 November 2024
Road networks and waterways	-	-	https://www.openstreetmap.org/ accessed on 12 November 2024
Bus stop data	-	-	http://www.bus-info.cn/ accessed on 12 November 2024

Table 2. Evaluation Matrix for the Comprehensive Effects of Cropland.

Conversion Type	Cropland Quality	Ecosystem Impact	Economic Benefit	Reclamation Difficulty
Cropland–Water	2	−1	−1	2
Cropland–Forest	1	−2	−1	2
Cropland–Grassland	1	−2	−1	1
Cropland–Construction	3	3	−3	3
Cropland–Unused Land	2	1	1	1
Weight	0.4	0.25	0.15	0.2

Table 3. GS-GAT Performance.

Period	AUC (%)	F1 (%)	$\| diff \| \leq 1$
Period	AUC (%)	F1 (%)	AUC (%)	F1 (%)
2018–2019	80.52	76.82	90.46	90.62
2019–2020	78.24	68.49	88.36	83.07
2020–2021	85.75	79.95	93.13	85.94
2021–2022	83.63	82.55	88.23	91.67

Table 4. Comparison between K-fold cross-validation and fixed train–test split results.

	AUC (%)	F1 (%)	$\| diff \| \leq 1$
	AUC (%)	F1 (%)	AUC (%)	F1 (%)
Fold_1	79.65	72.22	90.81	84.92
Fold_2	80.91	75.79	89.31	86.76
Fold_3	84.28	78.57	91.27	89.78
Fold_4	78.67	67.09	85.35	88.24
Fold_5	78.31	74.50	84.97	90.30
KFold_avr	80.36	73.63	88.34	88.00
Origin_split	82.23	76.95	91.01	87.83
\| $Δ$ \|	1.87	3.32	2.67	0.17
KFold_std	2.16	3.86	2.68	1.98

|

Δ

| denotes the difference between the mean of cross-validation and the origin fixed-split result.

Table 5. Ablation Results after Removing Individual Feature Dimensions.

Removed Feature	AUC (%)	F1 (%)	$\| diff \| \leq 1$
Removed Feature	AUC (%)	F1 (%)	AUC (%)	F1 (%)
Imagery	80.77	74.48	90.54	85.87
Land Cover	80.97	74.8	90.74	86.39
Topography	81.70	75.26	90.67	87.43
Socio-economic	81.92	76.36	90.97	87.43
Full	82.23	76.95	91.01	87.83

Table 6. Ablation Results after Removing Combinations of Feature Dimensions.

Removed Feature	AUC (%)	F1 (%)	$\| diff \| \leq 1$
Removed Feature	AUC (%)	F1 (%)	AUC (%)	F1 (%)
Imagery & Land Cover	80.60	73.24	90.12	85.16
Imagery & Topography	80.99	74.22	90.45	86.33
Imagery & Socio-economic	80.83	74.61	90.64	85.35
Land Cover & Topography	80.84	74.87	90.7	86.72
Land Cover & Socio-economic	80.64	75.26	90.85	86.59
Topography & Socio-economic	81.83	75.52	90.9	87.43
Full	82.23	76.95	91.01	87.83

Table 7. Ablation Results with One Feature Dimension Retained.

Retained Feature	AUC (%)	F1 (%)	$\| diff \| \leq 1$
Retained Feature	AUC (%)	F1 (%)	AUC (%)	F1 (%)
Imagery	80.7	75.19	90.81	86.65
Land Cover	80.92	73.89	90.77	86.07
Topography	79.41	73.05	90.07	84.05
Socio-economic	80.11	72.92	89.91	85.74
Full	82.23	76.95	91.01	87.83

Table 8. Distribution of Training Samples Before and After GraphSMOTE Augmentation.

	Year	None	Low	Mid	High
Before Graphsmote	2018–2019	607	148	115	26
	2019–2020	585	124	93	94
	2020–2021	542	112	112	130
	2021–2022	607	86	97	106
After Graphsmote	2018–2019	607	296	230	52
	2019–2020	585	248	186	188
	2020–2021	542	224	224	260
	2021–2022	607	172	194	212

Table 9. Ablation results of different imbalance handling strategies on the test set.

Method	AUC (%)	F1 (%)	$\| diff \| \leq 1$
Method	AUC (%)	F1 (%)	AUC (%)	F1 (%)
GAT (base)	65.20	64.12	68.20	85.03
GAT + Under-sampling	66.67	62.38	70.15	84.18
GAT + Focal loss	69.07	69.40	65.64	86.39
GAT + GraphSMOTE	82.23	76.95	91.01	87.83

Table 10. Comparative Results: GS-GAT vs. SVM and XGBoost.

Method	AUC (%)				F1 (%)	$\| diff \| \leq 1$
Method	2018–2019	2019–2020	2020–2021	2021–2022	F1 (%)	AUC (%)	F1 (%)
SVM	54.89	64.69	69.91	68.15	66.47	61.82	83.6
XGBoost	67.67	66.37	76.03	70.24	66.15	65.06	81.25
GS-GAT	80.53	78.51	85.61	83.3	76.95	91.01	87.83

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2026 by the authors. Published by MDPI on behalf of the International Society for Photogrammetry and Remote Sensing. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.

Share and Cite

MDPI and ACS Style

Wan, S.; Huang, L.; Xia, Z. Predicting Cropland Non-Agriculturalization Susceptibility Using Multi-Source Data and Graph Attention Networks: A Case Study of Wuhan, China. ISPRS Int. J. Geo-Inf. 2026, 15, 77. https://doi.org/10.3390/ijgi15020077

AMA Style

Wan S, Huang L, Xia Z. Predicting Cropland Non-Agriculturalization Susceptibility Using Multi-Source Data and Graph Attention Networks: A Case Study of Wuhan, China. ISPRS International Journal of Geo-Information. 2026; 15(2):77. https://doi.org/10.3390/ijgi15020077

Chicago/Turabian Style

Wan, Shiqi, Lina Huang, and Zhangying Xia. 2026. "Predicting Cropland Non-Agriculturalization Susceptibility Using Multi-Source Data and Graph Attention Networks: A Case Study of Wuhan, China" ISPRS International Journal of Geo-Information 15, no. 2: 77. https://doi.org/10.3390/ijgi15020077

APA Style

Wan, S., Huang, L., & Xia, Z. (2026). Predicting Cropland Non-Agriculturalization Susceptibility Using Multi-Source Data and Graph Attention Networks: A Case Study of Wuhan, China. ISPRS International Journal of Geo-Information, 15(2), 77. https://doi.org/10.3390/ijgi15020077

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Predicting Cropland Non-Agriculturalization Susceptibility Using Multi-Source Data and Graph Attention Networks: A Case Study of Wuhan, China

Abstract

1. Introduction

2. Materials

2.1. Study Area

2.2. Data Sources

3. Methods

3.1. Evaluation of Cropland Non-Agriculturalization Susceptibility

3.2. Construction of the Cropland Non-Agriculturalization Feature System

3.2.1. Imagery

3.2.2. Land Cover

3.2.3. Topography

3.2.4. Socioeconomy

3.3. Graph Model Construction

3.3.1. Node and Edge Construction

3.3.2. Feature Calculation

Imagery Features

Land Cover Features

Topographic Features

Socioeconomic Features

3.4. Cropland Non-Agriculturalization Susceptibility Prediction Model

3.4.1. Minority Node Augmentation Based on GraphSMOTE

3.4.2. GCNN with Graph Attention Mechanism

3.5. Model Training and Evaluation Settings

4. Results

4.1. Prediction Results of CNA Susceptibility

4.2. Ablation Experiment Results

4.2.1. Feature Dimension Ablation

4.2.2. Ablation of the GraphSMOTE Strategy

4.3. Comparative Experiment Results

5. Discussion

5.1. Meso-Scale Representation and Multi-Resolution Data Fusion

5.2. Geographic Principles and Semantic Landscape Embedding

5.3. Assessment of Model Robustness and Mitigation of Data Imbalance

5.4. Strategic Utility for Proactive Cropland Governance

5.5. Prospective Research Avenues

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

Appendix B. Illustrative Example of CICN Calculation

Appendix B.1. Step 1: Comprehensive Coefficients Ci

Appendix B.2. Step 2: Conversion Rates CNAi

Appendix B.3. Step 3: Final Aggregation of CICN

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Appendix B.1. Step 1: Comprehensive Coefficients C_i

Appendix B.2. Step 2: Conversion Rates CNA_i