ECO-DEAU: An Ecologically Constrained Deep Learning Autoencoder for Sub-Pixel Land Cover Unmixing in Arid and Semi-Arid Regions

Zhou, Leixuan; Li, Long; Li, Dehui; Bo, Yong; Li, Hang; Liu, Kai; Wang, Shudong

doi:10.3390/rs18060941

Open AccessArticle

ECO-DEAU: An Ecologically Constrained Deep Learning Autoencoder for Sub-Pixel Land Cover Unmixing in Arid and Semi-Arid Regions

by

Leixuan Zhou

^1,2,

Long Li

^1,2

,

Dehui Li

^1,2,

Yong Bo

^1,2,

Hang Li

³,

Kai Liu

¹

and

Shudong Wang

^1,*

¹

State Key Laboratory of Remote Sensing and Digital Earth, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China

²

University of Chinese Academy of Sciences, Beijing 100049, China

³

International Institute for Earth System Science, Nanjing University, Nanjing 210023, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2026, 18(6), 941; https://doi.org/10.3390/rs18060941

Submission received: 31 January 2026 / Revised: 13 March 2026 / Accepted: 17 March 2026 / Published: 19 March 2026

(This article belongs to the Special Issue Remote Sensing for Landscape Dynamics)

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

ECO-DEAU significantly outperforms traditional linear and unconstrained deep learning models, achieving a maximum overall $R^{2}$ of 0.749 in heterogeneous zones and effectively decoupling spectrally similar classes like impervious surfaces and bare soil.

What are the implications of the main findings?

Embedding ecological priors into deep autoencoders effectively overcomes local optima limitations of traditional unmixing methods, ensuring both high accuracy and biophysical interpretability.

Abstract

Arid and semi-arid regions are critical to terrestrial ecosystems and regional carbon cycle regulation, directly contributing to peak carbon and carbon neutrality goals. However, the fragmented landscapes in these regions pose significant challenges to conventional pixel-based classification, which often struggles with mixed pixel issues and lacks biophysical interpretability. To address these limitations, this study develops an Ecologically Constrained Deep Learning Autoencoder (ECO-DEAU) framework for sub-pixel land cover mapping by integrating biophysical constraints. Specifically, ECO-DEAU employs spectral indices to extract standard spectral signatures for five primary land cover types, which serve as initial weights to guide the autoencoder in estimating fractional abundances. The model was trained across ten representative landscape zones in the Inner Mongolia section of the Yellow River Basin and validated against high-resolution Gaofen-2 data. Results demonstrated that ECO-DEAU yielded an average

R^{2}

of 0.687, reaching a maximum

R^{2}

of 0.749 in spatially heterogeneous transition zones, representing a substantial improvement over the baseline unconstrained Deep Autoencoder (DEAU). By effectively resolving the blind source separation problem and improving decomposition accuracy, ECO-DEAU serves as a robust tool for addressing mixed pixel challenges in heterogeneous environments, thereby facilitating large-scale, high-resolution carbon sink monitoring.

Keywords:

spectral unmixing; autoencoder; ecological constraints; arid and semi-arid regions

1. Introduction

Arid and Semi-arid Lands (ASALs), encompassing approximately 41% of the Earth’s land surface, constitute a pivotal component of the global terrestrial ecosystem [1]. Given their vast spatial extent, ASALs play a substantial role in the global carbon cycle. However, their contribution to regulating the interannual variability (IAV) and long-term trends of the global terrestrial carbon sink has frequently been underestimated [2,3]. In China, ASALs are particularly extensive, accounting for approximately 52.5% of the national territory [4]. These regions function as crucial ecological security barriers, yet they remain ecologically fragile and sensitive zones [5]. Over recent decades, multi-source satellite remote sensing and ground-based observations have revealed a pronounced “greening” trend across China’s ASALs [6]. This phenomenon is primarily attributed to the synergistic effects of a warming-wetting climate and large-scale ecological restoration initiatives, notably the Three-North Shelterbelt Program and the Grain for Green Program [7,8]. Consequently, the accurate quantification of these land cover changes and their specific contributions to regional carbon sink dynamics holds profound scientific and strategic significance. This process is indispensable for elucidating the response mechanisms of the dryland ecosystem carbon cycle and for scientifically assessing the carbon sequestration efficacy of major ecological engineering projects [9,10].

In ASALs, vegetation cover—particularly cropland—exhibits highly heterogeneous and patchy distributions driven by constrained hydrothermal conditions [11]. Here, diverse vegetation types (e.g., forests, shrubs, and grasses) are interspersed with bare soil and saline-alkali land, forming intricate landscape mosaics [12]. This spatial fragmentation results in the prevalence of the “mixed pixel” phenomenon in regional-scale monitoring, which relies heavily on medium-resolution imagery such as Sentinel-2 and Landsat. Consequently, the spectral signal recorded within a single pixel often represents a composite of multiple fragmented land cover types [13,14]. Such mixed signals are difficult for traditional “hard” classification algorithms to resolve because they are based on the assumption of spectral homogeneity within a pixel. This limitation severely compromises the accuracy of land cover quantification and subsequent regional carbon cycle assessments [15,16]. Moreover, this challenge is exacerbated by frequent land use transitions, such as farmland abandonment and recultivation [17], which result in high spectral similarity between fallow fields and surrounding natural vegetation.

To address these spectral complexities, Spectral Unmixing (SU) has emerged as a critical methodological solution. By decomposing mixed spectra at the sub-pixel scale, SU quantitatively estimates the fractional abundances of individual endmembers within a pixel. This capability is essential for accurately characterizing the fragmented spatial distribution of vegetation in ASALs [18,19]. Over time, SU methodologies have evolved through various modeling frameworks to handle diverse unmixing scenarios. Among these, the Linear Mixing Model (LMM) has long served as the mainstream approach for vegetation monitoring in ASALs, favored for its clear physical interpretation and computational efficiency. However, LMMs typically rely on the assumption that pure “endmembers” exist within the image—a premise central to algorithms such as the Vertex Component Analysis (VCA) [20] and N-FINDR algorithm [21]. This assumption is frequently violated in the highly fragmented and heterogeneous landscapes of ASALs, thereby severely limiting unmixing accuracy.

In contrast, while Nonlinear Mixing Models (NLMM) effectively characterize complex photon scattering interactions between land covers, they are often impeded by high computational complexity and sensitivity to noise. These limitations restrict their application in large-scale remote sensing monitoring [22,23]. Recently, the advent of artificial intelligence has injected new vitality into the field of spectral unmixing [24]. Recent state-of-the-art (SOTA) unmixing frameworks published in the literature have incorporated multilinear augmented networks to handle nonlinear scattering [25], utilized spatial-spectral feature fusion for refined abundance estimation [26], and introduced variational autoencoders (VAE) with Dirichlet distribution priors to model mixed pixels [27]. Among existing mainstream deep learning unmixing frameworks, unsupervised learning networks based on Autoencoders (AE)—such as mDAU [28] and CNNAEU [29]—have demonstrated superior performance compared to traditional algorithms in resolving mixed pixels over complex surfaces. This success is attributed to their robust capabilities in nonlinear feature extraction and data mapping. Nevertheless, the majority of existing deep unmixing models function primarily as “blind” source separation tools. During training, these models prioritize the minimization of mathematical reconstruction errors, often neglecting critical prior constraints related to the biophysical properties of spectral signatures and ecological distribution patterns [30,31]. This lack of physical interpretability often results in extracted endmembers that do not correspond to actual biophysical features [32]. Furthermore, it renders the models prone to converging to local optima, thereby limiting their practical utility in precision ecological monitoring [28,33].

In recent years, deep learning unmixing models have demonstrated formidable capabilities, yet they remain fundamentally models that rely purely on mathematical optimization. In highly heterogeneous regions like arid-semi-arid zones, where landcover is fragmented and mixed, pure mathematical unmixing often fails due to severe spectral confusion (e.g., between bareland and impervious surfaces). Therefore, there is an urgent need to embed ecological priors into deep neural networks to ensure biophysical interpretability of unmixing results while improving accuracy. To address the limitations of existing approaches, this study proposes the Ecologically Constrained Deep Autoencoder Unmixing (ECO-DEAU) framework. This model is designed to overcome the prevalent mixed pixel issue and achieve precise vegetation abundance extraction in the complex landscapes of arid and semi-arid regions. Unlike traditional “blind” unmixing models [34], ECO-DEAU ensures the biophysical plausibility and interpretability of the unmixing results by explicitly integrating ecological principles into the deep learning architecture. In particular, the framework consists of three crucial steps: (i) extracting candidate endmembers for five main land cover types to initialize model weights based on physical spectral signatures; (ii) building ecological constraint functions by combining spatial land cover distribution characteristics with a priori knowledge; and (iii) integrating these constraints into the autoencoder to estimate fractional abundances via spectral unmixing. To implement the method within the Inner Mongolia section of the Yellow River Basin, ten representative sampling zones were selected for model training. The model was then thoroughly validated in two areas with different landscape features, using high-resolution Gaofen-2 imagery as ground truth to evaluate how well it resolved complex spectral mixing.

2. Materials and Methods

2.1. Study Area

This study focuses on the Inner Mongolia section of the Yellow River Basin (Figure 1), strategically situated within the agro-pastoral ecotone of northern China [35]. Encompassing seven leagues and municipalities within the Inner Mongolia Autonomous Region, the study area spans approximately 152,000 km². Recognized as a typical ecologically fragile zone and a critical region for environmental protection in northern China [36], it plays a pivotal role in ensuring food security and regulating the regional climate.

The region is characterized by complex geomorphological features and significant spatial heterogeneity. Topographically, the Yinshan Mountains traverse the north, the Hetao Plain and two major deserts (Ulan Buh and Kubuqi) occupy the central sector, and the Ordos Plateau dominates the south [37]. Climatically, the region lies within the temperate continental monsoon zone, registering an annual average temperature of approximately 5.6–7.8 °C [37]. Precipitation is scarce, ranging from 150 mm to 450 mm annually and concentrated primarily during the summer months (June to August), whereas potential evapotranspiration exceeds 2000 mm [38]. Constrained by these hydrothermal conditions, the overall vegetation coverage is sparse and fragmented. Dominant land cover types include forest, shrubland, grassland, cropland, bare land, and sandy land [39]. This intricate mosaic of vegetation, interspersed with bare soil and sand, results in a prevalence of mixed pixels in medium-resolution remote sensing imagery, posing significant challenges for the precise estimation of surface vegetation abundance.

To facilitate model initialization and training, ten representative zones (10 km × 10 km) were selected across the basin (Table 1). These zones fall into the following categories: (1) Ecologically Constrained Mixed Zones (ECMZ), which are five heterogeneous transition zones chosen to represent complex mixed scenarios for model optimization; and (2) Pure Endmember Zones (PEZ), which are five homogeneous areas that represent primary land covers and are used to extract high-quality spectral endmembers.

2.2. Datasets

This study incorporates three primary categories of data: (1) multi-temporal optical remote sensing imagery (Sentinel-2); (2) topographic elevation data (SRTM); and (3) reference data for land cover abundance validation. Detailed descriptions regarding the acquisition sources, technical specifications, and preprocessing protocols for each dataset are elaborated in the subsequent subsections.

2.2.1. Multi-Temporal Remote Sensing Data

Sentinel-2 imagery was selected as the primary spectral data source, distinguished by its high spatial resolution (10 m), frequent revisit capability, and comprehensive spectral configuration (12 bands). The availability of red-edge (B5–B7) and shortwave infrared (B11, B12) bands is instrumental for differentiating spectrally similar vegetation endmembers [40,41]. Level-1C Top-of-Atmosphere (TOA) products, covering the study area from April to October 2023, were retrieved from the Copernicus Data Space Ecosystem. This temporal window encompasses the complete vegetation phenological cycle characteristic of the region [42]. A preliminary cloud cover threshold of <30% was applied to maintain data usability.

To derive Surface Reflectance (SR) products suitable for spectral unmixing, a rigorous preprocessing chain was implemented. Initially, the Fmask algorithm [43,44] was employed to generate pixel-level Quality Assurance (QA) masks identifying clouds and shadows. Subsequently, atmospheric correction was executed using the Sen2Cor processor [45] to convert L1C TOA data into Level-2A (L2A) Surface Reflectance [44]. Accurate parameter extraction under complex atmospheric conditions is a fundamental prerequisite for retrieving true spectral signatures in optical remote sensing [46]. This step is a prerequisite for restoring the physical spectral signatures required by unmixing models. To capture distinct phenological features while mitigating noise, the study period was divided into three seasonal windows: Spring, Summer, and Autumn. A hierarchical compositing strategy was adopted [47,48], where cloud-masked L2A data were first aggregated into 10-day mean composites to suppress noise, and subsequently synthesized into representative composites for each season. Finally, all spectral bands were resampled to a unified 10 m spatial resolution and stacked, yielding a seamless, cloud-free, 36-band (12 bands

\times 3

seasons) surface reflectance dataset across the entire study area.

2.2.2. Topographic Elevation Data

Digital Elevation Model (DEM) data at a spatial resolution of 30 m were acquired from the Shuttle Radar Topography Mission (SRTM) [49]. These data were accessed via the United States Geological Survey (USGS), clipped to the study area extent, and subsequently resampled to 10 m to ensure spatial consistency with the Sentinel-2 imagery. Topography plays a decisive role in defining vegetation ecological niches by governing local hydrothermal redistribution and soil development processes. To incorporate these topographic factors, slope data were calculated from the DEM. Both elevation and slope serve as critical variables for the terrain constraint module within the proposed ecological framework, functioning to penalize biophysically implausible land cover distributions.

2.2.3. Land Cover Abundance Validation Data

To rigorously validate the spectral unmixing results, a high-precision reference dataset was constructed using high-resolution Gaofen-2 (GF-2) imagery. With its sub-meter spatial resolution, GF-2 imagery captures fine-scale ground details, serving as a reliable benchmark for identifying complex land cover features [50]. A hybrid classification workflow, combining supervised classification with rigorous visual interpretation and manual correction, was implemented to generate sub-meter land cover maps. To mitigate the impact of geometric registration errors, a spatial aggregation strategy was adopted. The fine-scale classification maps were upscaled to a

100 m \times 100 m

grid, within which the fractional abundance of each land cover type was calculated. The resulting dataset represents the “true” abundance distribution, serving as the ground truth for subsequent model accuracy assessment [51].

2.3. Ecologically Constrained Deep Autoencoder (ECO-DEAU) Method

This study proposes the Ecologically Constrained Deep Autoencoder (ECO-DEAU) framework, specifically designed to address the challenge of mixed pixel decomposition within the complex landscapes of arid and semi-arid regions (Figure 2). To systematically implement and evaluate the ECO-DEAU model, the methodology is structured into four primary components: (1) Endmember Extraction and Weight Initialization: Deriving endmember spectra by combining high-resolution imagery with geometric vertex analysis to initialize decoder weights, thereby enhancing model performance; (2) Framework Construction: Establishing the ECO-DEAU network architecture built upon a deep autoencoder backbone; (3) Ecological Loss Function Design: Formulating a composite loss function that integrates multi-source prior knowledge, encompassing vegetation indices and topographic factors; and (4) Abundance Estimation and Validation: Estimating sub-pixel abundance distributions from Sentinel-2 data and conducting a rigorous assessment of accuracy and robustness against reference data and baseline models.

2.3.1. Endmember Extraction and Weight Initialization

In autoencoder-based unmixing networks, the decoder weight matrix physically represents the spectral signatures of the endmembers. However, traditional random initialization strategies often result in suboptimal convergence and entrapment in local optima, yielding extracted endmembers that lack biophysical interpretability. To mitigate this issue, this study proposes a physically driven weight initialization strategy, integrating high-resolution sampling with the VCA algorithm to embed a priori endmember knowledge into the model.

Initially, to ensure endmember purity and representativeness, a rigorous sampling strategy employing multi-source verification was adopted. Sub-meter Google Earth imagery served as a geometric reference, while Sentinel-2 spectral features provided spectral validation for fine-scale sampling within the study area. For the five primary land cover types (forest, grassland, cropland, bareland, and impervious surfaces), 50 spatially distributed, homogeneous pure pixels were selected per class to constitute the candidate endmember set, denoted as

X_{c a n d}

. Subsequently, the VCA algorithm was employed to extract representative multi-temporal endmember spectra from

X_{c a n d}

. The extracted spectral profiles are illustrated in Figure 3 and serve as the initialization weights for the ECO-DEAU decoder.

2.3.2. Architecture of Ecologically Constrained Deep Autoencoder (ECO-DEAU)

The proposed ECO-DEAU model employs an asymmetric deep autoencoder architecture. This design leverages a deep encoder to perform non-linear mapping from the high-dimensional spectral feature space to the low-dimensional endmember abundance space, while utilizing a linear decoder to reconstruct spectra based on the physical mechanism of the Linear Mixture Model (LMM). The network (Figure 4) comprises three core modules: the Deep Feature Encoder, the Abundance Physical Constraint Layer, and the Physics-Driven Decoder.

(1): Deep Feature Encoder: The encoder is designed to compress and map the input spectral vector $x \in R^{L}$ into a latent feature vector, where $L$ denotes the number of spectral bands ( $L = 36$ , corresponding to the stacked multi-seasonal dataset). To capture the inherent non-linearity and complexity of spectral features, a multi-layer Fully Connected Network (FCN) is employed. This process captures complex non-linear spatial-spectral features, an ability that has proven essential in recent deep learning applications for hyperspectral target perception and robust video tracking [52,53]. Hidden layers utilize the Rectified Linear Unit (ReLU) activation function to enhance the network’s capacity for non-linear feature representation. Batch Normalization (BN) layers are interleaved between dense layers to accelerate model convergence and mitigate the vanishing gradient problem. Ultimately, the encoder outputs a feature vector $z$ with dimension $K$ (number of endmembers, $K = 5$ ), yielding a preliminary estimation of abundances.
(2): Abundance Physical Constraint Layer: This layer enforces strict constraints on the encoder’s output $z$ to guarantee the physical interpretability of the unmixing results. Spectral Unmixing mandates that the estimated abundance vector $a = [a_{1}, a_{2}, . . ., a_{K}]^{T}$ adhere to two physical conditions: the Abundance Non-negative Constraint (ANC) and the Abundance Sum-to-One Constraint (ASC), defined as:

$a_{k} \geq 0, \forall k \in {1, . . ., K} (A N C)$

(1)

$\sum_{k = 1}^{K} a_{k} = 1 (A S C)$

(2)
(3): Physics-Driven Decoder: The decoder maps the low-dimensional abundance vector $a$ back to the high-dimensional spectral space to reconstruct the original pixel $\hat{x}$ . This decoder is constructed in strict accordance with the LMM assumption. It is implemented as a bias-free single linear layer, mathematically equivalent to a matrix multiplication:

$\hat{x} = E a$

(3)

where $E \in R^{L \times K}$ is the spectral endmember matrix. Crucially, the weights of this decoder matrix $E$ are initialized using the multi-temporal spectral profiles of the five land cover types extracted in Section 2.3.1.

2.3.3. Objective Functions

To embed ecological interpretability into the deep learning-based unmixing process, this study establishes an Ecologically Constrained Loss Framework. The total objective function is formulated by integrating the spectral reconstruction loss (

L_{r e c o n}

) with the ecological constraint loss (

L_{e c o}

):

L_{t o t a l} = L_{r e c o n} + λ_{e c o} (t) \cdot L_{e c o}

(4)

where

λ (t)

denotes a dynamic balancing coefficient dependent on the training epoch

t

. To mitigate training instability potentially arising from premature constraints, a “warm-up” mechanism is introduced, whereby

λ (t)

increases linearly from 0 to a preset threshold during the initial epochs. This strategy allows the model to prioritize the extraction of intrinsic spectral features before gradually incorporating ecological regularizations.

Meanwhile, traditional deep learning models typically rely on fixed hyperparameters. However, a common bottleneck in physics-guided deep learning is the extreme sensitivity of hyperparameters when balancing the main reconstruction loss with multiple ecological priors. To eliminate the need for cumbersome manual weight adjustments and ensure training stability, the ECO-DEAU model introduces an adaptive weight normalization strategy.

During the training process, the model dynamically tracks the magnitude of each ecological loss using an Exponential Moving Average (EMA). After a predefined warm-up period, an adaptive scaling factor is computed at regular intervals for each constraint

k

(

k \in {t o p o, c o e x, p h r n o}

):

λ_{k} = \frac{L_{S A D}}{E M A (L_{k}) + ϵ}

(5)

where

ϵ

is a small constant to prevent division by zero.

Ultimately, the final total objective function (

L_{t o t a l}

) is formulated as the dynamically weighted sum of all components:

L_{t o t a l} = {L_{r e c o n} + λ_{e c o} (t) \cdot L_{e c o} = L}_{S A D} + \sum_{k \in {t o p o, c o e x, p h r n o}} (λ_{k} \cdot L_{k})

(6)

The spectral reconstruction loss (

L_{r e c o n}

) is quantified using the Spectral Angle Distance (SAD), which measures the cosine angle between the original (

x)

and reconstructed (

\hat{x}

) spectra:

L_{r e c o n} = \arccos (\frac{x^{T} \hat{x}}{‖ x ‖_{2} ‖ \hat{x} ‖_{2} + ϵ})

(7)

where

ϵ

is a small constant to prevent division by zero. SAD has been extensively employed in Autoencoder (AE)-based unmixing frameworks, owing to its inherent scale invariance. This property confers strong robustness against variations in local illumination, an advantage critical not only for unmixing but also for fine-grained spectral discrimination tasks such as hyperspectral anomaly detection [54].

The Ecological Constraint Loss (

L_{e c o}

) comprises three domain-specific penalty terms, designed to rectify mathematically optimal but ecologically implausible results generated by deep learning models:

L_{e c o} = λ_{1} L_{t o p o} + {λ_{2} L}_{c o e x} + {λ_{3} L}_{p h e n o}

(8)

Topographic Constraint Loss (

L_{t o p o}

): In arid and semi-arid landscapes, topography dictates hydrological distribution and land-use patterns. For instance, intensive croplands exhibit strict topographic upper bounds and rarely occur on steep slopes. To mathematically encode this geographic reality into the deep learning optimization process, we formulated the Terrain Suitability Constraint (

L_{t o p o}

). Leveraging Digital Elevation Model (DEM) data, the prior probability

P_{t o p o} (k)

for the

k

-th endmember is modeled via Gaussian and Sigmoid functions based on its preferred elevation (

E

) and slope (

S

). This loss penalizes topographic violations by minimizing the Kullback–Leibler (KL) divergence between the estimated abundance distribution and the topographic prior:

L_{t o p o} = \sum_{k = 1}^{K} P_{t o p o}^{(k)} (E, S) \log (\frac{P_{t o p o}^{(k)} (E, S)}{A_{p r e d}^{(k)} + ϵ})

(9)

Phenological Constraint Loss (

L_{p h e n o}

): In the distribution of land cover features, the Normalized Difference Vegetation Index (NDVI) serves as one of the key indicators for distinguishing land cover types. Given its inherent internal characteristics, Phenological Constraint Loss uses the NDVI variation curve of pure land cover areas as prior input. Utilizing NDVI as a key biophysical indicator, a Gaussian prior

P_{p h e n o} (k)

is established based on the statistical NDVI distribution

(μ, σ)

derived from pure pixels of each endmember. This term enforces phenological fidelity by constraining the abundance to align with phenological expectations:

L_{p h e n o} = \sum_{k = 1}^{K} P_{p h e n o}^{(k)} (N D V I) \log (\frac{P_{p h e n o}^{(k)} (N D V I)}{A_{p r e d}^{(k)} + ϵ})

(10)

Coexistence Constraint Loss (

L_{c o e x}

): Grounded in ecological niche theory, certain land covers (e.g., dense forest and impervious surfaces) are ecologically incompatible and thus mutually exclusive. To suppress spurious mixing, a penalty term is applied to the co-occurrence of mutually exclusive pairs

(i, j)

:

L_{c o e x} = \sum_{(i, j)} \max (0, A_{p r e d}^{(i)} \cdot A_{p r e d}^{(j)} - τ)

(11)

2.4. Estimating Abundance from Sentinel-2 Using ECO-DEAU

Following the formulation of the deep autoencoder architecture and the ecologically constrained loss framework, the ECO-DEAU model was applied to Sentinel-2 imagery to generate sub-pixel abundance maps. To focus the unmixing process on complex terrestrial features, large water bodies were initially masked out using the Otsu (Maximum Between-Class Variance) thresholding algorithm applied to the Near-Infrared (NIR) band. Excluding these spectrally homogeneous pixels prevents them from dominating the loss function during network optimization.

Subsequently, representative standard spectral signatures for five target endmembers—Grassland, Forest, Cropland, Bare Land, and Impervious Surfaces—were extracted to serve as initial physical constraints. These standard spectral signatures were explicitly assigned to initialize the ECO-DEAU decoder weight matrix. This endmember initialization strategy significantly enhances model efficacy and facilitates convergence. Simultaneously, ecological priors were instantiated based on the biophysical realities of the study area to penalize conflicting endmember co-occurrences.

Finally, the preprocessed 36-band Sentinel-2 spectral vectors were fed into the deep feature encoder for sub-pixel abundance inversion. Guided by the dynamic “warm-up” training mechanism, the network was optimized using the Adam optimizer with a batch size of 256 and an initial learning rate of

1 \times 1 0^{- 3}

. Upon convergence, the output of the physical constraint layer yielded optimal abundance vectors that strictly satisfied the non-negativity and sum-to-one conditions, resulting in a final, ecologically consistent sub-pixel land cover map.

2.5. Comparison Methods

To comprehensively evaluate the superiority of the proposed ECO-DEAU model, we established a rigorous comparative framework including the classic linear model (FCLS), a baseline Deep Autoencoder (DEAU), and three recent state-of-the-art (SOTA) deep learning spectral unmixing architectures.

Fully Constrained Least Squares (FCLS) is a classic algorithm based on the Linear Spectral Mixture Model (LSMM). It estimates the abundance fractions of endmembers by minimizing the least squares error between the observed pixel spectra and the reconstructed spectra. FCLS, as a classic and widely adopted algorithm in linear spectral unmixing, is employed to benchmark the fundamental performance improvements delivered by non-linear deep learning architectures.

The baseline Deep Autoencoder for Unmixing (Baseline AE) serves as the fundamental deep learning baseline in this study. It adopts a standard encoder–decoder network that is strictly identical to ECO-DEAU. However, unlike ECO-DEAU, the Baseline AE is driven purely by spectral reconstruction loss, without incorporating any ecological prior constraints. Consequently, it acts as a standard model within our comparative framework; comparing ECO-DEAU with the Baseline AE directly shows the performance improvements and the effectiveness attributed to the ecological constraints.

To comprehensively represent the latest advancements in autoencoder spectral unmixing, this study selected three state-of-the-art (SOTA) deep learning architectures published in recent years as advanced baseline models. Specifically, the Attention-based Autoencoder (Attention-AE) incorporates a channel attention mechanism within the encoder [55,56]. By dynamically recalibrating the importance of different spectral bands, it effectively suppresses noise while focusing on spectral features. The Sparse-regularized Autoencoder (Sparse-AE) incorporates Shannon entropy into its loss function to ensure sparsity in abundance results, mathematically reflecting the physical reality that individual pixels typically consist of only a few dominant endmembers [57]. Additionally, the 1D Convolutional Autoencoder (1DCNN-AE) replaces traditional fully connected layers with 1D convolutions, processing spectral features as continuous sequences [58]. This approach captures local morphological variations such as absorption valleys and reflection peaks, which are crucial for distinguishing materials with high spectral similarity.

3. Results

3.1. Comparison with Baseline Methods

To rigorously validate the overall performance of the proposed ECO-DEAU model—particularly the efficacy of its physical initialization and ecological constraint mechanisms—a comprehensive comparative analysis involving six models was conducted. The benchmarks encompassed the baseline models detailed in Section 2.5: the classic Fully Constrained Least Squares (FCLS), a standard unconstrained deep learning autoencoder (Baseline AE), and three recent state-of-the-art (SOTA) data-driven architectures (Attention-AE, Sparse-AE, and 1DCNN-AE). To ensure absolute fairness beyond architectural differences, all models were evaluated under identical experimental conditions, utilizing the same hardware platform, study regions, and optimal hyperparameters. Finally, the estimated abundance maps from each model are presented and quantitatively validated against ground-truth abundances derived from high-resolution Gaofen-2 satellite imagery.

3.1.1. Endmember Spectral Visualization

Figure 5 presents a comparative visualization of the endmember spectral signatures extracted for the five target land cover types. As observed, endmembers extracted by the Baseline AE (solid blue lines) exhibit significant non-physical oscillations and noise artifacts, failing to capture key biophysical features while achieving optimization solely in terms of mathematical spectral reconstruction. For instance, the baseline vegetation curves omit the characteristic “red edge” slope (approx. 700 nm) and display erratic fluctuations in the near-infrared (NIR) region. Conversely, the ECO-DEAU-derived endmembers (solid red lines) demonstrate superior spectral smoothness and fidelity, aligning closely with the initial physical endmember signatures (dashed grey line). This comparison suggests that relying solely on mathematical reconstruction errors predisposes the model to local minima and spurious mixing, whereas incorporating physical initialization effectively suppresses artifacts, ensuring the learning of physically meaningful spectral representations.

3.1.2. Quantitative Comparison of Unmixing Accuracy Between the Proposed ECO-DEAU Model and the Baseline AE Models

Subsequent to the qualitative spectral evaluation, a rigorous quantitative validation was conducted to assess the unmixing accuracy of the proposed framework. High-resolution abundance maps derived from GF-2 imagery served as the ground truth for a pixel-wise accuracy assessment benchmarking ECO-DEAU against the five Baseline models. The comparative performance metrics, including the Coefficient of Determination (

R^{2}

), Root Mean Square Error (RMSE), and Mean Absolute Error (MAE), are detailed in Table 2.

As detailed in Table 2, the quantitative comparison among six unmixing models reveals that the proposed ECO-DEAU achieves the highest overall unmixing accuracy (Overall

R^{2}

= 0.749,

R M S E

= 0.131,

M A E

= 0.079), substantially outperforming both the traditional linear method (FCLS) and other advanced deep learning architectures (Baseline AE, Attention-AE, Sparse-AE and 1DCNN-AE).

Most notably, ECO-DEAU demonstrates absolute superiority in decoupling spectrally similar land covers, particularly in arid landscapes where Impervious Surfaces and Bareland are frequently conflated due to their high spectral similarities. In these challenging categories, ECO-DEAU achieves an

R^{2}

of 0.825 for Impervious Surfaces and 0.592 for Bareland. In contrast, advanced SOTA models like Sparse-AE, which rely solely on mathematical sparsity priors, experience severe performance degradation in these classes (Impervious

R^{2}

= 0.701, Bareland

R^{2}

= 0.496). This result demonstrates that the synergistic interaction between topography and ecological coexistence constraints within the ECO-DEAU model, coupled with physical initialization, enables the model’s ecological prior assumptions to assist in correctly resolving the unmixing model even when pure spectral discrimination fails.

Furthermore, ECO-DEAU maintains excellent stability across diverse vegetation types. For Cropland and Grassland, it consistently ranks first

(R^{2}

of 0.802 and 0.623, respectively), indicating that incorporating phenological and terrain priors effectively constrains the overestimation of bare soil within sparsely vegetated pixels, yielding a more ecological prior distribution compared to the unconstrained Baseline AE

(R^{2}

of 0.766 and 0.569, respectively).

However, Attention-AE (

R^{2}

= 0.628) and Sparse-AE (

R^{2}

= 0.623) slightly outperform ECO-DEAU (

R^{2}

= 0.593) in the Forest category. This is likely due to the spectral absorption characteristics and pronounced spatial sparsity typically exhibited by forests within the study area, which aligns perfectly with the channel recalibration capability of attention mechanisms and the mathematical assumptions of sparsity regularization. Nevertheless, compared to FCLS (

R^{2}

= 0.542) and 1DCNN-AE (

R^{2}

= 0.548), which underperformed across the entire dataset, ECO-DEAU achieves an optimal balance. By integrating a nonlinear deep learning model with an ecological prior, ECO-DEAU ensures high-precision extraction while achieving robust biophysical interpretability in highly heterogeneous arid and semi-arid landscapes.

3.2. Accuracy Assessment with GF-2 Imagery Abundance

To visually and quantitatively validate the spatial fidelity of the unmixing results, two representative validation zones were selected: the intensive agricultural area in Bayannur and the urban-mountain transition zone in Hohhot. These two regions were strategically chosen because they encapsulate the most typical and challenging highly heterogeneous landscape patterns within the arid and semi-arid Yellow River Basin. Specifically, the Bayannur zone provides an ideal area for decoupling complex agricultural-pastoral mixtures governed by phenology, while the Hohhot transition zone rigorously tests the model’s ability to differentiate spectrally ambiguous categories under varying topographic gradients. Together, they offer a comprehensive evaluation of the embedded ecological constraints. High-spatial-resolution Gaofen-2 (GF-2) imagery, acquired concurrently with the Sentinel-2 data, served as the ground truth. To generate the reference abundance maps, these images were first classified into the five target categories using a hybrid approach integrating visual interpretation with machine learning algorithms. Subsequently, the classification results were spatially aggregated to a 100 m scale to mitigate the potential impact of geometric registration errors between the multi-source datasets, ensuring a robust pixel-wise comparison.

3.2.1. Accuracy Assessment in Urban-Mountain Transition Zone, Hohhot

To assess the model’s robustness in spatially heterogeneous landscapes, the Urban-Mountain Transition Zone in Hohhot was selected as the primary validation site. This region represents a typical ecotone characterized by complex land cover patterns: the plain area is dominated by a fragmented mosaic of croplands and impervious surfaces, while the mountainous area exhibits a mixed distribution of grassland and forest, significantly constrained by topographic factors such as slope and aspect. To quantify unmixing accuracy, pixel-wise scatter density plots comparing ECO-DEAU estimated abundances with the GF-2 derived ground truth are presented in Figure 6.

Figure 6 visually corroborates the robust unmixing performance of ECO-DEAU within the complex urban-mountain interface. As evidenced by the scatter density plots, estimated abundances for all five endmembers exhibit strong linear agreement with the ground truth, characterized by high-density clusters clustering tightly along the 1:1 diagonal. Notably, the model demonstrates exceptional precision in delineating anthropogenic and agricultural features, yielding the highest correlation for Impervious Surfaces (

R^{2} = 0.825

) and Cropland (

R^{2} = 0.802

). This validates that the proposed Ecological Coexistence Constraint (

L_{c o e x}

) effectively mitigates the severe spectral confusion between bright impervious surfaces and bare soil, a common challenge in peri-urban mosaics. Furthermore, while vegetation categories (Forest and Grassland) display slightly wider dispersion due to topographic shadows and the high intra-class spectral variability of semi-arid vegetation, the overall distribution remains statistically unbiased, devoid of significant systematic over- or under-estimation. This confirms that the integration of Topographic Constraints (

L_{t o p o}

) successfully regularized the spatial distribution of vegetation along elevation gradients, ensuring that ECO-DEAU maintains high physical fidelity even in highly heterogeneous transition zones.

Simultaneously, to visually diagnose the spatial distribution of unmixing errors, the abundance residual maps for the validation area were generated, as shown in Figure 7.

The residual maps reveal that absolute errors for the majority of pixels are concentrated within a narrow range of

\pm 0.1

, indicating a high degree of spatial consistency across the study area. Crucially, residuals do not exhibit significant spatial clustering or systematic bias, such as large patches of consistent overestimation or underestimation. Although slightly elevated residuals are observed at sharp boundaries between agricultural plots and impervious surfaces—a common phenomenon attributed to the mixed pixel effect at transition edges—the overall residual distribution remains stochastic and low in magnitude. This spatial error homogeneity further corroborates that the ECO-DEAU model, constrained by ecological priors, achieves robust generalizability across both homogeneous plains and heterogeneous mountainous terrain, without overfitting to specific land cover types.

3.2.2. Accuracy Assessment in Agricultural Aggregation Area, Bayannur

Following the assessment in Hohhot, the same rigorous validation protocol was applied to the Intensive Agricultural Zone in Bayannur. In contrast to the fragmented urban-mountain transition zone, this region is characterized by a landscape dominated by a homogeneous matrix of large-scale intensive agriculture, interspersed with scattered settlements, shelterbelts, and fallow land. Figure 8 presents the pixel-wise scatter density plots, revealing distinct high-density clustering patterns indicative of accurate endmember extraction across different land cover categories.

Specifically, regarding the Cropland endmember (Figure 8c), scatter points generally follow the trend of the 1:1 diagonal, yielding an

R^{2}

of 0.429. While this moderate correlation reflects inherent spectral heterogeneity caused by diverse crop phenology and planting structures, it demonstrates that ECO-DEAU effectively captures the spectral variance induced by abundance changes, successfully distinguishing dominant cultivated areas from the complex mixed background. Notably, Impervious Surfaces (Figure 8e,

R^{2} =

0.740) and Bare Land (Figure 8d,

R^{2} =

0.580) exhibit superior linear correlations. This performance is particularly crucial in vegetation-dominated agricultural landscapes, as it confirms the model’s capability to resolve sub-pixel non-vegetated features, such as narrow field paths, irrigation canals, and ridges. Comprehensively, the aggregate scatter plot (Figure 8f) reveals an overall region-wide

R^{2}

of 0.581, with abundance estimation deviations effectively constrained within 10%, underscoring the model’s efficacy in unmixing spectrally complex agricultural environments.

Additionally, the abundance residual maps for the agricultural zone (Figure 9) provide further insights into the spatial distribution of unmixing errors.

Visual inspection of the residual maps reveals that residuals remain predominantly low and uniform across extensive cropland areas, affirming the model’s stability within homogeneous landscapes. While slightly elevated residuals are discernible along sharp boundaries between crop fields and irrigation canals—a characteristic manifestation of the mixed pixel effect at transition edges—these deviations are spatially constrained to narrow linear features and do not propagate into field interiors. Crucially, the absence of large-scale clustering of positive or negative residuals indicates that ECO-DEAU maintains a high degree of spatial fidelity and is devoid of regional systematic bias, thereby effectively preserving the geometric integrity of the agricultural mosaic.

3.3. Spatial Distribution Patterns of Land Cover Abundances in the Study Area

Based on the validated ECO-DEAU model, pixel-wise fractional abundances for the five target endmembers were derived across the entire study area, as visualized in Figure 10. Overall, unmixing results exhibit distinct spatial heterogeneity that aligns closely with regional topographic and ecological characteristics, effectively capturing the landscape patterns of land cover transitions.

Initially, the distribution of natural vegetation demonstrates pronounced altitudinal zonation. In the northern Yinshan Mountains, the model identifies high-abundance forest zones (Figure 10b), strongly associated with increased orographic precipitation. Extending from mountains to plains, grassland (Figure 10a) forms a broad transition zone, widely distributed across foothills and undulating plateau slopes. The decreasing trend of grassland abundance from the mountains to the semi-arid plateau accurately reflects the ecological gradient driven by moisture availability. Spatially, this gradient manifests as a gradual shift from a “Forest-Grass” mixture to a “Grass-Bare Land” mixture.

Secondly, the distribution of Cropland, Bare Land, and Impervious Surfaces reflects the dual influence of Yellow River hydrology and regional climate. The Cropland abundance map (Figure 10c) clearly delineates the Hetao Plain, characterized by low, flat terrain and extensive irrigation networks. In addition to major irrigation districts along the Yellow River, scattered patches of irrigated agriculture are also detected within the southern Ordos Plateau. Bare Land (Figure 10d), representing the most extensive class, displays an increasing trend from southeast to northwest, consistent with the regional precipitation gradient. The pervasive mixing of Bare Land and Grassland further corroborates the necessity of spectral unmixing in this region. Finally, Impervious Surfaces (Figure 10e) are concentrated in major urban centers along the river corridor (e.g., Hohhot and Baotou), gradually transitioning into dispersed rural settlements.

4. Discussion

4.1. Reliability of Data and Validation Strategy

The reliability of the findings presented herein is underpinned by a rigorous data processing framework that ensures spatiotemporal consistency across multi-source datasets, an ecologically representative sampling strategy, and a robust upscaling validation methodology. To mitigate challenges associated with high surface heterogeneity and atmospheric interference in arid and semi-arid regions, a high-precision spatiotemporal alignment strategy was implemented. Temporally, a “ten-day growing season mean compositing” approach was adopted to construct high-quality Sentinel-2 time-series data, effectively suppressing residual cloud cover and noise while preserving critical phenological fidelity. Spatially, SRTM topographic data were co-registered to the 10 m spectral grid via bilinear interpolation, ensuring the physical validity of the terrain constraint function (

L_{t o p o}

) at the pixel scale. Furthermore, the “Process-Oriented, Dual-Purpose Purposive Sampling” strategy—encompassing both Pure Endmember Zones (PEZ) and Ecologically Constrained Mixed Zones (ECMZ)—provided accurate physical initialization priors, endowing the model with robustness in handling highly fragmented habitats, particularly within forest-grass ecotones. Crucially, to circumvent geometric registration errors inherent in multi-source validation, an “upscaling” validation strategy was adopted, wherein both model-derived abundances and high-resolution reference data were spatially aggregated to a 100 m grid. This strategy effectively mitigates the interference of minor positional discrepancies, ensuring that quantitative metrics objectively reflect unmixing performance, thereby confirming the reliability of ECO-DEAU results in terms of both statistical significance and biophysical interpretability.

4.2. Reliability of Model Architecture and Ecological Constraints

The uniqueness of ECO-DEAU lies in its ability to efficiently separate data while maintaining ecological validity, outperforming traditional spectral unmixing methods (FCLS) and deep autoencoder models. Its core advantage stems from integrating three key features: endmember-spectrum-based weight initialization, ecological constraints, and feature-level complex nonlinear mapping. First, this approach avoids the pitfalls of deep learning caused by poor initialization—its endmember initialization method utilizes early initial endmember extracted by VCA. These initialized weights guide the model toward improved training efficiency and enhanced training accuracy.

Second, ecological constraints effectively address the issue of “spectral similarity” in arid regions by incorporating loss functions. Specifically, the terrain suitability constraint successfully distinguishes between spectrally similar but topographically distinct land types—such as cropland, grassland, and forest—using prior terrain information. Meanwhile, the coexistence constraint penalizes ecologically implausible mixtures of land types. Crucially, these ecological constraints function as soft regularization rather than rigid rules, skillfully balancing prior knowledge with empirical data evidence. When local surface cover conflicts with regional topography or other ecological features due to microclimates or human interventions (e.g., steep artificial terraces), the model avoids blindly sacrificing spectral reconstruction accuracy to accommodate ecological constraints. Instead, the distinct spectral signatures of such anomalous covers incur a massive reconstruction loss, far exceeding minor ecological penalties. Thus, ecological priors intervene only when data evidence is highly ambiguous, with reconstruction loss remaining the primary discriminator.

Finally, unlike traditional linear spectral unmixing (LSU) constrained by linear assumptions, ECO-DEAU employs an asymmetric “nonlinear encoding, linear decoding” architecture. Its deep encoder captures complex nonlinear features within spectral curves. Its key advantage lies in the linear decoder’s ability to clearly preserve the physical interpretability of the retrieved abundances even under complex environmental conditions, making it particularly advantageous for handling scenes with mixed ground features.

4.3. Transferability of ECO-DEAU

Although the ten sampling areas selected in this study effectively represent the complex landscape patterns of arid and semi-arid regions in the Yellow River Basin, assessing the transferability of the proposed ECO-DEAU model to other ecological contexts is crucial. The subsequent analysis and argumentation focus on evaluating the model’s transferability.

The transferability of this method can be evaluated from two perspectives: First, the core deep autoencoder architecture—a deep learning autoencoder unmixing model with strict physical constraints (e.g., ANC, ASC, and reflectance upper bounds)—possesses mathematical universality that can be directly transferred to any hyperspectral unmixing task [29,32,59]. It has already been tested across multiple unmixing tasks based on different regions. Second, the transferability of the endmember initialization strategy proposed in this study. The endmember initialization strategy, designed to enhance model training performance and accelerate convergence, functions primarily as a “reference” dataset to guide the training process rather than imposing stringent constraints on the model. Therefore, when transferring the model to different study areas, this endmember initialization strategy remains adaptable to local conditions. Simply select spectral data from pure pixels of the target land cover class as input, tailored to the specific study area. However, it is worth noting that in smaller study areas or when high-resolution imagery is unavailable, visually identifying or methodically extracting pure pixels may prove challenging. Under such conditions, the applicability of the endmember initialization strategy diminishes, and a more suitable approach should be adopted. Furthermore, the transferability of ecological constraints presents greater complexity. While the underlying logic of ecological priors exhibits high adaptability, specific parameters may require localized calibration. The terrain constraint in this study is applicable to areas where features exhibit regular terrain-dependent variations. It performs well in regions where feature distribution changes significantly with terrain across large scales. However, when focusing on smaller areas where terrain-dependent feature variations are less pronounced, alternative constraints should be considered. Regarding the model’s coexistence and phenological constraints, when transferring to new study areas, seasonal time windows and index thresholds must be adjusted based on local characteristics. This constitutes the primary task in model transfer: first, modifying vegetation growth season intervals according to actual vegetation growth patterns in the target area; second, determining relevant vegetation index thresholds based on regional vegetation types and their approximate distributions.

4.4. Uncertainties and Limitations

Despite the superior performance of ECO-DEAU in unmixing complex landscapes in arid and semi-arid regions, certain uncertainties remain, stemming from intrinsic constraints in data sources and model assumptions. First, the spatial resolution of input data imposes inherent limits on the fine-scale identification of discrete micro-features. Although Sentinel-2 provides 10 m observations, sub-pixel unmixing remains challenging for highly fragmented targets, such as dispersed shrubs or narrow field ridges. Furthermore, the scale discrepancy between SRTM data (30 m) and spectral data may introduce uncertainties in areas with complex micro-topography, potentially attenuating the efficacy of terrain constraints. Second, this study is currently predicated on data spanning the 2023 growing season. Given the significant interannual climatic variability in arid regions, phenological characteristics captured in a single year may not fully represent scenarios under anomalous climatic conditions; consequently, the model’s temporal transferability for long-term monitoring warrants further validation. Finally, while the physical initialization strategy endows the model with explicit physical meaning, it is fundamentally based on the “fixed endmember” assumption. This approach does not explicitly account for Intra-class Spectral Variability (ISV), which constrains the model’s ability to resolve complex vegetation community structures [60,61].

4.5. Future Work

Although the ECO-DEAU model demonstrates excellent unmixing performance in arid and semi-arid regions of the Yellow River Basin, several future research directions remain. First, while existing ecological constraints effectively guide network operation, their specific thresholds still require localized manual calibration. Future studies will focus on developing automated ecological prior generation modules using global geospatial big data, aiming to provide better methods for large-scale and efficient landcover extraction. Second, the current model estimates land cover abundance based solely on a single year. To address the limitations of single-year data, subsequent research may incorporate temporal information to construct long-term unmixing models. By inputting multi-year or even multi-decade time series of remote sensing imagery and vegetation phenology data from the study area, the model can achieve long-term visualization and dynamic analysis of land cover abundance changes in arid-semi-arid regions. Finally, for future spectral unmixing model research, leveraging increasingly available high-spatial and temporal resolution remote sensing data, models can incorporate hyperspectral imagery to capture finer spectral characteristics of land cover elements. This approach enhances the distinctiveness of discrimination features and improves spectral unmixing performance.

5. Conclusions

To address the critical challenges posed by fragmented landscapes and ubiquitous mixed pixels in arid and semi-arid regions, this study proposes the novel Ecologically Constrained Deep Autoencoder Unmixing (ECO-DEAU) framework. By synergizing the physical interpretability of Linear Mixture Models (LMM) with the non-linear feature extraction capabilities of deep autoencoders, the method introduces an endmember-based physical initialization strategy that significantly accelerates convergence and avoids the local optima pitfalls of traditional random initialization. Crucially, the model innovatively integrates ecological priors—specifically terrain, phenology, and coexistence constraints—as soft regularizations into the network. This mechanism gracefully balances actual data evidence with geographic knowledge, effectively circumventing the “endmember permutation” issues inherent in traditional deep unmixing models and successfully decoupling spectrally ambiguous land covers.

Extensive multi-scale validation in the Inner Mongolia section of the Yellow River Basin demonstrates that ECO-DEAU significantly outperforms both traditional linear methods and SOTA deep learning architectures (including Attention-AE, Sparse-AE, and 1DCNN-AE). Achieving a maximum overall

R^{2}

of 0.749 and an RMSE below 0.15, Research findings indicate that when delineating complex forest–grassland transition zones and fine-scale landscape features, deep autoencoders incorporating ecological prior demonstrate greater robustness compared to models relying solely on spectral information. Ultimately, this study demonstrates the feasibility of achieving high-precision sub-pixel inversion using medium-resolution imagery with multi-source ecological priors. This provides a robust methodological foundation for accurate large-scale vegetation carbon stock estimation and offers strong support for achieving regional carbon neutrality goals.

Author Contributions

Conceptualization, L.Z. and K.L.; methodology, L.Z. and S.W.; validation, L.Z. and L.L.; data curation, L.Z. and D.L.; writing—original draft preparation, L.Z. and L.L.; writing—review and editing, D.L. and Y.B.; visualization, L.L. and H.L.; supervision, K.L.; project administration, K.L. and S.W.; funding acquisition, S.W. All authors have read and agreed to the published version of the manuscript.

Funding

This study has been jointly supported by the Foreign Technical Cooperation and Scientific Research Program (Grant No. ZE01) and Science and Technology Plan Project of Hohhot (Grant No. 2022-Social-Key 4-1-1).

Data Availability Statement

All of the data used in this study can be accessed and downloaded at Google Earth Engine (https://earthengine.google.com/). Maps of land cover abundance are available by contacting the authors.

Acknowledgments

Thanks to Google Earth Engine for providing access to datasets and computational resources, which greatly facilitated this research.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Reynolds, J.F.; Smith, D.M.S.; Lambin, E.F.; Turner, B.L.; Mortimore, M.; Batterbury, S.P.J.; Downing, T.E.; Dowlatabadi, H.; Fernández, R.J.; Herrick, J.E.; et al. Global Desertification: Building a Science for Dryland Development. Science 2007, 316, 847–851. [Google Scholar] [CrossRef]
Poulter, B.; Frank, D.; Ciais, P.; Myneni, R.B.; Andela, N.; Bi, J.; Broquet, G.; Canadell, J.G.; Chevallier, F.; Liu, Y.Y.; et al. Contribution of Semi-Arid Ecosystems to Interannual Variability of the Global Carbon Cycle. Nature 2014, 509, 600–603. [Google Scholar] [CrossRef]
Ahlström, A.; Raupach, M.R.; Schurgers, G.; Smith, B.; Arneth, A.; Jung, M.; Reichstein, M.; Canadell, J.G.; Friedlingstein, P.; Jain, A.K.; et al. The Dominant Role of Semi-Arid Ecosystems in the Trend and Variability of the Land CO₂ Sink. Science 2015, 348, 895–899. [Google Scholar] [CrossRef]
Huang, J.; Yu, H.; Guan, X.; Wang, G.; Guo, R. Accelerated Dryland Expansion under Climate Change. Nat. Clim. Change 2016, 6, 166–171. [Google Scholar] [CrossRef]
Huang, J.; Li, Y.; Fu, C.; Chen, F.; Fu, Q.; Dai, A.; Shinoda, M.; Ma, Z.; Guo, W.; Li, Z.; et al. Dryland Climate Change: Recent Progress and Challenges. Rev. Geophys. 2017, 55, 719–778. [Google Scholar] [CrossRef]
Chen, C.; Park, T.; Wang, X.; Piao, S.; Xu, B.; Chaturvedi, R.K.; Fuchs, R.; Brovkin, V.; Ciais, P.; Fensholt, R.; et al. China and India Lead in Greening of the World through Land-Use Management. Nat. Sustain. 2019, 2, 122–129. [Google Scholar] [CrossRef]
Li, B.; Liu, D.; Yu, E.; Wang, L. Warming-and-Wetting Trend over the China’s Drylands: Observational Evidence and Future Projection. Glob. Environ. Change 2024, 86, 102826. [Google Scholar] [CrossRef]
Bryan, B.A.; Gao, L.; Ye, Y.; Sun, X.; Connor, J.D.; Crossman, N.D.; Stafford-Smith, M.; Wu, J.; He, C.; Yu, D.; et al. China’s Response to a National Land-System Sustainability Emergency. Nature 2018, 559, 193–204. [Google Scholar] [CrossRef] [PubMed]
Lu, F.; Hu, H.; Sun, W.; Zhu, J.; Liu, G.; Zhou, W.; Zhang, Q.; Shi, P.; Liu, X.; Wu, X.; et al. Effects of National Ecological Restoration Projects on Carbon Sequestration in China from 2001 to 2010. Proc. Natl. Acad. Sci. USA 2018, 115, 4039–4044. [Google Scholar] [CrossRef] [PubMed]
Fang, J.; Yu, G.; Liu, L.; Hu, S.; Chapin, F.S. Climate Change, Human Impacts, and Carbon Sequestration in China. Proc. Natl. Acad. Sci. USA 2018, 115, 4015–4020. [Google Scholar] [CrossRef]
Li, L.; Liu, K.; Wang, S.; Li, H.; Bo, Y.; Li, X. Mapping Irrigated Cropland at 30 m Spatial Resolution in Northern China over the Past Three Decades. GIScience Remote Sens. 2025, 62, 2563394. [Google Scholar] [CrossRef]
Aguiar, M.R.; Sala, O.E. Patch Structure, Dynamics and Implications for the Functioning of Arid Ecosystems. Trends Ecol. Evol. 1999, 14, 273–277. [Google Scholar] [CrossRef]
Keshava, N.; Mustard, J.F. Spectral Unmixing. IEEE Signal Process. Mag. 2002, 19, 44–57. [Google Scholar] [CrossRef]
Small, C. The Landsat ETM+ Spectral Mixing Space. Remote Sens. Environ. 2004, 93, 1–17. [Google Scholar] [CrossRef]
Foody, G.M. Status of Land Cover Classification Accuracy Assessment. Remote Sens. Environ. 2002, 80, 185–201. [Google Scholar] [CrossRef]
DeFries, R.S.; Field, C.B.; Fung, I.; Collatz, G.J.; Bounoua, L. Combining Satellite Data and Biogeochemical Models to Estimate Global Effects of Human-induced Land Cover Change on Carbon Emissions and Primary Productivity. Glob. Biogeochem. Cycles 1999, 13, 803–815. [Google Scholar] [CrossRef]
Liu, X.; Wang, S.; Zhang, X.; Zhen, L.; Ma, C.; Naing, S.Y.; Liu, K.; Li, H. Monitoring Spatiotemporal Dynamics of Farmland Abandonment and Recultivation Using Phenological Metrics. Land 2025, 14, 1745. [Google Scholar] [CrossRef]
Smith, M.O.; Ustin, S.L.; Adams, J.B.; Gillespie, A.R. Vegetation in Deserts: I. A Regional Measure of Abundance from Multispectral Images. Remote Sens. Environ. 1990, 31, 1–26. [Google Scholar] [CrossRef]
Elmore, A.J.; Mustard, J.F.; Manning, S.J.; Lobell, D.B. Quantifying Vegetation Change in Semiarid Environments: Precision and Accuracy of Spectral Mixture Analysis and the Normalized Difference Vegetation Index. Remote Sens. Environ. 2000, 73, 87–102. [Google Scholar] [CrossRef]
Nascimento, J.M.P.; Dias, J.M.B. Vertex Component Analysis: A Fast Algorithm to Unmix Hyperspectral Data. IEEE Trans. Geosci. Remote Sens. 2005, 43, 898–910. [Google Scholar] [CrossRef]
Winter, M.E. N-FINDR: An Algorithm for Fast Autonomous Spectral End-Member Determination in Hyperspectral Data. In Imaging Spectrometry V; SPIE: Bellingham, DC, USA, 1999; Volume 3753, pp. 266–275. [Google Scholar]
Nonlinear Unmixing of Hyperspectral Images: Models and Algorithms. Available online: https://xplorestaging.ieee.org/document/6678284 (accessed on 30 January 2026).
Heylen, R.; Parente, M.; Gader, P. A Review of Nonlinear Hyperspectral Unmixing Methods. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2014, 7, 1844–1868. [Google Scholar] [CrossRef]
Zhao, D.; Hu, B.; Jiang, W.; Zhong, W.; Arun, P.V.; Cheng, K.; Zhao, Z.; Zhou, H. Hyperspectral Video Tracker based on Spectral Difference Matching Reduction and Deep Spectral Target Perception Features. Opt. Lasers Eng. 2025, 194, 109124. [Google Scholar] [CrossRef]
Su, Y.; Zhu, Z.; Gao, L.; Plaza, A.; Li, P.; Sun, X.; Xu, X. DAAN: A deep autoencoder-based augmented network for blind multilinear hyperspectral unmixing. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5512715. [Google Scholar] [CrossRef]
Zhang, M.; Yang, M.; Xie, H.; Yue, P.; Zhang, W.; Jiao, Q.; Xu, L.; Tan, X. A global spatial-spectral feature fused autoencoder for nonlinear hyperspectral unmixing. Remote Sens. 2024, 16, 3149. [Google Scholar] [CrossRef]
Mantripragada, K.; Qureshi, F.Z. Hyperspectral pixel unmixing with latent dirichlet variational autoencoder. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5507112. [Google Scholar] [CrossRef]
Su, Y.; Li, J.; Plaza, A.; Marinoni, A.; Gamba, P.; Chakravortty, S. DAEN: Deep Autoencoder Networks for Hyperspectral Unmixing. IEEE Trans. Geosci. Remote Sens. 2019, 57, 4309–4321. [Google Scholar] [CrossRef]
Palsson, B.; Ulfarsson, M.O.; Sveinsson, J.R. Convolutional Autoencoder for Spectral–Spatial Hyperspectral Unmixing. IEEE Trans. Geosci. Remote Sens. 2021, 59, 535–549. [Google Scholar] [CrossRef]
Chen, J.; Zhao, M.; Wang, X.; Richard, C.; Rahardja, S. Integration of physics-based and data-driven models for hyperspectral image unmixing: A summary of current methods. IEEE Signal Process. Mag. 2023, 40, 61–74. [Google Scholar] [CrossRef]
Zhao, M.; Chen, J.; Dobigeon, N. AE-RED: A hyperspectral unmixing framework powered by deep autoencoder and regularization by denoising. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5512115. [Google Scholar] [CrossRef]
Hong, D.; Gao, L.; Yao, J.; Yokoya, N.; Chanussot, J.; Heiden, U.; Zhang, B. Endmember-guided unmixing network (EGU-Net): A general deep learning framework for self-supervised hyperspectral unmixing. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 6518–6531. [Google Scholar] [CrossRef]
Bhatt, J.S.; Joshi, M.V. Deep Learning in Hyperspectral Unmixing: A Review. In Proceedings of the IGARSS 2020—2020 IEEE International Geoscience and Remote Sensing Symposium; IEEE: Piscataway, NJ, USA, 2020; pp. 2189–2192. [Google Scholar]
Advances in Nonlinear Spectral Unmixing of Hyperspectral Images-All Databases. Available online: https://webofscience.clarivate.cn/wos/alldb/full-record/CSCD:4934739 (accessed on 21 May 2024).
Dynamic Vegetation Responses to Climate and Land Use Changes over the Inner Mongolia Reach of the Yellow River Basin, China. Available online: https://www.mdpi.com/2072-4292/15/14/3531 (accessed on 30 January 2026).
Yang, J.; Jia, L.; Hao, J.; Luo, Q.; Chi, W.; Wang, Y.; Zheng, H.; Yuan, R.; Na, Y. Temporal and Spatial Variation Characteristics of the Ecosystem in the Inner Mongolia Section of the Yellow River Basin. Atmosphere 2024, 15, 827. [Google Scholar] [CrossRef]
Zhang, H.; Zhang, J.; Lv, Z.; Yao, L.; Zhang, N.; Zhang, Q. Spatio-Temporal Assessment of Landscape Ecological Risk and Associated Drivers: A Case Study of the Yellow River Basin in Inner Mongolia. Land 2023, 12, 1114. [Google Scholar] [CrossRef]
Chen, Q.; Zhu, M.; Zhang, C.; Zhou, Q. The Driving Effect of Spatial-Temporal Difference of Water Resources Carrying Capacity in the Yellow River Basin. J. Clean. Prod. 2023, 388, 135709. [Google Scholar] [CrossRef]
Liu, J.; Zhang, Z.; Xu, X.; Kuang, W.; Zhou, W.; Zhang, S.; Li, R.; Yan, C.; Yu, D.; Wu, S.; et al. Spatial Patterns and Driving Forces of Land Use Change in China during the Early 21st Century. J. Geogr. Sci. 2010, 20, 483–494. [Google Scholar] [CrossRef]
Drusch, M.; Del Bello, U.; Carlier, S.; Colin, O.; Fernandez, V.; Gascon, F.; Hoersch, B.; Isola, C.; Laberinti, P.; Martimort, P.; et al. Sentinel-2: ESA’s Optical High-Resolution Mission for GMES Operational Services. Remote Sens. Environ. 2012, 120, 25–36. [Google Scholar] [CrossRef]
Korhonen, L.; Hadi; Packalen, P.; Rautiainen, M. Comparison of Sentinel-2 and Landsat 8 in the Estimation of Boreal Forest Canopy Cover and Leaf Area Index. Remote Sens. Environ. 2017, 195, 259–274. [Google Scholar] [CrossRef]
Shen, B.; Guo, J.; Li, Z.; Chen, J.; Fang, W.; Kussainova, M.; Amarjargal, A.; Pulatov, A.; Yan, R.; Anenkhonov, O.A.; et al. Comparative Verification of Leaf Area Index Products for Different Grassland Types in Inner Mongolia, China. Remote Sens. 2023, 15, 4736. [Google Scholar] [CrossRef]
Zhu, Z.; Woodcock, C.E. Object-Based Cloud and Cloud Shadow Detection in Landsat Imagery. Remote Sens. Environ. 2012, 118, 83–94. [Google Scholar] [CrossRef]
Zhao, D.; Wang, M.; Huang, K.; Zhong, W.; Arun, P.V.; Li, Y.; Asano, Y.; Wu, L.; Zhou, H. OCSCNet-Tracker: Hyperspectral Video Tracker based on Octave Convolution and Spatial-Spectral Capsule Network. Remote Sens. 2025, 17, 693. [Google Scholar] [CrossRef]
Zhu, Z.; Wang, S.; Woodcock, C.E. Improvement and Expansion of the Fmask Algorithm: Cloud, Cloud Shadow, and Snow Detection for Landsats 4–7, 8, and Sentinel 2 Images. Remote Sens. Environ. 2015, 159, 269–277. [Google Scholar] [CrossRef]
Zhao, D.; Tang, L.; Arun, P.V.; Asano, Y.; Zhang, L.; Xiong, Y.; Tao, X.; Hu, J. City-Scale Distance Estimation via Near-Infrared Trispectral Light Extinction in Bad Weather. Infrared Phys. Technol. 2023, 128, 104507. [Google Scholar] [CrossRef]
Main-Knorn, M.; Pflug, B.; Louis, J.; Debaecker, V.; Müller-Wilm, U.; Gascon, F. Sen2Cor for Sentinel-2. In Proceedings of the Image and Signal Processing for Remote Sensing XXIII; SPIE: Bellingham, DC, USA, 2017; Volume 10427, pp. 37–48. [Google Scholar]
Immitzer, M.; Vuolo, F.; Atzberger, C. First Experience with Sentinel-2 Data for Crop and Tree Species Classifications in Central Europe. Remote Sens. 2016, 8, 166. [Google Scholar] [CrossRef]
Griffiths, P.; van der Linden, S.; Kuemmerle, T.; Hostert, P. A Pixel-Based Landsat Compositing Algorithm for Large Area Land Cover Mapping. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2013, 6, 2088–2101. [Google Scholar] [CrossRef]
Farr, T.G.; Rosen, P.A.; Caro, E.; Crippen, R.; Duren, R.; Hensley, S.; Kobrick, M.; Paller, M.; Rodriguez, E.; Roth, L.; et al. The Shuttle Radar Topography Mission. Rev. Geophys. 2007, 45, RG2004. [Google Scholar] [CrossRef]
Wu, Q.; Zhong, R.; Zhao, W.; Song, K.; Du, L. Land-Cover Classification Using GF-2 Images and Airborne Lidar Data Based on Random Forest. Int. J. Remote Sens. 2019, 40, 2410–2426. [Google Scholar] [CrossRef]
Pettorelli, N.; Vik, J.O.; Mysterud, A.; Gaillard, J.-M.; Tucker, C.J.; Stenseth, N.C. Using the Satellite-Derived NDVI to Assess Ecological Responses to Environmental Change. Trends Ecol. Evol. 2005, 20, 503–510. [Google Scholar] [CrossRef]
Zhao, D.; Zhong, W.; Ge, M.; Jiang, W.; Zhu, X.; Arun, P.V.; Zhou, H. SiamBSI: Hyperspectral Video Tracker based on Band Correlation Grouping and Spatial-Spectral Information Interaction. Infrared Phys. Technol. 2025, 151, 106063. [Google Scholar] [CrossRef]
Zhao, D.; Xu, X.; You, M.; Arun, P.V.; Zhao, Z.; Ren, J.; Wu, L.; Zhou, H. Local Sub-block Contrast and Spatial-spectral Gradient Features Fusion for Hyperspectral Anomaly Detection. Remote Sens. 2025, 17, 695. [Google Scholar] [CrossRef]
Wang, J.; Xu, J. A Spectral-Spatial Attention Autoencoder Network for Hyperspectral Unmixing. In Proceedings of the IGARSS 2023-2023 IEEE International Geoscience and Remote Sensing Symposium; IEEE: Piscataway, NJ, USA, 2023; pp. 7519–7522. [Google Scholar]
Jin, D.; Yang, B. Graph attention convolutional autoencoder-based unsupervised nonlinear unmixing for hyperspectral images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 16, 7896–7906. [Google Scholar] [CrossRef]
Ozkan, S.; Kaya, B.; Akar, G.B. Endnet: Sparse autoencoder network for endmember extraction and hyperspectral unmixing. IEEE Trans. Geosci. Remote Sens. 2018, 57, 482–496. [Google Scholar] [CrossRef]
Dhaini, M.; Berar, M.; Honeine, P.; Van Exem, A. End-to-end convolutional autoencoder for nonlinear hyperspectral unmixing. Remote Sens. 2022, 14, 3341. [Google Scholar] [CrossRef]
Zhao, M.; Wang, M.; Chen, J.; Rahardja, S. Hyperspectral unmixing via deep autoencoder networks for a generalized linear-mixture/nonlinear-fluctuation model. arXiv 2019, arXiv:1904.13017. [Google Scholar]
Borsoi, R.A.; Imbiriba, T.; Bermudez, J.C.M.; Richard, C.; Chanussot, J.; Drumetz, L.; Tourneret, J.L.; Zare, A.; Jutten, C. Spectral variability in hyperspectral data unmixing: A comprehensive review. IEEE Geosci. Remote Sens. Mag. 2021, 9, 223–270. [Google Scholar] [CrossRef]
Somers, B.; Asner, G.P.; Tits, L.; Coppin, P. Endmember variability in spectral mixture analysis: A review. Remote Sens. Environ. 2011, 115, 1603–1616. [Google Scholar] [CrossRef]

Figure 1. The location and topographic characteristics (a) of the study area in the Inner Mongolia section of the Yellow River Basin. The blue stars

(⋆)

indicate Pure Endmember Zones (PEZ), and the red triangles

(▲)

indicate Ecologically Constrained Mixed Zones (ECMZ); both were utilized as samples for model training. Panels (b,c) illustrate examples of true-color Gaofen-2 (GF-2) imagery used for model validation.

Figure 1. The location and topographic characteristics (a) of the study area in the Inner Mongolia section of the Yellow River Basin. The blue stars

(⋆)

indicate Pure Endmember Zones (PEZ), and the red triangles

(▲)

indicate Ecologically Constrained Mixed Zones (ECMZ); both were utilized as samples for model training. Panels (b,c) illustrate examples of true-color Gaofen-2 (GF-2) imagery used for model validation.

Figure 2. Flowchart of ECO-DEAU framework. (a) Preprocessing of Sentinel-2 multispectral data and SRTM DEM data; (b) Construction and implementation of the Ecologically Constrained Deep Autoencoder (ECO-DEAU) model; (c) Abundance validation and accuracy assessment against the baseline DEAU model and GF-2 imagery.

Figure 3. Initial endmember spectral signatures for the five primary land cover types (grass, forest, cropland, bareland and impervious).

Figure 4. Architecture of Ecologically Constrained Deep Learning Autoencoder.

Figure 5. Comparison of endmember spectral signatures among the initialization, Baseline AE, and the proposed ECO-DEAU model. The dashed lines represent the initial endmembers extracted. The blue solid lines represent the results learned by the Baseline AE. The red solid lines represent the optimized endmembers derived from the proposed ECO-DEAU model.

Figure 6. Pixel-wise scatter density plots evaluating the unmixing accuracy in the Urban-Mountain Transition Zone (Hohhot). The plots compare the reference abundance derived from GF-2 imagery with the estimated abundance from the ECO-DEAU model for the five target endmembers: (a) Grassland, (b) Forest, (c) Cropland, (d) Barren land, (e) Impervious Surfaces and (f) Overall. The color gradient from dark blue to bright red corresponds to the transition from low to high pixel density.

Figure 7. Spatial distribution of abundance residuals in the Urban-Mountain Transition Zone (Hohhot) for the (a) Grassland, (b) Forest, (c) Cropland, (d) Barren land, (e) Impervious Surfaces and (f) Overall. Red indicates overestimation; blue indicates underestimation.

Figure 8. Pixel-wise scatter density plots evaluating the unmixing accuracy in the agricultural aggregation area (Bayannur). The plots compare the reference abundance derived from GF-2 imagery with the estimated abundance from the ECO-DEAU model for the five target endmembers: (a) Grassland, (b) Forest, (c) Cropland, (d) Barren land, (e) Impervious Surfaces and (f) Overall. The color gradient from dark blue to bright red corresponds to the transition from low to high pixel density.

Figure 9. Spatial distribution of abundance residuals in the agricultural aggregation area (Bayannur) for the (a) Grassland, (b) Forest, (c) Cropland, (d) Barren land, (e) Impervious Surfaces and (f) Overall. Red indicates overestimation; blue indicates underestimation.

Figure 10. Spatial distribution patterns of unmixed fractional abundances for the five target endmembers across the study area. (a) Grassland, (b) Forest, (c) Cropland, (d) Barren Land, and (e) Impervious Surfaces. The background color gradient represents the pixel-wise abundance fraction ranging from 0 (low) to 1 (high). Red circles highlight representative high-abundance “hotspots”, while blue circles indicate low-abundance.

Table 1. Basic Information of Sample Zones.

Sample Zone	Mixing Type	Location
PEZ_1	Forest	Daqingshan Core Reserve
PEZ_2	Cropland	Core of Hetao Irrigation District
PEZ_3	Impervious	Baotou Old City District
PEZ_4	Bareland	Hinterland of Kubuqi Desert
PEZ_5	Grassland	Daqingshan Mountain Meadow
ECMZ_1	Forest/Grassland/Bareland	Southern Foothills of Daqingshan
ECMZ_2	Cropland/Grassland/Bareland	Hetao Irrigation District
ECMZ_3	Bareland/Grassland/Cropland	Eastern Edge of Ulan Buh Desert
ECMZ_4	Impervious/Cropland/Bareland	Urban-Rural Fringe of Hohhot
ECMZ_5	Bareland/Grassland	Mining Area of Daqingshan

Table 2. Quantitative comparison of unmixing accuracy between the proposed ECO-DEAU model and the Baseline AE models, validated against GF-2 reference abundance maps. The bolded number indicates that this model achieved the best performance among all six models. The underlined number indicates that this model performed second-best among all six models.

Land Cover Type	Model	$R^{2}$	RMSE	MAE
Grassland	ECO-DEAU	0.623	0.179	0.134
	Baseline AE	0.569	0.192	0.146
	Attention-AE	0.553	0.196	0.155
	Sparse-AE	0.541	0.198	0.154
	1DCNN-AE	0.311	0.243	0.191
	FCLS	0.287	0.247	0.195
Forest	ECO-DEAU	0.593	0.157	0.102
	Baseline AE	0.561	0.163	0.106
	Attention-AE	0.628	0.136	0.080
	Sparse-AE	0.623	0.137	0.080
	1DCNN-AE	0.441	0.167	0.098
	FCLS	0.456	0.165	0.096
Cropland	ECO-DEAU	0.802	0.075	0.034
	Baseline AE	0.766	0.081	0.037
	Attention-AE	0.745	0.084	0.041
	Sparse-AE	0.777	0.079	0.039
	1DCNN-AE	0.537	0.114	0.050
	FCLS	0.536	0.114	0.050
Bareland	ECO-DEAU	0.592	0.125	0.087
	Baseline AE	0.551	0.130	0.083
	Attention-AE	0.501	0.138	0.093
	Sparse-AE	0.496	0.138	0.094
	1DCNN-AE	0.314	0.162	0.109
	FCLS	0.303	0.163	0.111
Impervious	ECO-DEAU	0.825	0.090	0.049
	Baseline AE	0.779	0.101	0.057
	Attention-AE	0.717	0.115	0.067
	Sparse-AE	0.701	0.118	0.069
	1DCNN-AE	0.498	0.153	0.088
	FCLS	0.486	0.155	0.089
Overall	ECO-DEAU	0.749	0.131	0.079
	Baseline AE	0.645	0.133	0.086
	Attention-AE	0.708	0.138	0.086
	Sparse-AE	0.704	0.139	0.087
	1DCNN-AE	0.548	0.172	0.107
	FCLS	0.542	0.174	0.107

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhou, L.; Li, L.; Li, D.; Bo, Y.; Li, H.; Liu, K.; Wang, S. ECO-DEAU: An Ecologically Constrained Deep Learning Autoencoder for Sub-Pixel Land Cover Unmixing in Arid and Semi-Arid Regions. Remote Sens. 2026, 18, 941. https://doi.org/10.3390/rs18060941

AMA Style

Zhou L, Li L, Li D, Bo Y, Li H, Liu K, Wang S. ECO-DEAU: An Ecologically Constrained Deep Learning Autoencoder for Sub-Pixel Land Cover Unmixing in Arid and Semi-Arid Regions. Remote Sensing. 2026; 18(6):941. https://doi.org/10.3390/rs18060941

Chicago/Turabian Style

Zhou, Leixuan, Long Li, Dehui Li, Yong Bo, Hang Li, Kai Liu, and Shudong Wang. 2026. "ECO-DEAU: An Ecologically Constrained Deep Learning Autoencoder for Sub-Pixel Land Cover Unmixing in Arid and Semi-Arid Regions" Remote Sensing 18, no. 6: 941. https://doi.org/10.3390/rs18060941

APA Style

Zhou, L., Li, L., Li, D., Bo, Y., Li, H., Liu, K., & Wang, S. (2026). ECO-DEAU: An Ecologically Constrained Deep Learning Autoencoder for Sub-Pixel Land Cover Unmixing in Arid and Semi-Arid Regions. Remote Sensing, 18(6), 941. https://doi.org/10.3390/rs18060941

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

ECO-DEAU: An Ecologically Constrained Deep Learning Autoencoder for Sub-Pixel Land Cover Unmixing in Arid and Semi-Arid Regions

Highlights

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Datasets

2.2.1. Multi-Temporal Remote Sensing Data

2.2.2. Topographic Elevation Data

2.2.3. Land Cover Abundance Validation Data

2.3. Ecologically Constrained Deep Autoencoder (ECO-DEAU) Method

2.3.1. Endmember Extraction and Weight Initialization

2.3.2. Architecture of Ecologically Constrained Deep Autoencoder (ECO-DEAU)

2.3.3. Objective Functions

2.4. Estimating Abundance from Sentinel-2 Using ECO-DEAU

2.5. Comparison Methods

3. Results

3.1. Comparison with Baseline Methods

3.1.1. Endmember Spectral Visualization

3.1.2. Quantitative Comparison of Unmixing Accuracy Between the Proposed ECO-DEAU Model and the Baseline AE Models

3.2. Accuracy Assessment with GF-2 Imagery Abundance

3.2.1. Accuracy Assessment in Urban-Mountain Transition Zone, Hohhot

3.2.2. Accuracy Assessment in Agricultural Aggregation Area, Bayannur

3.3. Spatial Distribution Patterns of Land Cover Abundances in the Study Area

4. Discussion

4.1. Reliability of Data and Validation Strategy

4.2. Reliability of Model Architecture and Ecological Constraints

4.3. Transferability of ECO-DEAU

4.4. Uncertainties and Limitations

4.5. Future Work

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI