Lightweight Depthwise Autoregressive Convolutional Surrogate for Efficient Joint Inversion of Hydraulic Conductivity and Time-Varying Contaminant Sources

Hu, Caiping; Gao, Shuai; Zhao, Yule; Yu, Dalu; Liu, Chunwei; Xu, Qingyu; Jiang, Simin; Xia, Xuemin

doi:10.3390/w18030380

Open AccessArticle

Lightweight Depthwise Autoregressive Convolutional Surrogate for Efficient Joint Inversion of Hydraulic Conductivity and Time-Varying Contaminant Sources

by

Caiping Hu

^1,2,3,

Shuai Gao

^1,2,3

,

Yule Zhao

⁴,

Dalu Yu

^1,2,3,

Chunwei Liu

^1,2,3,

Qingyu Xu

^1,2,3,

Simin Jiang

⁵ and

Xuemin Xia

^1,2,4,*

¹

Shandong Provincial Geo-Mineral Engineering Exploration Institute, Shandong Provincial Bureau of Geology & Mineral Resources, Jinan 250014, China

²

Shandong Engineering Research Center for Environmental Protection and Remediation on Groundwater, Jinan 250014, China

³

Key Laboratory of Geological Disaster Risk Prevention and Control, Emergency Management Department of Shandong Province, Jinan 250014, China

⁴

School of Environment and Architecture, University of Shanghai for Science and Technology, Shanghai 200083, China

⁵

College of Civil Engineering, Tongji University, Shanghai 200092, China

^*

Author to whom correspondence should be addressed.

Water 2026, 18(3), 380; https://doi.org/10.3390/w18030380

Submission received: 30 December 2025 / Revised: 20 January 2026 / Accepted: 28 January 2026 / Published: 2 February 2026

(This article belongs to the Section Water Quality and Contamination)

Download

Browse Figures

Versions Notes

Abstract

Accurate joint estimation of heterogeneous hydraulic conductivity fields and time-varying contaminant source parameters in groundwater systems constitutes a challenging high-dimensional inverse problem, particularly under sparse observational conditions and high computational demands. To alleviate this limitation, this study proposes an autoregressive depthwise convolutional neural network (AR-DWCNN) as a lightweight surrogate model for coupled groundwater flow and contaminant transport simulations. The proposed model employs depthwise separable convolutions and dense connectivity within an encoder–decoder framework to capture nonlinear flow and spatiotemporal transport dynamics while reducing model complexity and computational demand relative to conventional convolutional architectures. The AR-DWCNN is further integrated with an enhanced Iterative Local Updating Ensemble Smoother incorporating Levenberg–Marquardt regularization, enabling efficient joint inversion of high-dimensional hydraulic conductivity fields and multi-period contaminant source strengths. Numerical experiments conducted on a synthetic two-dimensional heterogeneous aquifer demonstrate that the surrogate-assisted inversion framework achieves posterior estimates that closely match those obtained using the numerical forward model, while significantly improving computational efficiency. These results indicate that the AR-DWCNN-based inversion method provides an effective and scalable solution for high-dimensional groundwater contaminant transport inverse problems, offering practical potential for uncertainty quantification and remediation design in complex subsurface systems.

Keywords:

groundwater contamination; surrogate modeling; depthwise separable convolution; autoregressive neural network; inverse modeling

1. Introduction

The effective management of groundwater contamination requires the joint estimation of contaminant source parameters and hydraulic conductivity fields to accurately assess the impacts and design remediation strategies [1,2,3]. Given the inaccessibility and complexity of subsurface systems, direct measurement of these parameters is impractical, necessitating reliance on sparse indirect observations such as hydraulic heads and contaminant concentrations from limited monitoring wells. This inference constitutes a high-dimensional inverse problem that demands repeated execution of a numerical model coupling groundwater flow and solute transport until the simulated responses adequately match the observations [4,5]. Substantial aquifer heterogeneity, driven by incomplete knowledge of geological structures, necessitates representing hydraulic properties and release histories with a large number of parameters, leading to a super-linear rise in computational expense for joint estimation [6,7]. These challenges highlight the urgent need for efficient surrogate modeling and inversion techniques to achieve reliable parameter estimation under realistic observational constraints.

Given such limited data resources and a vast search space for finding optimal solutions, various methods have been proposed. Traditional optimization-based approaches, such as least square regression [8], nonlinear programming [9], and hybrid optimization with genetic algorithms [10,11], can provide acceptable predictions for release histories but often fail to fully quantify uncertainties in the inverse results [12]. Recent studies have extended traditional optimization to metaheuristic algorithms for parameter estimation in transport models [13,14]. A novel inverse model based on teaching learning-based optimization (TLBO) was developed for the continuous time random walk-truncated power law (CTRW-TPL) model, demonstrating low sensitivity to initial guesses and higher accuracy in estimating key parameters compared with the standard CTRW MATLAB R2017a toolbox, across synthetic, experimental, and direct numerical simulation breakthrough data in porous and fractured media [13]. In contrast, statistical methods, including data assimilation with extended and ensemble Kalman filters [15,16,17], Bayesian inference based on Markov chain Monte Carlo (MCMC) [18], and geostatistical approaches combined with adjoint methodologies [19], offer capabilities for uncertainty estimation. Ensemble-based data assimilation methods have therefore emerged as an efficient alternative, combining Monte Carlo sampling with sequential or batch data incorporation to jointly estimate parameters and their uncertainties at manageable cost [20,21,22]. Ensemble Smoother (ES) and its iterative variants assimilate all observations in a single global update while avoiding state-parameter inconsistencies seen in the Ensemble Kalman Filter [23]. To enhance the robustness in strongly nonlinear and multimodal problems common in contaminant transport, Zhang et al. (2018) developed the Iterative Local Updating Ensemble Smoother (ILUES), which conducts localized ensemble updates using a combined parameter-and-response distance metric to mitigate ensemble collapse and improve convergence [24].

To address these computational burdens, surrogate-based approaches have emerged as cost-effective approximations of the input–output relationships in computationally intensive forward simulators. Recently, deep neural network (DNN)-based surrogates have gained prominence for their ability to capture complex mappings [25,26,27]. Zhu and Zabaras (2018) pioneered image-to-image regression using fully convolutional networks (FCNs) to directly model high-dimensional inputs and outputs as images, transforming surrogate construction into an image regression task and yielding efficient solutions for intricate problems [28]. This idea has since been widely adopted and extended into various efficient encoder–decoder architectures, such as the Attention U-Net for steady-state flow fields [29] and the U-Net for transient flow inversion [30]. These models achieve end-to-end learning, reducing the computational time by orders of magnitude while preserving the accuracy.

Further developments have focused on handling temporal dynamics in contaminant transport. Mo et al. introduced deep autoregressive strategy in a dense convolutional encoder–decoder to surrogate groundwater contaminant transport, coupling it with ILUES for high-dimensional parameter estimation [31], while Bai and Tahmasebi (2022) combined this autoregressive idea with a transformer-based model for spatiotemporal forecasting [25]. Extensions include residual dense networks for improved feature reuse in multi-source scenarios [32], ResNet surrogates with enhanced particle filters for DNAPL characterization [33], BPNN surrogates coupled with optimizers for monitoring under uncertainty [34], and hybrid kernel models for sensitivity-driven inversion [35]. These innovations collectively advance surrogates from static snapshots to dynamic full-field time-series approximations, demonstrating improved accuracy in capturing nonlinear dynamics and uncertainties.

Many existing convolutional or recurrent architectures rely on a large number of trainable parameters and dense feature transformations, which substantially increase the training cost and memory requirements when high-resolution spatial fields and long temporal horizons are considered [36]. To address these challenges, lightweight convolutional architectures have emerged as a promising alternative. Depthwise separable convolution, originally developed to reduce computational complexity in computer vision tasks, decomposes standard convolution into channel-wise spatial filtering and pointwise feature fusion, leading to a significant reduction in parameter count and floating-point operations without altering the representational capacity of the network [37]. When combined with dense connectivity and encoder–decoder structures, depthwise convolution enables the efficient extraction and reuse of multiscale spatial features, making it well suited for surrogate modeling of groundwater flow and contaminant transport processes characterized by spatial heterogeneity.

In parallel, accurately capturing temporal dynamics remains a central challenge for surrogate models of contaminant transport. Autoregressive learning strategies provide a flexible framework for modeling transient evolution by explicitly conditioning future states on previously predicted system responses. Compared to recurrent architectures, autoregressive convolutional models avoid sequential backpropagation through time and offer improved numerical stability and scalability for long-term simulations. When integrated with a convolution-based surrogate, this strategy allows the generation of spatially continuous concentration fields over the entire simulation domain at each time step, rather than being limited to sparse observation points.

Despite the growing adoption of surrogate models, their effective integration with inverse modeling frameworks remains a challenging task. Ensemble-based data assimilation methods such as ILUES have demonstrated robustness in addressing strong nonlinearity and multimodality by employing localized ensemble updates. However, convergence can be sensitive to ensemble size and the degree of model nonlinearity, particularly in high-dimensional parameter spaces involving both hydraulic conductivity fields and time-dependent source information. Incorporating Levenberg–Marquardt regularization into the iterative ensemble smoother provides an adaptive mechanism between gradient-based correction and stabilization, enhancing the convergence behavior and mitigating ensemble collapse in ill-posed inverse problems [38].

Motivated by these considerations, an autoregressive depthwise convolutional neural network is developed as a computationally efficient surrogate to replace the forward coupled groundwater flow and contaminant transport model. The proposed network preserves the spatial resolution of the simulation domain and enables continuous temporal prediction through an autoregressive formulation, while substantially reducing the computational cost through depthwise separable convolution. This surrogate is further integrated with an improved Iterative Local Updating Ensemble Smoother with Levenberg–Marquardt regularization (ILUES-LM) to jointly estimate contaminant source characteristics and heterogeneous hydraulic conductivity fields under sparse observation conditions. The combined framework offers a balanced combination of predictive accuracy, uncertainty quantification, and computational efficiency, thereby enhancing the practicality of high-dimensional groundwater model parameters joint inversion.

The remainder of this study is organized as follows: Section 2 describes the methodology, including the governing equations, the construction of the AR-DWCNN surrogate, and the formulation of the ILUES-LM inversion algorithm; Section 3 presents the case study setup and evaluation metrics; Section 4 discusses the results; and Section 5 concludes with key findings and future directions.

2. Methods

2.1. Numerical Simulation Model for Groundwater Flow and Contaminant Transport

The forward model governing the groundwater flow and contaminant transport processes is described by a set of partial differential equations, comprising the flow equation and the advection–dispersion equation. In this study, two-dimensional, steady-state saturated flow in an aquifer is considered, and the flow equation is expressed as

\nabla \cdot (K \nabla h) + W = 0,

(1)

where

K

is the hydraulic conductivity [LT⁻¹],

h

is the hydraulic head [L], and W denotes the volumetric flux per unit volume representing sources (positive) or sinks (negative) [T⁻¹].

The Darcy velocity q [LT⁻¹] is calculated using Darcy’s law:

q = - K \nabla h .

(2)

The actual velocity v [LT⁻¹] is then obtained by dividing the Darcy velocity by the effective porosity θ (dimensionless):

v = - (K / θ) \nabla h .

(3)

For the contaminant transport, advection and dispersion are assumed to be the dominant mechanisms, with adsorption, chemical reactions, and molecular diffusion neglected due to their relatively minor contributions in the scenarios considered in this proof-of-concept study. The advection–dispersion equation for contaminant concentration C [ML⁻³] is given by

\partial (θ C) / \partial t + \nabla \cdot (θ v C) - \nabla \cdot (θ D \nabla C) - C_{s} W = 0,

(4)

where C is the contaminant concentration [ML⁻³], D is the hydrodynamic dispersion coefficient tensor [L² T⁻¹], and C_s is the concentration associated with sources or sinks [M L⁻³].

The hydrodynamic dispersion coefficient tensor D is defined as [39]:

D_{i j} = α_{T} |v| δ_{i j} + (α_{L} - α_{T}) v_{i} v_{j} / |v|,

(5)

where

α_{L}

and

α_{T}

denote the longitudinal and transverse dispersivities [L], respectively;

δ_{i j}

is the Kronecker delta;

v_{i}

and

v_{j}

are the components of the velocity vector in the i and j directions; and

|v|

is the magnitude of the velocity vector.

In this study, molecular diffusion, adsorption, decay, and chemical reactions are neglected in the advection–dispersion equation (Equation (4)), as their contributions are assumed to be minor relative to advection and dispersion in the synthetic scenarios considered, following common simplifications in similar inverse modeling studies [28,31]. This assumption limits the model’s applicability to conservative contaminants but allows focus on the core challenges of high-dimensional inversion under sparse observations. The implications of this simplification for the inversion results and applicability to reactive contaminants are further discussed in Section 4.3.

The groundwater flow and contaminant transport equations are numerically solved using MODFLOW [40] coupled with MT3DMS [41]. In this study, uncertainty arises from the spatially heterogeneous log-hydraulic conductivity field and the time-varying contaminant source release history, while porosity, dispersivities, and initial and boundary conditions are assumed known and fixed. Directly coupling the full MODFLOW–MT3DMS simulations with iterative ensemble-based inversion is computationally prohibitive. To address this, a machine-learning surrogate model is employed as the forward operator, to approximate the relationships between input parameters and output head and concentration fields while significantly reducing the computational costs.

2.2. Autoregressive Depthwise Convolutional Neural Network

To efficiently emulate the nonlinear time-dependent dynamics of groundwater flow and contaminant transport, especially for scenarios with time-varying contaminant sources, this study proposes a lightweight autoregressive depthwise convolutional neural network (AR-DWCNN). The theoretical foundation of the proposed method integrates three key concepts: (1) convolutional encoder–decoder learning for spatially distributed physical fields, (2) depthwise separable convolution for computational efficiency, and (3) autoregressive modeling for temporal dependency representation.

Groundwater flow and contaminant transport processes are governed by coupled partial differential equations in space and time, resulting in strong spatial and temporal correlations. Convolutional neural networks (CNNs) are well suited for learning spatial patterns from gridded physical fields (e.g., hydraulic conductivity and concentration maps), while autoregressive formulations naturally reflect the time-marching nature of transport dynamics.

The proposed AR-DWCNN is conceptually built upon the deep autoregressive neural network framework introduced by Mo et al. (2019) [31] for spatiotemporal sequence prediction. In this framework, the system state at a future time step is modeled as a nonlinear function of the current physical inputs and the system state at previous time steps, enabling the network to approximate the underlying temporal evolution operator of the transport process.

2.2.1. Encoder–Decoder Architecture with Dense Connectivity

Groundwater flow and contaminant transport processes are governed by spatially distributed physical parameters and state variables, which are naturally represented as gridded fields. The AR-DWCNN employs a typical encoder–decoder architecture to capture spatial characteristics from high-dimensional input images (e.g., hydraulic conductivity fields K, and source strength S) and map them to output images (e.g., hydraulic head and contaminant concentration fields).

In the encoder, spatial features are progressively extracted through a series of downsampling operations, allowing the network to capture spatial patterns. The decoder then reconstructs the target output fields by gradually restoring spatial resolution via upsampling layers.

Deeper neural networks generally enhance the data-fitting capabilities but can exacerbate overfitting and convergence difficulties. To mitigate these issues while improving parameter efficiency and reducing the need for extensive training samples, a dense-connected structure (dense block) is adopted, following Huang et al. (2017) and Zhu and Zabaras (2018) [28,42]. In this structure, each layer receives the feature maps from all preceding layers as input, and its own output feature maps serve as inputs to all subsequent layers. This dense connectivity minimizes information loss, promotes feature reuse, and maximizes propagation efficiency without significantly increasing the network parameters, thereby enhancing the training efficiency of the surrogate model.

The information flow in a dense block can be expressed as

x_{l} = G ([x_{0}, x_{1}, \dots, x_{l - 1}]),

(6)

where

[\cdot]

denotes concatenation of the output feature maps from all preceding layers, requiring uniform feature map sizes. G represents a composite operation of batch normalization (BN), ReLU activation, and convolution. The block is parameterized by the number of layers (depth) and growth rate k (output channels per layer). As dense connectivity can lead to rapid growth in the number of feature maps, transition blocks are inserted between adjacent dense blocks. These blocks use 1 × 1 convolutions to compress the channel dimensions and adjust the spatial resolution. In the encoder, transition blocks perform downsampling via strided DS-Conv layers, whereas in the decoder, transposed DS-Conv layers are used for upsampling. This design ensures a balanced trade-off between model expressiveness and computational efficiency.

2.2.2. Depthwise Separable Convolution

A key innovation of the proposed AR-DWCNN is the systematic replacement of standard convolutional layers with depthwise separable convolutions (DS-Conv), originally introduced by Howard et al. (2017) to enable efficient deep learning on resource-constrained platforms [37]. In a standard convolution, a set of three-dimensional kernels is applied across all input channels simultaneously. For an input tensor of size (H × W × C) (height, width, and number of input channels), a kernel size of (D × D), and (M) output channels, the number of trainable parameters is (C × N × D × D), and the computational cost scales as approximately (H × W × C × M × D × D).

In contrast, DS-Conv factorizes this operation into two sequential steps (Figure 1). First, a depthwise convolution applies an independent (D × D) spatial filter to each of the (C) input channels, producing (C) intermediate feature maps. This step focuses exclusively on spatial feature extraction within each channel and requires only (C × D × D) parameters, with a computational cost of (H × W × C × D × D). Second, a pointwise convolution performs a (1 × 1) convolution across the intermediate channels to generate (M) output channels, enabling linear channel-wise mixing and combination of features. This step introduces (C × M) parameters and a cost of (H × W × C × M).

The total number of parameters for DS-Conv is therefore (C × D × D + C × M), and the total computational cost is (H × W × (C × D × D + C × M)). Compared with standard convolution, this corresponds to a reduction factor of approximately (1/M + 1/D²). From a theoretical perspective, this factorization preserves expressive power because the depthwise convolution effectively captures local spatial correlations (such as heterogeneity patterns in hydraulic conductivity or concentration fields) independently per channel, while the pointwise convolution subsequently recombines these spatially filtered features across channels. As a result, DS-Conv closely approximates the cross-channel interactions of standard convolution with substantially lower computation.

2.2.3. Autoregressive Strategy for Temporal Dependencies

When the source strength varies with time, the concentration field at a given target time j depends not only on the static hydraulic conductivity field K and the current source intensity S but also on the concentration fields from all previous time steps. To accurately capture these temporal dependencies, an autoregressive (AR) strategy is integrated into the network, adapting the approach commonly used in time-series forecasting. The concentration field at time step j is thus predicted as

c_{j} = f_{θ} (K, S_{j}, c_{j - 1}),

(7)

where

f_{θ}

denotes the parameterized AR-DWCNN. Compared with a conventional non-autoregressive model, this formulation dynamically incorporates the output from the previous time step, enabling more precise representation of the complex input–output relationships in sequential transport processes. Once the network

f_{θ}

is trained, this autoregressive loop enables prediction of the full time series of concentration fields

(C_{1}, C_{2}, \dots C_{j})

for conductivity K and time-varying source term

(S_{1}, S_{2}, \dots S_{j})

. The prediction process follows the autoregressive sequence illustrated in Figure 2, where each time step’s output becomes the input for the next step, forming a computationally efficient, purely data-driven time-marching scheme.

Training samples are generated from high-fidelity forward simulations. Each simulation run, producing outputs at j time step, is reformatted into j − 1 autoregressive training pairs: the input consists of the static field K, the current source intensity S_j, and the previous concentration field C_j−1, paired with the target concentration field C_j. For N independent simulations, this yields N × (j − 1) training samples, substantially augmenting the effective dataset size without additional forward model evaluations.

2.2.4. Network Architecture and Training Configuration

A schematic of the complete AR-DWCNN architecture is presented in Figure 3. The encoding path initiates with a DS-Conv layer (“In Conv”) that applies a depthwise separable convolution (7 kernels, stride 2, padding 3) to perform early downsampling and increase the feature depth, transforming the input tensor (C_in channels, H × W spatial size) into a higher-channel feature map at half resolution (F₀ channels, H/2 × W/2).

The encoder path (dashed box on the upper in Figure 3) consists of stacked dense blocks with progressive feature growth. As shown in the diagram, each dense block adds k new feature maps per layer (k = growth rate), resulting in cumulative channel sizes such as F₀, F₀ + k, F₀ + 2k… F₀ + L₁k (L₁ = the total number of layers in this block). Curved arrows represent the dense connectivity: every layer’s output is concatenated to the input of all subsequent layers within the block, promoting extensive feature reuse and gradient flow.

Transition layers between dense blocks (often placed between dense blocks in the encoder) apply strided DS-Conv for further downsampling (e.g., from H/2 × W/2, finally to H/8 × W/8), compressing channels and controlling the computational load. The input to each transition block is therefore the output of the preceding dense block.

The decoder path (dashed box on the lower in Figure 3) begins with a dense block to deal with the feature maps and then progressively upsamples using transposed depthwise separable convolutions in transition blocks to restore spatial resolution step by step (e.g., H/8 × W/8 → H/4 × W/4 → H/2 × W/2 → H × W). Decoder dense blocks also employ dense connectivity with the same growth rate k. The decoder dense block (with L_d1 layers) receives the upsampled features and produces output channels that accumulate as F_d = F_in + L_d1k.

The final output projection layer reduces the multi-channel features from the last decoder block to the desired output channels (C_out: hydraulic head + contaminant concentration). Key layer components are described at the bottom of Figure 3: Depthwise Convolution (per-channel spatial filtering), Pointwise Convolution (1 × 1 channel-wise mixing), Batch Normalization, and ReLU activation, which constitute the core building block of every DS-Conv operation throughout the network.

The architecture demonstrates a progressive learning process in which multi-scale spatial features are extracted from the input fields in the encoder, compacted and deepened via transition layers, and subsequently reconstructed in the decoder to yield accurate full-domain predictions of head and concentration. The encoder contains two successive dense blocks, while the decoder incorporates one dense block. The number of layers in the three dense blocks (L₁, L₂, and L_d1) is set to 5, 10, and 5, respectively, and the growth rate k is determined as 40 following the configuration used in the reference AR-Net of Mo et al., 2019 [31]. This selection of hyperparameters ensures consistency with the AR-Net architecture and enables a fair and direct comparison of performance and computational cost across the proposed method and AR-Net.

The input tensor at each autoregressive step is thus x_i ∈ ℝ^3×40×80 (K field + source encoding + last time-step concentration), while the output is y_i ∈ ℝ^2×40×80 (current head and concentration). The final output layer uses Softplus activation for concentration to enforce physical non-negativity and Sigmoid (after [0, 1] normalization) for head to respect bounded hydraulic potential.

The network is trained end-to-end using the loss:

L = \frac{1}{N} \frac{1}{T} \sum_{i = 1}^{N} \sum_{t = 1}^{T} {‖c_{i, t} - {\hat{c}}_{i, t}‖}_{2}^{2} .

(8)

Optimization is performed with Adam (initial learning rate 0.005, batch size 200). A reduce-on-plateau scheduler decreases the learning rate by a factor of 10 if the validation loss does not improve for 12 consecutive epochs. Training proceeds for a maximum of 200 epochs.

2.3. Iterative Local Updating Ensemble Smoother with Levenberg–Marquardt Regularization (ILUES-LM)

This study proposes an improved ensemble-based data assimilation algorithm, termed Iterative Local Updating Ensemble Smoother with Levenberg–Marquardt Regularization (ILUES-LM), to jointly identify the contaminant source and hydraulic conductivity field. The method builds directly upon the original ILUES framework developed by Zhang et al. (2018) by introducing an adaptive Levenberg–Marquardt (LM) regularization into the local ensemble update step, while preserving the local updating strategy and the simple iterative scheme [24].

The forward model relationship remains

d = F (m) + ε,

(9)

where

d \in R^{N_{d}}

is the vector of observations,

m \in R^{N_{m}}

represents the unknown model parameters,

F (\cdot)

is the forward simulation model, and

ε ~ N (0, C_{D})

represents the measurement errors with covariance matrix

C_{D}

.

An initial ensemble of

N_{e}

prior realizations

M_{f} = [m_{f}^{1}, m_{f}^{2}, \dots, m_{f}^{N_{e}}]

is sampled from the prior distribution, and the corresponding model outputs

D_{f} = F (M_{f})

are generated. For each ensemble member

m_{f}^{i}

, a local ensemble of size

N_{l o c} = α N_{e} (α \in (0,1))

is first constructed by selecting the members that minimize the weighted distance metric

J (m) = J_{1} (m) / J_{1}^{m a x} + J_{2} (m) / J_{2}^{m a x}

(10)

With

J_{1} (m) = {[F (m) - d]}^{T} C_{D}^{- 1} [F (m) - d]

and

J_{2} (m) = {[m - m_{f}^{j}]}^{T} C_{M M}^{- 1} [F (m) - m_{f}^{j}]

,

C_{M M}

is the auto-covariance of m. Within the local ensemble, the update is performed using the Levenberg–Marquardt-regularized analysis:

m_{a}^{j} = m_{f}^{j} + C_{l o c, f}^{M D} {(C_{l o c, f}^{D D} + C_{D} + λ_{i} I)}^{- 1} (d^{j} - F (m_{f}^{j})),

(11)

where

C_{l o c, f}^{M D}

and

C_{l o c, f}^{D D}

are the cross-variance between model parameter and observation and the auto-covariance of the observation, respectively, computed in the local ensemble.

λ_{i} \geq 0

is the damping parameter applied to the

N_{d} \times N_{d}

identity matrix I. The damping parameter

λ_{i}

is adjusted adaptively using a trust-region approach. After a candidate update

m_{a}^{c a n d}

is computed, the gain ratio is given as

ρ = (J (m_{f}^{j}) - J (m_{a}^{c a n d})) / (L (m_{f}^{j}) - L (m_{a}^{c a n d})),

(12)

where

L (\cdot)

is the local quadratic model approximated by the LM step. If

ρ > 0.75

, the update is accepted, and

λ

is reduced (

λ_{i + 1} = λ_{i} / 3

); if

ρ < 0.25

, the update is rejected,

λ

is increased (

λ_{i + 1} = 3 λ_{i}

), and Equation (10) is resolved; otherwise, the update is accepted with an unchanged

λ

. Typically, 1–3 inner iterations are sufficient. The initial

λ_{1}

is set between 0.1 and 1.0, and the final

λ

from the previous global iteration is carried forward as the starting value for the next iteration. Finally, one updated member is randomly selected from the local updated ensemble to become the new global member

m_{a}^{j}

, and the updated global ensemble

M_{a}

is assembled and assigned to

M_{f}

for the next global iteration.

Upon completion of

I_{m a x}

iterations, the final ensemble

M_{a}

provides the posterior parameter samples and associated uncertainty estimates. By introducing adaptive Levenberg–Marquardt damping into the localized update, ILUES-LM achieves significantly improved stability, prevents ensemble collapse, and substantially accelerates convergence compared with the standard ensemble smoother method, making it particularly effective for nonlinear contaminated inverse problems in hydrogeology and environmental engineering.

3. Numerical Experiment

3.1. Case Setup

A synthetic two-dimensional numerical case study is employed to demonstrate the accuracy and efficiency of the proposed surrogate model in replacing forward simulations for groundwater flow and contaminant transport, as well as facilitating parameter inversion. The flow regime is assumed to be a steady-state saturated flow, as described in Equation (1), to focus on the dominant lateral advection processes while simplifying the transient effects. As illustrated in Figure 4, the hypothetical aquifer domain spans 20 × 10 [L] and is uniformly discretized into 80 × 40 square grid cells, each measuring 0.25 × 0.25 [L]. Constant head boundaries are prescribed on the left and right sides with values of 9 [L] and 8 [L], respectively, while no-flow conditions are imposed on the upper and lower boundaries to simulate lateral flow dominance.

A contaminant source with time-varying strengths is incorporated to introduce temporal complexity, located at coordinates (3, 5) [L]. The source is modeled as a continuous injection, characterized by mass-loading rates [MT⁻¹]. The injection occurs over four consecutive stress periods, namely [1–2], [3–4], [5–6], and [7–8] [T]. Within each stress period, the contaminant is continuously released at a constant rate, while the rate varies between periods, resulting in a total of 4 unknown source parameters. Table 1 provides the reference time-varying source strengths and their prior distributions, from which the simulation values are sampled. The known parameters include an effective porosity of 0.3, longitudinal dispersivity

α_{L} = 1.5

[L], and transverse dispersivity

α_{T} = 0.15

[L]. The observation data consist of 135 measurements with 5% random error, collected at 15 strategically placed observation wells (as shown in Figure 4). These include 15 hydraulic head observations and 120 contaminant concentration readings recorded at t = [2, 4, 6, 8, 10, 12, 14, 16] [T]. This setup is intentionally adopted as a proof-of-concept study, primarily to enable rigorous evaluation of the proposed lightweight AR-DWCNN surrogate model and its integration with the ILUES-LM algorithm in a high-dimensional joint inversion context under sparse observation conditions, while keeping the computational burden manageable for generating training datasets and performing extensive ensemble-based inversion.

The hydraulic conductivity field K is modeled as a log-Gaussian random field, expressed as K(x, y) = exp(G(x, y)), where G(x, y) follows a normal distribution N(m(x, y), C) with a mean m (x, y) = 2.0. The covariance structure is parameterized by an exponential variogram with variance σ² = 0.5 and correlation lengths

λ_{x} = 6

[L] and

λ_{y} = 3

[L]. To enhance the computational efficiency, the Karhunen–Loève expansion (KLE) is applied for dimensionality reduction of the conductivity field, approximating it as

l n K \approx 〈l n K〉 + \sum_{i = 1}^{N_{K L E}} ξ_{i} \sqrt{τ_{i}} f_{i} (x),

(13)

where

ξ_{i}

are independent standard Gaussian variables,

τ_{i}

and

f_{i} (x)

are the eigenvalues and eigenfunctions of the correlation function, and

〈l n K〉

denotes the mean. Retaining the first 317 KLE terms preserves approximately 97% of the total variance. Consequently, the inversion process targets a total of 321 parameters: 317 KLE terms for the hydraulic conductivity field plus the 4 source strength parameters.

To construct the surrogate model using the proposed AR-DWCNN, input images are generated to represent the contaminant source locations and strengths alongside the hydraulic conductivity field. The network processes these inputs to predict output images of hydraulic heads and contaminant concentrations. The training dataset comprises 1500 samples, each consisting of a randomly generated hydraulic conductivity realization (via KLE), randomly sampled source strength parameters from the prior distribution in Table 1, and the corresponding head and concentration fields at selected time steps computed by the numerical forward model. An independent testing dataset of 500 randomly generated samples, not used during training, is reserved to evaluate the generalization performance of the trained surrogate model.

3.2. Evaluation Indicators

The predictive performance of the proposed AR-DWCNN surrogate model is quantitatively evaluated using a comprehensive set of statistical metrics to assess the agreement between the surrogate predictions and high-fidelity numerical simulations. These include the widely adopted coefficient of determination (R²), and the root mean square error (RMSE), as well as additional indicators such as mean bias error (MBE), standard deviation of errors (SD), t-statistic (TS), uncertainty at the 95% confidence level (U95), and a global performance indicator (GPI) [14].

The coefficient of determination R² is defined as

R^{2} = 1 - \sum_{i = 1}^{n_{t}} {‖y_{i} - {\hat{y}}_{i}‖}_{2}^{2} / \sum_{i = 1}^{n_{t}} {‖y_{i} - {\bar{y}}_{i}‖}_{2}^{2},

(14)

where

y_{i}

represents the reference output (hydraulic head or contaminant concentration field) of the i-th testing sample obtained from the full forward model,

n_{t}

denotes the total number of testing samples,

{\hat{y}}_{i}

is the corresponding prediction from the AR-DWCNN, and

{\bar{y}}_{i}

is the mean of the reference outputs.

The root mean square error (RMSE) is calculated as

R M S E = \sqrt{\frac{1}{n_{t}} \sum_{i = 1}^{n_{t}} {‖y_{i} - {\hat{y}}_{i}‖}_{2}^{2}} .

(15)

Both the L₂ norm and summation are performed over all spatial grid points and all predicted variables (head and concentration fields) within each sample.

The MBE, SD, TS, and U95 of relative errors are computed as

M B E = \frac{1}{n_{t} \bar{y}} \sum_{i = 1}^{n_{t}} (y_{i} - {\hat{y}}_{i});

(16)

S D = \frac{1}{\bar{y}} \frac{\sqrt{\sum_{i = 1}^{n_{t}} n_{t} {‖y_{i} - {\hat{y}}_{i}‖}_{2}^{2} - {(\sum_{i = 1}^{n_{t}} (y_{i} - {\hat{y}}_{i}))}^{2}}}{n_{t}};

(17)

T S = \sqrt{\frac{(N - 1) {M B E}^{2}}{{R M S E}^{2} - {M B E}^{2}}};

(18)

U 95 = 1.96 \sqrt{{S D}^{2} - {R M S E}^{2}} .

(19)

To further integrate the above statistical measures into a single scalar index comparison, a global performance indicator (GPI) is defined as [14]

G P I = M B E \times R M S E \times T S \times U 95 \times (1 - R^{2}) .

(20)

Higher R² (approaching 1) and lower RMSE, MBE, SD, TS, U95, and GPI values indicate better surrogate accuracy and reliability. These metrics are computed separately for hydraulic head and contaminant concentration predictions, as well as for their joint prediction, to comprehensively assess the capability of the surrogate in reproducing the complex coupled flow and transport processes across the entire spatial–temporal domain.

4. Results and Discussion

4.1. Performance of the Proposed Surrogate

The proposed Autoregressive Depthwise Separable Convolutional Neural Network (AR-DWCNN) surrogate achieves predictive accuracy comparable to the baseline AR-Net on the testing set, while offering substantial advantages in training efficiency and model compactness. Across 10 independent realizations, the AR-DWCNN achieves a higher coefficient of determination (R² = 0.989) and a lower root mean square error (RMSE = 0.056), compared with R² = 0.984 and RMSE = 0.067 obtained by AR-Net for concentration fields at the final time step (t = 16 [T]). Additionally, AR-DWCNN exhibits a smaller mean bias error (MBE = 0.017 vs. 0.021), reduced uncertainty (U95 = 0.280 vs. 0.332), and a lower global performance indicator (GPI = 5.004 × 10⁻⁵ vs. 1.405 × 10⁻⁴), indicating the improved overall accuracy, stability, and reliability of the surrogate predictions. This parity in fidelity is further demonstrated in Figure 5 and Figure 6 for a random selected testing sample, which compare plume evolution at selected time steps (t = 4, 8, 12, and 16 [T]) for both surrogates against the reference fields generated by the physics-based forward model. Both AR-DWCNN (Figure 5) and AR-Net (Figure 6) successfully capture the overall shape and migration patterns of the contaminant plume under the influence of heterogeneous hydraulic conductivity and time-varying source releases. The predicted plumes closely align with the reference in terms of the spatial distribution and concentration magnitude at all displayed time steps, confirming the effectiveness of the autoregressive strategy in modeling temporal dependencies.

Minor discrepancies between the surrogates and the reference are primarily observed near the contaminant sources and areas, where concentration gradients are steep, and local variability is high. In these regions, small spatial shifts or smoothing effects inherent to convolutional approximations can lead to relatively larger absolute errors. Such deviations are common in data-driven surrogates trained on limited samples and are more pronounced in areas of rapid concentration change driven by source proximity [29,30]. The visual and quantitative agreement validates the capability of AR-DWCNN to capture the full-field groundwater flow and contaminant transport process, providing a reliable and computationally lightweight alternative to validated architecture like AR-Net.

Figure 7 provides point-wise scatter plots comparing the predicted versus reference concentrations for all grid cells across the selected testing set at 8 time steps. Both surrogate models perform well in low-concentration regimes (near-zero values), where scatter points closely follow the 1:1 (45°) reference line across all observation times. This indicates an accurate representation of background concentrations and the contaminant plume margins, which occupy the majority of the spatial domain. Such strong agreement in low-concentration regions is consistent with the plume morphology comparisons shown in Figure 5 and Figure 6, demonstrating that the overall plume and low-concentration areas are well captured by both surrogate models.

In high-concentration regions, both models exhibit noticeable deviations, with points scattering below the 45° line, indicating a tendency to underestimate the peak values. These discrepancies are most evident near the contaminant sources and within plume cores, where steep gradients and rapid temporal changes prevail, due to time-varying releases and heterogeneous hydraulic conductivity. The underestimation arises from convolutional smoothing effects and limited training realizations failing to fully resolve extreme localized peaks, a common problem in data-driven surrogates for solute transport. AR-DWCNN shows slightly tighter clustering and higher R² values overall (ranging from 0.995 at t = 2 [T] to 0.983 at later steps) compared to AR-Net (0.979 at t = 2 [T]), demonstrating marginally better handling of high-value variability.

The accuracy of the surrogate predictions at observation locations is particularly relevant for subsequent data assimilation, as these represent the sparse measurements typically available in real-world inverse problems. Figure 8 displays breakthrough curves at 15 monitoring wells across the full observation period for the same randomly selected test realization shown in Figure 5 and Figure 6. Due to the finite-duration release of the contaminant source (stops after t ≥ 8T), observation wells closer to the source or on the main flow path (e.g., wells 6–9) exhibit complete “rise–peak–decline” features within the simulation time, while more distant or transversely influenced wells (e.g., wells 2–5 and 11–15) remain in the concentration rising phase during the 0–16 T period and have not yet reached their peak, thus not displaying complete breakthrough curve morphology. Both surrogates reproduce general trends effectively, including peak arrival times, maximum concentrations, and tailing behavior at most wells. This agreement confirms that both models provide reliable concentration data at observation points, which is important for driving accurate inversion in data assimilation. AR-DWCNN generally tracks the reference curves more closely than AR-Net, particularly during rapid rise and peak phases. For instance, at upstream wells (e.g., Wells 5, 7, 9) influenced by source proximity, AR-DWCNN better matches the peak timings and magnitudes, while AR-Net shows slight delays or overestimations in some cases. Downstream wells (e.g., Wells 11, 13, 14) exhibit smoother lower-amplitude responses due to dispersion; here, both surrogates perform well, though AR-DWCNN exhibits marginally less deviation in tailing sections. Across the testing set, the mean RMSE of breakthrough curves for AR-DSCNN is 0.019, compared to 0.032 for AR-Net, confirming substantially higher prediction accuracy at observations despite the lightweight architecture.

The training metrics present the efficiency of the proposed AR-DWCNN compared to the AR-Net when both models are trained for the same number of epochs (200 epochs). Despite identical training epochs, AR-DWCNN converges to a lower mean validation loss of 0.012, compared to 0.018 for AR-Net. More importantly, the AR-DWCNN contains only 1,813,730 trainable parameters, representing a reduction of approximately 48% relative to the 3,490,020 parameters of AR-Net. This substantial decrease in model complexity directly transforms into significant computational savings: training time on a single NVIDIA GeForce RTX 3090 is reduced by about 37% (1590 s versus 2536 s). After training, the single-realization prediction time is comparable for the two models, amounting to 0.1462 s for AR-DWCNN and 0.1572 s for AR-Net, respectively. These efficiency gains primarily arise from the adoption of depthwise separable convolutions, which replace standard convolutions and substantially minimize redundant computations without sacrificing capacity to capture complex spatiotemporal dependencies in groundwater flow and contaminant transport processes. Even under the same 200-epoch training constraint, the AR-DWCNN offers approximately 48% fewer trainable parameters and 37% shorter training time than AR-Net, while delivering comparable or slightly superior predictive accuracy. This lightweight design is particularly advantageous for constructing surrogate under limited computational resources, enabling faster iteration during model development and more efficient deployment in ensemble-based inverse method.

4.2. Inversion Results of ILUES-LM

The ILUES-LM algorithm is employed to inverse 321 parameters using the AR-DWCNN surrogate model, which is constructed from 1500 forward model input–output dataset. The forward model is fully replaced by this surrogate, eliminating additional forward simulation executions during the inversion process. To evaluate the accuracy and computational efficiency of the high-dimensional parameter inversion combining the AR-DWCNN surrogate with ILUES-LM (surrogate-based ILUES-LM), the results from ILUES-LM using the original physics-based forward models (physics-based ILUES-LM) serve as a reference.

For both approaches, an ensemble size of 3300 and iteration number of 10 are determined to handle high-dimensional parameter joint inversion and quantify parameter uncertainty adequately. Figure 9 illustrates the inversion results for the 4 contaminant source strength parameters with the increase in iteration for physics-based ILUES-LM (left column) and surrogate-based ILUES-LM (right column). In both cases, the ensemble means move toward the true values within the early iterations, accompanied by a progressive reduction in ensemble spread. Compared to the physics-based implementation, the surrogate-based ILUES-LM exhibits a slightly slower convergence rate; nevertheless, it continues to evolve steadily and ultimately converges to values close to the reference. Therefore, the surrogate-based approach demonstrates convergence behavior that is qualitatively consistent with the physics-based inversion, confirming its capability to accurately identify source strengths.

Figure 10 and Figure 11 present the reference hydraulic conductivity field alongside the estimated mean field for the final ensemble, a randomly selected posterior realization, and the variance field of the final ensemble, obtained from the physics-based ILUES-LM and surrogate-based ILUES-LM, respectively. As shown in Figure 10, the physics-based ILUES-LM successfully reconstructs the main patterns of the reference field. The mean estimate clearly shows the continuous high-conductivity zones (yellow-orange areas) and surrounding low-conductivity regions (blue areas). The randomly selected posterior realization preserves the spatial connectivity and variability of these zones. The variance field is generally low across most of the domain, with slightly higher values in a few isolated points, reflecting limited influence from distant observations.

The surrogate-based ILUES-LM produces very similar results (Figure 11). The mean estimate captures the overall structure of high- and low-conductivity zones, with only minor underestimation. The selected posterior realization exhibits comparable heterogeneity and continuity to the physics-based case. The variance field is marginally higher, which can be attributed to residual discrepancies between the selected observations and the true values, leading to a systematic bias in the ensemble inversion and, consequently, elevated uncertainty levels. Another reason might be the inherent characteristics of sparse observations relative to the high-dimensional parameter space, which amplifies the impact of these surrogate errors on uncertainty propagation.

Executing physics-based ILUES-LM requires 36,300 forward model runs (3300 for initial ensemble plus 33,000 across 10 iterations), imposing a significant computational burden and reducing the inversion efficiency. In contrast, surrogate-coupled ILUES-LM yields comparable results for high-dimensional parameter estimation to physics-based ILUES-LM. Notably, it requires no additional forward runs during inversion, with only 2000 simulations (1500 training + 500 testing) used for surrogate construction. The dramatic reduction underscores the efficiency gains enabled by the AR-DWCNN surrogate while maintaining an inversion accuracy nearly identical to the physics-based reference.

To quantify the efficiency gains, the total computational costs are compared. For physics-based ILUES-LM, the cost is = T_f × N_f, where T_f is the time per forward model run, and N_f = 36,300. For surrogate-based ILUES-LM, the cost includes forward runs for surrogate samples (T_f × N₀, with N₀ = 2000) plus surrogate training time T_t = 1590 s, giving T_s = T_f × N₀ + T_t. Assuming T_f ≈ 10 s (a conservative estimate of the physics-based simulator for the numerical case in this study), T_F ≈ 363,000 s, while T_s ≈ 21,590 s, representing an acceleration of approximately 17 times. Even larger speedups are expected for higher-dimensional problems or more expensive forward models, as the fixed surrogate construction cost is amortized over the inversion process.

These results align with the trends in recent surrogate-coupled ensemble frameworks, where lightweight deep learning approximations dramatically cut forward evaluations without compromising posterior fidelity [28,29,30]. The AR-DWCNN surrogate thus enables the practical application of data assimilation to complex real-world contaminant source identification tasks that would otherwise be computationally prohibitive.

4.3. Limitations and Potential Improvements

While the proposed AR-DWCNN surrogate model coupled with the ILUES-LM algorithm demonstrates promising performance in terms of predictive accuracy and computational efficiency for the joint inversion of hydraulic conductivity fields and time-varying contaminant source strengths, several limitations inherent to the current study should be acknowledged.

First, the numerical experiments are conducted in a synthetic two-dimensional domain under steady-state saturated flow conditions. This setup serves primarily as a proof-of-concept, allowing focused evaluation of the surrogate’s ability to capture spatiotemporal transport dynamics under high-dimensional parameter uncertainty and sparse observations, while keeping the computational demands manageable for extensive ensemble-based inversion. However, real-world aquifers are typically three-dimensional, exhibit transient flow regimes, and are influenced by complex recharge–discharge processes, vertical flow components, and multi-scale geological structures. These simplifications limit the direct transferability of the results to field-scale applications and may underestimate the challenges associated with increased dimensionality, data scarcity, and non-stationarity in practical scenarios.

Second, the solute transport model considers only advection and mechanical dispersion (Equations (4) and (5)), neglecting molecular diffusion, adsorption, decay, and biogeochemical reactions. This assumption is reasonable for conservative contaminants in the synthetic cases examined and aligns with many previous inverse modeling studies focusing on source identification and heterogeneity. However, in real-world groundwater systems, if the reactive parameters of the contaminant are known a priori from site characterization or laboratory data, the proposed surrogate modeling and inversion framework holds strong application potential: by simply incorporating the relevant reaction terms into the high-fidelity forward simulations used to generate the training dataset for the AR-DWCNN, the surrogate can learn to approximate the full reactive transport dynamics without requiring fundamental changes to the network architecture or inversion algorithm. This extension represents a direction for future research to enhance the method’s applicability to more complex reactive contaminant problems.

Third, the AR-DWCNN is designed as a data-driven surrogate without explicit incorporation of governing physical equations as hard constraints, in contrast to physics-informed neural networks (PINNs). While this choice prioritizes computational efficiency and scalability for high-dimensional problems, it may lead to minor deviations from fundamental physical principles in regions that are sparsely represented in the training data or characterized by sharp spatial gradients. Future research will therefore focus on incorporating physical mechanisms into the learning framework, for instance by introducing physics-based regularization terms into the loss function or by developing hybrid data–physics modeling strategies. Such extensions are expected to further improve the physical consistency, interpretability, and robustness of the surrogate model, particularly under more challenging extrapolation and generalization scenarios.

Finally, the current case does not account for pumping or injection stresses, which are common in managed aquifers and can significantly alter flow paths and contaminant migration patterns. This omission was intentional to isolate the effects of natural-gradient heterogeneity and time-varying source release, but it reduces the framework’s immediate relevance to actively managed groundwater systems. However, pumping rates can be incorporated as additional time-varying input channels in the training data and inversion parameters without changing the network architecture or algorithm. Extending the framework to actively managed sites constitutes another direction for future investigation.

Despite these limitations, the developed lightweight surrogate-assisted inversion framework provides a robust and efficient foundation for addressing high-dimensional groundwater inverse problems. Future research will focus on extending the methodology to three-dimensional transient systems, incorporating reactive transport processes, embedding physical constraints into the surrogate model, and including anthropogenic stresses such as pumping to enhance its applicability to real-world contaminated sites.

5. Conclusions

This study introduces a computationally efficient and robust framework for the joint inversion of heterogeneous hydraulic conductivity and time-varying contaminant source strengths under sparse and noisy observational data. The key innovation is a lightweight autoregressive depthwise convolutional neural network (AR-DWCNN) that serves as a high-fidelity surrogate for the coupled groundwater flow and solute transport processes. By replacing standard convolutions with depthwise separable convolutions and incorporating dense connectivity within an encoder–decoder architecture, the AR-DWCNN achieves predictive accuracy comparable to the established AR-Net while using only 1,813,730 trainable parameters with a 48% reduction and requiring 37% less training time. Under identical training budgets of 200 epochs, the proposed model exhibits faster convergence, with lower validation loss 0.012 compared to the AR-Net, demonstrating that depthwise separable convolutions effectively reduce computational cost without sacrificing the ability to capture complex spatiotemporal dependencies.

The surrogate model is integrated with an enhanced Iterative Local Updating Ensemble Smoother incorporating adaptive Levenberg–Marquardt regularization (ILUES-LM). This coupling achieves convergence stability in nonlinear and high-dimensional inverse problems. In a synthetic two-dimensional heterogeneous aquifer case, the surrogate-assisted ILUES-LM yields posterior distributions of hydraulic conductivity and source parameters that closely match those obtained using the numerical forward model, while reducing the overall computational cost by more than one order of magnitude, approaching 17 times acceleration. The framework successfully reproduces plume evolution, identifies time-varying source strengths, and provides uncertainty-aware inverse estimates, despite the challenges posed by high-dimensional parameters and limited observational data.

The combination of AR-DWCNN and ILUES-LM offers an effective balance between predictive accuracy, computational efficiency, and uncertainty quantification. The results, while obtained in a synthetic setting, validate the main methodological contributions: the lightweight surrogate design, the autoregressive formulation for transient dynamics, and the regularized local ensemble updating strategy for robust inversion in nonlinear and high-dimensional problems. The proposed framework shows the potential to extend to more realistic and challenging hydrogeological scenarios. Future research will focus on extending to three-dimensional transient systems, incorporating reactive transport processes, embedding physical constraints (e.g., through regularization or hybrid approaches), and integrating pumping stresses as additional input channels or parameters, thereby enhancing the scalability, robustness, and real-world applicability.

Author Contributions

Conceptualization, X.X. and S.J.; methodology, X.X.; software, Y.Z.; validation, C.H., S.G., and D.Y.; formal analysis, C.L.; investigation, Q.X.; resources, S.G., D.Y., C.L., and Q.X.; data curation, D.Y.; writing—original draft preparation, C.H., S.G., and X.X.; writing—review and editing, X.X. and S.J.; visualization, Y.Z. and X.X.; supervision, X.X. and S.J.; project administration, C.H. and S.G.; funding acquisition, S.G., S.J., and X.X. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Open Fund Project of Shandong Engineering Research Center for Environmental Protection and Remediation on Groundwater (Grant No. 801KF2024-5), the National Natural Science Foundation of China (Grant No. 42402252 and No. 42572309), and the open project program of MOE Key Laboratory of Groundwater Circulation and Environmental Evolution, China University of Geosciences (Beijing).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Song, X.; Demirkanli, I.; Hou, Z.; Lin, X.; Karanovic, M.; Tonkin, M.; Mackley, R. Integrating analytical solutions and U-Net model for predicting groundwater contaminant plumes in pump-and-treat systems. Adv. Water Resour. 2025, 202, 105002. [Google Scholar] [CrossRef]
Wang, Z.; Lu, W.; Chang, Z.; Bai, Y.; Xu, Y. Simultaneous identification of groundwater contamination source information, model parameters, and boundary conditions under an unknown boundary mode. Stoch. Environ. Res. Risk Assess. 2024, 38, 4085–4106. [Google Scholar] [CrossRef]
Chen, J.; Dai, Z.; Yin, S.; Zhang, M.; Soltanian, M.R. Enhancing inverse modeling in groundwater systems through machine learning: A comprehensive comparative study. Hydrol. Earth Syst. Sci. 2025, 29, 4251–4279. [Google Scholar] [CrossRef]
Swetha, K.; Eldho, T.I.; Singh, L.G.; Kumar, A.V. Flow and transport parameter estimation of a confined aquifer using simulation-optimization model. Model. Earth Syst. Environ. 2024, 10, 4013–4026. [Google Scholar] [CrossRef]
Sahranavard, H.; Mohtashami, A.; Mohtashami, E.; Akbarpour, A. Inverse modeling application for aquifer parameters estimation using a precise simulation-optimization model. Appl. Water Sci. 2023, 13, 58. [Google Scholar] [CrossRef]
Panzeri, M.; Riva, M.; Guadagnini, A.; Neuman, S.P. Data assimilation and parameter estimation via ensemble Kalman filter coupled with stochastic moment equations of transient groundwater flow. Water Resour. Res. 2013, 49, 1334–1344. [Google Scholar] [CrossRef]
Thomas, A.; Majumdar, P.; Eldho, T.I.; Rastogi, A.K. Simulation optimization model for aquifer parameter estimation using coupled meshfree point collocation method and cat swarm optimization. Eng. Anal. Bound. Elem. 2018, 91, 60–72. [Google Scholar] [CrossRef]
Moravej, M.; Amani, P.; Hosseini-Moghari, S.M. Groundwater level simulation and forecasting using interior search algorithm-least square support vector regression (ISA-LSSVR). Groundw. Sustain. Dev. 2020, 11, 100447. [Google Scholar] [CrossRef]
Safavi, H.R.; Darzi, F.; Mariño, M.A. Simulation-optimization modeling of conjunctive use of surface water and groundwater. Water Resour. Manag. 2010, 24, 1965–1988. [Google Scholar] [CrossRef]
Sreekanth, J.; Datta, B. Coupled simulation-optimization model for coastal aquifer management using genetic programming-based ensemble surrogate models and multiple-realization optimization. Water Resour. Res. 2011, 47, W04516. [Google Scholar] [CrossRef]
Patel, S.; Eldho, T.I.; Rastogi, A.K. Hybrid-metaheuristics based inverse groundwater modelling to estimate hydraulic conductivity in a nonlinear real-field large aquifer system. Water Resour. Manag. 2020, 34, 2011–2028. [Google Scholar] [CrossRef]
Yoon, H.; Hart, D.B.; McKenna, S.A. Parameter estimation and predictive uncertainty in stochastic inverse modeling of groundwater flow: Comparing null-space Monte Carlo and multiple starting point methods. Water Resour. Res. 2013, 49, 536–553. [Google Scholar] [CrossRef]
Mehdinejadiani, B. A novel inverse model insensitive to initial guesses for estimating parameters of continuous time random walk-truncated power law model. J. Hydrol. 2025, 658, 133206. [Google Scholar] [CrossRef]
Maroufi, H.; Mehdinejadiani, B. A comparative study on using metaheuristic algorithms for simultaneously estimating parameters of space fractional advection-dispersion equation. J. Hydrol. 2021, 602, 126757. [Google Scholar] [CrossRef]
Zovi, F.; Camporese, M.; Franssen, H.J.H.; Huisman, J.A.; Salandin, P. Identification of high-permeability subsurface structures with multiple point geostatistics and normal score ensemble Kalman filter. J. Hydrol. 2017, 548, 208–224. [Google Scholar] [CrossRef]
Keller, J.; Franssen, H.J.H.; Nowak, W. Investigating the pilot point ensemble Kalman filter for geostatistical inversion and data assimilation. Adv. Water Resour. 2021, 155, 104010. [Google Scholar] [CrossRef]
Xu, T.; Gómez-Hernández, J.J. Joint identification of contaminant source location, initial release time, and initial solute concentration in an aquifer via ensemble Kalman filtering. Water Resour. Res. 2016, 52, 6587–6595. [Google Scholar] [CrossRef]
Bai, Y.; Lu, W.; Li, J.; Chang, Z.; Wang, H. Groundwater contamination source identification using improved differential evolution Markov chain algorithm. Environ. Sci. Pollut. Res. 2022, 29, 19679–19692. [Google Scholar] [CrossRef]
Zhuang, C.; Illman, W.A.; Yu, X.; Yan, L.; Wu, J.; Dou, Z.; Wang, J.; Zhou, Z. Geostatistical inverse modeling to characterize the transience of streambed hydraulic conductivity. J. Hydrol. 2023, 626, 130325. [Google Scholar] [CrossRef]
Yang, S.; Tsai, F.T.C.; Bacopoulos, P.; Kees, C.E. Comparative analyses of covariance matrix adaptation and iterative ensemble smoother on high-dimensional inverse problems in high-resolution groundwater modeling. J. Hydrol. 2023, 625, 130075. [Google Scholar] [CrossRef]
Kang, X.; Kokkinaki, A.; Kitanidis, P.K.; Shi, X.; Lee, J.; Mo, S.; Wu, J. Hydrogeophysical characterization of nonstationary DNAPL source zones by integrating a convolutional variational autoencoder and ensemble smoother. Water Resour. Res. 2021, 57, e2020WR028538. [Google Scholar] [CrossRef]
He, L.; Cheng, H.; Nan, Z.; Gong, Y.; Guo, H.; Mao, J.; Zhang, J. Improving joint identification of groundwater contaminant source and non-Gaussian distributed conductivity field using a deep learning-based ensemble smoother. J. Hydrol. 2025, 658, 133202. [Google Scholar]
Emerick, A.A.; Reynolds, A.C. Ensemble smoother with multiple data assimilation. Comput. Geosci. 2013, 55, 3–15. [Google Scholar] [CrossRef]
Zhang, J.; Lin, G.; Li, W.; Wu, L.; Zeng, L. An iterative local updating ensemble smoother for estimation and uncertainty assessment of hydrologic model parameters with multimodal distributions. Water Resour. Res. 2018, 54, 1716–1733. [Google Scholar] [CrossRef]
Bai, T.; Tahmasebi, P. Characterization of groundwater contamination: A transformer-based deep learning model. Adv. Water Resour. 2022, 164, 104217. [Google Scholar] [CrossRef]
Müller, J.; Park, J.; Sahu, R.; Varadharajan, C.; Arora, B.; Faybishenko, B.; Agarwal, D. Surrogate optimization of deep neural networks for groundwater predictions. J. Global Optim. 2021, 81, 203–231. [Google Scholar]
Chen, J.; Dai, Z.; Dong, S.; Zhang, X.; Sun, G.; Wu, J.; Ershadnia, R.; Yin, S.; Soltanian, M.R. Integration of deep learning and information theory for designing monitoring networks in heterogeneous aquifer systems. Water Resour. Res. 2022, 58, e2022WR032429. [Google Scholar] [CrossRef]
Zhu, Y.; Zabaras, N. Bayesian deep convolutional encoder-decoder networks for surrogate modeling and uncertainty quantification. J. Comput. Phys. 2018, 366, 415–447. [Google Scholar] [CrossRef]
Taccari, M.L.; Nuttall, J.; Chen, X.; Wang, H.; Minnema, B.; Jimack, P.K. Attention U-Net as a surrogate model for groundwater prediction. Adv. Water Resour. 2022, 163, 104169. [Google Scholar] [CrossRef]
Lauzon, D. A U-Net architecture as a surrogate model combined with a geostatistical spectral algorithm for transient groundwater flow inverse problems. Adv. Water Resour. 2024, 189, 104726. [Google Scholar] [CrossRef]
Mo, S.; Zabaras, N.; Shi, X.; Wu, J. Deep autoregressive neural networks for high-dimensional inverse problems in groundwater contaminant source identification. Water Resour. Res. 2019, 55, 3856–3881. [Google Scholar] [CrossRef]
Xia, X.; Jiang, S.; Zhou, N.; Cui, J.; Li, X. Groundwater contamination source identification and high-dimensional parameter inversion using residual dense convolutional neural network. J. Hydrol. 2023, 617, 129013. [Google Scholar] [CrossRef]
Xu, Y.; Lu, W.; Pan, Z.; Wang, Z.; Luo, C.; Bai, Y. Intelligent enhanced particle filter with deep residual network surrogate for accurate groundwater pollution source characterization. J. Hydrol. 2024, 642, 131904. [Google Scholar] [CrossRef]
Guo, X.; Luo, J.; Lu, W.; Dong, G.; Pan, Z. Optimal design of groundwater pollution monitoring network based on a back-propagation neural network surrogate model and grey wolf optimizer algorithm under uncertainty. Environ. Monit. Assess. 2024, 196, 132. [Google Scholar] [CrossRef] [PubMed]
Hou, Z.; Zhao, K.; Wang, S.; Wang, Y.; Lu, W. Bayesian hybrid-kernel machine-learning-assisted sensitivity analysis and sensitivity-relevant inverse modeling for groundwater DNAPL contamination. J. Hydrol. 2024, 633, 131009. [Google Scholar] [CrossRef]
Luo, J.; Ma, X.; Ji, Y.; Li, X.; Song, Z.; Lu, W. Review of machine learning-based surrogate models of groundwater contaminant modeling. Environ. Res. 2023, 238, 117268. [Google Scholar] [CrossRef]
Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv 2017, arXiv:1704.04861. [Google Scholar] [CrossRef]
Chen, Y.; Oliver, D.S. Levenberg-Marquardt forms of the iterative ensemble smoother for efficient history matching and uncertainty quantification. Comput. Geosci. 2013, 17, 689–703. [Google Scholar] [CrossRef]
Zheng, N.; Li, Z.; Xia, X.; Gu, S.; Li, X.; Jiang, S. Estimating line contaminant sources in non-Gaussian groundwater conductivity fields using deep learning-based framework. J. Hydrol. 2024, 630, 130727. [Google Scholar] [CrossRef]
Harbough, A.W.; Banta, E.R.; Hill, M.C.; Mcdonald, M.G. The US Geological Survey Modular Ground-Water Model-the Ground-Water Flow Process. US Geol. Surv. Tech. Water Resour. Investig. 2005, 6, 253. [Google Scholar]
Zheng, C.; Wang, P.P. MT3DMS: A Modular Three-Dimensional Multispecies Transport Model for Simulation of Advection, Dispersion, and Chemical Reactions of Contaminants in Groundwater Systems; Documentation and User’s Guide; Contract Report SERDP-99-1; U.S. Army Engineer Research and Development Center: Vicksburg, MS, USA, 1999. [Google Scholar]
Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]

Figure 1. Depthwise separable convolution neural network architecture with depthwise and pointwise convolution operations.

Figure 2. Sequential prediction of concentration fields from time t₁ to t_j, given conductivity K and time-varying source S using the trained autoregressive neural network

f_{θ}

.

Figure 2. Sequential prediction of concentration fields from time t₁ to t_j, given conductivity K and time-varying source S using the trained autoregressive neural network

f_{θ}

.

Figure 3. Schematic diagram of the autoregressive depthwise separable convolutional neural network (AR-DWCNN) architecture: encoder–decoder structure with depthwise separable convolutions.

Figure 4. Flow domain configuration and the reference log-conductivity field in the study area with specified boundary conditions. Red triangle and black circles denote the contaminant source and observation well, respectively. The number near the observation well represents the well number.

Figure 5. Comparison of contaminant concentration fields (top four rows) and hydraulic head field (bottom row) at selected time steps (t = 4, 8, 12, and 16 [T]) for a randomly selected test sample: reference fields from the high-fidelity forward model (left column), AR-DWCNN predictions (middle column), and error fields (right column).

Figure 6. Comparison of contaminant concentration fields (top four rows) and hydraulic head field (bottom row) at selected time steps (t = 4, 8, 12, and 16 [T]) for a randomly selected test sample: reference fields from the high-fidelity forward model (left column), AR-Net predictions (middle column), and error fields (right column).

Figure 7. Comparison of predicted and actual concentrations for AR-DWCNN and AR-Net at multiple time steps.

Figure 8. Comparison of breakthrough curves at 15 observation locations obtained from the physics-based model and two surrogate models (AR-DWCNN and AR-Net).

Figure 9. Comparison of physics-based and AR-DWCNN model inversion results for multiple source strength parameters using ILUES-LM. The horizontal red dash lines indicate the actual values of the source characteristics.

Figure 10. The reference log-conductivity field, random posterior realization, and the mean and variance for the final ensemble generated by the physics-based ILUES-LM method.

Figure 11. The reference log-conductivity field, random posterior realization, and the mean and variance for the final ensemble generated by the AR-DWCNN-based ILUES-LM method.

Table 1. The prior distribution and actual values of the contaminant releasing strength.

Parameter	S1	S2	S3	S4
Actual value [MT⁻¹]	6.224	6.057	3.242	5.615
Prior range	U[0, 8]	U[0, 8]	U[0, 8]	U[0, 8]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Hu, C.; Gao, S.; Zhao, Y.; Yu, D.; Liu, C.; Xu, Q.; Jiang, S.; Xia, X. Lightweight Depthwise Autoregressive Convolutional Surrogate for Efficient Joint Inversion of Hydraulic Conductivity and Time-Varying Contaminant Sources. Water 2026, 18, 380. https://doi.org/10.3390/w18030380

AMA Style

Hu C, Gao S, Zhao Y, Yu D, Liu C, Xu Q, Jiang S, Xia X. Lightweight Depthwise Autoregressive Convolutional Surrogate for Efficient Joint Inversion of Hydraulic Conductivity and Time-Varying Contaminant Sources. Water. 2026; 18(3):380. https://doi.org/10.3390/w18030380

Chicago/Turabian Style

Hu, Caiping, Shuai Gao, Yule Zhao, Dalu Yu, Chunwei Liu, Qingyu Xu, Simin Jiang, and Xuemin Xia. 2026. "Lightweight Depthwise Autoregressive Convolutional Surrogate for Efficient Joint Inversion of Hydraulic Conductivity and Time-Varying Contaminant Sources" Water 18, no. 3: 380. https://doi.org/10.3390/w18030380

APA Style

Hu, C., Gao, S., Zhao, Y., Yu, D., Liu, C., Xu, Q., Jiang, S., & Xia, X. (2026). Lightweight Depthwise Autoregressive Convolutional Surrogate for Efficient Joint Inversion of Hydraulic Conductivity and Time-Varying Contaminant Sources. Water, 18(3), 380. https://doi.org/10.3390/w18030380

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Lightweight Depthwise Autoregressive Convolutional Surrogate for Efficient Joint Inversion of Hydraulic Conductivity and Time-Varying Contaminant Sources

Abstract

1. Introduction

2. Methods

2.1. Numerical Simulation Model for Groundwater Flow and Contaminant Transport

2.2. Autoregressive Depthwise Convolutional Neural Network

2.2.1. Encoder–Decoder Architecture with Dense Connectivity

2.2.2. Depthwise Separable Convolution

2.2.3. Autoregressive Strategy for Temporal Dependencies

2.2.4. Network Architecture and Training Configuration

2.3. Iterative Local Updating Ensemble Smoother with Levenberg–Marquardt Regularization (ILUES-LM)

3. Numerical Experiment

3.1. Case Setup

3.2. Evaluation Indicators

4. Results and Discussion

4.1. Performance of the Proposed Surrogate

4.2. Inversion Results of ILUES-LM

4.3. Limitations and Potential Improvements

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI