Connectivity-Aware LSTM-PSO for Water Injection Allocation in Offshore Waterflooding Reservoirs

Wei, Feng; Chen, Xiaoquan; Pang, Guoqiang; Li, Wei; Chen, Peng; Jiao, Shixiang

doi:10.3390/pr14132065

Open AccessArticle

Connectivity-Aware LSTM-PSO for Water Injection Allocation in Offshore Waterflooding Reservoirs

by

Feng Wei

¹,

Xiaoquan Chen

¹,

Guoqiang Pang

²,

Wei Li

¹

,

Peng Chen

² and

Shixiang Jiao

^3,*

¹

CNOOC (China) Zhanjiang Branch, Zhanjiang 524057, China

²

CNOOC EnerTech-Drilling & Production Co., Zhanjiang 524057, China

³

School of Computer Science and Software Engineering, Southwest Petroleum University, Chengdu 610500, China

^*

Author to whom correspondence should be addressed.

Processes 2026, 14(13), 2065; https://doi.org/10.3390/pr14132065 (registering DOI)

Submission received: 19 May 2026 / Revised: 11 June 2026 / Accepted: 23 June 2026 / Published: 25 June 2026

(This article belongs to the Topic Advanced Technology for Oil and Nature Gas Exploration)

Download

Browse Figures

Versions Notes

Abstract

Water injection allocation is critical for maintaining pressure support in mature offshore waterflooding reservoirs, but its optimization is complicated by delayed injection–production responses, interwell interference, limited intervention windows, and incomplete field labels for injector–producer connectivity. This study proposes a connectivity-aware optimization framework that couples an attention-based connectivity identification network, a group-level long short-term memory (LSTM) production surrogate, and particle swarm optimization (PSO). The methodological novelty lies in using prescribed connectivity labels in a field-informed semi-synthetic benchmark to quantitatively test whether dynamic injection–production sequences and static well-pair attributes can be transformed into interpretable connectivity estimates for injection allocation decision support. The benchmark contains five injectors, ten producers, daily injection and production histories, static well-pair attributes, response lags, and normalized connectivity coefficients generated under practical injection rate, lag, water cut, and adjustment constraints. The attention model recovered the dominant injector–producer relationships with MAE = 0.0146, RMSE = 0.0240, R² = 0.9835, cosine similarity = 0.9962, and top-three overlap = 100%. The group-level LSTM achieved MAE = 4.524 m³/d, RMSE = 5.963 m³/d, MAPE = 1.255%, and R² = 0.964 on the chronological test set. Across 15 optimization cases, the PSO module generated feasible injection reallocations under single-well rate, total-injection balance, and +/−15% adjustment constraints. The results should be interpreted as controlled methodological validation rather than direct field deployment; further testing with anonymized field data is required.

Keywords:

water injection allocation; injector–producer connectivity; long short-term memory; particle swarm optimization; semi-synthetic waterflooding dataset

1. Introduction

Waterflooding is one of the most widely used development strategies for maintaining reservoir pressure and improving oil recovery in mature oilfields. By injecting water into the reservoir, pressure support and displacement efficiency can be enhanced, thereby delaying production decline and improving the sweep efficiency of remaining oil. Classical waterflooding theory indicates that the effectiveness of water injection development is strongly controlled by reservoir heterogeneity, injection–production balance, and spatial communication between injectors and producers [1,2]. As waterflooding reservoirs enter the middle-to-high water-cut stage, the injection–production relationship becomes increasingly complex due to interwell interference, preferential flow paths, uneven displacement, and delayed production responses. These challenges are particularly prominent in offshore oilfields, where production operations are constrained by limited offshore intervention windows, high operation costs, and strict safety requirements.

Water injection allocation is a key operational task in waterflooding reservoir management. An appropriate injection allocation scheme can improve reservoir energy distribution, enhance the effective displacement of remaining oil, and reduce ineffective water circulation. In conventional field practice, injection–production adjustment is usually conducted based on production performance analysis, engineering experience, and periodic reservoir surveillance. Although these approaches have played an important role in reservoir management, they are often limited by low analysis frequency, strong dependence on expert knowledge, and insufficient quantitative description of dynamic injector–producer relationships. In addition, the production response to injection adjustment is usually delayed and nonlinear, making it difficult to rapidly evaluate candidate injection schemes and continuously update injection allocation strategies.

Injector–producer connectivity is an important concept for characterizing the dynamic relationship between water injectors and oil producers. It reflects the contribution of each injector to the production response of surrounding producers and provides an interpretable basis for injection allocation optimization. Albertoni and Lake [3] proposed a practical method to infer interwell connectivity from well-rate fluctuations in waterfloods, demonstrating that production and injection rate data can provide useful information about interwell communication. Capacitance–resistance models have also been widely used as reduced-order tools for rapid waterflood performance evaluation and connectivity analysis [4]. de Holanda et al. [5] provided a comprehensive review of capacitance–resistance models and summarized their applications in reservoir characterization and performance forecasting. Compared with tracer tests, interference tests, and full-physics numerical reservoir simulations, these data-driven or semi-analytical methods are computationally efficient and easier to apply in field-scale production management. However, their performance may still be affected by simplified assumptions, limited nonlinear representation ability, and insufficient adaptability to complex dynamic production conditions.

In recent years, neural-network-based methods have further expanded the capability of dynamic interwell connectivity analysis. Jiang et al. [6] developed an interpretable recurrent neural network with a self-attention mechanism to characterize flow disequilibrium in waterflooding reservoirs. Du et al. [7] combined graph convolutional networks for adaptive interwell-connectivity correction with gated recurrent units for performance prediction. Huang et al. [8] proposed an improved graph neural network for dynamic interwell connectivity analysis in multi-layer waterflooding reservoirs. More recently, attention-guided fusion and state-variable capacitance models have been introduced to improve the dynamic characterization of injector–producer relationships [9,10]. These studies indicate that combining dynamic production data with interpretable learning structures is a promising direction for connectivity-aware reservoir management.

With the development of digital oilfields and intelligent reservoir management, data-driven methods have also been increasingly used for production forecasting and injection optimization. Machine learning models can learn nonlinear mapping relationships from historical injection–production data without explicitly solving complex multiphase flow equations. Long short-term memory networks are particularly suitable for time-series modeling because they can capture temporal dependencies and delayed responses in sequential data [11]. Previous studies have shown that LSTM and other deep learning models can provide efficient alternatives to traditional reservoir simulation for production forecasting and dynamic performance evaluation [12,13,14,15]. In addition, attention mechanisms can adaptively assign weights to different input features or response channels, thereby improving both representation ability and model interpretability [16]. For example, Pan et al. [17] proposed a CNN-LSTM model with a self-attention mechanism for oil well production prediction, and Lu et al. [18] introduced a Transformer-based Seq2Seq method with attention for post-liquid-lifting production forecasting.

Although single-well production prediction has been widely studied, practical water injection allocation is generally performed at the well-group or pattern scale. Therefore, group-level production forecasting is more consistent with field-scale injection optimization than isolated single-well prediction. Yang et al. [19] recently proposed a data-driven framework for well-group oil production prediction and water injection recommendation, showing the value of linking production forecasting with injection scheme optimization. These studies provide useful references for the present work, in which a group-level production surrogate is developed to support fast evaluation of candidate injection allocation schemes.

Optimization algorithms provide another important component for intelligent water injection management. Data-driven interwell numerical simulation has been used for waterflood history matching and production optimization, providing an efficient alternative to repeated full-physics simulation in some field-management tasks [20]. Particle swarm optimization, originally proposed by Kennedy and Eberhart [21], has been widely used for nonlinear and constrained engineering optimization problems because of its simple implementation, global search capability, and flexibility in handling complex objective functions. Multi-objective variants of PSO have also been developed for constrained engineering optimization [22]. In reservoir engineering, PSO and related evolutionary algorithms have been applied to production optimization, well control optimization, and water injection allocation. Jia et al. [23] proposed a data-driven optimization method for fine water injection in a mature oilfield. Farahi et al. [24] used multi-objective particle swarm optimization for production optimization under geological and economic uncertainties. Rostamian et al. [25] further reviewed multi-objective model-based oil and gas field development optimization and emphasized the computational burden of simulation-driven optimization workflows. These studies demonstrate the potential of PSO-based optimization, but the integration of dynamic connectivity identification, group-level production surrogate modeling, and constrained injection allocation remains insufficiently explored.

Despite these advances, three issues remain insufficiently resolved for practical water injection allocation. First, connectivity identification, production forecasting, and injection allocation are often treated as separate tasks, so the optimized scheme is difficult to interpret from an injector–producer-response perspective. Second, many data-driven optimization studies focus on numerical improvement but provide limited evidence that the learned relationships are consistent with reservoir-engineering intuition. Third, complete offshore field datasets with daily injection–production histories and independently verified interwell connectivity labels are rarely available, which makes controlled validation difficult. Therefore, a reproducible benchmark and an integrated workflow are required to evaluate whether dynamic injection–production sequences and static well-pair attributes can jointly support interpretable and constrained injection optimization.

The present study addresses these gaps by developing a connectivity-aware LSTM-PSO framework and testing it on a field-informed semi-synthetic waterflooding dataset. The dataset is not intended to replace a complete offshore field case. Instead, it provides a controlled validation environment in which interwell connectivity, response lags, operational constraints, and production responses are known and can be quantitatively assessed. This distinction is important because the proposed workflow should be interpreted as methodological validation before field deployment.

Recent machine-learning studies have demonstrated the value of integrating geological and operational factors for subsurface engineering problems, including fracture-stimulation design, shale-gas production forecasting, and neural-network-based geomechanical sensitivity analysis [26,27,28]. These studies support the broader motivation of combining static geological-spatial descriptors with dynamic operational data, but the joint solution of connectivity identification, group-level forecasting, and constrained injection allocation remains insufficiently developed for offshore waterflooding scenarios.

The main contributions of this study are summarized as follows.

First, an attention-based injector–producer connectivity identification model is proposed. The model integrates historical injection sequences, producer-response sequences, and static well-pair features to infer normalized connectivity coefficients, providing an interpretable representation of dynamic interwell relationships.

Second, a group-level LSTM production surrogate model is developed for short-term oil production forecasting. Instead of predicting individual-well production separately, the model forecasts well-group oil production responses, which are directly consistent with the objective of injection allocation optimization.

Third, a constrained LSTM–PSO workflow is established for daily water injection allocation. The PSO algorithm searches for feasible injection allocation schemes under practical field constraints, including single-well injection-rate limits, total-injection balance, and adjustment-amplitude restrictions.

Fourth, a field-informed semi-synthetic dataset is constructed for reproducible evaluation of the proposed method. The dataset includes well locations, static well-pair attributes, daily injection data, production responses, prescribed connectivity ground truth, and optimization cases, enabling systematic validation of connectivity identification, production forecasting, and injection optimization.

The remainder of this paper is organized as follows. Section 2 describes the construction of the field-informed semi-synthetic waterflooding dataset and formulates the injection allocation optimization problem. Section 3 presents the proposed methodology, including the attention-based connectivity model, the group-level LSTM surrogate model, and the PSO-based optimization algorithm. Section 4 reports the experimental results and performance evaluation. Section 5 summarizes the main conclusions and discusses the limitations and future work.

2. Dataset Construction and Problem Formulation

2.1. Rationale and Dataset Overview

Complete offshore oilfield datasets that simultaneously contain daily injection records, production responses, operating constraints, and independently verified injector–producer connectivity labels are difficult to obtain because of confidentiality restrictions and the high cost of direct connectivity verification. In routine waterflooding management, interwell connectivity is usually inferred indirectly from tracer tests, interference tests, production response analysis, or calibrated reservoir models. Therefore, a field-informed semi-synthetic dataset was constructed in this study to provide a controlled and reproducible benchmark for method validation.

The dataset was not generated by arbitrary random sampling. Instead, it was constrained by typical offshore waterflooding characteristics, including practical injection-rate ranges, layered injection behavior, delayed injection–production responses, gradual water-cut increase, and feasible injection adjustment limits. This design allows the underlying connectivity coefficients to be prescribed as ground truth, while the generated dynamic data still retain the main engineering features of offshore waterflooding reservoirs.

The constructed well group contains five water injectors and ten oil producers. Daily dynamic data were generated over 720 days. A three-layer reservoir setting was considered to approximate layered injection and vertical heterogeneity. The dataset was used to support three connected tasks: injector–producer connectivity identification, well-group oil production forecasting, and constrained water injection allocation optimization. Figure 1 illustrates the well pattern and prescribed injector–producer connectivity network, and Figure 2 presents representative injection and production response dynamics.

The semi-synthetic dataset was designed as a field-informed benchmark rather than as a full substitute for a confidential offshore oilfield dataset. Its representativeness is reflected by practical injection-rate constraints, delayed and smoothed injection–production response, gradual water-cut increase, and prescribed dominant injector–producer communication patterns. Nevertheless, it cannot fully reproduce geological uncertainty, surveillance errors, workover events, facility constraints, or long-term reservoir evolution. Therefore, the results are used for controlled methodological validation, and further validation with anonymized field data or history-matched numerical models is required before deployment.

Table 1 and Table 2 summarize the dataset components and key construction settings. The dataset contains five injectors, ten producers, 50 injector–producer pairs, and 720 daily records. The chronological LSTM sample split contains 669 sliding-window samples, including 468 training samples, 100 validation samples, and 101 test samples. Continuous input variables were standardized using statistics fitted only on the training set.

2.2. Static Well-Pair Features and Connectivity Ground Truth

For each injector–producer pair, a set of static features was constructed to describe the spatial and geological relationship between the two wells. These features include interwell distance, layer consistency, effective-thickness overlap, and a permeability-link proxy. The interwell distance between injector i and producer j was calculated using Equation (1):

d_{i j} = s q r t ({(x_{i} - x_{j})}^{2} + {(y_{i} - y_{j})}^{2})

(1)

where

d_{i j}

is the interwell distance, and (

x_{i}

,

y_{i}

) and (

x_{j}

,

y_{j}

) are the coordinates of injector i and producer j, respectively.

Based on distance, layer consistency, thickness overlap, and permeability linkage, the static connectivity prior was constructed using Equation (2):

P_{i j} = e x p (- \frac{d_{i j}}{D_{0}}) (0.70 + 0.30 L_{i j}) (0.75 + 0.25 T_{i j}) (0.80 + \frac{0.20 K_{i j}}{K_{0}})

(2)

where

P_{i j}

is the static prior between injector i and producer j, D₀ is the characteristic interwell influence distance,

L_{i j}

is the layer-consistency indicator,

T_{i j}

is the effective-thickness overlap ratio,

K_{i j}

is the permeability-link proxy, and

K_{0}

is the normalization factor. In this benchmark,

D_{0}

was set to 700 m and

K_{0}

was set to 1.5, consistent with the scaling used for the permeability-link proxy.

The static prior was used only to guide dataset construction and to provide a baseline for comparison. It was not treated as the final connectivity result, because actual injector–producer communication is also affected by dynamic production responses, operational changes, and time-lag effects.

For each producer, the three injectors with the strongest prior scores were retained as dominant contributing injectors, while the remaining injectors were assigned weak background connectivity. A stochastic perturbation term was introduced to represent unresolved geological uncertainty. The resulting raw scores were normalized using Equation (3) to obtain the ground-truth connectivity coefficients:

C_{i j} = \frac{S_{i j}}{\sum_{i = 1}^{N_{I}} S_{i j}}, C_{i j} \geq 0, {\sum_{i = 1}^{N_{I}} C_{i j}}_{i} = 1

(3)

where

C_{i j}

represents the normalized contribution of injector i to the production response of producer j,

S_{i j}

is the perturbed raw connectivity score, and

N_{I}

is the number of injectors. A larger coefficient indicates stronger injector–producer communication.

A response lag time was also assigned to each well pair. The lag time increases with interwell distance and decreases when the injector and producer share the same dominant layer, as expressed in Equation (4).

To make the prescribed connectivity labels reproducible, the static well-pair attributes were defined as interwell distance, layer-consistency flag, effective-thickness overlap, and permeability-link proxy. The static prior, stochastic perturbation, and normalization in Equations (2) and (3) define benchmark labels for controlled validation and should not be interpreted as universal physical laws.

τ_{i j} = c l i p (10 + \frac{d_{i j}}{55} + ε_{i j} - 6 L_{i j}, 5, 45)

(4)

where

τ_{i j}

is the response lag time,

ε_{i j}

is a random perturbation term, and the clipping function restricts the lag to 5–45 days. This setting is used to mimic delayed subsurface flow response in heterogeneous waterflooding reservoirs.

2.3. Generation of Dynamic Injection and Production Data

Daily injection data were generated for each injector by combining a base injection level, periodic operational fluctuation, gradual random variation, stepwise adjustment events, and short abnormal operating intervals. The injection rate was constrained within 300–1200 m³/d. Injection pressure was generated according to Equation (5):

p_{i} (t) = p_{i, 0} + 0.010 [q_{i}^{i n j} (t) - q_{i, b a s e}^{i n j}] + 0.25 s i n (\frac{2 π t}{90} + φ_{i}) + ε_{p, i} (t)

(5)

where

p_{i, 0}

is the baseline injection pressure,

q_{i}^{i n j} (t)

is the daily injection rate,

q_{i, b a s e}^{i n j}

is the base injection rate,

φ_{i}

is a phase term, and

ε_{p, i} (t)

is the pressure perturbation. In this benchmark,

p_{i, 0}

was sampled from 10.5 to 16.5 MPa,

ε_{p, i} (t)

followed a zero-mean Gaussian perturbation with a standard deviation of 0.18 MPa, and the generated pressure was clipped to 8.0–22.5 MPa. This pressure variable was used to reproduce practical pressure-rate co-variation in the semi-synthetic benchmark, not to represent calibrated field bottom-hole pressure. Layer-wise allocation ratios were also generated for the three-layer system, with the ratios varying slowly over time to represent layered injection adjustment.

Producer responses were generated by combining natural production decline, delayed injection-driven response, operational fluctuation, and water-cut evolution. The liquid production response of each producer was controlled by the ground-truth connectivity coefficients and the corresponding response lags. Water cut was generated as a gradually increasing variable influenced by development time and cumulative injection influence. Daily oil production was then calculated from liquid production and water cut. The well-group oil production rate was obtained by summing the oil rates of all producers, as shown in Equation (6).

The semi-synthetic data generation procedure is summarized as follows. First, five injectors and ten producers were placed in a 2500 m × 1800 m well-group domain, with three reservoir layers and injector/producer effective-thickness and kh-proxy values sampled from the ranges reported in Table 2. Second, static injector–producer attributes were calculated using interwell distance, same-layer flag, effective-thickness overlap, and permeability-link proxy. The static prior used D₀ = 700 m and K₀ = 1.5, top-three dominant injectors were retained for each producer, and non-dominant weak connections were multiplied by a random factor uniformly sampled from 0 to 0.08. The raw dominant-connectivity scores were perturbed by a lognormal factor with sigma = 0.18 and then normalized. Third, response lags were generated using Equation (4), where epsilon_ij follows N(0, 3²) days. Fourth, daily injector rates were generated from a base rate of 520–920 m³/d, a sinusoidal component with amplitude 40 m³/d, a random-walk component with daily noise N(0, 1.8²), five step-change events, and measurement/operation noise N(0, 18²), followed by clipping to 300–1200 m³/d. Fifth, injection pressure, layer-wise allocation ratios, producer liquid rates, water-cut evolution, and daily oil rates were generated using the rules described above. These parameter values define the reproducible benchmark used in this study rather than universal reservoir laws.

Q^{o i l} (t) = \sum_{i}^{N_{P}} q_{j}^{o i l} (t)

(6)

where

Q^{o i l} (t)

is the well-group oil production rate at time t,

q_{j}^{o i l} (t)

is the daily oil production rate of producer j, and

N_{P}

is the number of producers.

2.4. Production Forecasting Samples

The production forecasting task was defined at the well-group level rather than the individual-well level. This design is consistent with the optimization objective, which aims to maximize the cumulative oil production of the whole well group under operational constraints. For each sample, a historical dynamic window was used as input. The input features include injector-level injection rates and pressures, total injection rate, average injection pressure, well-group oil production rate, well-group liquid production rate, well-group water cut, and smoothed or differential dynamic indicators.

A supervised sample is defined by one valid chronological sliding window of the well-group dynamic sequence. With 720 daily records, a 45-day look-back window, and a 7-day forecasting horizon, the valid windows are constructed chronologically and then split by target start date into training, validation, and test subsets. This definition avoids random shuffling across future dates and prevents information leakage from later production responses into earlier training windows.

The input dynamic sequence is denoted by Equation (7):

X_{t} = [x_{t - T_{L}}, x_{t - T_{L} + 1}, . . ., x_{t - 1}]

(7)

where

X_{t}

is the historical input sequence,

x_{t}

is the multivariate dynamic feature vector at time t, and

T_{L}

is the look-back window length.

A 7-day forecasting horizon was used in this study. The prediction target is defined by Equation (8).

The production surrogate was defined at the well-group level because the injection allocation decision is implemented under well-group constraints. The total injection rate is kept constant, and PSO redistributes daily injection among injectors to improve the predicted cumulative oil production of the entire group. However, group-level forecasting may mask adverse responses at individual producers, such as local water-channeling or early water breakthrough; field application therefore requires individual-well screening.

Y_{t} = [Q^{o i l} (t), Q^{o i l} (t + 1), . . ., Q^{o i l} (t + H_{f} - 1)]

(8)

where

H_{f}

is the forecasting horizon. To improve training stability, the model was trained to predict the relative change in well-group oil production rather than directly predicting absolute production rates. The predicted relative changes were then converted back to absolute oil production rates for performance evaluation.

2.5. Water Injection Allocation Optimization Problem

The objective of water injection allocation optimization is to determine the daily injection rates of all injectors so that the predicted cumulative oil production of the well group over a future 30-day horizon is maximized. The decision variable is the injection allocation vector composed of the daily injection rates of all injectors, as defined in Equation (9):

q^{i n j} = {[q_{1}^{i n j}, q_{2}^{i n j}, . . ., q_{N I}^{i n j}]}^{T}

(9)

The optimization objective is formulated in Equation (10):

m a x (q^{i n j}) Σ_{t = 1}^{H} {\hat{Q}}^{o i l} (t| q^{i n j})

(10)

where H = 30 days is the optimization horizon, and

{\hat{Q}}^{o i l}

is the predicted well-group oil production rate under the candidate injection allocation vector. A 30-day optimization horizon was selected to match short-term offshore injection allocation review and adjustment cycles. This horizon is long enough to capture delayed injection–production responses but short enough to limit cumulative surrogate uncertainty. Therefore, the optimization objective is interpreted as a short-term decision-support metric rather than long-term recovery optimization.

The optimization was performed under three practical constraints. First, each injector must satisfy its lower and upper injection-rate limits, as given by Equation (11):

q_{i, m i n}^{i n j} \leq q_{i}^{i n j} \leq q_{i, m a x}^{i n j}

(11)

Second, the total injection rate of the well group is kept equal to the current total injection rate to maintain injection balance, as given by Equation (12):

Σ_{i} q_{i}^{i n j} = Σ_{i} q_{i, 0}^{i n j}

(12)

Third, the adjustment amplitude of each injector is limited to +/−15% of its current injection rate to ensure field operability, as given by Equation (13):

|q_{i}^{i n j} - q_{i, 0}^{i n j}| \leq γ q_{i, 0}^{i n j}

(13)

where

q_{i, 0}^{i n j}

is the current injection rate before optimization, and gamma is the maximum allowable adjustment ratio. In this study, gamma was set to 15%. This formulation ensures that the optimized injection allocation scheme improves the predicted oil production response while remaining consistent with practical injection capacity and operational constraints.

2.6. Evaluation Metrics

The proposed framework was evaluated from three aspects: connectivity identification accuracy, production forecasting performance, and injection optimization effect. For connectivity identification, the predicted connectivity matrix was compared with the prescribed ground-truth matrix using MAE, RMSE, R², cosine similarity, and Top-k overlap accuracy. The Top-k metric evaluates whether the model can correctly identify the dominant injectors for each producer.

For production forecasting, the group-level LSTM surrogate model was evaluated using MAE, RMSE, MAPE, and R². For injection optimization, the optimization effect was measured by the predicted incremental oil production over the 30-day horizon using Equation (14):

Δ Q_{30 d} = {\hat{Q}}_{o i l, o p t}^{30 d} - {\hat{Q}}_{o i l, b a s e}^{30 d}

(14)

where

{\hat{Q}}_{o i l, o p t}^{30 d}

is the predicted 30-day cumulative oil production after optimization, and

{\hat{Q}}_{o i l, b a s e}^{30 d}

is the predicted 30-day cumulative oil production under the initial injection allocation.

3. Methodology

3.1. Overall Workflow

The proposed workflow consists of three connected modules: attention-based injector–producer connectivity identification, group-level LSTM production forecasting, and PSO-based injection allocation optimization. The connectivity module first extracts normalized injector–producer influence coefficients from historical injection–production sequences and static well-pair attributes. The group-level LSTM model then acts as a fast production surrogate to predict short-term well-group oil production responses. Finally, the PSO algorithm searches for feasible daily injection allocation schemes under practical injection constraints.

This design links interpretability and optimization. The connectivity matrix explains which injectors dominate the response of each producer, while the LSTM surrogate provides rapid production evaluation for candidate injection schemes. The PSO module converts the predictive model into an operational injection allocation strategy. Figure 3 summarizes the overall workflow, Table 3 lists the main module inputs and outputs, and Table 4 reports the implementation and hyperparameter settings.

3.2. Attention-Based Connectivity Identification Model

The attention-based connectivity model is designed to estimate the relative contribution of each injector to each producer. For a given producer, the model receives three types of inputs: the historical injection-rate sequence of all injectors, the historical production-response sequence of the target producer, and the static features of each injector–producer pair. These inputs allow the model to learn both dynamic response patterns and static spatial-geological relationships.

A recurrent encoder is first used to extract the temporal injection–production response context. Meanwhile, injector-specific temporal features and static pair features are encoded separately. These encoded features are then fused and passed to an attention scoring layer.

For injector i and producer j, the attention score is calculated using Equation (15):

e_{i j} = w^{T} t a n h (W_{h} h_{i j} + W_{s} s_{i j} + b)

(15)

where

e_{i j}

is the learned attention score,

h_{i j}

represents the encoded dynamic response feature,

s_{i j}

represents the encoded static well-pair feature, and w,

W_{h}

,

W_{s}

, and b are trainable parameters.

The final connectivity coefficients are obtained using the Softmax normalization in Equation (16):

{\hat{C}}_{i j} = \frac{e x p (e_{i j})}{Σ_{i} e x p (e_{i j})}

(16)

where

{\hat{C}}_{i j}

is the predicted normalized connectivity coefficient between injector i and producer j. This normalization ensures that the coefficients are non-negative and that their sum equals one for each producer.

The connectivity loss combines distribution similarity and numerical accuracy using the weighted KL-divergence and mean-square-error form in Equation (17):

L_{c o n n} = Σ_{j} Σ_{i} C_{i j} l o g (\frac{C_{i j}}{{\hat{C}}_{i j}}) + λ Σ_{j} Σ_{i} {(C_{i j} - {\hat{C}}_{i j})}^{2}

(17)

where

C_{i j}

is the prescribed ground-truth connectivity coefficient,

{\hat{C}}_{i j}

is the predicted coefficient, and lambda is the weight of the numerical error term. In the supplementary sensitivity test, lambda was varied from 0 to 2.0 to examine whether the numerical-error term changes coefficient-scale accuracy or dominant injector ranking. This design encourages the model to recover both the dominant injector ranking and the actual coefficient magnitude.

3.3. Group-Level LSTM Production Surrogate Model

The production forecasting module is formulated at the well-group level rather than at the individual-well level. This setting is consistent with the optimization objective, which focuses on maximizing the cumulative oil production of the entire well group. The model input is a historical dynamic window containing injector-level injection rates and pressures, total injection rate, mean injection pressure, group oil rate, group liquid rate, group water cut, and smoothed or differential dynamic indicators.

The LSTM network maps the historical dynamic sequence to the future well-group oil production response over a seven-day forecasting horizon using Equation (18):

Ŷ_{t} = f_{L S T M (X_{t})}

(18)

where

X_{t}

is the multivariate historical dynamic sequence and

Ŷ_{t}

is the predicted future group oil production response.

To improve training stability, the model predicts relative production changes rather than absolute oil rates, as defined in Equation (19):

{∆ Q}^{o i l} (t + h) = \frac{Q^{o i l} (t + h)}{Q^{o i l} (t - 1)} - 1, h = 0,1, . ., H_{f} - 1

(19)

The predicted relative changes are then converted back to absolute group oil rates for evaluation. This strategy reduces the influence of production-level differences and improves the robustness of short-term forecasting.

3.4. PSO-Based Injection Allocation Optimization Method

The injection allocation problem is solved using particle swarm optimization. Each particle represents a candidate daily injection allocation vector for all injectors. The fitness value is calculated from the predicted 30-day cumulative oil production obtained from the production surrogate model, with penalties for constraint violations if necessary. Candidate solutions are restricted by single-well injection-rate bounds, total-injection balance, and adjustment-amplitude constraints.

For the 30-day optimization horizon, the candidate injection allocation is assumed to remain constant within the operational adjustment window. The 7-day LSTM surrogate is applied recursively in a rolling manner: after each 7-day prediction block, the dynamic input window is updated using the candidate injector rates and the predicted group-level production indicators until the 30-day horizon is covered. The last block is truncated to match the remaining days. The resulting daily predictions are accumulated to obtain the 30-day objective value. Because recursive use of a short-horizon surrogate can amplify forecast uncertainty, the optimized allocation is interpreted as a short-term decision-support recommendation rather than a deterministic field-control instruction.

The velocity and position of each particle are updated iteratively according to its own best historical position and the global best position of the swarm using Equations (20) and (21):

v_{i}^{k + 1} = ω v_{i}^{k} + c_{1} r_{1} (p_{i}^{k} - x_{i}^{k}) + c_{2} r_{2} (g^{k} - x_{i}^{k})

(20)

x_{i}^{k + 1} = x_{i}^{k} + v_{i}^{k + 1}

(21)

where

x_{i}^{k}

and

v_{i}^{k}

are the position and velocity of particle i at iteration k,

p_{i}^{k}

is the personal best position,

g^{k}

is the global best position, omega is the inertia weight,

c_{1}

and

c_{2}

are acceleration coefficients, and

r_{1}

and

r_{2}

are random factors.

The optimization target is written as Equation (22):

q^{*} = a r g m a x Σ_{t}^{H} {\hat{Q}}^{o i l} (t q^{i n j})

(22)

where q* is the optimized injection allocation vector. In this study, the maximum adjustment amplitude of each injector is set to +/−15%, and the total injection rate is kept unchanged before and after optimization to ensure field operability.

3.5. Implementation Details

The dataset was divided chronologically into training, validation, and test sets to avoid information leakage from future production data. The original 7-day LSTM samples consisted of 669 chronological sliding-window samples, including 468 training samples, 100 validation samples, and 101 test samples. For the additional direct multi-step experiments, independent 14-day and 21-day LSTM surrogates were trained using the same 45-day look-back window and chronological split protocol, resulting in 661 samples for the 14-day model and 654 samples for the 21-day model. All continuous input variables were standardized using statistics calculated only from the training set. The attention model used a 60-day injector-rate sequence, a 60-day producer-response sequence, and static pair features as inputs. The optimizer was Adam with a learning rate of 8 × 10⁻⁴ and early stopping. The PSO settings were 45 particles, 120 iterations, an inertia weight decreasing from 0.90 to 0.40, and acceleration coefficients c1 = c2 = 1.8. Candidate injection rates were constrained by 300–1200 m³/d single-well bounds, unchanged total injection, and a +/−15% adjustment limit.

The final implementation produced three categories of outputs: predicted connectivity matrices, group-level production forecasting results, and optimized injection allocation schemes. These outputs were further used to analyze model interpretability, forecasting stability, PSO convergence behavior, and the predicted oil-increment potential of the optimized injection strategies.

4. Results

4.1. Connectivity Identification Performance

The attention-based connectivity model was first evaluated by comparing the predicted injector–producer connectivity matrix with the prescribed ground-truth matrix. As shown in Figure 4, the predicted matrix reproduces the main spatial pattern of the ground-truth connectivity. Strong connections, such as the dominant injector contributions to producers P01, P05, P08, and P10, are correctly identified, while weak background connections remain close to zero. The absolute error matrix indicates that most prediction errors are lower than 0.03, with only a few local deviations observed in moderate-connectivity pairs.

The quantitative results further confirm the accuracy of the proposed connectivity identification model. The attention model achieved an MAE of 0.0146, RMSE of 0.0240, R² of 0.9835, and cosine similarity of 0.9962. In contrast, the static-prior baseline yielded a substantially larger MAE of 0.080 and RMSE of 0.093. This comparison demonstrates that the dynamic injection–production sequences provide additional information beyond static well-pair relationships. Because many weak connectivity coefficients are close to zero, MAPE is not used as the primary metric for connectivity evaluation; MAE, RMSE, R², cosine similarity, and Top-k overlap are more informative for this task. The quantitative comparison is summarized in Table 5.

A lambda-sensitivity analysis is reported in Supplementary Table S1 using lambda = 0, 0.1, 0.5, 1.0, and 2.0. Across these settings, the top-three injector overlap remains 100%, while MAE decreases slightly from 0.0143 to 0.0142, RMSE decreases from 0.0227 to 0.0225, and R² increases from 0.9852 to 0.9855. These results indicate that the KL-divergence term already constrains the dominant injector ranking, whereas the MSE-weighted term mainly refines coefficient-scale accuracy. The sensitivity range is therefore small, and lambda = 1.0–2.0 provides a stable balance between ranking preservation and numerical coefficient fitting.

The scatter plot in Figure 5 shows that the predicted connectivity coefficients are closely distributed around the 1:1 line. This indicates that the model not only identifies the dominant injectors correctly but also provides reliable quantitative coefficient estimates. The top-three injector overlap accuracy reaches 100% for both the attention model and the static-prior baseline; therefore, the main improvement of the attention model lies in substantially reducing coefficient-scale errors while preserving the correct dominant injector ranking.

The high connectivity identification accuracy should be interpreted in the context of the controlled semi-synthetic benchmark. A physical-consistency check shows that the predicted connectivity coefficients are negatively correlated with interwell distance (Pearson r = −0.842; Spearman r = −0.862) and positively correlated with the static connectivity prior (Pearson r = 0.791; Spearman r = 0.846). These trends are consistent with reservoir-engineering intuition and support the interpretability of the learned coefficient matrix.

4.2. Group-Level Production Forecasting Performance

The group-level LSTM production surrogate was evaluated on the chronological test set to examine whether the constructed dynamic features can support short-term well-group production forecasting. Because the surrogate is later used inside the PSO optimizer, both overall test accuracy and horizon-dependent degradation were examined.

The overall test results are summarized in Table 6. The group-level LSTM model achieved an MAE of 4.524 m³/d, RMSE of 5.963 m³/d, MAPE of 1.255%, and R² of 0.964, indicating high short-term forecasting accuracy. Figure 6 shows the comparison between the actual and predicted group oil production rates, where the predicted curve follows the overall production trend well.

The forecasting performance under different prediction horizons is shown in Table 7 and Figure 7. As the forecasting horizon increases from 1 to 7 days, the MAPE increases from 0.894% to 1.644%, while R² decreases from 0.985 to 0.938. This gradual degradation is expected for multi-step time-series forecasting. Nevertheless, the model maintains MAPE below 1.65% and R² above 0.938 for all forecasting horizons, indicating robust short-term predictive capability.

To address multi-step error accumulation beyond the direct 1–7-day horizon, independent direct 14-day and 21-day LSTM surrogates were trained using the same input features, look-back length, chronological split strategy, optimizer, and early-stopping protocol as the original 7-day model. The direct 14-day model achieved an overall MAE of 6.340 m³/d, RMSE of 8.457 m³/d, MAPE of 1.745%, R of 0.975, and R² of 0.932 on the test set. The direct 21-day model achieved an MAE of 7.107 m³/d, RMSE of 9.371 m³/d, MAPE of 1.957%, R of 0.971, and R² of 0.922. The comparison in Supplementary Table S2 and Supplementary Figure S2 shows the expected degradation from the 7-day model to the longer direct horizons, but the MAPE remains below 2.0% and R² remains above 0.92 for both additional models.

The error-distribution analysis further indicates that the test-set residuals are centered close to zero. The mean error is 1.226 m³/d, the standard deviation is 5.840 m³/d, the 5th and 95th percentiles are −7.081 and 12.223 m³/d, respectively, and the mean absolute error is 4.524 m³/d. These results suggest that no strong systematic bias is observed in the group-level short-term forecasts, although multi-step extrapolation still accumulates uncertainty as the forecast horizon increases.

4.3. Results of PSO-Based Injection Allocation Optimization

After connectivity identification and production forecasting, PSO was used to optimize daily injection allocation under single-well injection-rate limits, total-injection balance, and adjustment-amplitude constraints. Each particle represents one candidate injection allocation vector. The group-level LSTM surrogate model provides the predicted 30-day cumulative oil production used to evaluate each candidate scheme.

The PSO convergence curve in Figure 8 shows that the objective function increases rapidly during the early iterations and then becomes stable, indicating that the algorithm can efficiently search for a feasible and improved allocation scheme. Figure 9 compares the injection rates before and after optimization for a representative case. The optimized scheme redistributes injection among injectors while satisfying the total-injection constraint and the maximum adjustment limit.

The optimization results for 15 cases are summarized in Table 8. The predicted 30-day incremental oil production ranges from 1595.50 to 1943.96 m³, with an average increment of 1737.36 m³ per case, as shown in Figure 10. The average predicted 30-day cumulative oil production increases from 14,718.94 m³ before optimization to 16,456.30 m³ after optimization. These results indicate that the proposed LSTM-PSO workflow can generate feasible injection allocation schemes with measurable production benefits.

The uncertainty analysis was separated into PSO stochastic stability and surrogate-prediction uncertainty. The multi-seed analysis evaluates optimizer stability, whereas the 30-day oil increment in Table 8 is reported in physical production units. The raw objective uplift used during PSO is an internal normalized surrogate score and should not be interpreted directly as m³ of oil. The conversion to Table 8 is obtained by accumulating the predicted daily group-oil-rate difference between the optimized and initial injection schemes over the 30-day horizon. In the 15 optimization cases, the empirical conversion ratio ranges from 7.07 to 9.66 m³ per raw-objective unit. A residual-bootstrap analysis based on test-set LSTM residuals is reported in Supplementary Table S3 and Supplementary Figure S3; the 95% confidence intervals remain positive for all cases, with lower bounds ranging from 1505.67 to 1853.62 m³.

A representative multi-seed stability analysis was performed for five optimization cases and is summarized in Supplementary Table S4. Each case was repeated using five random seeds. The mean raw objective uplift ranges from 188.590 to 225.731 over the 30-day horizon, with a standard deviation below 1 × 10⁻⁶ at the reported precision. This indicates stable convergence for these representative cases under the present objective function and constraints. For Case 1, the corresponding allocation changed I01 and I02 by −15%, I03 and I04 by +15%, and I05 by +7.35%, while satisfying total-injection balance and single-well adjustment limits.

5. Discussion

The results demonstrate the advantage of integrating connectivity identification, production forecasting, and injection optimization into a unified workflow. The attention-based connectivity model provides interpretable injector–producer relationships, which helps bridge the gap between purely data-driven prediction and engineering decision-making. The group-level LSTM model acts as a fast surrogate for evaluating candidate injection schemes, while PSO provides a flexible search mechanism under practical operational constraints.

For practical use, the workflow requires daily injector rates and pressures, producer oil/liquid rates and water cut, static well-pair attributes, and operational injection constraints. The recommended procedure is to conduct quality control and time alignment, standardize inputs using training-set statistics, infer injector–producer connectivity, forecast short-term group oil response, search feasible injection reallocations using PSO, and screen the recommended scheme using individual-well water cut, pressure, facility, and reservoir-engineering constraints before implementation.

6. Conclusions

This study developed a connectivity-aware LSTM-PSO workflow for short-term water injection allocation in offshore waterflooding reservoirs. A field-informed semi-synthetic benchmark was constructed to provide prescribed connectivity labels, delayed injection–production responses, operational constraints, and production histories. Within this benchmark, the attention-based model recovered the dominant injector–producer relationships with high coefficient accuracy and correct top-three injector ranking. The group-level LSTM surrogate provided stable short-term group oil forecasting, and the PSO module generated feasible injection reallocations under single-well rate, total-injection balance, and adjustment-amplitude constraints. These results indicate that dynamic injection–production sequences and static well-pair attributes can be integrated into an interpretable decision-support workflow for constrained injection allocation evaluation.

The results should be regarded as controlled methodological validation rather than direct field deployment. The semi-synthetic dataset cannot fully reproduce field-scale geological uncertainty, surveillance errors, workover events, facility constraints, or long-term waterflooding behavior. Future work should validate the workflow using anonymized field data, tracer or interference tests, or history-matched numerical models, and should extend the objective function to include water cut, pressure maintenance, operational cost, injection efficiency, and long-term recovery.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/pr14132065/s1, Table S1. Lambda sensitivity analysis for connectivity identification. Figure S1. Lambda sensitivity of connectivity-coefficient errors and R². Table S2. Independent direct 7-14-21-day LSTM forecast performance on the test set. Figure S2. Direct 7-14-21-day LSTM comparison in terms of MAPE, R, and R². Table S3. Residual-bootstrap uncertainty intervals for PSO-predicted 30-day incremental oil. Figure S3. Residual-bootstrap 95% confidence intervals for PSO-predicted 30-day incremental oil. Table S4. PSO multi-seed stability for five representative optimization cases. Table S5. Input-output ranges of the group-level forecasting and optimization dataset. Table S6. Representative chronological sample records for the group-level LSTM forecasting task. The supplementary analyses include lambda-sensitivity results (Table S1 and Figure S1), independent direct 7/14/21-day LSTM forecast comparison (Table S2 and Figure S2), residual-bootstrap uncertainty intervals for 30-day incremental oil (Table S3 and Figure S3), PSO multi-seed stability results (Table S4), input-output ranges (Table S5), and representative chronological sample records (Table S6). Dataset-scope and split statistics are reported in Section 2.1 and Table 1 and Table 2.

Author Contributions

Conceptualization, F.W. and S.J.; methodology, S.J. and G.P.; software, S.J.; validation, F.W., X.C. and W.L.; formal analysis, S.J. and G.P.; investigation, F.W., X.C., G.P., W.L. and P.C.; resources, F.W. and P.C.; data curation, X.C. and G.P.; writing—original draft preparation, S.J.; writing—review and editing, F.W., G.P. and S.J.; visualization, S.J.; supervision, F.W. and S.J.; project administration, F.W. and P.C.; funding acquisition, F.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The field-informed semi-synthetic dataset, generated result tables, and source code used in this study are available from the corresponding author upon reasonable request. No confidential well-level field production data are disclosed in this study.

Acknowledgments

The authors would like to thank the engineers and technical staff involved in offshore waterflooding reservoir management for their valuable field knowledge and technical support.

Conflicts of Interest

Authors Feng Wei, Xiaoquan Chen, Guoqiang Pang and Wei Li were employed by the company CNOOC (China), Zhanjiang Branch. Author Peng Chen was employed by the CNOOC EnerTech-Drilling & Production Co. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Willhite, G.P. Waterflooding; SPE Textbook Series; Society of Petroleum Engineers: Richardson, TX, USA, 1986. [Google Scholar] [CrossRef]
Lake, L.W. Enhanced Oil Recovery; Prentice Hall: Englewood Cliffs, NJ, USA, 1989; ISBN 9780132816014. [Google Scholar]
Albertoni, A.; Lake, L.W. Inferring interwell connectivity only from well-rate fluctuations in waterfloods. SPE Reserv. Eval. Eng. 2003, 6, 6–16. [Google Scholar] [CrossRef]
Sayarpour, M.; Zuluaga, E.; Kabir, C.S.; Lake, L.W. The use of capacitance–resistance models for rapid estimation of waterflood performance and optimization. J. Pet. Sci. Eng. 2009, 69, 227–238. [Google Scholar] [CrossRef]
de Holanda, R.W.; Gildin, E.; Jensen, J.L.; Lake, L.W.; Kabir, C.S. A state-of-the-art literature review on capacitance resistance models for reservoir characterization and performance forecasting. Energies 2018, 11, 3368. [Google Scholar] [CrossRef]
Jiang, Y.; Shen, W.; Zhang, H.; Zhang, K.; Wang, J.; Zhang, L. An interpretable recurrent neural network for waterflooding reservoir flow disequilibrium analysis. Water 2023, 15, 623. [Google Scholar] [CrossRef]
Du, Y.; Fu, L.; Sun, W.; Huang, R.; Shao, D.; Liu, Z. A deep learning framework using graph convolutional networks for adaptive correction of interwell connectivity and gated recurrent unit for performance prediction. SPE Reserv. Eval. Eng. 2022, 25, 815–831. [Google Scholar] [CrossRef]
Huang, Z.-Q.; Wang, Z.-X.; Hu, H.-F.; Zhang, S.-M.; Liang, Y.-X.; Guo, Q.; Yao, J. Dynamic interwell connectivity analysis of multi-layer waterflooding reservoirs based on an improved graph neural network. Pet. Sci. 2024, 21, 1062–1080. [Google Scholar] [CrossRef]
Saihood, A.; Saihood, T.; Jebur, S.A.; Ehlig-Economides, C.; Alzubaidi, L.; Gu, Y. Artificial intelligence based-improving reservoir management: An Attention-Guided Fusion Model for predicting injector–producer connectivity. Eng. Appl. Artif. Intell. 2025, 146, 110205. [Google Scholar] [CrossRef]
Guo, L.-W.; Qu, S.-Y.; Lei, Y.-Y.; Kang, Z.-H.; Wang, S.-L. Evaluation of dynamic inter-well connectivity by using the state-variable-capacitance model. Pet. Sci. 2025, 22, 3380–3396. [Google Scholar] [CrossRef]
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
Song, X.; Liu, Y.; Xue, L.; Wang, J.; Zhang, J.; Wang, J.; Jiang, L.; Cheng, Z. Time-series well performance prediction based on Long Short-Term Memory (LSTM) neural network model. J. Pet. Sci. Eng. 2020, 186, 106682. [Google Scholar] [CrossRef]
Zhong, Z.; Sun, A.Y.; Wang, Y.; Ren, B. Predicting field production rates for waterflooding using a machine learning-based proxy model. J. Pet. Sci. Eng. 2020, 194, 107574. [Google Scholar] [CrossRef]
Dong, Y.; Zhang, Y.; Liu, F.; Cheng, X. Reservoir production prediction model based on a stacked LSTM network and transfer learning. ACS Omega 2021, 6, 34700–34711. [Google Scholar] [CrossRef] [PubMed]
Chahar, J.; Verma, J.; Vyas, D.; Goyal, M. Data-driven approach for hydrocarbon production forecasting using machine learning techniques. J. Pet. Sci. Eng. 2022, 217, 110757. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Advances in Neural Information Processing Systems 30; Curran Associates, Inc.: Red Hook, NY, USA, 2017; pp. 6000–6010. [Google Scholar] [CrossRef]
Pan, S.; Yang, B.; Wang, S.; Guo, Z.; Wang, L.; Liu, J.; Wu, S. Oil well production prediction based on CNN-LSTM model with self-attention mechanism. Energy 2023, 284, 128701. [Google Scholar] [CrossRef]
Lu, S.; Cui, C.; Qian, Y.; He, J. Well production forecast post-liquid lifting measures: A Transformer-based Seq2Seq method with attention mechanism. Energy Fuels 2024, 38, 14072–14084. [Google Scholar] [CrossRef]
Yang, C.; Chen, Y.; Li, Y.; Chen, P. A data-driven approach for oil production prediction and water injection recommendation in well groups. Geoenergy Sci. Eng. 2025, 247, 213682. [Google Scholar] [CrossRef]
Zhao, H.; Kang, Z.; Zhang, X.; Sun, H.; Cao, L.; Reynolds, A.C. History matching and production optimization of water flooding based on a data-driven interwell numerical simulation model. J. Nat. Gas Sci. Eng. 2016, 31, 48–66. [Google Scholar] [CrossRef]
Kennedy, J.; Eberhart, R. Particle swarm optimization. In Proceedings of the IEEE International Conference on Neural Networks, Perth, Australia, 27 November–1 December 1995; pp. 1942–1948. [Google Scholar] [CrossRef]
Coello Coello, C.A.; Lechuga, M.S. MOPSO: A proposal for multiple objective particle swarm optimization. In Proceedings of the 2002 Congress on Evolutionary Computation, Honolulu, HI, USA, 12–17 May 2002; pp. 1051–1056. [Google Scholar] [CrossRef]
Jia, D.; Liu, H.; Zhang, J.; Gong, B.; Pei, X.; Wang, Q.; Yang, Q. Data-driven optimization for fine water injection in a mature oil field. Pet. Explor. Dev. 2020, 47, 674–682. [Google Scholar] [CrossRef]
Farahi, M.M.M.; Ahmadi, M.; Dabir, B. Model-based production optimization under geological and economic uncertainties using multi-objective particle swarm method. Oil Gas Sci. Technol.—Rev. D’ifp Energ. Nouv. 2021, 76, 60. [Google Scholar] [CrossRef]
Rostamian, A.; de Moraes, M.B.; Schiozer, D.J.; Coelho, G.P. A survey on multi-objective, model-based, oil and gas field development optimization: Current status and future directions. Pet. Sci. 2025, 22, 508–526. [Google Scholar] [CrossRef]
Wang, S.; Chen, S. Insights to fracture stimulation design in unconventional reservoirs based on machine learning modeling. J. Pet. Sci. Eng. 2019, 174, 682–695. [Google Scholar] [CrossRef]
Hui, G.; Chen, S.; He, Y.; Wang, H.; Gu, F. Machine learning-based production forecast for shale gas in unconventional reservoirs via integration of geological and operational factors. J. Nat. Gas Sci. Eng. 2021, 94, 104045. [Google Scholar] [CrossRef]
Jolfaei, S.; Lakirouhani, A. Sensitivity analysis of effective parameters in borehole failure, using neural network. Adv. Civ. Eng. 2022, 2022, 4958004. [Google Scholar] [CrossRef]

Figure 1. Well pattern and prescribed injector–producer connectivity network in the field-informed semi-synthetic waterflooding dataset.

Figure 2. Representative dynamic injection and production responses in the constructed semi-synthetic dataset: (a) injector dynamics and (b) producer response. Specifically, in panel (a), the blue and orange lines represent injection rate and injection pressure, respectively; in panel (b), the green, blue, and red lines represent oil rate, liquid rate, and water cut, respectively.

Figure 3. Overall workflow of the proposed connectivity-aware LSTM–PSO framework. The solid horizontal arrows indicate the sequential information flow among the main workflow stages, and the dashed downward arrows indicate the intermediate outputs generated by the corresponding modules.

Figure 4. Comparison of ground-truth, predicted, and absolute-error injector–producer connectivity matrices.

Figure 5. Connectivity identification evaluation: (a) predicted versus ground-truth connectivity coefficients and (b) top-three injector identification accuracy.

Figure 6. Group-level LSTM production forecasting performance: (a) time-series comparison for the 1-day-ahead test sequence and (b) predicted versus actual group oil production rates for all flattened 7-day test predictions.

Figure 7. Forecasting performance of the group-level LSTM model under different prediction horizons.

Figure 8. PSO convergence curve for a representative optimization case.

Figure 9. Comparison of injection allocation before and after optimization for a representative case.

Figure 10. Predicted 30-day incremental oil production of the 15 optimization cases.

Table 1. Main components of the field-informed semi-synthetic dataset.

Component	Main Variables	Role in This Study
Well information	Well type, coordinates, dominant layer, effective thickness, injectivity/productivity proxy	Defines the well-group structure and static properties
Well-pair features	Distance, layer consistency, thickness overlap, permeability-link proxy, static prior	Provides auxiliary geological-spatial constraints
Injection dynamics	Daily injection rate, injection pressure, cumulative injection, layer allocation	Provides dynamic injector-side inputs
Production responses	Liquid rate, oil rate, water cut, cumulative oil/liquid production	Provides producer-side response sequences and forecasting targets
Connectivity labels	Normalized connectivity coefficient and response lag	Provides ground truth for evaluating connectivity identification
Optimization cases	Initial/optimized injection rates, injection bounds, predicted 30-day oil increment	Supports evaluation of constrained injection optimization

Table 2. Key settings used in the field-informed semi-synthetic dataset.

Item	Setting or Range	Purpose
Number of injectors	5	Injector-side control variables
Number of producers	10	Production response system
Data length	720 days	Daily dynamic sequence modeling
Reservoir layers	3 layers	Layered injection and heterogeneity representation
Injection-rate range	300–1200 m³/d	Practical injection capacity constraint
Connectivity response lag	5–45 days	Delayed injection–production response
Forecasting horizon	7 days	Short-term production prediction
Optimization horizon	30 days	Injection allocation evaluation
Maximum injection adjustment	±15%	Field operability constraint
Injection pressure generation	p_i,0 = 10.5–16.5 MPa; pressure-rate coefficient = 0.010 MPa/(m³/d); sinusoidal amplitude = 0.25 MPa; eps_p ~ N(0, 0.18^2); clipped to 8.0–22.5 MPa	Pressure-rate co-variation in the semi-synthetic benchmark

Table 3. Summary of the proposed connectivity-aware LSTM-PSO workflow.

Module	Main Input	Main Output	Role in Workflow
Connectivity identification	Injection histories, production responses, static pair features	Injector–producer connectivity coefficients	Provides interpretable dynamic interwell relationships
Production forecasting	Historical group-level dynamic features	Future group oil production response	Provides a fast surrogate model for evaluating candidate schemes
Injection optimization	Current injection rates, constraints, surrogate predictions	Optimized daily injection allocation	Generates feasible injection schemes with improved predicted oil production

Table 4. Implementation details and hyperparameter settings of the proposed workflow.

Item	Setting to Report
Random seed	2026 for dataset generation, connectivity model, and group-level LSTM training; PSO V2 uses seed 2026 + 321 for the reported optimization cases.
Dataset split	Chronological split by target start date; 669 sliding-window samples in total, with 468 for training, 100 for validation, and 101 for testing. No random shuffle across future dates.
Normalization	Continuous inputs were standardized using statistics fitted only on the training set.
Attention input	60-day injector-rate sequence, 60-day producer-response sequence, and static injector–producer pair features; sample stride = 5 days.
Attention temporal encoder	Two LSTM layers with 96 and 64 units; dropout = 0.10; layer normalization after the first LSTM layer.
Attention static/temporal branches	Injector temporal branch: TimeDistributed Dense 32 → 16; static pair branch: TimeDistributed Dense 32 → 16; attention hidden layers: 64 → 32 with tanh activation; Softmax over injectors.
Attention training	Adam optimizer; learning rate 8 × 10⁻⁴; batch size 64; maximum epochs = 220; early-stopping patience = 30; combined KL divergence and MSE loss.
Group LSTM input	45-day look-back window including injector rates/pressures, total injection, mean pressure, group oil/liquid rate, water cut, moving averages, and difference features.
Group LSTM architecture	Two LSTM layers with 96 and 64 units; dropout = 0.08; layer normalization after the first LSTM layer; Dense 64 → Dropout 0.12 → Dense 32 → linear 7-day output.
Group LSTM training	Adam optimizer; learning rate 8 × 10⁻⁴; Huber loss; batch size 64; maximum epochs = 180; early-stopping patience = 25; ReduceLROnPlateau patience = 10.
Longer-horizon tests	Independent direct 14-day and 21-day LSTM surrogates were trained using the same look-back window, chronological split protocol, optimizer, and early-stopping strategy.
PSO settings	45 particles; 120 iterations; inertia weight 0.90 to 0.40; c1 = c2 = 1.8; injection-rate bounds 300–1200 m³/d; total injection unchanged; +/−15% adjustment limit.

Table 5. Connectivity identification performance compared with the static-prior baseline.

Method	MAE	RMSE	R²	Cosine Similarity	Top-3 Overlap
Attention model	0.0146	0.0240	0.9835	0.9962	100%
Static-prior baseline	0.0797	0.0934	-	0.9462	100%

Table 6. Overall performance of the group-level LSTM production surrogate model.

MAE (m³/d)	RMSE (m³/d)	MAPE (%)	R²
4.524	5.963	1.255	0.964

Table 7. Forecasting performance under different prediction horizons.

Horizon (Day)	MAE (m³/d)	RMSE (m³/d)	MAPE (%)	R²
1	3.215	3.954	0.894	0.985
2	3.456	4.295	0.960	0.982
3	4.248	5.164	1.173	0.973
4	4.334	5.627	1.195	0.968
5	4.865	6.486	1.347	0.957
6	5.639	7.580	1.570	0.939
7	5.909	7.550	1.644	0.938

Table 8. Summary of PSO-based water injection allocation optimization results.

Cases	Minimum Increment (m³)	Maximum Increment (m³)	Mean Increment (m³)	Std. (m³)	Mean Before (m³)	Mean After (m³)
15	1595.50	1943.96	1737.36	103.89	14,718.94	16,456.30

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wei, F.; Chen, X.; Pang, G.; Li, W.; Chen, P.; Jiao, S. Connectivity-Aware LSTM-PSO for Water Injection Allocation in Offshore Waterflooding Reservoirs. Processes 2026, 14, 2065. https://doi.org/10.3390/pr14132065

AMA Style

Wei F, Chen X, Pang G, Li W, Chen P, Jiao S. Connectivity-Aware LSTM-PSO for Water Injection Allocation in Offshore Waterflooding Reservoirs. Processes. 2026; 14(13):2065. https://doi.org/10.3390/pr14132065

Chicago/Turabian Style

Wei, Feng, Xiaoquan Chen, Guoqiang Pang, Wei Li, Peng Chen, and Shixiang Jiao. 2026. "Connectivity-Aware LSTM-PSO for Water Injection Allocation in Offshore Waterflooding Reservoirs" Processes 14, no. 13: 2065. https://doi.org/10.3390/pr14132065

APA Style

Wei, F., Chen, X., Pang, G., Li, W., Chen, P., & Jiao, S. (2026). Connectivity-Aware LSTM-PSO for Water Injection Allocation in Offshore Waterflooding Reservoirs. Processes, 14(13), 2065. https://doi.org/10.3390/pr14132065

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Connectivity-Aware LSTM-PSO for Water Injection Allocation in Offshore Waterflooding Reservoirs

Abstract

1. Introduction

2. Dataset Construction and Problem Formulation

2.1. Rationale and Dataset Overview

2.2. Static Well-Pair Features and Connectivity Ground Truth

2.3. Generation of Dynamic Injection and Production Data

2.4. Production Forecasting Samples

2.5. Water Injection Allocation Optimization Problem

2.6. Evaluation Metrics

3. Methodology

3.1. Overall Workflow

3.2. Attention-Based Connectivity Identification Model

3.3. Group-Level LSTM Production Surrogate Model

3.4. PSO-Based Injection Allocation Optimization Method

3.5. Implementation Details

4. Results

4.1. Connectivity Identification Performance

4.2. Group-Level Production Forecasting Performance

4.3. Results of PSO-Based Injection Allocation Optimization

5. Discussion

6. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI