Rational Design of High-Performance Viscosifying Polymers in Confined Systems via a Machine-Learning-Accelerated Multiscale Framework for Enhanced Hydrocarbon Recovery

Alvarez-Cruz, Arturo; Mayoral-Villa, Estela; García-Márquez, Alfonso Ramón; Klapp, Jaime

doi:10.3390/fluids11040086

Open AccessArticle

Rational Design of High-Performance Viscosifying Polymers in Confined Systems via a Machine-Learning-Accelerated Multiscale Framework for Enhanced Hydrocarbon Recovery

by

Arturo Alvarez-Cruz

¹,

Estela Mayoral-Villa

^1,*

,

Alfonso Ramón García-Márquez

²

and

Jaime Klapp

¹

Departamento de Química y Física, Instituto Nacional de Investigaciones Nucleares, Carretera México-Toluca Km. 36.5, La Marquesa, Ocoyoacac 52750, Estado de México, Mexico

²

Departamento de Química Inorgánica y Nuclear, Facultad de Química, UNAM Circuito Exterior S/N, Ciudad Universitaria, Coyoacán 04510, Ciudad de México, Mexico

^*

Author to whom correspondence should be addressed.

Fluids 2026, 11(4), 86; https://doi.org/10.3390/fluids11040086

Submission received: 28 February 2026 / Revised: 17 March 2026 / Accepted: 20 March 2026 / Published: 26 March 2026

(This article belongs to the Special Issue Pipe Flow: Research and Applications, 2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

Rational design of high-performance viscosifying polymers is critical for enhancing supercritical CO₂ flooding efficiency in enhanced oil recovery (EOR). Traditional experimental and simulation approaches are limited in exploring the vast design space of polymer architecture, flexibility, and intermolecular interactions. This work presents an integrated machine learning (ML) and mesoscopic simulation framework using Dissipative Particle Dynamics (DPD) to accelerate the development of tailored polymeric thickeners. We systematically investigate synergistic effects of linear and branched polymer blends on solvent viscosity under Poiseuille flow, representative of flow in micro-fractures and pore throats. Key molecular descriptors are varied to generate a comprehensive rheological database. This data trains a deep neural network (DNN) surrogate model linking molecular parameters to macroscopic viscosity. The DNN is coupled with gradient ascent optimization for inverse design, enabling rapid virtual screening of thousands of formulations. A focused case study demonstrates that the star-like architectures with associative cores and semi-flexible backbones outperform linear analogs for supercritical CO₂ viscosity enhancement. The optimal candidate—a four-arm star polymer with linear side chains—was validated by DPD simulation. This multiscale “simulation-to-surrogate” methodology bridges molecular design with continuum-scale flow behavior, offering a transformative tool for formulating cost-effective, efficient, and sustainable next-generation EOR chemicals.

Keywords:

Dissipative Particle Dynamics; multiscale modeling; surrogate model; inverse design; enhanced oil recovery; polymer rheology; machine learning

1. Introduction

The transition towards sustainable industrial processes and efficient energy resource extraction necessitates the accelerated development of advanced functional materials, with high-performance polymeric additives playing a pivotal role. A key research frontier is the design of new thickening additives that can efficiently increase fluid viscosity at low concentrations, are environmentally benign, and cost-effective [1,2,3]. These additives are critical for diverse applications, including fine-tuning the rheology of water-based paints, treating industrial water for reuse, and—most notably—enhancing the viscosity of supercritical CO₂ for use as a water substitute in hydraulic fracturing and enhanced oil recovery (EOR) [2].

Traditional material design, reliant on empirical experimentation or high-fidelity simulations like DFT or Molecular Dynamics (MD), is often prohibitively slow and computationally expensive for exploring the vast parameter space governing polymer–solvent interactions. Although mesoscopic numerical modeling methods such as Dissipative Particle Dynamics (DPD) have proven to be an alternative for obtaining accurate qualitative and quantitative results at low computational cost [4,5,6], the analysis of results and generation of properties in real time remains insufficient due to the need to accelerate the intelligent design of materials for the energy industry. This work presents a methodological framework for Machine Learning (ML)-assisted multiscale modeling as a transformative approach to accelerate the discovery and optimization of polymeric thickeners. We synthesize the state-of-the-art in integrating ML with simulations across scales [7,8,9], from coarse-grained mesoscopic models to continuum-level property prediction, to establish robust, computationally efficient design pipelines.

The role of machine learning in oil development is becoming increasingly prominent, offering powerful tools for predicting complex phenomena and optimizing processes. For instance, ML techniques have been successfully applied to screen effective enhanced oil recovery methods [10], to evaluate the environmental impact and performance of nanomaterials for heavy oil viscosity improvement [11], and to compare drag reduction methods in complex drilling environments such as continental shale formations [12]. These studies underscore the potential of data-driven approaches to complement traditional experimental and simulation techniques, a concept our work builds upon by integrating ML with mesoscopic simulations for polymer design.

In this work, we demonstrate this integrated approach through a targeted case study: optimizing the viscosity enhancement of supercritical CO₂ via mixtures of branched and linear polymers under Poiseuille flow. The workflow is structured around four key molecular descriptor parameters: (a) polymer architecture (Structure), (b) chain flexibility (k_θ), (c) polymer–solvent/polymer–polymer Flory–Huggins interaction parameters χ_ij, and (d) additive concentration (C). By varying these descriptors, a large dataset is generated via high-throughput DPD simulations, which compute the solvent’s shear viscosity using a coupled methodology for confined solutions under steady flow [13].

The comprehensive dataset generated from DPD simulations [14,15,16] provides the essential input for constructing supervised machine learning models. Specifically, we employ deep neural networks (DNNs) to capture the high-dimensional, nonlinear mapping from fundamental molecular descriptors (e.g., architecture, χ parameter) to the emergent rheological response (viscosity). Once trained and validated, this surrogate model predicts viscosity with high fidelity (mean absolute error < 10%) and computational efficiency orders of magnitude greater than the original simulations. This paradigm shift from simulation-to-prediction unlocks two powerful capabilities: the high-throughput virtual screening of thousands of polymer candidates and the gradient-based inverse design of novel thickeners tailored to specific reservoir conditions.

2. Models and Methodology

2.1. DPD-ML-Enhanced Strategy Integration for Thickener Design

Descriptor Definition and High-Throughput Simulation: The system is parameterized by critical molecular descriptors (structure, flexibility, interaction parameter and composition). A designed suite of DPD simulations, varying these parameters systematically, generates a comprehensive database of the resulting solvent shear viscosity under Poiseuille flow.

Machine Learning Model Development: The simulation database is used to train supervised ML models (DNNs). The models learn the functional relationship η = f (structure, k_θ, χ, C).

Virtual Screening and Inverse Design: The trained DNN surrogate is deployed to predict the viscosity of vast numbers of virtual polymer mixtures. Optimization algorithms (gradient ascent method) can then be applied to this model to identify the combination of descriptors that maximizes viscosity enhancement at minimal cost or concentration—solving the inverse design problem.

Validation and Active Learning: Top-performing candidates identified by DNN are validated with targeted, high-fidelity DPD simulations. The new data points are fed back into the training set, refining the DNN model in an active learning loop for continuous improvement.

This ML-enhanced multiscale strategy bridges the gap between molecular-level interactions (captured by DPD) and target macroscale rheological properties, providing a fast, predictive, and rational tool for designing the next generation of high-performance, sustainable thickening agents for the energy and chemical industries.

2.2. Integrated Mesoscale Methodology for Viscosifier Design Flow Simulation via DPD

We employ Dissipative Particle Dynamics (DPD), a well-known particle-based mesoscale method that preserves hydrodynamic behavior and thermal fluctuations [14,15,16] to model the flow of architecturally complex polymers under confinement. This approach is computationally efficient for capturing the relevant length and time scales of polymeric fluid dynamics in confined geometries.

The system consists of soft, coarse-grained beads representing groups of atoms/molecules. Newton’s equations of motion are integrated with pairwise forces:

\vec{F_{i}} = \sum_{i \neq j} [\vec{F_{i j}^{C}} + \vec{F_{i j}^{D}} + \vec{F_{i j}^{R}}]

(1)

where

\vec{F_{i j}^{C}}

is the conservative force:

\vec{F_{i j}^{C}} = a_{i j} (1 - \frac{r}{r_{c}}) \hat{r_{i j}}

and

\vec{F_{i j}^{D}} + \vec{F_{i j}^{R}}

are respectively the dissipative and the random forces, coupled via the fluctuation-dissipation theorem, providing a built-in thermostat. All forces act within the cutoff radius r_c, with repulsion parameters a_ij related to Flory–Huggins interaction parameters χ_ij as (see details in [4,16,17])

a_{i j} = a_{i i} + 3.5 χ_{i j}

(2)

To model the monomeric linking to construct the polymer, we use the Kremer–Grest bead-spring model with harmonic bonds [4]:

\vec{F_{S}} (r_{i j}) = - k (r_{i j} - r_{o}) \hat{r_{i j}}

(3)

Based on previous systematic studies in similar DPD applications [4], the parameters setting is density ρ = 3, equilibrium bond length r₀ = 0.7 and spring constant k = 100.

We prepare a mixture of polymers constituted by linear and branched architectures in scCO₂ as a solvent (See Figure 1a). Branched polymers were chosen because it is known that they promote intermolecular associations that improve viscosifying power [6] compared to linear chains that were also added in different concentrations to analyze the possible synergic effect [2,5]. We model four-arm star polymers (Bm) with the following:

(1): Six central “A” beads exhibiting attractive interactions (mimicking π-π stacking).
(2): Four “B” arm beads in good solvent conditions (scCO₂ solvent).
(3): Chain stiffness controlled via harmonic angular potential:

F_{i j k}^{s} = - k_{θ} (θ_{i j k} - θ_{0})

(4)

where θ_ijk is the angle formed by three consecutive beads, k_θ is the angular spring constant and determines the chain flexibility, and θ₀ = 180° is the equilibrium angle [18,19]. This design promotes intermolecular association through the core while allowing tunable flexibility.

2.3. Confined Flow Setup and Simulation Parameters

We simulate stationary Poiseuille flow between parallel plates separated by Λ = 17 r_c. Walls were considered via short-ranged repulsive force with amplitude a_w = 115.0, and we consider slip suppression including beads within z ≤ 0.15 r_c of walls, which have zero z-velocity. The driving force was implemented using a constant body force Fp = 0.02 in the x-direction. Periodic boundaries in the x- and y-directions were imposed. Details of this methodology can be consulted in references [4,6,10,19]. The setup is presented in Figure 1b.

Figure 1. (a) Branched (Bm) and linear polymers (Lm) bead design. (b) Confined flow setup. Polymers in (b) were visualized with VMD 2.0 [20].

All simulations were performed in NVT ensemble with N_DPD = 15,000 beads, using reduced units m* = 1, T* = 1, r_c* = 1, δt* = 0.03. The production steps were 15,000 time steps after equilibration, and for the analysis, we monitored the velocity profile v_x(z) yields shear rate γ˙(z) = −dv_x/dz following reference [4,10]. The viscoelastic regime used was settled by applying maximum shear rates γ˙(z) ∼ 0.07−0.10 that correspond to Weissenberg numbers We ∼ 1, indicating comparable polymer relaxation and flow timescales. A constant body force drives Poiseuille flow, and the resulting velocity profile v_x(z) is used to compute the apparent shear viscosity η from the stress field or via the integrated momentum balance. This setup directly mimics the shear-thinning behavior and confinement effects critical for flow in microporous networks.

2.4. Shear Viscosity Calculation via Poiseuille Flow Simulations

The shear viscosity (η) was computed using non-equilibrium molecular dynamics (NEMD) simulations of steady-state Poiseuille flow. The simulation geometry (Figure 1b) consists of a fluid confined between two parallel, stationary walls oriented perpendicular to the z-axis. The walls are separated by a total distance Λ = 2d = 17r_c, where r_c is the DPD cutoff radius.

Wall Model: The walls are modeled as featureless, purely repulsive surfaces via a short-ranged, linearly decaying force acting along the z-direction:

F_{i}^{W} = \{\begin{matrix} a_{w} (1 - \frac{z_{i}}{z_{c}}) {\hat{z}}_{i}, & z_{i} < z_{c} \\ 0, & z_{i} \geq z_{c} \end{matrix}

(5)

where a_w = 115.0 (in reduced DPD units) is the force amplitude, and z_c = 0.4 r_c is the force cutoff length. This soft-wall model effectively prevents particle penetration, eliminates the need for high-density frozen particle walls, and provides tunable slip control.

Flow Generation and Boundary Conditions: A constant external body force, F_p = 0.02 (in reduced units), was applied to all fluid beads in the positive x-direction to drive the flow. To suppress slip at the walls, beads within a distance z ≤ 0.15r_c of either wall were constrained to have zero velocity in the z-direction. Periodic boundary conditions were applied only in the x- and y-directions.

Steady-State Velocity Profile: Under these conditions, the system evolves to a steady state characterized by a parabolic velocity profile v_x(z) (Figure 1b), as expected for plane Poiseuille flow of a Newtonian fluid. The maximum shear rates (γ˙(z)= ∣dv_x/dz∣) extracted from the profiles ranged from 0.07 to 0.1. For the polymers studied, these shear rates correspond to Weissenberg numbers We ∼ O(1), indicating that chain relaxation times (λ ≈ 10.0–14.3) are comparable to the flow timescale.

Viscosity Extraction: The dynamic shear viscosity η was determined by fitting the simulated velocity profile v_x(z) to the analytical steady-state solution of the Navier–Stokes equation for Poiseuille flow:

v_{0} = - \frac{d^{2}}{2 ρ ν} (\frac{Δ p}{L}) .

(6)

where v₀ is the maximum centerline velocity and d = Λ/2. The applied body force Fp is related to the pressure gradient and the fitted parameters by [21]

|F_{p}| = \frac{1}{ρ} \frac{\partial p}{\partial x} = \frac{1}{ρ} (\frac{∆ p}{L}) = - \frac{2 ν v_{0}}{d^{2}} = - \frac{2 η v_{0}}{{ρ d}^{2}}

(7)

Here, ρ is the number density, and ν = η/ρ is the kinematic viscosity. From the fit to Equation (6), we obtain v₀ and the quantity v₀/d². Substituting these into Equation (7) yields the kinematic viscosity ν, from which the dynamic shear viscosity is calculated as η = νρ.

The relevance of this methodology for fluid dynamics simulations is that it enables direct computation of velocity profiles in confined geometries, shear-dependent viscosity from stress tensor or velocity curvature, microstructural evolution under flow, effects of molecular architecture on hydrodynamic behavior and a bridge between molecular design and macroscopic rheology. The approach is particularly suited for studying thickening efficiency in applications such as enhanced oil recovery, where branched architectures in confined flows exhibit complex viscoelastic responses that challenge continuum models.

2.5. Machine Learning Surrogate Model

The dataset {Structure, k_θ, χ, C} → η from DPD simulations is used to train supervised ML models as follows:

○: Input Features: Molecular descriptors (topology index, k_θ, a_ij (χ_ij) matrix, concentration [Lm] and [Bm]).
○: Target Output: Computed shear viscosity η at specified shear rates.
○: Algorithms: Our methodological pipeline integrates three core computational techniques: Principal Component Analysis (PCA) for preliminary data exploration and feature analysis, Deep Neural Networks (DNNs) to construct accurate, nonlinear surrogate models linking molecular descriptors to macroscopic viscosity, and the Gradient Ascent optimization algorithm to solve the inverse design problem by navigating the DNN’s parameter space towards optimal formulations.
○: Validation: Models are validated on hold-out simulation data and tested for extrapolation to unseen parameter combinations.

2.6. Optimization and Reinforcement Loop Design

The trained surrogate model is coupled with global optimization algorithms (gradient ascent method) to solve the inverse problem:

Targ_max{ η(γ˙(z))} → {S, k_θ, a_ij, [Lm], [Bm]}

(8)

This enables the identification of polymer structures that maximize viscosity under economic and synthetic constraints. The gradient ascent method derived from reinforcement learning is a way to improve the performance of a system given some environment, in this case the solution of the DNN is used as environment, then we can choose a random point in the parameter space, and with this we calculate the gradient and create a path where the result is improved (viscosity) until a local maximum or minimum is reached, this candidate is simulated and this new parameter point is added to training data.

3. Results

3.1. Finite Size Effect Study and Database Generation

To determine the minimum representative size of the training systems, a finite-size analysis was performed for ten systems with particle numbers N_DPD = 5000, 7500, 8750, 10,000, 11,250, 12,500, 13,750, 15,000, 17,500 and 20,000 (corresponding to simulation box lengths L = 11.85, 13.57, 14.29, 14.94, 15.54, 16.09, 16.61, 17.10, 18.00, and 18.82 r_c in DPD units, respectively). All simulations were conducted using k_θ = 50 and a_ij = 70, with a polymer concentration of 26 wt% (linear and branched polymers in a 1:1 ratio) to maintain a total system density of ρ = 3 in DPD units. The parabolic velocity profiles, characteristic of Poiseuille flow, were obtained, and the shear viscosity η was calculated after fitting the velocity profiles with the steady-state solution for the Navier−Stokes equation (see Equations (6) and (7)).

The results for the relative viscosity (η/η₀), where η₀ = 1.53 DPD units is the viscosity of the solvent (scCO₂) obtained with the same procedure, are presented in Figure 2.

The results indicate that for system sizes of L ≥ 11.85 DPD units, the relative viscosity remains essentially constant, fluctuating around an average value of η/η₀ = 3.31. This establishes L = 11.85 as the minimum representative box size required for generating our training systems. Consequently, we performed 1000 distinct DPD simulations at this size (L = 11.85, N = 5000), following the described methodology by using a uniform random sampling strategy that allows exploring the multidimensional design space in an efficient and unbiased way. In these simulations, we randomly vary each of the following parameters:

(a): The number of molecules of linear [Lm] and branched [Bm] polymers in an interval of [0, 100];
(b): The flexibility of the linear polymer chain via the spring constant k_θ in an interval of [5, 500];
(c): The affinity (a_ij) between the branched and the linear polymers in an interval of [10, 200].

For each case, the target property (the relative viscosity η/η₀) was calculated by fitting the velocity profiles v_x(z) and using the maximum asymptotic velocity v_x(z)_max. The setup and structures of the branches and linear polymers remained unchanged according to Figure 1. This process generated the finely tuned training database for the system.

3.2. Principal Component Analysis (PCA)

The automation of mesoscopic simulations allowed obtaining 1000 systems with different descriptor parameters: (1) flexibility (k_θ), (2) interaction (a_ij), (3) branched polymer concentration ([Bm]), and (4) linear polymer concentration ([Lm]). The analysis of the output results allowed the extraction of the target property: the relative viscosity (η/η₀) obtained via the v_x(z)_max. Even though this system has a small number of parameters, it is suitable to perform principal component analysis to explore the dependency and sensitivity between them and to explore a possible reduction of variables through linear combinations of the components. It was found that intermolecular interaction between the head and tail of arms and linear polymers, formulation and flexibility present a high correlation, so they were analyzed independently.

3.3. Training Supervised Machine Learning Models for Viscosity Prediction

An iterative architecture search was performed to identify the optimal Deep Neural Network (DNN) surrogate model for predicting dynamic viscosity (η).

For the neural network training, we used: (1) the ReLU (Rectified Linear Unit) activation function for all hidden layers, (2) RMSProp as optimizer, (3) an initial learning rate set to 3 × 10⁻⁴, (4) a maximum of 200 epochs to train the network, implementing early stopping with a patience of 50 epochs to prevent overfitting, and (5) a training validation split with the dataset of 1000 simulations, randomly split into 80% for training (800 samples) and 20% for validation (200 samples).

After evaluating multiple configurations, a DNN with five fully connected hidden layers, each with 2000 neurons, achieved the lowest mean absolute error on the validation set. This final architecture was subsequently trained on the comprehensive dataset derived from Dissipative Particle Dynamics (DPD) simulations, mapping molecular descriptors {a_ij, k_θ, [Bm], [Lm]} to the target property η.

Initial model performance was quantitatively assessed by comparing DNN-predicted η values against ground-truth DPD simulation results for a representative subset of polymer formulations (Table 1). In this first training iteration, prediction errors exceeded 10% for high-viscosity systems (η > 3). Notably, the error is significantly influenced by, and decreases relative to, the reference viscosity of the pure solvent (η₀ for scCO₂ = 1.53).

For these initial predictions, the concentrations of branched ([Bm]) and linear ([Lm]) polymers were held constant, while the DNN screened a wide range of the remaining variables: arm flexibility (k_θ, range [0, 500]) and intermolecular affinity (a_ij, range [0, 200]). The parameter a_ij corresponds to Flory–Huggins interaction parameters, where values above a critical threshold (χ_c) indicate segregation-driven system behavior.

A detailed analysis of the maximum flow velocity (v_x(z)_max) for select systems from Table 1 is presented in Figure 3.

This analysis reveals that the lowest velocities (v_x(z)_max < 0.3), corresponding to the most viscous systems, occur at a_ij < 78 (the point of maximum affinity between identical DPD beads). The influence of arm flexibility (k_θ) is less straightforward. Systems with a low concentration of branched polymer ([Bm] = 13; Figure 3d–f) show improved performance (lower v_x(z)_max) at low flexibility, with a slight positive dependence on linear polymer concentration. In contrast, systems with high [Bm] (>20) and low [Lm] (<10) (Figure 3a,b) perform better with rigid branched polymers (k_θ > 230), while flexibility enhances viscosity when linear polymer content is sufficiently high (Figure 3c). The high correlation among system parameters makes it difficult to identify a single optimal combination for maximizing viscosity from this initial data, underscoring the need for more exhaustive sampling and improved prediction accuracy.

To address this, a second training loop was conducted by integrating the initial predictions into the expanded dataset. The results (Table 2) show a marked improvement, with prediction errors reduced to ≤10%, and for the best estimate, achieving an error of only 0.33%.

3.4. Virtual Screening and Inverse Design Framework

Once validated, the trained DNN surrogate model was deployed as a high-throughput virtual laboratory. This framework enables the rapid, computationally inexpensive prediction of dynamic viscosity for millions of hypothetical polymer formulations, a task prohibitively expensive for direct DPD simulation or experimental synthesis. The workflow consists of two synergistic phases:

3.4.1. Virtual High-Throughput Screening

The model was used to screen expansive combinatorial libraries generated by systematically varying key molecular descriptors (k_θ, a_ij, [Bm], [Lm]). This process efficiently maps the vast chemical and topological space to the target property η, identifying promising regions that exhibit enhanced thickening performance. Using the improved predictions from the second training loop, a more detailed analysis of v_x(z)_max was performed with fixed polymer concentrations while scanning a_ij and k_θ intervals (Figure 4). The results confirm that the lowest velocities (highest viscosities) for high concentrations of branched polymer ([Bm] > 25) consistently correspond to attractive values for a_ij (Figure 4a,b). We observe that a_ij < 50 is the optimal value to increase the viscosity, indicating that attractive intermolecular interactions between the tails of the polymers are necessary to form synergistic structures between components. For the cases where branched and linear polymer concentration is low ([Bm] = 15, [Lm] = 8), poor improvement for the viscosity is observed (Figure 4c). When we increase the quantity of linear polymer [Lm] = 20 at low concentrations of branched polymers [Bm] = 13 (Figure 4d), a synergistic effect occurs, and the best candidates are the branched polymers with flexible arms (k_θ < 30) and affinities between 25 < a_ij < 50. Nevertheless, the influence of arm rigidity (k_θ) remains complex, with viscosity showing non-monotonic behavior as flexibility varies.

3.4.2. Analysis of Composition–Flexibility Interplay

To systematically elucidate the influence of branched polymer arm flexibility and synergy, we analyzed systems with a fixed, high intermolecular affinity (a_ij = 50) while varying the concentrations of branched ([Bm]: 0–30) and linear ([Lm]: 0–30) polymers. This analysis was performed for two contrasting arm stiffness regimes: flexible (k_θ = 50) and rigid (k_θ = 200). The resulting viscosity response surfaces are presented in Figure 5.

The results reveal distinct behavior in each regime. For flexible-arm polymers k_θ = 50 (Figure 5a), viscosity exhibits a strong positive dependence on linear polymer concentration: increasing [Lm] consistently leads to higher η. This suggests that chain flexibility and the affinity promote interaction or entanglement with linear chains, enhancing the thickening capability of the blend.

Conversely, in the rigid-arm regime (Figure 5b), the composition–viscosity relationship is notably more complex and non-monotonic. Regions of higher viscosity are observed at high linear polymer concentrations ([Lm] > 30) and low branched polymer concentration ([Bm] < 10), indicating that rigidity may foster intrinsic self-structuring of the branched component that contributes to thickening. However, the response shows local variations and non-systematic behavior in certain composition ranges, suggesting the presence of competitive interactions or non-linear synergies between components.

These findings underscore the coupled and non-trivial dependence of viscosity on both composition and molecular flexibility. The clear divergence in thickening mechanisms between the two regimes demonstrates that optimal formulation design cannot rely on simple heuristic analysis or one-dimensional parameter extrapolation.

The complexity observed in this analysis, together with the virtual screening results, highlights the need for a systematic and computationally guided approach to efficiently navigate the high-dimensional design space. This necessity motivates the implementation of advanced gradient-based inverse design techniques, described in the following section, which aim to automatically and optimally identify polymer architectures that maximize the target viscosity.

3.5. Inverse Design via Gradient-Based Optimization

To advance beyond passive screening, we formulated a targeted inverse design problem: identify the optimal combination of molecular descriptors that maximizes target viscosity under defined physical and compositional constraints.

This was achieved by treating the trained, differentiable DNN surrogate model as an objective function, η = f(x), where x is the vector of input descriptors {a_ij, k_θ, [Bm], [Lm]}. We then employed gradient ascent optimization to navigate the high-dimensional design space. By computing the gradients of the predicted viscosity with respect to the input features (∇η), the algorithm iteratively follows the direction of steepest ascent toward higher η values. This gradient-based framework enables the de novo computational design of polymer formulations that maximize thickening efficiency. The output of this inverse design process is a set of optimized molecular blueprints. Selected candidates predicted to be high-performance viscosifiers (Table 3) are presented below. Their performance was validated against DPD simulations to assess the predictive fidelity of the optimized designs, also presented in Table 3.

We can observe that when validating the selected systems, we have an error of approximately 20% due to the model extrapolation to unexplored regions of the design space. However, within expectations, the values obtained in the simulation are lower in all cases, achieving viscosities of up to η = 4.04. To quantify the increased percentage of viscosity, we have used the following equation:

% i n c r e a s e = (\frac{η}{η_{0}} - 1) \times 100 %

(9)

where

\frac{η}{η_{0}}

means the predicted relative viscosity, and 1 means that

η = η_{0}

and no increase in viscosity is observed.

For the previous example (η = 4.04), we have estimated a 164% increase in the viscosity of scCO₂. Nevertheless, the cost in terms of added polymer weight is high; therefore, only those candidates where wt% < 35 are considered for further analysis with more robust simulations.

3.6. Validation and Active Learning

Top-performing candidates identified by DNN are validated with targeted, high-fidelity DPD simulations. These simulations were carried out using N = 15,000 in a simulation box of L = 17.1 while maintaining the density ρ = 3. The results are presented in Table 4.

The new data points are fed back into the training set, refining the DNN model in an active learning loop for continuous improvement.

3.7. Study of the Synergistic Effect of a Mixture of Branched and Linear Polymers

One of these systems was chosen to analyze the synergistic effect. DPD simulations were carried out with N = 15,000 particles, performing two experiments. In the first, the concentration of linear polymer [Lm] = 72 was kept constant, while the concentration of branched polymer [Bm] was varied from 0 to 72. In the second experiment, the concentration of branched polymer [Bm] = 72 was kept constant, and the concentration of linear polymer [Lm] was varied from 0 to 72. The results obtained are shown in Figure 6.

At constant branched-polymer concentration [Bm] = 72, the system exhibits a markedly elevated viscosity from the outset: even before adding any linear polymer [Lm] = 0, the viscosity is already 40% higher than in the analog system containing only the linear polymer. As the linear-polymer fraction increases, the viscosity rises accordingly and eventually reaches a plateau once both polymer concentrations become equal. Conversely, when the linear-polymer concentration is held constant, the viscosity increases linearly, attaining a value of 3.2 at equal polymer concentrations. Collectively, these trends demonstrate that branched polymers impart a substantially stronger thickening effect than their linear counterparts, producing high initial viscosities followed by only modest increases at elevated concentrations. In contrast, starting from a fixed amount of linear polymer, the incremental addition of branched polymer enables precise tuning of the fluid’s viscosity.

4. Conclusions

This work presents an integrated methodological framework that combines Dissipative Particle Dynamics (DPD) simulations with machine learning (ML) to accelerate the rational design of thickening polymers for enhanced oil recovery applications. From a fluid dynamics perspective, the results demonstrate that the proper combination of linear and branched polymer architectures fundamentally alters the rheological response of supercritical CO₂. Polymers with branched structures with flexible arms show superior thickening performance compared to linear polymers, increasing apparent viscosity through better formation of intermolecular networks. Nevertheless, relative viscosity can be tailored by varying the concentration of the linear polymer at a given branched polymer concentration.

The implementation of a deep neural network (DNN) as a surrogate model enabled prediction errors to be reduced below 10% after successive training cycles, facilitating the virtual screening of thousands of formulations and solving the inverse design problem through gradient-based optimization. The high initial correlation among molecular descriptors—intermolecular affinity (a_ij), flexibility (k_θ), and concentrations of branched (Bm) and linear (Lm) polymers—explains the prediction errors exceeding 10% during the first DNN training iteration, reflecting the nonlinear and coupled nature of these variables on the resulting flow curves. However, incorporating new data through successive training cycles significantly reduced these errors, demonstrating the effectiveness of the active learning approach for model refinement.

The ML model identifies a specific mixture of four-arm star polymer–linear polymer with a_ij = 203 and a semi-flexible (k_θ = 58) backbone in an 81:27 ratio as the top candidate, predicting a viscosity increase of over 230% at approximately 26 wt% concentration (Table 4)—a result validated by subsequent DPD simulation that confirmed the predicted flow curve.

Even though inverse design via gradient ascent enabled the identification of optimal formulations that increased scCO₂ viscosity by up to 300%, high polymer loadings (>30 wt%) are required. Therefore, we discarded this result to achieve the best compromise between technical efficiency and economic viability. This methodological framework not only accelerates the discovery of new viscosifiers but also provides a mechanistic understanding of how molecular architecture and interactions determine macroscopic rheology.

Nevertheless, further research is still needed to reduce effective concentration, potentially by exploring alternative molecular architectures or synergies with other additives.

Thus, this closed-loop framework transforms the predictive model from a passive analyzer into an active design engine, accelerating the discovery of next-generation additives with tailored rheological properties. The integration of mesoscale hydrodynamic simulations with machine learning creates a powerful paradigm for designing polymers optimized for specific flow conditions in EOR applications.

Author Contributions

All authors participated in conceptualization, formal analysis, writing—original draft preparation, writing—review and editing. E.M.-V. and A.A.-C. performed calculations, chose methodology and adapted ML-DPD software. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Project COMECYT, CIKAS-FICDTEM-25-030.

Data Availability Statement

The data presented in this study is available on request from the corresponding author. For institutional security reasons, permission shall be requested to share the data.

Acknowledgments

The authors thank DGTIC-Miztli project Nano2DhyME for Supercomputing facilities, Laboratorio de Supercómputo and Visualización en Paralelo (LSVP−Yoltla) of UAM−Iztapalapa, where part of the simulations were carried out.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Cummings, S.; Xing, D.; Enick, R.; Rogers, S.; Heenan, R.; Grillo, I.; Eastoe, J. Design principles for supercritical CO₂ viscosifiers. Soft Matter 2012, 8, 7044–7055. [Google Scholar] [CrossRef]
Orr, F.M.; Taber, J.J. Use of Carbon Dioxide in Enhanced Oil Recovery. Science 1984, 224, 563–569. [Google Scholar] [CrossRef] [PubMed]
Enick, R.M.; Klara, S.M. Effects Of CO₂ Solubility in Brine on the Compositional Simulation Of CO₂ Floods. SPE Reserv. Eng. 1992, 7, 253–258. [Google Scholar] [CrossRef]
Mayoral, E.; Goicochea, A.G. Modeling of Branched Thickening Polymers under Poiseuille Flow Gives Clues as to How to Increase a Solvent’s Viscosity. J. Phys. Chem. B 2021, 125, 1692–1704. [Google Scholar] [CrossRef]
Mayoral, E.; Arcos-Casarrubias, J.; Goicochea, A.G. Self–assembly of model surfactants as reverse micelles in nonpolar solvents and their role as interfacial tension modifiers. Colloids Surf. A Physicochem. Eng. Asp. 2021, 615, 126244. [Google Scholar] [CrossRef]
Mayoral, E.; Velázquez, J.D.H.; Goicochea, A.G. The viscosity of polyelectrolyte solutions and its dependence on their persistence length, concentration and solvent quality. RSC Adv. 2022, 12, 35494. [Google Scholar] [CrossRef]
Iskandarov, J.; Ahmed, S.; Fanourgakis, G.S.; Alameri, W.; Froudakis, G.E.; Karanikolos, G.N. Predicting and optimizing CO₂ foam performance for enhanced oil recovery: A machine learning approach to foam formulation focusing on apparent viscosity and interfacial tension. Mar. Pet. Geol. 2024, 170, 107108. [Google Scholar] [CrossRef]
Cheraghi, Y.; Kord, S.; Mashayekhizadeh, V. Application of machine learning techniques for selecting the most suitable enhanced oil recovery method; challenges and opportunities. J. Pet. Sci. Eng. 2021, 205, 108761. [Google Scholar] [CrossRef]
Shakeel, M.; Pourafshary, P.; Hashmet, M.R.; Muneer, R. Application of machine learning techniques to predict viscosity of polymer solutions for enhanced oil recovery. Energy Syst. 2023, 1–24. [Google Scholar] [CrossRef]
Ali, J.; Ansari, U.; Ali, F.; Javed, T.; Hullio, I.A. Application of machine learning for effective screening of enhanced oil recovery methods. Reserv. Sci. 2026, 2, 65–80. [Google Scholar] [CrossRef]
Xu, N.; Wang, Y. Effect of nanomaterials on improving the apparent viscosity of heavy oil and the environmental evaluation of reservoir environment. Reserv. Sci. 2026, 2, 1–15. [Google Scholar] [CrossRef]
Hu, Y.; Yang, Y. A comparative study on drag reduction methods for continental shale drilling in the Fuxing Block, southeastern Sichuan Basin. Reserv. Sci. 2026, 2, 81–96. [Google Scholar] [CrossRef]
Gama Goicochea, A.; Mayoral, E.; Klapp, J.; Pastorino, C. Nanotribology of biopolymer brushes in aqueous solution using dissipative particle dynamics simulations: An application to PEG covered liposomes in a theta solvent. Soft Matter 2014, 10, 166–174. [Google Scholar] [CrossRef] [PubMed]
Hoogerbrugge, P.J.; Koelman, J.M.V.A. Simulating Microscopic Hydrodynamic Phenomena with Dissipative Particle Dynamics. Europhys. Lett. 1992, 19, 155–160. [Google Scholar] [CrossRef]
Español, P.; Warren, P. Statistical Mechanics of Dissipative Particle Dynamics. Europhys. Lett. 1995, 30, 191–196. [Google Scholar] [CrossRef]
Groot, R.D.; Warren, P.B. Dissipative Particle Dynamics: Bridging the Gap Between Atomistic and Mesoscopic Simulation. J. Chem. Phys. 1997, 107, 4423–4435. [Google Scholar] [CrossRef]
Maiti, A.; McGrother, S. Bead–bead interaction parameters in dissipative particle dynamics: Relation to bead-size, solubility parameter, and surface tension. J. Chem. Phys. 2004, 120, 1594–1601. [Google Scholar] [CrossRef]
Grest, G.S.; Kremer, K. Molecular Dynamics Simulation for Polymers in the Presence of a Heat Bath. Phys. Rev. A 1986, 33, 3628–3631. [Google Scholar] [CrossRef]
Hernández-Fragoso, J.S.; Alas, S.d.J.; Goicochea, A.G. Polymer Chains of a Large Persistence Length in a Polymer Brush Require a Lower Force to Be Compressed Than Chains with a Short Persistence Length. ACS Appl. Polym. Mater. 2020, 2, 5006–5013. [Google Scholar] [CrossRef]
Humphrey, W.; Dalke, A.; Schulten, K. VMD-Visual Molecular Dynamics. J. Molec. Graph. 1996, 14, 33–38. [Google Scholar] [CrossRef]
Sigalotti, L.D.G.; Klapp, J.; Sira, E.; Meleán, Y.; Hasmy, A. SPH simulations on time-dependent Poiseuille flow al low Reynols numbers. J. Comp. Phys. 2003, 191, 622–638. [Google Scholar]
Origin 2026 (Academic), version 2026; OriginLab Corporation: Northampton, MA, USA, 2026.

Figure 2. Finite size effect study of viscosity (see text for details). Plots were created with Origin 2026 [22].

Figure 3. Detailed analysis of v_x(z)_max behavior for systems in Table 1, with k_θ varied from [0, 500] (x-axis) and a_ij from [0, 200] (y-axis). The v_xmax values range (color profile curves) (third dimension or profile curves) are indicated on the right side of every plot. Yellow means the highest v_xmax value (lowest viscosity), and purple means the lowest v_xmax value (highest viscosity).

Figure 4. Detailed analysis of v_x(z)_max behavior for systems in Table 2, with k_θ varied from [0, 500] (x-axis) and a_ij from [0, 200] (y-axis). The v_xmax values range (color profile curves) are indicated on the right side of every plot. Yellow means the highest v_xmax value (lowest viscosity), and purple means the lowest v_xmax value (highest viscosity).

Figure 5. Viscosity (η) response surfaces as a function of branched ([Bm]) and linear ([Lm]) polymer concentrations for polymers with (a) flexible (k_θ = 50) and (b) rigid (k_θ = 200) arms, at fixed intermolecular affinity (a_ij = 50). Left plots show the whole dataset. Center plots are the zoom in within the 0–30 range for both [Bm] and [Lm] data. Far right, the v_{x max} values range (color profile curves) are indicated on the right side of every plot. Yellow means the highest v_xmax value (lowest viscosity), and purple means the lowest v_xmax value (highest viscosity).

Figure 6. (a) Viscosity as a function of one polymer concentration by keeping the other polymer concentration = 72; (b) relative viscosity as a function of polymer concentration. For both figures, [Bm] (black); [Lm] (red).

Table 1. First training loop: DNN predictions for η against DPD simulation results for a representative subset.

[Bm]	[Lm]	k_θ	a_ij	wt%	η Prediction	η DPD Simulation	Error %	η/η₀
25	6	132	195	24.2	4.0561	3.0131	25.71	1.97
22	27	56	143	25.64	3.7664	3.0131	20.00	1.97
24	24	50	70	26.88	4.0618	3.0131	25.81	1.97
13	21	126	23	16.16	2.7752	2.3435	15.56	1.54
13	25	136	49	16.16	2.7041	2.3968	11.36	1.57
13	9	123	100	13.76	2.5722	2.3435	8.89	1.54
30	30	190	150	33.6	1.4647	1.4251	2.70	0.93

Table 2. Second training loop: improved DNN predictions for η against DPD simulation results.

[Bm]	[Lm]	k_θ	a_ij	wt%	η Prediction	η DPD Simulation	Error %	η/η₀
27	22	58	141	29.24	3.9059	3.6365	6.90	2.38
28	29	32	104	31.56	3.9059	3.5153	10.00	2.30
25	7	57	103	24.4	3.6365	3.01	14.71	2.03
13	20	114	92	15.96	2.7041	2.5722	4.88	1.69
11	8	77	63	11.72	2.3435	2.1522	8.16	1.41
5	30	50	50	10.6	2.0328	2.0394	0.33	1.34
1	1	10	130	1.12	1.5420	1.5587	1.08	1.02

Table 3. Top-performing viscosifiers identified via gradient-based inverse design. Formulations are optimized for high predicted η, with subsequent validation via DPD simulation.

[Bm]	[Lm]	k_θ	a_ij	wt%	η Prediction	η DPD Simulation	Error %	η/η₀
27	8	58	203	26.44	4.12	3.28	20.49	2.15
30	15	61	226	30.6	4.47	3.46	22.53	2.27
30	35	10	147	34.6	4.54	3.65	19.64	2.39
38	5	8	137	35.96	5.24	3.96	24.35	2.60
39	24	4	216	40.68	5.16	4.04	21.77	2.64

Table 4. Top-performing viscosifiers identified via DNN. Formulations are optimized for high predicted η, with subsequent validation via DPD simulation.

k_θ	a_ij	wt%	[Bm]	[Lm]	η (N = 15,000) DPD Simulation	η/η₀ (N = 15,000)
50	70	26.88	72	72	2.65	3.18
58	203	26.44	81	24	2.82	3.38
61	226	30.6	90	45	2.95	3.54
10	147	34.6	90	105	3.41	4.08

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Alvarez-Cruz, A.; Mayoral-Villa, E.; García-Márquez, A.R.; Klapp, J. Rational Design of High-Performance Viscosifying Polymers in Confined Systems via a Machine-Learning-Accelerated Multiscale Framework for Enhanced Hydrocarbon Recovery. Fluids 2026, 11, 86. https://doi.org/10.3390/fluids11040086

AMA Style

Alvarez-Cruz A, Mayoral-Villa E, García-Márquez AR, Klapp J. Rational Design of High-Performance Viscosifying Polymers in Confined Systems via a Machine-Learning-Accelerated Multiscale Framework for Enhanced Hydrocarbon Recovery. Fluids. 2026; 11(4):86. https://doi.org/10.3390/fluids11040086

Chicago/Turabian Style

Alvarez-Cruz, Arturo, Estela Mayoral-Villa, Alfonso Ramón García-Márquez, and Jaime Klapp. 2026. "Rational Design of High-Performance Viscosifying Polymers in Confined Systems via a Machine-Learning-Accelerated Multiscale Framework for Enhanced Hydrocarbon Recovery" Fluids 11, no. 4: 86. https://doi.org/10.3390/fluids11040086

APA Style

Alvarez-Cruz, A., Mayoral-Villa, E., García-Márquez, A. R., & Klapp, J. (2026). Rational Design of High-Performance Viscosifying Polymers in Confined Systems via a Machine-Learning-Accelerated Multiscale Framework for Enhanced Hydrocarbon Recovery. Fluids, 11(4), 86. https://doi.org/10.3390/fluids11040086

Article Menu

Rational Design of High-Performance Viscosifying Polymers in Confined Systems via a Machine-Learning-Accelerated Multiscale Framework for Enhanced Hydrocarbon Recovery

Abstract

1. Introduction

2. Models and Methodology

2.1. DPD-ML-Enhanced Strategy Integration for Thickener Design

2.2. Integrated Mesoscale Methodology for Viscosifier Design Flow Simulation via DPD

2.3. Confined Flow Setup and Simulation Parameters

2.4. Shear Viscosity Calculation via Poiseuille Flow Simulations

2.5. Machine Learning Surrogate Model

2.6. Optimization and Reinforcement Loop Design

3. Results

3.1. Finite Size Effect Study and Database Generation

3.2. Principal Component Analysis (PCA)

3.3. Training Supervised Machine Learning Models for Viscosity Prediction

3.4. Virtual Screening and Inverse Design Framework

3.4.1. Virtual High-Throughput Screening

3.4.2. Analysis of Composition–Flexibility Interplay

3.5. Inverse Design via Gradient-Based Optimization

3.6. Validation and Active Learning

3.7. Study of the Synergistic Effect of a Mixture of Branched and Linear Polymers

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI