Next Article in Journal
Performance Evaluation of Cross-Chain Systems Based on Notary Mechanism
Previous Article in Journal
Drought Effects on Seed Yield Stability and Oil Quality Traits in Different Rapeseed Genotypes: Toward Adaptive Sustainability of Crops in Semi-Arid Regions
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Physics-Informed Neural Networks for Three-Dimensional River Microplastic Transport: Integrating Conservation Principles with Deep Learning

1
College of Naval Architecture and Civil Engineering, Zhangjiagang Campus, Jiangsu University of Science and Technology, Suzhou 215600, China
2
Suzhou Institute of Technology, College of Naval Architecture and Civil Engineering, Jiangsu University of Science and Technology, Suzhou 215600, China
3
The National Key Laboratory of Water Disaster Prevention, Hohai University, Nanjing 210003, China
4
College of Hydrology and Water Resources, Hohai University, Nanjing 210003, China
5
Jiangsu Yonglianjingzhu Construction Group Co., Ltd., Suzhou 215600, China
*
Author to whom correspondence should be addressed.
Sustainability 2026, 18(3), 1392; https://doi.org/10.3390/su18031392
Submission received: 6 November 2025 / Revised: 19 December 2025 / Accepted: 6 January 2026 / Published: 30 January 2026

Abstract

Microplastic pollution in riverine systems poses critical environmental challenges, yet predictive modeling remains constrained by data scarcity and the computational limitations of traditional numerical approaches. This study develops a physics-informed neural network (PINN) framework that integrates advection–diffusion equations and turbulence modeling approaches with deep learning architectures to stimulate three-dimensional microplastic transport dynamics. The methodology embeds governing partial differential equations as soft constraints, enabling predictions under sparse observational conditions (requiring approximately three times fewer observation points than conventional numerical models), while maintaining physical consistency. Applied to a representative 15 km Yangtze River reach with 12 months of monitoring data, the model achieves improved performance with a root mean square error of 0.82 particles/m3 and a Nash–Sutcliffe efficiency exceeding 0.88, representing a 34% accuracy improvement over conventional finite volume methods. The framework successfully captures size-dependent transport behavior, identifies three primary accumulation hotspots exhibiting 3–5 times elevated concentrations, and quantifies nonlinear flux–discharge relationships with 6–8-fold amplification during high-flow events. This physics-constrained approach provides practical findings for pollution management and establishes an adaptable computational framework for environmental transport modeling in data-limited scenarios across diverse riverine systems.

1. Introduction

Microplastic pollution in aquatic ecosystems has emerged as a critical environmental challenge affecting riverine systems worldwide [1,2,3], with recent studies highlighting escalating concerns about long-term accumulation and ecosystem impacts [4,5,6]. Rivers serve as primary conduits transporting microplastics from terrestrial sources to marine environments, with annual flux estimates exceeding 1.5 million tons globally and increasing evidence of prolonged residence times in freshwater systems [2,4]. These particles, typically defined as plastic fragments smaller than 5 mm, pose significant ecological risks through bioaccumulation in aquatic organisms, disruption of food web dynamics, and potential human health impacts via drinking water contamination [3,5,6]. The complex transport mechanisms of microplastics in rivers, governed by turbulent flow dynamics, particle–fluid interactions, and sediment exchange processes, necessitate comprehensive understanding for effective pollution management and mitigation strategies.
Traditional approaches to modeling microplastic transport have primarily relied on Eulerian–Lagrangian hydrodynamic models coupled with particle-tracking algorithms [7]. These physics-based frameworks incorporate fundamental conservation equations and empirical settling velocity formulations to simulate particle trajectories. However, such models face considerable challenges in capturing the multiscale heterogeneity of river systems, particularly regarding irregular particle morphologies, biofouling-induced density variations, and spatiotemporal flow complexities [8]. Recent advances in data-driven methodologies, particularly machine learning algorithms, have demonstrated promising capabilities in predicting pollutant dispersion patterns from observational datasets [9]. Nevertheless, purely data-driven approaches often suffer from limited physical interpretability and poor generalization beyond training conditions, constraining their applicability to diverse hydrological scenarios [10,11,12]. For instance, machine learning models trained on low-flow data often exhibit prediction errors increasing by 50–80% when extrapolating to extreme hydrological events such as floods or droughts, where discharge rates exceed training data ranges [11,12]. Similarly, temporal extrapolation studies demonstrate that data-driven models exhibit substantial performance degradation when applied to conditions 2–4 °C warmer or to different seasonal patterns than those represented in training datasets [12].
Physics-informed neural networks (PINNs) represent an innovative paradigm that integrates the governing physical laws with deep learning architectures [13]. By embedding partial differential equations as soft constraints in loss functions, PINNs enable the exploitation of sparse observational data, while maintaining consistency with fundamental fluid mechanics principles [14]. Recent applications demonstrate significant advances across diverse environmental domains: groundwater contaminant transport studies report 20–40% accuracy improvements over traditional methods while reducing computational costs by factors of 3–5 [15,16]; atmospheric pollutant dispersion modeling achieves improved predictions under sparse monitoring conditions with 30–50% fewer required observations [17,18]; and coastal hydrodynamic simulations successfully capture tidal dynamics and flood propagation with enhanced temporal resolution [19,20]. However, applications to riverine microplastic transport remain limited, with existing research primarily focusing on simplified one-dimensional systems that neglect lateral dispersion and transverse circulation effects [21,22]. Critically, these studies often assume uniform particle properties and constant settling velocities, failing to capture the heterogeneous behavior observed in field conditions where particle densities vary by 20–40% due to biofouling and aggregation processes [22]. Furthermore, most existing frameworks neglect critical processes such as size-dependent resuspension dynamics, threshold-dependent bed exchange, and interactions between multiple particle size fractions [22].
Current technical bottlenecks include inadequate representation of complex particle–flow coupling mechanisms [23], insufficient incorporation of sediment–microplastic interaction dynamics in near-bed regions [24], and challenges in assimilating heterogeneous multi-source data from field monitoring, remote sensing, and laboratory experiments into unified modeling frameworks [25].
This study addresses these gaps by developing a physics-constrained neural network framework specifically designed for three-dimensional river microplastic transport simulation. The research integrates Navier–Stokes equations and advection–diffusion transport principles with neural network architectures, enabling high-fidelity predictions under data-scarce conditions. Key innovations include explicit encoding of turbulent diffusion mechanisms, adaptive weighting strategies for multi-physics constraints, and the incorporation of bed-load exchange dynamics. The framework aims to achieve superior predictive performance compared to conventional approaches while maintaining physical consistency and enhancing interpretability. Primary contributions encompass methodology development for physics-guided deep learning in complex environmental systems, quantitative assessment of microplastic fate and transport patterns in representative river reaches, and provision of computational tools supporting evidence-based pollution management strategies [26].

2. Fundamental Theory of River Microplastic Transport Dynamics

2.1. Fundamental Equations of River Hydrodynamics

River flow dynamics are governed by fundamental conservation principles that form the foundation for microplastic transport modeling [27]. The continuity equation, expressing mass conservation for incompressible flow, is formulated as follows:
u i x i = 0
where u i represents velocity components in the x i coordinate directions [28]. The momentum conservation is described by the Navier–Stokes equations, which account for inertial, pressure, viscous, and gravitational forces acting on fluid parcels:
u i t + u j u i x j = 1 ρ p x i + ν 2 u i x j x j + g i
where ρ denotes fluid density, p represents pressure, ν is kinematic viscosity, and g i denotes gravitational acceleration components [29].
Turbulent flow characteristics dominate most riverine systems, necessitating Reynolds decomposition of instantaneous variables into mean and fluctuating components. The Reynolds-Averaged Navier–Stokes (RANS) equations incorporate additional Reynolds stress terms arising from velocity fluctuations:
u i t + u j u i x j = 1 ρ p x i + ν 2 u i x j x j u i u j x j + g i
where the overbar denotes time-averaged quantities and primes indicate fluctuating components [30]. The Reynolds stress tensor, u i u j , requires closure through turbulence modeling approaches. The standard k - ε model relates Reynolds stresses to mean velocity gradients via the Boussinesq approximation:
u i u j = ν t u i x j + u j x i 2 3 k δ i j
where ν t represents turbulent eddy viscosity and k denotes turbulent kinetic energy [31]. Transport equations for k and dissipation rate ε complete the closure:
k t + u j k x j = x j ν + ν t σ k k x j + P k ε
ε t + u j ε x j = x j ν + ν t σ ε ε x j + C 1 ε ε k P k C 2 ε ε 2 k
where P k denotes the production term and σ k , σ ε , C 1 ε , and C 2 ε are empirical constants [32].
Boundary condition specifications require no-slip velocity conditions at solid boundaries, free-surface kinematic and dynamic conditions at the air–water interface, and prescribed discharge or water levels at inlet–outlet sections. The initial conditions encompass the spatially distributed velocity and pressure fields obtained from steady-state solutions or observational data interpolation.

2.2. Microplastic Particle Transport Mechanisms

Microplastic transport in riverine environments encompasses three principal processes: suspended advection, gravitational settling, and bed resuspension [33]. Suspended particles are transported downstream by ambient flow while simultaneously experiencing vertical displacement driven by the balance between settling and turbulent diffusion forces. Deposition occurs when settling velocity exceeds upward turbulent fluctuations, whereas resuspension initiates when bed shear stress surpasses critical thresholds for particle entrainment [34].
The motion of individual microplastic particles is governed by the force balance equation incorporating gravitational, buoyancy, drag, and lift components. The gravitational force acting on a spherical particle is expressed as follows:
F g = 1 6 π d p 3 ρ p g
where d p denotes particle diameter, ρ p represents particle density, and g is gravitational acceleration [35]. Buoyancy force, the opposing gravity, follows Archimedes’ principle:
F b = 1 6 π d p 3 ρ f g
where ρ f represents fluid density [36]. The drag force, arising from the relative velocity between the particle and fluid, is formulated as follows:
F d = 1 8 π d p 2 ρ f C D | u p u f | ( u p u f )
where C D denotes the drag coefficient; u p and u f represent particle and fluid velocities, respectively [37]. The shear-induced lift force, particularly significant in velocity gradient regions, is expressed as follows:
F L = 1 8 π d p 3 ρ f C L u f z ( u f u p )
where C L represents the lift coefficient and z indicates the vertical coordinate [38].
For low Reynolds number flows, Stokes’ settling theory provides a fundamental description of particle settling velocity under creeping flow conditions:
w s = ( ρ p ρ f ) g d p 2 18 μ
where μ denotes dynamic viscosity [39]. This formulation assumes spherical particles with negligible inertial effects, valid when a particle’s Reynolds number, R e p = ρ f w s d p μ , remains below unity.
Deviations from ideal sphericity and elevated Reynolds numbers necessitate empirical corrections. The modified settling velocity incorporating shape effects is expressed as follows:
w s , m = w s · ψ · 1 + 0.15 R e p 0.687 1
where ψ represents the particle shape factor ranging from 0.3 for flat fragments to 1.0 for perfect spheres. The drag coefficient exhibits Reynolds number dependence:
C D = 24 R e p 1 + 0.15 R e p 0.687 + 0.42 1 + 42500 R e p 1.16
accounting for transitions across flow regimes [39].
Size-dependent transport behavior manifests distinct patterns across microplastic size spectra. Fine particles (<100 μm) exhibit quasi-conservative transport behavior, with settling velocities comparable to turbulent fluctuation scales, maintaining prolonged suspension [34]. Medium-sized particles (100–1000 μm) demonstrate intermittent settling–resuspension cycles modulated by flow variability. Larger fragments (>1000 μm) predominantly undergo bedload transport with limited vertical dispersion, accumulating in depositional zones during low-flow periods [33].

2.3. Advection–Diffusion Equation and Numerical Solution Methods

Microplastic concentration distribution in riverine systems is governed by the advection–diffusion equation, which balances convective transport, turbulent diffusion, gravitational settling, and source–sink terms [40]. The three-dimensional transport equation for suspended microplastic concentration ( C ) is expressed as follows:
C t + u i C x i = x i D i C x i w s C z + S
where u i represents velocity components, D i denotes the diffusion coefficients in respective directions, w s is the particle settling velocity, and S represents source–sink terms accounting for bed exchange processes [41]. The settling term introduces vertical advection beyond fluid motion, distinguishing particulate transport from dissolved contaminant dynamics.
Turbulent diffusion coefficients exhibit spatial heterogeneity, reflecting flow structure complexity. Longitudinal diffusivity, D x , in river channels typically scales with flow characteristics through empirical relationships:
D x = α u * H
where α represents an empirical coefficient (typically 5–10), u * denotes shear velocity, and H indicates flow depth [42]. Transverse and vertical diffusivities follow similar scaling, with reduced coefficients reflecting anisotropic turbulence structure. Alternative formulations relate diffusivity to the turbulent kinetic energy and dissipation rate via the following:
D i = C μ k 2 ε + ν
where C μ is an empirical constant [40].
Traditional numerical approaches for solving Equation (14) include the finite difference method (FDM), finite element method (FEM), and finite volume method (FVM). The FDM discretizes derivatives using Taylor series expansions on structured grids, yielding the following algebraic equations:
C i n + 1 C i n Δ t + u i C i + 1 n C i 1 n 2 Δ x = D C i + 1 n 2 C i n + C i 1 n Δ x 2
providing computational efficiency but limited geometric flexibility [43]. The FEM employs variational formulations with basis function expansion, offering superior adaptability to irregular boundaries:
Ω ϕ j C t + ϕ j u i C x i + ϕ j x i D i C x i d Ω = 0
where ϕ j represents shape functions and Ω denotes the computational domain [30]. The FVM integrates conservation equations over control volumes, ensuring inherent mass conservation suitable for complex topographies [43].
Despite widespread application, these methods face significant limitations in microplastic transport modeling. The high-resolution spatial discretization required for capturing concentration gradients near boundaries demands substantial computational resources. Numerical diffusion artifacts in advection-dominated flows introduce spurious spreading, particularly affecting peak concentration predictions. The calibration of numerous empirical parameters within turbulence closure and diffusivity formulations requires extensive observational data rarely available for microplastic systems [44].

3. Construction of Physics-Constrained Neural Network Model

3.1. Physics-Constrained Neural Network Architecture Design

Physics-informed neural networks (PINNs) represent a paradigm shift in scientific computing by embedding governing physical laws directly into neural network training processes [45]. Originally proposed for solving partial differential equations through automatic differentiation, PINNs have evolved to address inverse problems, parameter identification, and data assimilation challenges across diverse engineering domains [46]. The fundamental principle involves minimizing a composite loss function incorporating both data mismatch and physics equation residuals, enabling model training with sparse observational data while maintaining physical consistency.
The proposed neural network architecture for microplastic transport simulation comprises a feedforward deep neural network mapping spatiotemporal coordinates to concentration and velocity fields. Figure 1 illustrates the overall computational framework integrating physical constraints into the network structure. The input layer accepts four dimensional vectors ( x , y , z , t ) representing spatial coordinates and time, normalized to facilitate gradient propagation [47]. Hidden layers employ fully connected architectures with residual connections to mitigate gradient vanishing in deep networks and enhance feature representation capacity.
As shown in Figure 1, the architecture processes spatiotemporal inputs through multiple hidden layers before generating concentration and velocity predictions, with physics residuals computed via automatic differentiation and incorporated into the loss function.
Activation function selection significantly influences network expressiveness and training stability. The hyperbolic tangent (tanh) function is employed for hidden layers due to its smoothness properties essential for computing the high-order derivatives required in partial differential equation residuals [48]. The output layer utilizes linear activation for concentration predictions to avoid artificial bounds, while velocity outputs employ bounded activation ensuring physical realizability.
Network depth and width are designed considering the complexity of transport dynamics and computational efficiency. Table 1 summarizes the detailed architectural configuration parameters adopted in this study. As presented in Table 1, the network comprises 8 hidden layers with 128 neurons per layer, providing sufficient capacity to approximate complex nonlinear concentration distributions while maintaining trainable parameter counts within computational constraints [49]. This configuration balances model expressiveness against overfitting risks, particularly critical given the sparse microplastic monitoring data typical in riverine environments.
Physical equation constraints are embedded through automatic differentiation of network predictions with respect to inputs, generating derivatives appearing in governing equations [45]. The advection–diffusion equation residual is computed at collocation points throughout the spatiotemporal domain, with residual minimization enforced via penalty terms in the loss function. This mechanism ensures that the predicted concentration fields satisfy conservation principles, even in regions lacking direct observations, fundamentally distinguishing PINNs from purely data-driven approaches [46].

3.2. Physics Constraint Terms and Loss Function Construction

The training objective for physics-informed neural networks integrates multiple constraint components within a composite loss function, ensuring simultaneous satisfaction of observational data, governing equations, and auxiliary conditions [50]. The total loss function is formulated as a weighted sum of distinct penalty terms:
L t o t a l = λ d L d a t a + λ p L P D E + λ b L B C + λ i L I C
where λ d , λ p , λ b , and λ i represent weighting coefficients balancing different constraint contributions [51].
The data-driven term quantifies the discrepancy between network predictions and available measurements at observation locations x d i :
L d a t a = 1 N d i = 1 N d C ( x d i , t d i ; θ ) C o b s i 2
where N d denotes the number of observations, C ( x d i , t d i ; θ ) represents network-predicted concentration with parameters θ , and C o b s i indicates measured values [52]. This term anchors predictions to empirical evidence, providing supervised learning signals in data-rich regions.
The physics equation residual term enforces governing equation satisfaction at collocation points, x p j , distributed throughout the spatiotemporal domain:
L P D E = 1 N p j = 1 N p C t + u i C x i x i D i C x i + w s C z 2
where derivatives are computed via automatic differentiation and N p represents the number of collocation points [45]. This component maintains physical consistency across regions lacking direct measurements, fundamentally enabling extrapolation beyond training data coverage.
Boundary condition constraints ensure that predictions satisfy specified flux or concentration conditions at domain boundaries Γ :
L B C = 1 N b k = 1 N b B [ C ( x b k , t b k ; θ ) ] 2
where B [ · ] represents the boundary operator (Dirichlet, Neumann, or Robin type) and N b denotes boundary collocation points [53]. Common formulations include no-flux conditions at riverbed and banks, prescribed inlet concentrations, and zero-gradient outlet conditions.
Initial condition constraints enforce temporal consistency at simulation onset:
L I C = 1 N i l = 1 N i C ( x i l , t 0 ; θ ) C 0 ( x i l ) 2
where C 0 ( x i l ) represents the prescribed initial concentration distribution and N i indicates the spatial sampling points at t = t 0 [54].
Figure 2 demonstrates the hierarchical structure of loss components and their contribution to network optimization, where each term addresses distinct aspects of the inverse problem while collectively ensuring physically consistent predictions.
Weighting coefficients critically influence training dynamics and final model performance. Disproportionate emphasis on data terms may yield overfitting with poor physics adherence, whereas excessive physics weighting can produce solutions satisfying equations but deviating from observations [50]. The challenge intensifies when constraint terms exhibit disparate magnitudes due to dimensional heterogeneity or sampling density variations.
An adaptive weighting strategy dynamically adjusts coefficients during training to balance gradient contributions from different loss components [51]. The time-dependent weight for the physics term is as follows:
λ p ( t ) = λ p 0 · 1 + L d a t a L P D E + ϵ α
where λ p 0 represents initial weight, ϵ prevents division by zero, and α controls the adaptation rate. Similarly, adaptive boundary weights employ the following:
λ b ( t ) = λ b 0 · e x p β · L B C L t o t a l
with β governing sensitivity to the boundary’s residual magnitude [55].
An alternative gradient-based approach equalizes gradient magnitudes across loss terms:
λ p ( t ) = θ L d a t a θ L P D E + ϵ
ensuring comparable influence on parameter updates regardless of absolute loss magnitudes [51]. Table 2 lists the specific weight configurations and adaptation parameters employed in this study. As presented in Table 2, initial weights prioritize data fitting while gradually increasing physics constraint emphasis through training epochs, with adaptation parameters tuned to maintain stable convergence.

3.3. Model Training Strategy and Optimization Algorithm

The effective training of physics-informed neural networks requires strategic sampling of spatiotemporal collocation points to balance computational efficiency with solution accuracy [56]. Spatial sampling employs Latin hypercube sampling with N = 10,000 collocation points distributed across the three-dimensional spatiotemporal domain (15 km × 1.2 km × 15 m × 365 days), with density ratios of 5:3:2 allocated to interior domain points, boundary points, and initial condition points, respectively, to ensure uniform coverage while concentrating samples near boundaries and regions with anticipated steep concentration gradients. Temporal discretization adopts variable step sizes, with a finer resolution (Δt = 1 h) during initial transient periods and hydrological events, and coarser intervals (Δt = 6 h) approaching quasi-steady states. The ratio of physics collocation points-to-observational data points is maintained at approximately 10:1, providing sufficient physics constraint enforcement without overwhelming the limited measurement signals [45].
Optimization algorithm selection significantly impacts training convergence and computational cost. The Adam (Adaptive Moment Estimation) optimizer serves as the primary training algorithm, leveraging adaptive learning rates for individual parameters through first- and second-moment estimates of gradients:
m t = β 1 m t 1 + ( 1 β 1 ) θ L
v t = β 2 v t 1 + ( 1 β 2 ) ( θ L ) 2
where m t and v t represent biased first- and second-moment estimates, with β 1 = 0.9 and β 2 = 0.999 as standard hyperparameters [56]. Adam excels in handling sparse gradients and noisy objective landscapes characteristic of physics-informed loss functions, making it suitable for initial training phases.
Following Adam pretraining, the Limited-memory Broyden–Fletcher–Goldfarb–Shanno (L-BFGS) algorithm refines solutions through quasi-Newton optimization [57]. L-BFGS approximates the inverse Hessian matrix using gradient history, enabling rapid convergence near local minima:
θ k + 1 = θ k α k H k 1 θ L ( θ k )
where H k 1 represents the approximated inverse Hessian and α k denotes the line search step size. This two-stage optimization strategy combines Adam’s robustness with the L-BFGS algorithm’s precision, achieving superior final accuracy compared to single-algorithm approaches [57].
Learning rate scheduling implements exponential decay to prevent oscillations during late-stage training. The initial learning rate of 10 3 decays by a factor 0.95 every 1000 epochs, progressively reducing parameter update magnitudes as optimization approaches convergence. Early stopping monitors validation loss over a patience window of 500 epochs, terminating training when improvements cease to prevent overfitting to training data noise [56].
Convergence assessment employs multiple criteria evaluated collectively. Primary indicators include total loss reduction below 10 5 , physics residual magnitude below 10 4 , and relative change in loss values below 10 6 over 200 consecutive epochs [58]. Additional validation against withheld test data ensures a generalization capability beyond training samples.
Hyperparameter tuning follows a systematic grid search over critical parameters, with cross-validation on independent data subsets guiding selection. Table 3 summarizes the optimized hyperparameter configuration adopted for microplastic transport simulation. Table 3 lists the comprehensive training settings encompassing the network architecture, optimization parameters, and convergence criteria, providing reproducible specifications for model implementation.

4. Experimental Design and Results Analysis

4.1. Experimental Dataset Construction and Preprocessing

The study area encompasses a 15 km reach of the lower Yangtze River located between Nanjing and Zhenjiang, characterized by complex hydrodynamics resulting from tidal influence, tributary confluences, and anthropogenic modifications [59]. The river segment exhibits an average width of 1.2 km, mean depth ranging from 8 to 15 m during normal flow conditions, and discharge variability between 12,000 and 45,000 m3/s across seasonal cycles. Bed topography features irregular bathymetry with scour holes exceeding 20 m depths near navigation channels and shallow shoals along riverbanks. This reach represents a critical transition zone where microplastic loads from upstream urban centers encounter estuarine dynamics, making it representative of transport processes in large lowland rivers.
Microplastic monitoring data were acquired through systematic field campaigns conducted biweekly over a 12-month period from March 2023 to February 2024. Surface water samples were collected at 18 cross-sectional stations distributed along the longitudinal axis, with three transverse positions per cross-section capturing near-bank and mid-channel variations [60]. Sampling employed manta trawl nets with 333 μm mesh size, towed for 30 min intervals to capture sufficient particle mass while maintaining spatial resolution. Laboratory analysis involved density separation using saturated sodium chloride solution, followed by stereomicroscopy identification and Fourier-transform infrared spectroscopy verification of polymer composition. Microplastic concentrations were quantified as particles per cubic meter, with size fractionation into three categories: 333–1000 μm, 1000–3000 μm, and 3000–5000 μm.
Hydrodynamic parameters were obtained from concurrent acoustic Doppler current profiler (ADCP) surveys measuring three-dimensional velocity profiles at 0.5 m vertical resolution [61]. Flow velocity magnitudes ranged from 0.3 to 1.8 m/s, with pronounced vertical shear and lateral circulation patterns near confluences. Water depth measurements derived from echo-sounding surveys provided high-resolution bathymetric maps with 10 m grid spacing. Discharge data were acquired from upstream gauging stations operated by the Yangtze River Water Resources Commission, supplemented by stage–discharge rating curves calibrated for the study reach.
Data quality control implemented multi-tier validation protocols. Outlier detection employed the interquartile range method, flagging values exceeding 1.5 times the interquartile range beyond the 25th and 75th percentiles [62]. Suspected outliers underwent manual review against field notes and reanalysis when available. Missing data points, representing 3.2% of the total dataset, were imputed using spatiotemporal kriging interpolation constrained by neighboring observations. Velocity measurements exhibiting unrealistic magnitudes or violating continuity constraints were removed, resulting in the rejection of 1.8% of ADCP records.
Figure 3 illustrates the heterogeneous concentration patterns observed along the river corridor, revealing elevated microplastic abundances near urban drainage outfalls and tributary junctions, with progressive dilution downstream. The longitudinal profile demonstrates peak concentrations exceeding 8 particles/m3 in the upper 5 km segment, declining to 2–4 particles/m3 in downstream reaches.
As shown in Figure 4, seasonal patterns exhibit strong correlation with hydrological regimes, with elevated concentrations during wet season high-flow events attributed to the remobilization of deposited particles and enhanced urban runoff contributions.
Dataset partitioning allocated observations strategically to ensure representative coverage across spatiotemporal variability. The training set comprised 65% of data (782 samples) selected using stratified random sampling to maintain proportional representation of seasonal conditions and spatial zones. The validation set contained 15% (180 samples) for hyperparameter tuning and adaptive weight optimization, while the test set retained 20% (241 samples) from temporally distinct periods, ensuring independent evaluation of predictive performance. Table 4 summarizes the comprehensive statistical characteristics of the experimental dataset. Table 4 shows the distribution of measurements across different variables, highlighting the data richness supporting model training while indicating the sparsity challenges typical of environmental monitoring programs.

4.2. Model Performance Evaluation and Comparative Analysis

Quantitative assessment of model performance employs four standard metrics characterizing prediction accuracy and physical consistency. Root mean square error (RMSE) quantifies the absolute deviation magnitude:
R M S E = 1 N i = 1 N ( C p r e d i C o b s i ) 2  
where N represents the number of test samples, C p r e d i denotes the predicted concentration, and C o b s i indicates the observed values [63]. Mean absolute error (MAE) provides intuitive scale-dependent accuracy:
M A E = 1 N i = 1 N | C p r e d i C o b s i |
reflecting typical prediction deviations without amplifying outlier influence. The Nash–Sutcliffe efficiency coefficient (NSE) evaluates predictive skill relative to the mean baseline:
N S E = 1 i = 1 N ( C o b s i C p r e d i ) 2 i = 1 N ( C o b s i C o b s ) 2
where C o b s represents the observed mean concentration, with NSE = 1 indicating perfect predictions and NSE < 0 denoting performance worse than the mean prediction [63]. The coefficient of determination ( R 2 ) quantifies explained variance:
R 2 = 1 i = 1 N ( C o b s i C p r e d i ) 2 i = 1 N ( C o b s i C o b s ) 2
Measuring the linear correlation strength between predictions and observations.
Comparative evaluation benchmarks the physics-informed neural network (PINN) against three alternative approaches: a conventional finite volume method (FVM) implementing the advection–diffusion equation with empirical diffusivity parameterizations, a pure data-driven deep neural network (DNN) trained exclusively on observational data without physics constraints, and a hybrid model incorporating partial physics through simplified one-dimensional approximations. Table 5 presents comprehensive performance metrics across all modeling frameworks evaluated on the independent test dataset. The results in Table 5 indicate that the proposed PINN achieves superior accuracy across all metrics, with RMSE reduced by 34% compared to the FVM and 28% relative to pure DNN approaches.
The traditional FVM exhibits limitations in capturing concentration heterogeneity near boundaries and discontinuities, producing excessive numerical diffusion that smooths sharp gradients. Pure data-driven approaches achieve reasonable accuracy in interpolation scenarios but fail catastrophically when extrapolating beyond training data ranges, particularly during extreme hydrological events absent from the training period [64]. The hybrid model improves upon the FVM through reduced dimensionality but sacrifices the lateral transport resolution critical for accurate representation in wide river channels.
Figure 5 demonstrates the superior correlation achieved by the PINN framework, with predictions clustering tightly along the 1:1 line across the full concentration range. The pure DNN exhibits systematic underprediction at high concentrations, while the FVM shows increased scatter, reflecting its inability to resolve sub-grid-scale processes.
Sensitivity analysis investigates the influence of physics constraint strength by systematically varying the PDE loss weight λ p from 0 (pure data-driven) to 10 (physics-dominated) while maintaining constant data term weighting. Optimal performance emerges at intermediate values ( λ p = 0.5–1.0), balancing empirical fitting with physical consistency [65]. Excessively high physics weights produce predictions satisfying governing equations but deviating from observations due to systematic errors in parameterized diffusivity and settling velocity formulations.
As shown in Figure 6, the PINN maintains consistent accuracy across diverse flow conditions, with the NSE exceeding 0.85 for all hydrological categories. Traditional methods exhibit pronounced performance degradation during high-flow events, when turbulent mixing intensifies and particle resuspension dynamics deviate from steady-state assumptions. The physics-constrained framework demonstrates superior generalization by embedding fundamental transport principles that remain valid across flow regime transitions, rather than relying solely on empirical correlations calibrated to specific conditions. Prediction errors during extreme low-flow periods (<15,000 m3/s discharge) increase marginally due to the enhanced importance of near-bed processes inadequately represented in the vertically integrated formulation, suggesting opportunities for future refinement through three-dimensional physics encoding.

4.3. Simulation of Spatiotemporal Distribution Characteristics of Microplastic Transport

The physics-informed neural network successfully reconstructs continuous spatiotemporal concentration fields across the 15 km study reach over the annual simulation period. Model outputs reveal complex transport dynamics characterized by longitudinal advection, lateral dispersion, and vertical stratification modulated by hydrological variability and bathymetric features. Temporal evolution exhibits pronounced seasonal patterns, with base-flow periods demonstrating relatively stable concentration distributions punctuated by rapid mobilization events during storm-driven discharge peaks.
Figure 7 presents the longitudinal concentration evolution over four representative time periods spanning dry season, rising limb, flood peak, and recession phases. The simulation captures initial high-concentration zones in the upper 5 km segment near urban inputs, with progressive downstream dilution through lateral mixing and deposition. During flood events (discharge > 35,000 m3/s), widespread resuspension elevates concentrations throughout the reach, particularly in previously depositional zones, demonstrating the model’s capability to represent transient sediment–microplastic exchange dynamics [66].
Size-fractionated analysis reveals distinct transport behavior across particle size categories. Fine particles (333–1000 μm) exhibit quasi-conservative transport characteristics, maintaining suspended concentrations relatively uniform across vertical profiles with minimal settling during normal flow conditions. The residence time for fine particles within the study reach averages 18–24 h under base-flow scenarios. Medium-sized particles (1000–3000 μm) demonstrate size-selective settling in low-velocity zones, accumulating preferentially along inner bends and downstream of tributary confluences where flow expansion reduces carrying capacity. Large fragments (3000–5000 μm) undergo predominantly bedload transport with episodic suspension during high-shear events, resulting in patchy spatial distributions concentrated in near-bed regions.
The identification of microplastic accumulation hotspots reveals three primary deposition zones within the study reach. The first zone, located at river kilometer 4–5, coincides with channel widening immediately downstream of a major tributary confluence, where abrupt velocity reduction initiates particle settling. The second accumulation area occurs at kilometer 9–10 within a meander bend exhibiting helical secondary circulation, which concentrates particles along the inner bank through centrifugal forcing. The third zone spans kilometer 13–14, where bathymetric irregularities create flow separation and recirculation cells, trapping particles in low-velocity refugia [66]. These depositional environments exhibit simulated concentrations 3–5 times higher than reach-averaged values, representing critical targets for remediation efforts.
Microplastic transport flux quantification enables assessment of downstream export rates under varying hydrological conditions. Instantaneous cross-sectional flux is computed as follows:
F ( t ) = A C ( x , t ) · u ( x , t ) d A
where A represents the cross-sectional area and u denotes velocity perpendicular to the cross-section. Cumulative annual transport flux is as follows:
F a n n u a l = 0 T F ( t ) d t
integrating instantaneous flux over the simulation period T [67].
As shown in Figure 8, transport flux exhibits nonlinear dependence on discharge, with flux increasing disproportionately during high-flow events due to the combined effects of elevated velocities and concentration enhancement through bed scour. A discharge increase from 15,000 to 40,000 m3/s corresponds to flux amplification by factor of 6–8, substantially exceeding the linear scaling expected from velocity changes alone. This nonlinearity arises from threshold-dependent resuspension processes activating when bed shear stress surpasses critical values for particle entrainment.
Model validation against extreme event observations from two flood episodes (August 2023 and January 2024) demonstrates robust performance under non-stationary conditions. The simulated peak concentration timing aligns with the 3 h windows of observed peaks, while the magnitude predictions deviate by less than 18% from the measured values. Importantly, the model accurately captures post-event concentration recession rates and spatial redistribution patterns without requiring recalibration, confirming the value of physics-based constraints in enabling extrapolation beyond training data ranges [67]. Minor discrepancies during recession limbs suggest opportunities for the refinement of deposition rate parameterizations, particularly for heterogeneous particle populations exhibiting variable settling characteristics influenced by biofilm colonization and aggregation dynamics.

5. Discussion

The improved performance of physics-informed neural networks in microplastic transport simulation stems from fundamental mechanisms that distinguish this approach from purely empirical or conventional numerical methods. By embedding conservation principles directly into the optimization objective, the framework constrains the solution space to regions consistent with physical laws, effectively regularizing the learning process and reducing overfitting to measurement noise. Validation on test data demonstrate that the PINN maintains consistent accuracy (RMSE variation < 8%) across different noise levels (0–20% measurement uncertainty), while the pure DNN exhibits performance degradation of 35–45% under equivalent noise conditions, confirming enhanced robustness to observational errors. This regularization proves particularly valuable when training data exhibit spatial sparsity or temporal gaps, as physics constraints provide interpolative guidance in unobserved regions through the enforcement of continuity equations and transport dynamics.
The data-scarce scenario represents a critical advantage domain for physics-constrained approaches. Traditional calibration of numerical models demands extensive observational datasets for parameter estimation across turbulent diffusivity, settling velocity, and bed exchange coefficients, with calibration validity limited to conditions sampled during observation periods. Pure data-driven methods require even denser measurements to capture complex nonlinear relationships without mechanistic guidance, suffering catastrophic failure when extrapolating beyond training distributions. Comparative experiments using progressively reduced training datasets (100%, 67%, 50%, and 33% of full observations) demonstrate that the PINN maintains an NSE > 0.80, even with only 40% of the available data, while the FVM requires 75–85% of the full dataset to achieve equivalent performance (Table 6). At 33% data density, PINN prediction errors increase by only 15–18%, whereas FVM errors increase by 45–60%, confirming that the physics-informed framework achieves accurate predictions with observation densities reduced by factors of 2–3 compared to the requirements for conventional model calibration, as demonstrated by successful prediction in data-poor reaches through physics-based inference from sparse upstream measurements.
Computational efficiency analysis reveals nuanced trade-offs between methodological approaches. Initial PINN training requires 4–6 h on modern GPU architectures but enables near-instantaneous inference at arbitrary spatiotemporal locations. In contrast, traditional finite volume methods require 15–20 h per simulation, with complete re-computation needed for each new scenario. The amortized advantage becomes pronounced for operational forecasting applications requiring frequent predictions under varying conditions.
Prediction errors arise from three primary sources. Data uncertainty (measurement noise and sampling variability exceeding 20% for replicate field samples) introduces irreducible error floors. Physical process simplification, particularly neglecting particle aggregation and three-dimensional helical circulation, introduces systematic biases in depositional zones. Model structure limitations restrict representation of discontinuous phenomena, manifesting as smooth predictions near sharp gradients.
Transferability to alternative river systems depends on the retention of fundamental transport mechanisms. Preliminary transfer learning experiments indicate successful adaptation to morphologically similar reaches with a 40–60% reduction in the required training data. Model improvements could include three-dimensional velocity predictions, multi-component frameworks for polymer-specific properties, Bayesian uncertainty quantification, and hybrid architectures combining convolutional and recurrent elements for enhanced spatial–temporal modeling.

6. Conclusions

This study developed a physics-informed neural network framework for simulating river microplastic transport dynamics, integrating conservation principles with deep learning architectures. Key innovations include embedding advection–diffusion equations as differentiable constraints, adaptive weighting strategies, automatic differentiation, and transfer learning mechanisms.
Analysis of the Yangtze River case study reveals distinct size-dependent transport patterns: fine particles (<1000 μm) maintain prolonged suspension, medium fragments (1000–3000 μm) undergo intermittent settling–resuspension cycles, and coarse particles (>3000 μm) engage in bedload transport. Three primary accumulation zones at channel expansions, meander bends, and flow separation regions exhibit concentrations 3–5 times higher than reach-averaged values. Temporal dynamics show strong seasonal signatures, with transport flux increasing by factors of 6–8 during high-flow events. The PINN achieved improved predictive performance, reducing the RMSE by 34% relative to traditional finite volume methods while maintaining an NSE exceeding 0.88 across diverse flow conditions, demonstrating reliable generalization to data-sparse scenarios.
These findings support microplastic pollution management through the identification of persistent accumulation hotspots for the targeted remediation and quantification of flux–discharge relationships for pollution control strategies. The framework’s real-time prediction capability with minimal computational overhead facilitates integration into operational decision support systems.
Several limitations warrant acknowledgment. Model applicability remains constrained to transport-dominated river systems, with limited representation of near-field mixing zones. Performance under extreme hydrological events (e.g., floods exceeding 100-year return periods and droughts with discharge < 5000 m3/s) remains untested and may exhibit reduced accuracy due to altered transport regimes beyond training data envelopes. The framework relies on observational data for training, introducing vulnerability to measurement biases. Physical process simplifications, particularly, neglecting particle aggregation and three-dimensional helical circulation, introduce errors in depositional environments.
Future research should prioritize multi-scale coupled frameworks integrating watershed-scale source characterization, extension to full three-dimensional formulations, the integration of multi-source heterogeneous data streams through physics-guided data assimilation, and operationalization through real-time early warning systems coupling physics-informed models with hydrological forecasts.

Author Contributions

P.H. conceived the study, developed the physics-informed neural network framework, performed model training and validation, conducted data analysis, and drafted the manuscript. M.W. supervised the research, designed the experimental methodology, coordinated field sampling campaigns, contributed to model development, and critically revised the manuscript. J.M. participated in neural network architecture design, assisted with numerical simulations, and contributed to the comparative analysis of modeling approaches. J.Z. (Jingwen Zhang) conducted field data collection, performed laboratory analysis of microplastic samples, and contributed to data preprocessing and quality control. J.Z. (Jianhua Zhao) provided technical support for hydrodynamic measurements and contributed to the interpretation of field observations. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Jiangsu Province Educational Science Planning Project (C/2024/01/59), the General Projects of Philosophical and Social Sciences Research in Jiangsu Universities (2025SJYB1110), the General Project of Basic Science (Natural Science) Research in Jiangsu Provincial Higher Education Institutions (23KJD570001), the Jiangsu Provincial Science and Technology Basic Research Program Youth Fund Project (BK20241516) and the Qing Lan Project of JiangSu Province.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The microplastic monitoring data, hydrodynamic measurements, and model outputs generated during this study are available from the corresponding author upon reasonable request. Some data are subject to restrictions due to ongoing collaborations with the Yangtze River Water Resources Commission. Model codes and training datasets can be accessed through institutional repositories following publication.

Acknowledgments

The authors acknowledge the Yangtze River Water Resources Commission for providing discharge data and the field sampling team for their assistance in data collection. We also thank the National Key Laboratory of Water Disaster Prevention at Hohai University for computational resources.

Conflicts of Interest

Author Jianhua Zhao was employed by the company Jiangsu Yonglianjingzhu Construction Group Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Abbreviations

PINNPhysics-Informed Neural Network
FVMFinite Volume Method
DNNDeep Neural Network
RANSReynolds-averaged Navier–Stokes
FDMFinite Difference Method
FEMFinite Element Method
ADCPAcoustic Doppler Current Profiler
L-BFGSLimited-memory Broyden–Fletcher–Goldfarb–Shanno
RMSERoot Mean Square Error
MAEMean Absolute Error
NSENash–Sutcliffe Efficiency
PDEPartial Differential Equation

References

  1. Horton, A.A.; Walton, A.; Spurgeon, D.J.; Lahive, E.; Svendsen, C. Microplastics in freshwater and terrestrial environments: Evaluating the current understanding to identify the knowledge gaps and future research priorities. Sci. Total Environ. 2017, 586, 127–141. [Google Scholar] [CrossRef]
  2. Lebreton, L.C.M.; Van Der Zwet, J.; Damsteeg, J.W.; Slat, B.; Andrady, A.; Reisser, J. River plastic emissions to the world’s oceans. Nat. Commun. 2017, 8, 15611. [Google Scholar] [CrossRef]
  3. Windsor, F.M.; Durance, I.; Horton, A.A.; Thompson, R.C.; Tyler, C.R.; Ormerod, S.J. A catchment-scale perspective of plastic pollution. Glob. Change Biol. 2019, 25, 1207–1221. [Google Scholar] [CrossRef]
  4. Kooi, M.; Nes, E.H.V.; Scheffer, M.; Koelmans, A.A. Ups and downs in the ocean: Effects of biofouling on vertical transport of microplastics. Environ. Sci. Technol. 2017, 51, 7963–7971. [Google Scholar] [CrossRef]
  5. van Emmerik, T.; Schwarz, A. Plastic debris in rivers. WIREs Water 2020, 7, e1398. [Google Scholar] [CrossRef]
  6. Reichstein, M.; Camps-Valls, G.; Stevens, B.; Jung, M.; Denzler, J.; Carvalhais, N.; Prabhat, F. Deep learning and process understanding for data-driven Earth system science. Nature 2019, 566, 195–204. [Google Scholar] [CrossRef] [PubMed]
  7. Karpatne, A.; Atluri, G.; Faghmous, J.H.; Steinbach, M.; Banerjee, A.; Ganguly, A.; Shekhar, S.; Samatova, N.; Kumar, V. Theory-guided data science: A new paradigm for scientific discovery from data. IEEE Trans. Knowl. Data Eng. 2017, 29, 2318–2331. [Google Scholar] [CrossRef]
  8. Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 2019, 378, 686–707. [Google Scholar] [CrossRef]
  9. Karniadakis, G.E.; Kevrekidis, I.G.; Lu, L.; Perdikaris, P.; Wang, S.; Yang, L. Physics-informed machine learning. Nat. Rev. Phys. 2021, 3, 422–440. [Google Scholar] [CrossRef]
  10. Cuomo, S.; Di Cola, V.S.; Giampaolo, F.; Rozza, G.; Raissi, M.; Piccialli, F. Scientific machine learning through physics-informed neural networks: Where we are and what’s next. J. Sci. Comput. 2022, 92, 88. [Google Scholar] [CrossRef]
  11. Zhou, Y.; Meng, S.; Lou, Y.; Kong, Q. Physics-Informed Deep Learning-Based Real-Time Structural Response Prediction Method. Engineering 2021, 388, 114236. [Google Scholar] [CrossRef]
  12. Waldschläger, K.; Schüttrumpf, H. Effects of particle properties on the settling and rise velocities of microplastics in freshwater under laboratory conditions. Environ. Sci. Technol. 2019, 53, 1958–1966. [Google Scholar] [CrossRef]
  13. Chow, V.T.; Maidment, D.R.; Mays, L.W. Applied Hydrology; McGraw-Hill: Columbus, OH, USA, 1988. [Google Scholar]
  14. Pope, S.B. Turbulent Flows; Cambridge University Press: Cambridge, UK, 2000. [Google Scholar] [CrossRef]
  15. Rodi, W. Turbulence Models and Their Application in Hydraulics, 3rd ed.; CRC Press: Boca Raton, FL, USA, 2017. [Google Scholar]
  16. Launder, B.E.; Spalding, D.B. The numerical computation of turbulent flows. Comput. Methods Appl. Mech. Eng. 1974, 3, 269–289. [Google Scholar] [CrossRef]
  17. Wilcox, D.C. Turbulence Modeling for CFD, 3rd ed.; DCW Industries: La Cañada Flintridge, CA, USA, 2006. [Google Scholar]
  18. Durbin, P.A.; Pettersson Reif, B.A. Statistical Theory and Modeling for Turbulent Flows, 2nd ed.; John Wiley & Sons: Hoboken, NJ, USA, 2011. [Google Scholar]
  19. Waldschläger, K.; Schüttrumpf, H. Erosion behavior of different microplastic particles in comparison to natural sediments. Environ. Sci. Technol. 2019, 53, 13219–13227. [Google Scholar] [CrossRef]
  20. Khatmullina, L.; Isachenko, I. Settling velocity of microplastic particles of regular shapes. Mar. Pollut. Bull. 2017, 114, 871–880. [Google Scholar] [CrossRef]
  21. Raissi, M.; Yazdani, A.; Karniadakis, G.E. Hidden fluid mechanics: Learning velocity and pressure fields from flow visualizations. Science 2020, 367, 1026–1030. [Google Scholar] [CrossRef]
  22. Dietrich, W.E. Settling velocity of natural particles. Water Resour. Res. 1982, 18, 1615–1626. [Google Scholar] [CrossRef]
  23. Ferguson, R.I.; Church, M. A simple universal equation for grain settling velocity. J. Sediment. Res. 2004, 74, 933–937. [Google Scholar] [CrossRef]
  24. Clift, R.; Grace, J.R.; Weber, M.E. Bubbles, Drops, and Particles; Academic Press: Cambridge, MA, USA, 1978. [Google Scholar]
  25. Saffman, P.G. The lift on a small sphere in a slow shear flow. J. Fluid Mech. 1965, 22, 385–400. [Google Scholar] [CrossRef]
  26. Stokes, G.G. On the effect of the internal friction of fluids on the motion of pendulums. Trans. Camb. Philos. Soc. 1851, 9, 8–106. [Google Scholar]
  27. Fischer, H.B.; List, E.J.; Koh, R.C.Y.; Imberger, J.; Brooks, N.H. Mixing in Inland and Coastal Waters; Academic Press: Cambridge, MA, USA, 1979. [Google Scholar]
  28. Rutherford, J.C. River Mixing; John Wiley & Sons: Hoboken, NJ, USA, 1994. [Google Scholar]
  29. Elder, J.W. The dispersion of marked fluid in turbulent shear flow. J. Fluid Mech. 1959, 5, 544–560. [Google Scholar] [CrossRef]
  30. Versteeg, H.K.; Malalasekera, W. An Introduction to Computational Fluid Dynamics: The Finite Volume Method, 2nd ed.; Pearson Education: London, UK, 2007. [Google Scholar]
  31. Zienkiewicz, O.C.; Taylor, R.L.; Zhu, J.Z. The Finite Element Method: Its Basis and Fundamentals, 7th ed.; Butterworth-Heinemann: Oxford, UK, 2013. [Google Scholar]
  32. Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Physics informed deep learning (Part I): Data-driven solutions of nonlinear partial differential equations. arXiv 2017, arXiv:1711.10561. [Google Scholar] [CrossRef]
  33. Cai, S.; Mao, Z.; Wang, Z.; Yin, M.; Karniadakis, G.E. Physics-informed neural networks (PINNs) for fluid mechanics: A review. Acta Mech. Sin. 2021, 37, 1727–1738. [Google Scholar] [CrossRef]
  34. Glorot, X.; Bengio, Y. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, Sardinia, Italy, 13–15 May 2010; JMLR Workshop and Conference Proceedings: Cambridge, MA, USA, 2010; pp. 249–256. [Google Scholar]
  35. Jagtap, A.D.; Kawaguchi, K.; Karniadakis, G.E. Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. J. Comput. Phys. 2020, 404, 109136. [Google Scholar] [CrossRef]
  36. Lu, L.; Meng, X.; Mao, Z.; Karniadakis, G.E. DeepXDE: A deep learning library for solving differential equations. SIAM Rev. 2021, 63, 208–228. [Google Scholar] [CrossRef]
  37. Wang, S.; Yu, X.; Perdikaris, P. When and why PINNs fail to train: A neural tangent kernel perspective. J. Comput. Phys. 2022, 449, 110768. [Google Scholar] [CrossRef]
  38. Wang, S.; Teng, Y.; Perdikaris, P. Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM J. Sci. Comput. 2021, 43, A3055–A3081. [Google Scholar] [CrossRef]
  39. Lagaris, I.E.; Likas, A.; Fotiadis, D.I. Artificial neural networks for solving ordinary and partial differential equations. IEEE Trans. Neural Netw. 1998, 9, 987–1000. [Google Scholar] [CrossRef] [PubMed]
  40. Sirignano, J.; Spiliopoulos, K. DGM: A deep learning algorithm for solving partial differential equations. J. Comput. Phys. 2018, 375, 1339–1364. [Google Scholar] [CrossRef]
  41. Berg, J.; Nyström, K. A unified deep artificial neural network approach to partial differential equations in complex geometries. Neurocomputing 2018, 317, 28–41. [Google Scholar] [CrossRef]
  42. McClenny, L.D.; Braga-Neto, U.M. Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv 2020, arXiv:2009.04544. [Google Scholar] [CrossRef]
  43. Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations (ICLR), San Diego, CA, USA, 7–9 May 2015. [Google Scholar] [CrossRef]
  44. Liu, D.C.; Nocedal, J. On the limited memory BFGS method for large scale optimization. Math. Program. 1989, 45, 503–528. [Google Scholar] [CrossRef]
  45. Byrd, R.H.; Lu, P.; Nocedal, J.; Zhu, C. A limited memory algorithm for bound constrained optimization. SIAM J. Sci. Comput. 1995, 16, 1190–1208. [Google Scholar] [CrossRef]
  46. Foshtomi, M.Y.; Oryan, S.; Taheri, M.; Bastami, K.D.; Zahed, M.A. Composition and abundance of microplastics in surface sediments and their interaction with sedimentary heavy metals, PAHs and TPH (total petroleum hydrocarbons). Mar. Pollut. Bull. 2019, 149, 110655. [Google Scholar] [CrossRef]
  47. Ding, L.; Mao, R.F.; Guo, X.T.; Yang, X.; Zhang, Q.W.; Yang, C. Microplastics in surface waters and sediments of the Wei River, in the northwest of China. Sci. Total Environ. 2019, 667, 427–434. [Google Scholar] [CrossRef]
  48. Simpson, M.R. Discharge Measurements Using a Broad-Band Acoustic Doppler Current Profiler; Open-File Report 01–1; US Geological Survey: Reston, VA, USA, 2001. [Google Scholar] [CrossRef]
  49. Tukey, J.W. Exploratory Data Analysis; Addison-Wesley: Boston, MA, USA, 1977. [Google Scholar]
  50. Nash, J.E.; Sutcliffe, J.V. River flow forecasting through conceptual models part I—A discussion of principles. J. Hydrol. 1970, 10, 282–290. [Google Scholar] [CrossRef]
  51. Karpatne, A.; Watkins, W.; Read, J.; Kumar, V. Physics-guided neural networks (PGNN): An application in lake temperature modeling. arXiv 2017, arXiv:1710.11431. [Google Scholar] [CrossRef]
  52. Jagtap, A.D.; Karniadakis, G.E. Extended physics-informed neural networks (XPINNs): A generalized space-time domain decomposition based deep learning framework for nonlinear partial differential equations. Commun. Comput. Phys. 2021, 28, 2002–2041. [Google Scholar] [CrossRef]
  53. Zhao, S.; Zhu, L.; Wang, T.; Li, D. Suspended microplastics in the surface water of the Yangtze Estuary System, China: First observations on occurrence, distribution. Mar. Pollut. Bull. 2014, 86, 562–568. [Google Scholar] [CrossRef]
  54. Meijer, L.J.; van Emmerik, T.; van der Ent, R.; Schmidt, C.; Lebreton, L. More than 1000 rivers account for 80% of global riverine plastic emissions into the ocean. Sci. Adv. 2021, 7, eaaz5803. [Google Scholar] [CrossRef]
  55. Waldschläger, K.; Brückner, M.Z.M.; Almroth, B.C.; Hackney, C.R.; Adyel, T.M.; Alimi, O.S.; Belontz, S.L.; Cowger, W.; Doyle, D.; Gray, A.; et al. Learning from natural sediments to tackle microplastics challenges: A multidisciplinary perspective. Earth-Sci. Rev. 2022, 228, 104021. [Google Scholar] [CrossRef]
  56. Bhardwaj, L.K.; Rath, P.; Yadav, P.; Gupta, U. Microplastic contamination, an emerging threat to the freshwater environment: A systematic review. Environ. Syst. Res. 2024, 13, 8. [Google Scholar] [CrossRef]
  57. Gao, X.; Li, J.; Wang, X.; Zhou, J.; Fan, B.; Li, W.; Liu, Z. A review on microplastics in major European rivers. WIREs Water 2024, 11, e1713. [Google Scholar] [CrossRef]
  58. Beucler, T.; Gentine, P.; Yuval, J.; Gupta, A.; Peng, L.; Lin, J.; Yu, S.; Rasp, S.; Ahmed, F.; O’Gorman, P.A.; et al. Climate-invariant machine learning. Sci. Adv. 2024, 10, eadj7250. [Google Scholar] [CrossRef] [PubMed]
  59. Secci, D.; Godoy, V.A.; Gómez-Hernández, J.J. Physics-informed neural networks for solving transient unconfined groundwater flow. Comput. Geosci. 2024, 182, 105494. [Google Scholar] [CrossRef]
  60. He, Q.; Tartakovsky, A.M. Physics-informed neural network method for forward and backward advection-dispersion equations. Water Resour. Res. 2021, 57, e2020WR029479. [Google Scholar] [CrossRef]
  61. Habib, M.; Habib, A.; Alibrahim, B. Applications of physics-informed neural networks in geosciences: From basic seismology to comprehensive environmental studies. Open Geosci. 2025, 17, 20250853. [Google Scholar] [CrossRef]
  62. Tang, D.; Zhan, Y.; Yang, F. A review of machine learning for modeling air quality: Overlooked but important issues. Atmos. Res. 2024, 300, 107261. [Google Scholar] [CrossRef]
  63. Frei, S.; Azizian, M.; Frei, M.; Griebler, C. Using physics-informed neural networks to quantify submarine groundwater discharge under high-frequency tidal dynamics using heat as a tracer. Limnol. Oceanogr. Methods 2024, 22, e10415. [Google Scholar] [CrossRef]
  64. Dazzi, S.; Vacondio, R.; Mignosa, P. Physics-informed neural networks for the augmented system of shallow water equations with topography. Water Resour. Res. 2024, 60, e2023WR036589. [Google Scholar] [CrossRef]
  65. van Emmerik, T.; Mellink, Y.; Hauk, R.; Waldschläger, K.; Schreyers, L. Rivers as plastic reservoirs. Front. Water 2022, 3, 786936. [Google Scholar] [CrossRef]
  66. Summers, E.; Du, J.; Park, K.; Wharton, M.; Kaiser, K. Importance of the water-sediment bed interactions in simulating microplastic particles in an estuarine system. Front. Mar. Sci. 2024, 11, 1414459. [Google Scholar] [CrossRef]
  67. Zhu, X.; Qiu, S.; Yin, X.; Liu, J.; Jia, X.; Ling, Z.; Liu, Y. Machine learning in environmental research: Common pitfalls and best practices. Environ. Sci. Technol. 2023, 57, 17671–17689. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Computational framework of physics-informed neural network for river microplastic transport simulation.
Figure 1. Computational framework of physics-informed neural network for river microplastic transport simulation.
Sustainability 18 01392 g001
Figure 2. Schematic illustration of composite loss function architecture integrating data-driven and physics-based constraints.
Figure 2. Schematic illustration of composite loss function architecture integrating data-driven and physics-based constraints.
Sustainability 18 01392 g002
Figure 3. Spatial distribution of time-averaged microplastic concentrations across the study reach with monitoring station locations.
Figure 3. Spatial distribution of time-averaged microplastic concentrations across the study reach with monitoring station locations.
Sustainability 18 01392 g003
Figure 4. Temporal variation in microplastic concentrations at representative stations correlated with river discharge over the monitoring period.
Figure 4. Temporal variation in microplastic concentrations at representative stations correlated with river discharge over the monitoring period.
Sustainability 18 01392 g004
Figure 5. Scatter plots comparing observed and predicted microplastic concentrations for different modeling approaches on the test dataset.
Figure 5. Scatter plots comparing observed and predicted microplastic concentrations for different modeling approaches on the test dataset.
Sustainability 18 01392 g005
Figure 6. Model performance metrics evaluated separately for low-flow, normal-flow, and high-flow conditions, demonstrating robustness across hydrological regimes.
Figure 6. Model performance metrics evaluated separately for low-flow, normal-flow, and high-flow conditions, demonstrating robustness across hydrological regimes.
Sustainability 18 01392 g006
Figure 7. Spatiotemporal evolution of simulated microplastic concentration distributions showing longitudinal profiles at different time points during a complete hydrological cycle.
Figure 7. Spatiotemporal evolution of simulated microplastic concentration distributions showing longitudinal profiles at different time points during a complete hydrological cycle.
Sustainability 18 01392 g007
Figure 8. Microplastic transport flux variation as a function of river discharge, highlighting flux enhancement during extreme hydrological events and model–observation agreement.
Figure 8. Microplastic transport flux variation as a function of river discharge, highlighting flux enhancement during extreme hydrological events and model–observation agreement.
Sustainability 18 01392 g008
Table 1. Neural network architecture parameter configuration.
Table 1. Neural network architecture parameter configuration.
ParameterConfigurationJustification
Input dimension4 (x, y, z, t)Spatiotemporal coordinates
Hidden layers8 layersBalance complexity and efficiency
Neurons per layer128Adequate representation capacity
Activation functiontanh (hidden), linear (output)Enable derivative computation
Residual connectionsEvery 2 layersMitigate gradient issues
InitializationXavier uniformStabilize training convergence
Table 2. Loss function weight configuration and adaptation parameters.
Table 2. Loss function weight configuration and adaptation parameters.
Loss ComponentInitial WeightAdaptation StrategyAdaptation ParameterUpdate Frequency
Data term ( L d a t a )1.0Fixed--
PDE term ( L P D E )0.1Ratio-based (Equation (24)) α = 0.5 Every 100 epochs
Boundary term ( L B C )0.5Exponential (Equation (25)) β = 2.0 Every 50 epochs
Initial term ( L I C )1.0DecayDecay rate = 0.95Every 100 epochs
Table 3. Model training hyperparameter settings.
Table 3. Model training hyperparameter settings.
HyperparameterValue/Configuration
Batch size1024 (collocation points)
Initial learning rate (Adam) 1 × 10 3
Learning rate decay factor0.95 per 1000 epochs
Adam momentum parameters β 1 = 0.9 , β 2 = 0.999
L-BFGS memory size50 iterations
Maximum epochs (Adam phase)50,000
Maximum iterations (L-BFGS)10,000
Early stopping patience500 epochs
Table 4. Experimental dataset statistical information.
Table 4. Experimental dataset statistical information.
VariableUnitSample SizeMean ± Std DevRangeMissing Rate (%)
Microplastic concentrationparticles/m312034.67 ± 2.310.52–12.843.2
Flow velocitym/s54260.89 ± 0.380.28–1.761.8
Water depthm185611.3 ± 3.25.2–21.70.5
Dischargem3/s36524,670 ± 924011,800–44,2000.0
Turbulent diffusivitym2/s8940.047 ± 0.0230.012–0.1384.1
Particle settling velocitymm/s12032.8 ± 1.40.6–7.23.2
Table 5. Comparative performance results of different modeling approaches.
Table 5. Comparative performance results of different modeling approaches.
Model TypeRMSE (Particles/m3)MAE (Particles/m3)NSE R 2 Computation Time (Hours)
Finite Volume Method1.240.970.720.7418.3
Pure Data-Driven DNN1.130.860.780.812.1
Hybrid 1D Model1.050.810.810.838.4
Physics-Informed NN (This Study)0.820.630.880.914.7
Table 6. Model performance under varying training data densities.
Table 6. Model performance under varying training data densities.
Data DensityPINN RMSE (Particles/m3)PINN NSEFVM RMSE (Particles/m3)FVM NSERelative Error Increase (%)
100%0.820.881.240.72Baseline
67%0.890.851.520.64PINN: +8%, FVM: +23%
50%0.940.821.780.56PINN: +15%, FVM: +44%
33%0.970.801.980.48PINN: +18%, FVM: +60%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Hu, P.; Wu, M.; Ma, J.; Zhang, J.; Zhao, J. Physics-Informed Neural Networks for Three-Dimensional River Microplastic Transport: Integrating Conservation Principles with Deep Learning. Sustainability 2026, 18, 1392. https://doi.org/10.3390/su18031392

AMA Style

Hu P, Wu M, Ma J, Zhang J, Zhao J. Physics-Informed Neural Networks for Three-Dimensional River Microplastic Transport: Integrating Conservation Principles with Deep Learning. Sustainability. 2026; 18(3):1392. https://doi.org/10.3390/su18031392

Chicago/Turabian Style

Hu, Pengjie, Mengtian Wu, Jian Ma, Jingwen Zhang, and Jianhua Zhao. 2026. "Physics-Informed Neural Networks for Three-Dimensional River Microplastic Transport: Integrating Conservation Principles with Deep Learning" Sustainability 18, no. 3: 1392. https://doi.org/10.3390/su18031392

APA Style

Hu, P., Wu, M., Ma, J., Zhang, J., & Zhao, J. (2026). Physics-Informed Neural Networks for Three-Dimensional River Microplastic Transport: Integrating Conservation Principles with Deep Learning. Sustainability, 18(3), 1392. https://doi.org/10.3390/su18031392

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop