Implementation of a GPU-Accelerated Lagrangian Particle Dispersion Model for Atmospheric Transport of Radioactive Nuclides
Abstract
1. Introduction
2. Materials and Methods
2.1. Lagrangian Particle Model and the FLEXPART Framework
2.2. Fine-Grained Parallel Acceleration Architecture
2.2.1. Memory Allocation and Data Transfer
2.2.2. Parallelization of Particle Grid Reordering and the Two-Level Loop Structure
2.3. Fast Arithmetic Instruction Optimization
2.3.1. Fast Division Operations
2.3.2. Fast Square-Root Operations
2.3.3. Newton–Raphson Iterative Refinement
2.4. GPU Parallel Execution Strategy and Resource Utilization Optimization
2.4.1. Parallel Granularity and Thread Organization Strategy
2.4.2. Register Usage Control Optimization
2.5. Decoupling of Background-Field Preprocessing
2.6. Multi-GPU Scalability and Load Balancing
2.7. Validation
2.7.1. Validation Experiments and Reference Benchmarks
2.7.2. Validation Metrics and Accuracy Assessment
2.7.3. Performance Evaluation Design
- Single-GPU acceleration assessment. This evaluation was conducted on a computing platform equipped with an AMD Ryzen Threadripper 7970X CPU (32 cores) and an NVIDIA GeForce RTX 5080 GPU (hereafter referred to as Platform A). The runtime performance of the developed GPU-accelerated program was compared with that of the reference benchmark. The analysis focused on quantifying performance improvements achieved through the successive introduction of the fine-grained parallel architecture, fast arithmetic instruction optimization, and parallel execution and resource utilization strategies, thereby enabling a quantitative assessment of the relative contribution of each optimization to the overall speedup.
- Multi-GPU scalability evaluation. By progressively increasing the number of GPUs, the variation in total execution time and speedup with respect to GPU count was systematically analyzed to assess the parallel scalability of the developed program in multi-device environments. Because such scalability tests require substantial GPU resources to adequately characterize performance trends, this evaluation was performed on a server platform equipped with an Intel® Xeon® Platinum 8160 CPU @ 2.10 GHz and eight NVIDIA Tesla V100 GPUs (hereafter referred to as Platform B).
- Heterogeneous GPU load-balancing evaluation. Based on Platform A, an additional NVIDIA GeForce RTX 5070 GPU was introduced to construct a heterogeneous multi-GPU environment (hereafter referred to as Platform C) for evaluating the proposed load-balancing strategy. By comparing the distribution of computation time across GPUs with and without the load-balancing strategy enabled, the effectiveness of the strategy in improving task allocation balance and overall parallel efficiency under heterogeneous computing conditions was analyzed.
3. Results and Discussion
3.1. Accuracy Validation
3.2. Computational Performance Evaluation
3.2.1. Single-GPU Acceleration Performance
3.2.2. Multi-GPU Scalability Assessment
3.2.3. Load Balancing Across Heterogeneous GPUs
4. Conclusions
5. Limitations and Future Works
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A. Deep Convective Particle Redistribution Procedure
Appendix B. Design, Computational Details, and Intermediate Variables of the Preprocessing Program grib_to_bin_mpi
- MPI initialization and process management. Upon startup, the program calls MPI_Init to initialize the MPI environment and retrieves the total number of processes and the process rank . The root process is designated with . and is responsible for overall coordination.
- Configuration file reading and format detection. The root process reads the COMMAND, RELEASES, and AVAILABLE configuration files to obtain simulation parameters (e.g., , , ). The meteorological data format (ECMWF or NCEP) is identified using . Depending on the detected format, or is invoked to analyze grid dimensions (e.g., , , ). The resulting metadata are written to header.t for subsequent use by FLEXPART.
- Global broadcast and memory allocation. The identified format, grid dimensions, and related parameters (e.g., , , , , ) are broadcast to all processes via MPI_Bcast. Each process then allocates memory for the required data structures (such as the three-dimensional arrays , , and ), ensuring consistency across all ranks.
- Parallel file processing. GRIB files are distributed among processes according to the and arrays. Each process reads its assigned files and invokes or to extract wind-field data. After completing parameter calculations and necessary transformations, the results are written to files. An MPI_Barrier is used to synchronize all processes, after which the root process generates the AVAILABLE_bin file listing the produced binary files.
- Memory deallocation and program termination.
| Variable Short Name | Description |
|---|---|
| temperature data on half model levels | |
| specific humidity data on half model levels | |
| 1 | surface pressure |
| 1 | total cloud cover |
| 1 | 2 m temperature |
| 1 | 2 m dew point |
| 1 | large scale total precipitation |
| 1 | convective precipitation |
| 1 | orography |
| 1 | land sea mask |
| 3 | logical, indicating whether clwc is available (see 1) |
| 1 | friction velocity |
| 1 | convective velocity scale |
| 1 | mixing height |
| 1 | altitude of thermal tropopause |
| 1 | inverse Obukhov length (1/L) |
| 1 | deposition velocity |
| () | wind components in x |
| () | wind components in y |
| () | wind components in z |
| () | wind components in polar stereographic projection |
| () | wind components in polar stereographic projection |
| () | temperature data on internal model levels |
| specific humidity data on internal model levels | |
| () | potential vorticity |
| () | air density |
| () | vertical air density gradient |
| 1 | total cloud water content (=liquid + ice ) |
| 1 | cloud bottom height |
| 1 | cloud top |
| 2 | heights of all levels |
| 2 ( 2) | model level heights |
| 2 ( 2) | half-model level heights |
| 3 | number of levels up to maximum PBL height (3500 m) |
| pressure on half model levels | |
| air pressure RLT |
| Variable Name | Description |
|---|---|
| storing the input data type (ECMWF/NCEP) | |
| Size of windfield | |
| Size of windfield | |
| Size of windfield | |
| Size of windfield | |
| Size of windfield | |
| actual dimensions of wind fields in x, y and z direction | |
| actual dimensions of wind fields in x, y and z direction | |
| actual dimensions of wind fields in x, y and z direction | |
| same as for limited area fields, but for global fields | |
| vertical dimension of original data (u, v components; staggered grid) | |
| vertical dimension of original data (w component) | |
| number of levels ECMWF model | |
| T for global fields, F for limited area fields | |
| T for global fields, F for limited area fields | |
| T for global fields, F for limited area fields | |
| coefficients which regulate vertical discretization | |
| coefficients which regulate vertical discretization | |
| model discretization coefficients at the centre of the layers | |
| model discretization coefficients at the centre of the layers | |
| grid distance in x direction | |
| grid distance in y direction | |
| auxiliary variables for utransform | |
| auxiliary variables for utransform | |
| geographical longitude of lower left grid point | |
| geographical latitude of lower left grid point | |
| define stereographic projections at the two poles | |
| define stereographic projections at the two poles | |
| use polar stereographic threshold in grid units | |
| use polar stereographic threshold in grid units | |
| maximum number of levels for convection | |
| parameter used in Emanuel’s convect subroutine |
Appendix C. Configuration Parameters in the Validation Benchmark
| Parameter | Setting | Description |
|---|---|---|
| LDIRECT | 1 | Forward simulation |
| IBDATE | 19941023 | Start date of the simulation (YYYYMMDD) |
| IBTIME | 160000 | Start time of the simulation (HHMMSS, UTC) |
| IEDATE | 19941027 | End date of the simulation (YYYYMMDD) |
| IETIME | 100000 | End time of the simulation (HHMMSS, UTC) |
| PARTS | 1,000,000 | Total number of released Lagrangian particles |
| LSYNCTIME | 300 s | Synchronization time step of the simulation |
| DXOUT/DYOUT | Longitude/latitude resolution of the output grid | |
| NX/NY | 320/250 | Number of output grid cells in the longitude/latitude directions |
| LCONVECTION | 1 | Switch on convection parameterization |
| LTURBULENCE | 1 | Switch on turbulence parameterization |
| CBLFLAG | 1 | Skewed turbulence parameterization scheme used in the CBL |
| CTL | 10 | Reduction factor for the turbulence-integration time step |
| IFINE | 10 | Reduction factor for the vertical-transport time step |
Appendix D. Long-Term Accuracy Assessment of the GPU Model

Appendix E. Resource Utilization of Individual Components Without Imposing a Per-Thread Maximum Register Limit
| Computational Module | Registers per Thread | Occupancy | Computational Throughput | Memory Throughput |
|---|---|---|---|---|
| Advection-Diffusion | 254 | 16.67% | ∼20% | ∼46% |
| Convective mixing | 116 | 33.33% | ∼13% | ∼38% |
| Wet deposition | 250 | 16.67% | ∼2% | ∼4.5% |
Appendix F. Performance Comparison for Different Particle Numbers
| Particle Number | CPU Total Time (s) | GPU Total Time (s) | Speedup |
|---|---|---|---|
| 100,000 | 1168.47 | 37.79 | 30.92× |
| 200,000 | 1830.58 | 48.61 | 37.66× |
| 500,000 | 3678.58 | 87.27 | 42.15× |
| 1,000,000 | 6598.00 | 126.77 | 52.05× |
| 2,000,000 | 13,550.26 | 278.87 | 48.59× |
| 5,000,000 | 30,103.03 | 630.61 | 47.74× |
References
- Zhang, X.; Wang, J. Atmospheric dispersion of chemical, biological, and radiological hazardous pollutants: Informing risk assessment for public safety. J. Saf. Sci. Resil. 2022, 3, 372–397. [Google Scholar] [CrossRef]
- Yao, R. Atmospheric dispersion of radioactive material in radiological risk assessment and emergency response. Prog. Nucl. Sci. Technol. 2011, 1, 7–13. [Google Scholar] [CrossRef]
- Sugiyama, G.; Nasstrom, J.; Pobanz, B.; Foster, K.; Simpson, M.; Vogt, P.; Aluzzi, F.; Homann, S. Atmospheric Dispersion Modeling: Challenges of the Fukushima Daiichi Response. Health Phys. 2012, 102, 493–508. [Google Scholar] [CrossRef]
- Hernández-Ceballos, M.A.; Sangiorgi, M.; García-Puerta, B.; Montero, M.; Trueba, C. Dispersion and ground deposition of radioactive material according to airflow patterns for enhancing the preparedness to N/R emergencies. J. Environ. Radioact. 2020, 216, 106178. [Google Scholar] [CrossRef]
- Ulimoen, M.; Berge, E.; Klein, H.; Salbu, B.; Lind, O.C. Comparing model skills for deterministic versus ensemble dispersion modelling: The Fukushima Daiichi NPP accident as a case study. Sci. Total Environ. 2022, 806, 150128. [Google Scholar] [CrossRef]
- Xu, Y.; Li, X.; Luo, H.; Wang, W.; Fang, S. Source reconstruction for atmospheric radionuclide leakage: Recent advances in decoding information from atmospheric transport physics. J. Hazard. Mater. 2025, 497, 139534. [Google Scholar] [CrossRef]
- Rakesh, P.T.; Venkatesan, R.; Srinivas, C.V. Formulation of TKE based empirical diffusivity relations from turbulence measurements and incorporation in a Lagrangian particle dispersion model. Environ. Fluid Mech. 2013, 13, 353–369. [Google Scholar] [CrossRef]
- Zhang, X.; Efthimiou, G.; Wang, Y.; Huang, M. Comparisons between a new point kernel-based scheme and the infinite plane source assumption method for radiation calculation of deposited airborne radionuclides from nuclear power plants. J. Environ. Radioact. 2018, 184–185, 32–45. [Google Scholar] [CrossRef]
- Pudykiewicz, J. Simulation of the Chernobyl dispersion with a 3-D hemispheric tracer model. Tellus B 1989, 41, 391–412. [Google Scholar] [CrossRef]
- Leelőssy, Á.; Mészáros, R.; Lagzi, I. Short and long term dispersion patterns of radionuclides in the atmosphere around the Fukushima Nuclear Power Plant. J. Environ. Radioact. 2011, 102, 1117–1121. [Google Scholar] [CrossRef] [PubMed]
- Christoudias, T.; Proestos, Y.; Lelieveld, J. Atmospheric Dispersion of Radioactivity from Nuclear Power Plant Accidents: Global Assessment and Case Study for the Eastern Mediterranean and Middle East. Energies 2014, 7, 8338–8354. [Google Scholar] [CrossRef]
- Hu, X.; Li, D.; Huang, H.; Shen, S.; Bou-Zeid, E. Modeling and sensitivity analysis of transport and deposition of radionuclides from the Fukushima Dai-ichi accident. Atmos. Chem. Phys. 2014, 14, 11065–11092. [Google Scholar] [CrossRef]
- Christoudias, T.; Lelieveld, J. Modelling the global atmospheric transport and deposition of radionuclides from the Fukushima Dai-ichi nuclear accident. Atmos. Chem. Phys. 2013, 13, 1425–1438. [Google Scholar] [CrossRef]
- Pisso, I.; Sollum, E.; Grythe, H.; Kristiansen, N.I.; Cassiani, M.; Eckhardt, S.; Arnold, D.; Morton, D.; Thompson, R.L.; Groot Zwaaftink, C.D.; et al. The Lagrangian particle dispersion model FLEXPART version 10.4. Geosci. Model Dev. 2019, 12, 4955–4997. [Google Scholar] [CrossRef]
- Bakels, L.; Tatsii, D.; Tipka, A.; Thompson, R.; Dütsch, M.; Blaschek, M.; Seibert, P.; Baier, K.; Bucci, S.; Cassiani, M.; et al. FLEXPART version 11: Improved accuracy, efficiency, and flexibility. Geosci. Model Dev. 2024, 17, 7595–7627. [Google Scholar] [CrossRef]
- Cassiani, M.; Stohl, A.; Brioude, J. Lagrangian stochastic modelling of dispersion in the convective boundary layer with skewed turbulence conditions and a vertical density gradient: Formulation and implementation in the FLEXPART model. Bound.-Layer Meteorol. 2015, 154, 367–390. [Google Scholar] [CrossRef]
- Van Thielen, S.; Turcanu, C.; Camps, J.; Keppens, R. Optimizing the calculation grid for atmospheric dispersion modelling. J. Environ. Radioact. 2015, 142, 103–112. [Google Scholar] [CrossRef]
- Sørensen, J.H.; Bartnicki, J.; Blixt Buhr, A.M.; Feddersen, H.; Hoe, S.C.; Israelson, C.; Klein, H.; Lauritzen, B.; Lindgren, J.; Schönfeldt, F.; et al. Uncertainties in atmospheric dispersion modelling during nuclear accidents. J. Environ. Radioact. 2020, 222, 106356. [Google Scholar] [CrossRef]
- International Nuclear Safety Advisory Group. The Chernobyl Accident: Updating of INSAG-1; Intenational Atomic Energy Agency: Vienna, Austria, 1992. [Google Scholar]
- Saunier, O.; Mathieu, A.; Didier, D.; Tombette, M.; Quélo, D.; Winiarek, V.; Bocquet, M. An inverse modeling method to assess the source term of the Fukushima Nuclear Power Plant accident using gamma dose rate observations. Atmos. Chem. Phys. 2013, 13, 11403–11421. [Google Scholar] [CrossRef]
- Jammal, R.; Vincze, P.; Heitsch, M.; Dobrzynski, L.; Dolganov, K.; Duspiva, J.; Grant, I.; Guerpinar, A.; Hirano, M.; Khouaja, H. The Fukushima Daiichi Accident; Intenational Atomic Energy Agency: Vienna, Austria, 2015. [Google Scholar]
- Snoun, H.; Bellakhal, G.; Kanfoudi, H.; Zhang, X.; Chahed, J. One-way coupling of WRF with a Gaussian dispersion model: A focused fine-scale air pollution assessment on southern Mediterranean. Environ. Sci. Pollut. Res. 2019, 26, 22892–22906. [Google Scholar] [CrossRef] [PubMed]
- He, J.; Lyu, M.; Qiu, Z.; He, X.; Lu, B.; Wang, J.; Shen, S.; Zhang, X. Physics-informed optimization for emergency radiation assessment with temporal correction under meteorological uncertainty. J. Environ. Radioact. 2026, 291, 107817. [Google Scholar] [CrossRef]
- Zhang, X.; Raskob, W.; Landman, C.; Trybushnyi, D.; Li, Y. Sequential multi-nuclide emission rate estimation method based on gamma dose rate measurement for nuclear emergency management. J. Hazard. Mater. 2017, 325, 288–300. [Google Scholar] [CrossRef]
- Dong, X.; Fang, S.; Zhuang, S.; Xu, Y.; Zhao, Y.; Sheng, L. Objective inversion of the continuous atmospheric 137Cs release following the Fukushima accident. J. Hazard. Mater. 2023, 447, 130786. [Google Scholar] [CrossRef]
- Dong, X.; Zhuang, S.; Xu, Y.; Hu, H.; Li, X.; Fang, S. Multi-scenario validation of the robust inversion method with biased plume range and values. J. Environ. Radioact. 2024, 272, 107363. [Google Scholar] [CrossRef] [PubMed]
- Xu, Y.; Fang, S.; Dong, X.; Zhuang, S. A spatiotemporally separated framework for reconstructing the sources of atmospheric radionuclide releases. Geosci. Model Dev. 2024, 17, 4961–4982. [Google Scholar] [CrossRef]
- Xu, Y.; Dong, X.; Luo, H.; Fang, S. Robust source reconstruction of atmospheric radionuclides from observations of different sparsity with spatial preselection and non-smooth constraints. J. Hazard. Mater. 2025, 486, 136919. [Google Scholar] [CrossRef]
- Zhang, X.L.; Su, G.F.; Yuan, H.Y.; Chen, J.G.; Huang, Q.Y. Modified ensemble Kalman filter for nuclear accident atmospheric dispersion: Prediction improved and source estimated. J. Hazard. Mater. 2014, 280, 143–155. [Google Scholar] [CrossRef] [PubMed]
- Zhang, X.L.; Li, Q.B.; Su, G.F.; Yuan, M.Q. Ensemble-based simultaneous emission estimates and improved forecast of radioactive pollution from nuclear power plant accidents: Application to ETEX tracer experiment. J. Environ. Radioact. 2015, 142, 78–86. [Google Scholar] [CrossRef]
- Zhang, X.L.; Su, G.F.; Chen, J.G.; Raskob, W.; Yuan, H.Y.; Huang, Q.Y. Iterative ensemble Kalman filter for atmospheric dispersion in nuclear accidents: An application to Kincaid tracer experiment. J. Hazard. Mater. 2015, 297, 329–339. [Google Scholar] [CrossRef] [PubMed]
- Jianyao, Y.; Yuan, H.; Su, G.; Wang, J.; Weng, W.; Zhang, X. Machine learning-enhanced high-resolution exposure assessment of ultrafine particles. Nat. Commun. 2025, 16, 1209. [Google Scholar] [CrossRef]
- Huang, S.X.; Zhang, J.P.; Yang, W.D.; Wang, Z.F.; Hu, F.; Liu, F.; Sheng, L.; Zeng, Q.C. Predicting and Controlling Nuclear Accident Hazards: Issues and Challenges. Aerosol Air Qual. Res. 2016, 16, 417–429. [Google Scholar] [CrossRef]
- Sun, S.; Li, H.; Fang, S. A forward-backward coupled source term estimation for nuclear power plant accident: A case study of loss of coolant accident scenario. Ann. Nucl. Energy 2017, 104, 64–74. [Google Scholar] [CrossRef]
- Saunier, O.; Korsakissok, I.; Didier, D.; Doursout, T.; Mathieu, A. Real-time use of inverse modeling techniques to assess the atmospheric accidental release of a nuclear power plant. Radioprotection 2020, 55, 107–115. [Google Scholar] [CrossRef]
- Fang, S.; Dong, X.; Zhuang, S.; Tian, Z.; Chai, T.; Xu, Y.; Zhao, Y.; Sheng, L.; Ye, X.; Xiong, W. Oscillation-free source term inversion of atmospheric radionuclide releases with joint model bias corrections and non-smooth competing priors. J. Hazard. Mater. 2022, 440, 129806. [Google Scholar] [CrossRef]
- Andronopoulos, S.; Kovalets, I.V. Method of Source Identification Following an Accidental Release at an Unknown Location Using a Lagrangian Atmospheric Dispersion Model. Atmosphere 2021, 12, 1305. [Google Scholar] [CrossRef]
- Cui, W.; Cao, B.; Fan, Q.; Fan, J.; Chen, Y. Source term inversion of nuclear accident based on deep feedforward neural network. Ann. Nucl. Energy 2022, 175, 109257. [Google Scholar] [CrossRef]
- Hoffmann, L.; Haghighi Mood, K.; Herten, A.; Hrywniak, M.; Kraus, J.; Clemens, J.; Liu, M. Accelerating Lagrangian transport simulations on graphics processing units: Performance optimizations of Massive-Parallel Trajectory Calculations (MPTRAC) v2.6. Geosci. Model Dev. 2024, 17, 4077–4094. [Google Scholar] [CrossRef]
- Ling, Y.; Liu, C.; Shan, Q.; Hei, D.; Zhang, X.; Shi, C.; Jia, W.; Yue, Q.; Wang, J. Source term inversion of short-lived nuclides in complex nuclear accidents based on machine learning using off-site gamma dose rate. J. Hazard. Mater. 2024, 465, 133388. [Google Scholar] [CrossRef]
- Zhao, Y.; Liu, Y.; Wang, L.; Cheng, J.; Wang, S.; Li, Q. Source Reconstruction of Atmospheric Releases by Bayesian Inference and the Backward Atmospheric Dispersion Model: An Application to ETEX-I Data. Sci. Technol. Nucl. Install. 2021, 2021, 5558825. [Google Scholar] [CrossRef]
- Li, Q.-Y.; Zhang, J.; Lian, B.; Liu, L.; Qiu, R.; Li, J. A Bayesian Source Term inversion Method Based on Spatiotemporal Trajectory Prior and Joint Adaptive MCMC Sampling. ChinaXiv 2025. [Google Scholar] [CrossRef]
- Xu, Y.; Dong, X.; Fang, S. Efficient Bayesian source reconstruction and uncertainty quantification of atmospheric radionuclide releases by replacing release rate sampling with Maximum-A-Posteriori estimation of time-varying release rates. J. Hazard. Mater. 2025, 492, 138171. [Google Scholar] [CrossRef] [PubMed]
- Harvey, P.; Hameed, S.; Vanderbauwhede, W. Accelerating Lagrangian particle dispersion in the atmosphere with OpenCL across multiple platforms. In IWOCL ’14: Proceedings of the International Workshop on OpenCL 2013 & 2014; Association for Computing Machinery: New York, NY, USA, 2014. [Google Scholar] [CrossRef]
- Santos, M.C.; Pinheiro, A.; Schirru, R.; Pereira, C.M.N.A. GPU-based implementation of a real-time model for atmospheric dispersion of radionuclides. Prog. Nucl. Energy 2019, 110, 245–259. [Google Scholar] [CrossRef]
- Yu, F.; Strazdins, P.; Henrichs, J.; Pugh, T. Shared Memory and GPU Parallelization of an Operational Atmospheric Transport and Dispersion Application. In Proceedings of the 2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Rio de Janeiro, Brazil, 20–24 May 2019. [Google Scholar] [CrossRef]
- Kong, B.; Dai, T.; Xu, L.; Li, B.; Dai, N.; Xiao, B. CPU-GPU concurrent computing algorithm of particle transport using discontinuous finite element discrete ordinates with unstructured grids. ChinaXiv 2025. [Google Scholar] [CrossRef]
- Stohl, A.; Hittenberger, M.; Wotawa, G. Validation of the lagrangian particle dispersion model FLEXPART against large-scale tracer experiment data. Atmos. Environ. 1998, 32, 4245–4264. [Google Scholar] [CrossRef]
- Muhammad, H.; Xuan, W.; Wang, M.; Su, G. Review of spatial scale dispersion models (ATDMs) to simulate environmental dispersion and deposition of radionuclides and the overview of GIS coupling with dispersion models. Int. J. Adv. Nucl. React. Des. Technol. 2024, 6, 256–280. [Google Scholar] [CrossRef]
- Zeng, J.; Matsunaga, T.; Mukai, H. Using nvidia gpu for modelling the lagrangian particle dispersion in the atmosphere. In Proceedings of the 5th International Congress on Environmental Modelling and Software, Ottawa, ON, Canada, 5–8 July 2010. [Google Scholar]
- Van dop, H.; Addis, R.; Fraser, G.; Girardi, F.; Graziani, G.; Inoue, Y.; Kelly, N.; Klug, W.; Kulmala, A.; Nodop, K.; et al. ETEX: A European tracer experiment; observations, dispersion modelling and emergency response. Atmos. Environ. 1998, 32, 4089–4094. [Google Scholar] [CrossRef]
- Emanuel, K.A.; Živković Rothman, M. Development and Evaluation of a Convection Scheme for Use in Climate Models. J. Atmos. Sci. 1999, 56, 1766–1782. [Google Scholar] [CrossRef]
- Seinfeld, J.H.; Pandis, S.N. Atmospheric Chemistry and Physics: From Air Pollution to Climate Change; John Wiley & Sons: Hoboken, NJ, USA, 2016. [Google Scholar]
- Lin, J.C.; Gerbig, C.; Wofsy, S.C.; Andrews, A.E.; Daube, B.C.; Davis, K.J.; Grainger, C.A. A near-field tool for simulating the upstream influence of atmospheric observations: The Stochastic Time-Inverted Lagrangian Transport (STILT) model. J. Geophys. Res. Atmos. 2003, 108, 4493. [Google Scholar] [CrossRef]
- Song, C.K.; Kim, C.H.; Lee, S.H.; Park, S.U. A 3-D Lagrangian particle dispersion model with photochemical reactions. Atmos. Environ. 2003, 37, 4607–4623. [Google Scholar] [CrossRef]
- Forster, C.; Stohl, A.; Seibert, P. Parameterization of convective transport in a Lagrangian particle dispersion model and its evaluation. J. Appl. Meteorol. Climatol. 2007, 46, 403–422. [Google Scholar] [CrossRef]
- Dietz, R. Perfluorocarbon Tracer Technology; Technical Report; Brookhaven National Lab.: Upton, NY, USA, 1985. [Google Scholar]
- Nodop, K.; Connolly, R.; Girardi, F. The field campaigns of the European Tracer Experiment (ETEX): Overview and results. Atmos. Environ. 1998, 32, 4095–4108. [Google Scholar] [CrossRef]
- Sakdhnagool, P.; Sabne, A.; Eigenmann, R. RegDem: Increasing GPU performance via shared memory register spilling. arXiv 2019, arXiv:1907.02894. [Google Scholar] [CrossRef]











| Author (Year) | Convergence Iterations |
|---|---|
| Zhao et al. [41] | ∼3000 |
| Li et al. [42] | ∼2400 |
| Xu et al. [43] | 1000–5000 |
| Metric | Mathematical Definition |
|---|---|
| FB | FB |
| RMSE | RMSE |
| FA2 | FA2 |
| FA5 | FA5 |
| Metric | CPU Reference Implementation | GPU Implementation |
|---|---|---|
| FB | 0.43 | 0.4 |
| RMSE | 0.58 | 0.57 |
| FA2 | 0.7 | 0.7 |
| FA5 | 0.74 | 0.74 |
| Computational Module | Execution Time (s) | Time Fraction (%) |
|---|---|---|
| Advection-Diffusion | 55.07 | 9.53 |
| Get fields | 454.04 | 78.56 |
| Convective mixing | 12.95 | 2.24 |
| Wet deposition | 4.72 | 0.82 |
| Statistics & Output | 20.42 | 3.53 |
| Others | 30.76 | 5.32 |
| Number of Processes | Execution Time (s) | Speedup |
|---|---|---|
| CPU baseline | 454.04 | \ |
| 1 | 490.27 | 0.93 |
| 2 | 252.24 | 1.80 |
| 3 | 172.80 | 2.63 |
| 6 | 93.31 | 4.87 |
| 9 | 65.23 | 6.99 |
| 18 | 39.20 | 11.58 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Li, Q.; He, T.; Li, M.; Zhang, J.; Lian, B.; Liu, L.; Qiu, R.; Li, J. Implementation of a GPU-Accelerated Lagrangian Particle Dispersion Model for Atmospheric Transport of Radioactive Nuclides. Atmosphere 2026, 17, 573. https://doi.org/10.3390/atmos17060573
Li Q, He T, Li M, Zhang J, Lian B, Liu L, Qiu R, Li J. Implementation of a GPU-Accelerated Lagrangian Particle Dispersion Model for Atmospheric Transport of Radioactive Nuclides. Atmosphere. 2026; 17(6):573. https://doi.org/10.3390/atmos17060573
Chicago/Turabian StyleLi, Qingyun, Tao He, Mingye Li, Junfang Zhang, Bing Lian, Liye Liu, Rui Qiu, and Junli Li. 2026. "Implementation of a GPU-Accelerated Lagrangian Particle Dispersion Model for Atmospheric Transport of Radioactive Nuclides" Atmosphere 17, no. 6: 573. https://doi.org/10.3390/atmos17060573
APA StyleLi, Q., He, T., Li, M., Zhang, J., Lian, B., Liu, L., Qiu, R., & Li, J. (2026). Implementation of a GPU-Accelerated Lagrangian Particle Dispersion Model for Atmospheric Transport of Radioactive Nuclides. Atmosphere, 17(6), 573. https://doi.org/10.3390/atmos17060573

