# A Hybrid MPI/OpenMP Parallelization Scheme Based on Nested FDTD for Parametric Decay Instability

^{1}

^{2}

^{3}

^{*}

## Abstract

**:**

## 1. Introduction

^{5}m), pump wave wavelengths (10

^{1}m), and electrostatic waves excited by wave mode conversion (10

^{−1}m) [17,18,19]. When simulating PDI processes, the minimum scale of the discrete grid should be less than 10

^{−1}m, and the time step needs to meet the Courant–Friedrichs–Lewy (CFL) condition [20]. Thus, the spatial grids will reach 10

^{6}points and the process will be up to 600 h when a 1D simulation is performed on a 200 km length area with a grid resolution of 2 decimeters (the discrete scheme is the same as the coarse mesh region in Figure 1b). Furthermore, to examine the general characteristics of the interaction between EM waves and the density cavity, which is suddenly generated, several simulations with different cavity depths need to be designed. If a serial code is used, then the total time cost of simulation becomes unacceptable.

## 2. Mathematical Model

#### 2.1. Governing Equation

^{3}, and ${\mathit{U}}_{\alpha}$ is the time-varying fluid bulk velocity vector. Here, $\alpha $ represents an electron or an oxygen ion. $\mathit{E}$, ${\mathit{U}}_{\alpha}$, $\mathit{H}$, and ${N}_{\alpha}$ are the functions of the spatial coordinates (x, y, z) and time (T), respectively.

#### 2.2. Discretization Scheme

## 3. Serial Program

#### 3.1. Serial Program Design

^{−1}m, and the time step needs to meet the CFL condition. The total number of updates was determined by the time step. At the same cycle time, the individual spatial grids were updated for each physical variable ($H$, $U$, $N$, and $E$) in the data update module (module 1), and the storage operation for the updated results was performed in the data storage module (module 2).

#### 3.2. Serial Program Performance Analysis

## 4. MPI and OpenMP Parallelization of FDTD

^{5}–10

^{6}points, and OpenMP is more suitable for parallelizing this workload than MPI or CUDA since they tend to handle much larger spatial grids [29]. OpenMP could avoid some extra parallelism overhead, for instance, the required message passing brought by MPI and CUDA. Based on the above discussion, a parallel program framework for the hybrid programming of OpenMP and MPI was established for the full model, as shown in the third panel of Figure 3.

#### 4.1. Data Update Module Parallel Scheme

#### 4.2. Data Storage Module Parallel Scheme

#### 4.3. Adaptive Allocation of the Number of Threads and Tasks

## 5. Results

## 6. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## References

- Pedersen, T.; Gustavsson, B.; Mishin, E.; Kendall, E.; Mills, T.; Carlson, H.C.; Snyder, A.L. Creation of artificial ionospheric layers using high-power HF waves. Geophys. Res. Lett.
**2010**, 37, L02106. [Google Scholar] [CrossRef] [Green Version] - Rietveld, M.T.; Kohl, H.; Kopka, H.; Stubbe, P. Introduction to ionospheric heating at Tromsø—I. Experimental overview. J. Atmos. Terr. Phys.
**1993**, 55, 577–599. [Google Scholar] [CrossRef] - Rietveld, M.T.; Senior, A.; Markkanen, J.; Westman, A. New capabilities of the upgraded EISCAT high-power HF facility. Radio Sci.
**2016**, 51, 1533–1546. [Google Scholar] [CrossRef] [Green Version] - Streltsov, A.V.; Berthelier, J.J.; Chernyshov, A.A.; Frolov, V.L.; Honary, F.; Kosch, M.J.; McCoy, R.P.; Mishin, E.V.; Rietveld, M.T. Past, present and future of active radio frequency experiments in space. Space Sci. Rev.
**2018**, 214, 118. [Google Scholar] [CrossRef] [Green Version] - Zhou, C.; Wang, X.; Liu, M.; Ni, B.; Zhao, Z. Nonlinear processes in ionosphere: Report on the mechanisms of ionospheric heating by the powerful radio waves. Chin. J. Geophys.
**2018**, 61, 4323–4336. [Google Scholar] [CrossRef] - Hocke, K.; Liu, H.X.; Pedatella, N.; Ma, G.Y. Global sounding of F region irregularities by COSMIC during a geomagnetic storm. Ann. Geophys.
**2019**, 37, 235–242. [Google Scholar] [CrossRef] [Green Version] - Gurevich, A.V.J.P.-U. Nonlinear effects in the ionosphere. Phys. Usp.
**2007**, 50, 1091. [Google Scholar] [CrossRef] - Doe, R.A.; Mendillo, M.; Vickrey, J.F.; Zanetti, L.J.; Eastes, R.W. Observations of nightside auroral cavities. J. Geophys. Res. Space Phys.
**1993**, 98, 293–310. [Google Scholar] [CrossRef] [Green Version] - Streltsov, A.V.; Lotko, W. Coupling between density structures, electromagnetic waves and ionospheric feedback in the auroral zone. J. Geophys. Res. Space Phys.
**2008**, 113, A05212. [Google Scholar] [CrossRef] [Green Version] - Zettergren, M.; Lynch, K.; Hampton, D.; Nicolls, M.; Wright, B.; Conde, M.; Moen, J.; Lessard, M.; Miceli, R.; Powell, S. Auroral ionospheric F region density cavity formation and evolution: MICA campaign results. J. Geophys. Res. Space Phys.
**2014**, 119, 3162–3178. [Google Scholar] [CrossRef] [Green Version] - Robinson, T.R. The heating of the high lattitude ionosphere by high power radio waves. Phy. Rep.
**1989**, 179, 79–209. [Google Scholar] [CrossRef] - Yee, K. Numerical solution of initial boundary value problems involving Maxwell’s equations in isotropic media. IEEE Trans. Antennas Propag.
**1966**, 14, 302–307. [Google Scholar] [CrossRef] [Green Version] - Simpson, J.J. Current and future applications of 3-D global Earth-Ionosphere models based on the Full-Vector Maxwell’s equations FDTD method. Surv. Geophys.
**2009**, 30, 105–130. [Google Scholar] [CrossRef] - Young, J.L. A full finite difference time domain implementation for radio wave propagation in a plasma. Radio Sci.
**1994**, 29, 1513–1522. [Google Scholar] [CrossRef] - Yu, Y.; Simpson, J.J. An E-J Collocated 3-D FDTD model of electromagnetic wave propagation in magnetized cold plasma. IEEE Trans. Antennas Propag.
**2010**, 58, 469–478. [Google Scholar] - Gondarenko, N.A.; Ossakow, S.L.; Milikh, G.M. Generation and evolution of density irregularities due to self-focusing in ionospheric modifications. J. Geophys. Res. Space Phys.
**2005**, 110, A09304. [Google Scholar] [CrossRef] [Green Version] - Eliasson, B. Full-scale simulation study of the generation of topside ionospheric turbulence using a generalized Zakharov model. Geophys. Res. Lett.
**2008**, 35, L11104. [Google Scholar] [CrossRef] - Eliasson, B. A nonuniform nested grid method for simulations of RF induced ionospheric turbulence. Comput. Phys. Commun.
**2008**, 178, 8–14. [Google Scholar] [CrossRef] - Eliasson, B. Full-scale simulations of Ionospheric Langmuir turbulence. Mod. Phys. Lett. B
**2013**, 27, 1330005. [Google Scholar] [CrossRef] - Huang, J.; Zhou, C.; Liu, M.-R.; Wang, X.; Zhang, Y.-N.; Zhao, Z.-Y. Study of parametric decay instability in ionospheric heating of powerful waves (I): Numerical simulation. Chin. J. Geophys.
**2017**, 60, 3693–3706. [Google Scholar] [CrossRef] - Cannon, P.D.; Honary, F. A GPU-Accelerated finite-gifference time-domain scheme for electromagnetic wave interaction with plasma. IEEE Trans. Antennas Propag.
**2015**, 63, 3042–3054. [Google Scholar] [CrossRef] [Green Version] - Chaudhury, B.; Gupta, A.; Shah, H.; Bhadani, S. Accelerated simulation of microwave breakdown in gases on Xeon Phi based cluster-application to self-organized plasma pattern formation. Comput. Phys. Commun.
**2018**, 229, 20–35. [Google Scholar] [CrossRef] - Yang, Q.; Wei, B.; Li, L.; Ge, D. Analysis of the calculation of a plasma sheath using the parallel SO-DGTD method. Int. J. Antennas Propag.
**2019**, 2019, 7160913. [Google Scholar] [CrossRef] - Sharma, K.K.; Joshi, S.D.; Sharma, S. Advances in Shannon sampling theory. Def. Sci. J.
**2013**, 63, 41–45. [Google Scholar] [CrossRef] [Green Version] - Gabriel, E.; Fagg, G.E.; Bosilca, G.; Angskun, T.; Dongarra, J.J.; Squyres, J.M.; Sahay, V.; Kambadur, P.; Barrett, B.; Lumsdaine, A.; et al. Open MPI: Goals, concept, and design of a next generation MPI implementation. In Proceedings of the 11th European PVM/MPI Users’ Group Meeting, Budapest, Hungary, 19–22 September 2004; Volume 3241, pp. 97–104. [Google Scholar]
- Gropp, W.; Lusk, E.; Doss, N.; Skjellum, A. A high-performance, portable implementation of the MPI message passing interface standard. Parallel Comput.
**1996**, 22, 789–828. [Google Scholar] [CrossRef] - Dagum, L.; Menon, R. OpenMP: An industry standard API for shared-memory programming. IEEE Comput. Sci. Eng.
**1998**, 5, 46–55. [Google Scholar] [CrossRef] [Green Version] - Rabenseifner, R.; Hager, G.; Jost, G. Hybrid MPI/OpenMP parallel programming on clusters of Multi-Core SMP nodes. In Proceedings of the 17th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, Weimar, Germany, 18–20 February 2009; IEEE: Piscataway, NJ, USA, 2009; pp. 427–436. [Google Scholar]
- Miri Rostami, S.R.; Ghaffari-Miab, M. Finite difference generated transient potentials of open-layered media by parallel computing using OpenMP, MPI, OpenACC, and CUDA. IEEE Trans. Antennas Propag.
**2019**, 67, 6541–6550. [Google Scholar] [CrossRef] - Quinn, M.J. Parallel Programming in C with MPI and OpenMP; McGrawHill: New York, NY, USA, 2000; ISBN 0-07-282256-2. [Google Scholar]
- Chen, F.F. Introduction to Plasma Physics and Controlled Fusion; Plenum Press: New York, NY, USA, 1984; pp. 82–94. [Google Scholar]

**Figure 1.**(

**a**) Schematic showing the simulation model of the EM wave propagation in the ionosphere: B

_{0}is the geomagnetic field; k

_{EM}is the propagation direction of the injected EM wave; E

_{x}and H

_{y}are the initial polarized direction of the electric field and magnetic field of the injected EM wave, respectively; the dotted line represents the approximate location of the cavity. (

**b**) Spatial–temporal dispersion scheme of the physical parameters, with the positions of field nodes indicated.

**Figure 6.**Vertical electric field $\left|{E}_{z}\right|$ in the altitude range of 266.85–267 km at 1.5194 ms with different resolutions.

**Figure 8.**(

**a**) The height distribution of the Langmuir wave number. (

**b**) Spectrogram analysis for ${E}_{z}$ at an altitude of 266.9032 m.

**Figure 9.**(1) Vertical electric field $\left|{E}_{z}\right|$ (

**top**); (2) ion density perturbation $\Delta {N}_{i}$ in the altitude range of 266.85−267 km at 1.5194 ms (

**bottom**).

**Figure 11.**(

**a**) Ion density profile with a cavity depth of 15%; (

**b**,

**d**) absolute value of the vertical electric field $\mid {E}_{z}\mid $ and ion density perturbation $\Delta {N}_{i}$ with a cavity depth of 15% in the altitude range of 266.85–267 km at 1.5194 ms; (

**c**,

**e**) values without cavity.

**Figure 12.**(

**a**) Power spectral density on the height range of 266.85−267 km at 1.5194 ms with different cavity depths; (

**b**) peak of the power spectral density as a function of the cavity depth on the height range of 266.85−267 km at different times.

Grid Cells | Single Cyclical Update Time (s) (T1) | Storage Time (s) (T2) | Sampling Period (n = Δ/Δt) | Time Consumed for a Sampling Period (T3 = T1 × n + T2) (s) | Ratio of Storage to the Total Time Spent (T2/T3) |
---|---|---|---|---|---|

35,000 | 0.0089 | 1.576 | 6 | 1.656 | 96.72% |

350,000 | 0.0936 | 11.94 | 90 | 20.364 | 58.63% |

2,000,000 | 1.04 | 76.35 | 500 | 596.4 | 12.80% |

Grid Cells | Num Threads (Seconds) | |||||
---|---|---|---|---|---|---|

1 | 4 | 6 | 12 | 24 | 32 | |

35,000 | 0.008921 | 0.009782 | 0.008573 | 0.008273 | 0.006159 | 0.005948 |

350,000 | 0.09364 | 0.07574 | 0.073465 | 0.071703 | 0.047205 | 0.039629 |

2,000,000 | 1.04 | 0.49 | 0.417 | 0.36 | 0.258 | 0.174 |

**Table 3.**Performance of the storage module code running with different numbers of MPI tasks. T1 represents the single storage time, and T2 is the ideal data forwarding time. The wall-clock consumption for point-to-point non-blocking communication is defined as the ideal data forwarding time, and MPI_Wtime() is used to compute it.

Grid Cells | T1 | Number of MPI Task (Seconds) | T2 | ||||
---|---|---|---|---|---|---|---|

2 | 4 | 8 | 16 | 32 | |||

35,000 | 1.576 | 0.8302 | 0.4418 | 0.1843 | 0.1217 | 0.0872 | 0.080 |

350,000 | 11.93 | 0.2343 | 0.0401 | 0.0142 | 0.0124 | 0.0115 | 0.0095 |

2,000,000 | 76.35 | 0.0674 | 0.0658 | 0.0637 | 0.0648 | 0.0657 | 0.064 |

**Table 4.**Speedup comparison of the adaptive allocation of tasks and threads with manual allocations of threads and tasks on different CPUs.

Code Num | i7 9700K (8) | XEON GLOD 6254 (18) | ||||
---|---|---|---|---|---|---|

Thread Num | Task Num | Speed-Up | Thread Num | Task Num | Speed-Up | |

(1) | 4 | 12 | 14.84 | 17 | 19 | 22.45. |

(2) | 2 | 14 | 15.35 | 14 | 22 | 23.86 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

He, L.; Chen, J.; Lu, J.; Yan, Y.; Yang, J.; Yuan, G.; Hao, S.; Li, Q.
A Hybrid MPI/OpenMP Parallelization Scheme Based on Nested FDTD for Parametric Decay Instability. *Atmosphere* **2022**, *13*, 472.
https://doi.org/10.3390/atmos13030472

**AMA Style**

He L, Chen J, Lu J, Yan Y, Yang J, Yuan G, Hao S, Li Q.
A Hybrid MPI/OpenMP Parallelization Scheme Based on Nested FDTD for Parametric Decay Instability. *Atmosphere*. 2022; 13(3):472.
https://doi.org/10.3390/atmos13030472

**Chicago/Turabian Style**

He, Linglei, Jing Chen, Jie Lu, Yubo Yan, Jutao Yang, Guang Yuan, Shuji Hao, and Qingliang Li.
2022. "A Hybrid MPI/OpenMP Parallelization Scheme Based on Nested FDTD for Parametric Decay Instability" *Atmosphere* 13, no. 3: 472.
https://doi.org/10.3390/atmos13030472