MDPI - Publisher of Open Access Journals

17 pages, 7511 KiB

Open AccessArticle

Acceleration of a Production-Level Unstructured Grid Finite Volume CFD Code on GPU

by Jian Zhang, Zhe Dai, Ruitian Li, Liang Deng, Jie Liu and Naichun Zhou

Appl. Sci. 2023, 13(10), 6193; https://doi.org/10.3390/app13106193 - 18 May 2023

Cited by 8 | Viewed by 2883

Due to the complex topological relationship, poor data locality, and data racing problems in unstructured CFD computing, how to parallelize the finite volume method algorithms in shared memory to efficiently explore the hardware capabilities of many-core GPUs has become a significant challenge. Based on a production-level unstructured CFD software, three shared memory parallel programming strategies, atomic operation, colouring, and reduction were designed and implemented by deeply analysing its computing behaviour and memory access mode. Several data locality optimization methods—grid reordering, loop fusion, and multi-level memory access—were proposed. Aimed at the sequential attribute of LU-SGS solution, two methods based on cell colouring and hyperplane were implemented. All the parallel methods and optimization techniques implemented were comprehensively analysed and evaluated by the three-dimensional grid of the M6 wing and CHN-T1 aeroplane. The results show that using the Cuthill–McKee grid renumbering and loop fusion optimization techniques can improve memory access performance by 10%. The proposed reduction strategy, combined with multi-level memory access optimization, has a significant acceleration effect, speeding up the hot spot subroutine with data races three times. Compared with the serial CPU version, the overall speed-up of the GPU codes can reach 127. Compared with the parallel CPU version, the overall speed-up of the GPU codes can achieve more than thirty times the result in the same Message Passing Interface (MPI) ranks. Full article

(This article belongs to the Topic Theory and Applications of High Performance Computing)

► Show Figures

Figure 1

16 pages, 5998 KiB

Open AccessArticle

Accelerated Parallel Numerical Simulation of Large-Scale Nuclear Reactor Thermal Hydraulic Models by Renumbering Methods

by Huajian Zhang, Xiao-Wei Guo, Chao Li, Qiao Liu, Hanwen Xu and Jie Liu

Appl. Sci. 2022, 12(20), 10193; https://doi.org/10.3390/app122010193 - 11 Oct 2022

Cited by 1 | Viewed by 1956

Abstract

Numerical simulation of thermal hydraulics of nuclear reactors is widely concerned, but large-scale fluid simulation is still prohibited due to the complexity of components and huge computational effort. Some applications of open source CFD programs still have a large gap in terms of comprehensiveness of physical models, computational accuracy and computational efficiency compared with commercial CFD programs. Therefore, it is necessary to improve the computational performance of in-house CFD software (YHACT, the parallel analysis code of thermohydraulices) to obtain the processing capability of large-scale mesh data and better parallel efficiency. In this paper, we will form a unified framework of meshing and mesh renumbering for solving fluid dynamics problems with unstructured meshes. Meanwhile, the effective Greedy, RCM (reverse Cuthill-Mckee), and CQ (cell quotient) grid renumbering algorithms are integrated into YHACT software. An important judgment metric, named median point average distance (MDMP), is applied as the discriminant of sparse matrix quality to select the renumbering methods with better effect for different physical models. Finally, a parallel test of the turbulence model with 39.5 million grid volumes is performed using a pressurized water reactor engineering case component with 3*3 rod bundles. The computational results before and after renumbering are also compared to verify the robustness of the program. Experiments show that the CFD framework integrated in this paper can correctly perform simulations of the thermal engineering hydraulics of large nuclear reactors. The parallel size of the program reaches a maximum of 3072 processes. The renumbering acceleration effect reaches its maximum at a parallel scale of 1536 processes, 56.72%. It provides a basis for our future implementation of open-source CFD software that supports efficient large-scale parallel simulations. Full article

(This article belongs to the Topic Fluid Mechanics)

► Show Figures

Figure 1

19 pages, 7610 KiB

Open AccessArticle

Accelerating FVM-Based Parallel Fluid Simulations with Better Grid Renumbering Methods

by Huajian Zhang, Xiao-Wei Guo, Chao Li, Qiao Liu, Hanwen Xu and Jie Liu

Appl. Sci. 2022, 12(15), 7603; https://doi.org/10.3390/app12157603 - 28 Jul 2022

Cited by 2 | Viewed by 2052

Abstract

Grid renumbering techniques have been shown to be effective in improving the efficiency of computational fluid dynamics (CFD) numerical simulations based on the finite volume method (FVM). However, with the increasing complexity of real-world engineering scenarios, there is still a huge challenge to choose better sequencing techniques to improve parallel simulation performance. This paper designed an improved metric (MDMP) to evaluate the structure of sparse matrices. The metric takes the aggregation of non-zero elements inside the sparse matrix as an evaluation criterion. Meanwhile, combined with the features of the cell-centered finite volume method supporting unstructured grids, we proposed the cell quotient (CQ) renumbering algorithm to further reduce the maximum bandwidth and contours of large sparse matrices with finite volume discretization. Finally, with real-world engineering cases, we quantitatively analyzed the evaluation effect of MDMP and the optimization effect of different renumbering algorithms. The results showed that the classical greedy algorithm reduces the maximum bandwidth of the sparse matrix by at most 60.34% and the profile by 95.38%. Correspondingly, the CQ algorithm reduced them by at most 92.94% and 98.70%. However, in terms of MDMP, the CQ algorithm was 83.43% less optimized than the Greedy algorithm. In terms of overall computational speed, the Greedy algorithm was optimized by a maximum of 38.19%, and the CQ algorithm was optimized by a maximum of 27.31%. The above is in accordance with the evaluation results of the MDMP metric. Thus, our new metric can more accurately evaluate the renumbering method for numerical fluid simulations, which is of great value in selecting a better mesh renumbering method in engineering applications of CFD. Full article

(This article belongs to the Section Fluid Science and Technology)

► Show Figures

Figure 1

22 pages, 7877 KiB

Open AccessArticle

ARCFIRE: Experimentation with the Recursive InterNetwork Architecture

by Sander Vrijders, Dimitri Staessens, Didier Colle, Eduard Grasa, Miquel Tarzan, Sven van der Meer, Marco Capitani, Vincenzo Maffione, Diego Lopez, Lou Chitkushev and John Day

Computers 2020, 9(3), 59; https://doi.org/10.3390/computers9030059 - 22 Jul 2020

Cited by 1 | Viewed by 4458

Abstract

European funded research into the Recursive Inter-Network Architecture (RINA) started with IRATI, which developed an initial prototype implementation for OS/Linux. IRATI was quickly succeeded by the PRISTINE project, which developed different policies, each tailored to specific use cases. Both projects were development-driven, where most experimentation was limited to unit testing and smaller scale integration testing. In order to assess the viability of RINA as an alternative to current network technologies, larger scale experimental deployments are needed. The opportunity arose for a project that shifted focus from development towards experimentation, leveraging Europe’s investment in Future Internet Research and Experimentation (FIRE+) infrastructures. The ARCFIRE project took this next step, developing a user-friendly framework for automating RINA experiments. This paper reports and discusses the implications of the experimental results achieved by the ARCFIRE project, using open source RINA implementations deployed on FIRE+ Testbeds. Experiments analyze the properties of RINA relevant to fast network recovery, network renumbering, Quality of Service, distributed mobility management, and network management. Results highlight RINA properties that can greatly simplify the deployment and management of real-world networks; hence, the next steps should be focused on addressing very specific use cases with complete network RINA-based networking solutions that can be transferred to the market. Full article

(This article belongs to the Special Issue Post-IP Networks: Advances on RINA and other Alternative Network Architectures)

► Show Figures

Figure 1

Search Results (4)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (4)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI