Accelerating Conjugate Heat Transfer Simulations in Squared Heated Cavities through Graphics Processing Unit (GPU) Computing

: This research develops an innovative framework for accelerating Conjugate Heat Transfer (CHT) simulations within squared heated cavities through the application of Graphics Processing Units (GPUs). Although leveraging GPUs for computational speed improvements is well recognized, this study distinguishes itself by formulating a tailored optimization strategy utilizing the CUDA-C programming language. This approach is specifically designed to tackle the inherent challenges of modeling squared cavity configurations in thermal simulations. Comparative performance evaluations reveal that our GPU-accelerated framework reduces computation times by up to 99.7% relative to traditional mono-core CPU processing. More importantly, it demonstrates an increase in accuracy in heat transfer predictions compared to existing CPU-based models. These results highlight not only the technical feasibility but also the substantial enhancements in simulation efficiency and accuracy, which are crucial for critical engineering applications such as aerospace component design, electronic device cooling, and energy system optimization. By advancing GPU computational techniques, this work contributes significantly to the field of thermal management, offering a potential for broader application and paving the way for more efficient, sustainable engineering solutions.


Introduction
The early exploration into Conjugate Heat Transfer (CHT) problems by Perelman [1] highlighted the complexity of solving heat conduction equations for solid bodies and their surrounding fluids simultaneously.Further advancements were made by Luikov et al. [2], who developed analytical methods to address convective heat transfer issues, acknowledging the crucial role of heat propagation in solids in contact with moving fluids.In the mid-1970s, Chida and Katto [3] utilized vectorial dimensional analysis to propose new dimensionless groups that encapsulate the conjugate nature of heat transfer.The impact of porous media within cavities on heat transfer was experimentally explored by Seki et al. [4].Inaba and Fukuda [5] conducted a groundbreaking study on the effects of water's density inversion near 4 • C within inclined square cavities, revealing unique natural convection patterns that deviate significantly from those observed in traditional Boussinesq fluids.The late 1980s saw Ho and Yih [6] numerically investigate the attenuated heat transfer rates in air-filled rectangular cavities with partitions.Frederick and Valencio [7] studied natural convection in square cavities with conducting partitions, demonstrating the significant impact of partition.These pioneering studies collectively provide a comprehensive foundation for the ongoing exploration of conjugated heat transfer phenomena, particularly in configurations involving square cavities and conducting solids, including the effect of geometry and thermal properties on convective heat transfer mechanisms.
The exploration of CHT within square cavities has seen notable advancements over the last decade, characterized by innovative methodologies and diverse applications.For instance, Alkhalid et al. [8] studied buoyancy-driven rarefied gas dynamics within conjugate cavities, examining the effects of cavity aspect ratios, conductivity ratios, and tilt angles.Alsabery et al. [9] demonstrated the potential of nanofluids to significantly enhance thermal management through heatline visualization.Khatamifar et al. [10] optimized partitioned square cavities for improved thermal management.Gijon-Rivera et al. [11] simulated CHT in rooms with glazed windows, modeled as square cavities.Chen et al. [12] introduced a novel lattice Boltzmann approach, tackling the modeling complexities associated with fluid-porous interfaces.Alvarado-Juarez et al. [13] characterized thermal mechanisms in solar receivers, showcasing the crucial role of radiatively participating media in thermal analysis.
The acceleration of CHT problem resolutions through the use of GPUs has been a significant area of research over the last decade.Klimeš and Stetina [14] presented a fully three-dimensional GPU-based heat transfer and solidification model for the continuous casting of steel, highlighting the necessity for rapid computation in real-time applications like casting control and optimization.Zhang et al. [15] presented a GPU-assisted finite element methodology for the modeling and analysis of bio-heat transfer processes in thermal ablation treatment, demonstrating computational performance improvements of up to 55.3 times with GPU acceleration.Narang et al. [16] explored the use of GPUs for accelerating the numerical solution of heat and mass transfer equations, demonstrated drastic performance enhancements, and highlighted the potential of GPUs in speeding up complex simulations.Silvestri and Pecnik [17] implemented a fast Monte Carlo algorithm on GPUs for radiative heat transfer in turbulent flows; the study achieved significant speed-ups, enabling accurate solutions for radiative heat transfer that can be coupled to direct numerical simulations of turbulent flows.Dugast et al. [18] demonstrated an efficient GPU-based thermal process simulator for laser powder bed fusion additive manufacturing; the paper proposed a matrix-free preconditioned conjugate gradient algorithm, resulting in significant computational speedups.Luo et al. [19] focused on the heat transfer in a porous brick roof filled with phase change materials; the numerical study employed a GPU-accelerated multiple-relaxation-time Lattice Boltzmann Method to investigate thermal buffering capacity, demonstrating the efficiency of GPU acceleration in simulating complex thermal phenomena.Gou and Shen [20] introduced a new GPU-based CFD-DEM model designed to simulate gas-solid flow involving large numbers of particles and complex geometries.They developed an innovative coupling strategy between the CFD and DEM GPU-based solvers, which enhanced both the efficiency and stability of simulations, reaching 95.6% simulation time reduction.Wang et al. [21] utilized the Lattice Boltzmann Method to simulate thermal convective flows, demonstrating its effectiveness in handling complex boundary conditions and providing a comparative benchmark for our GPU-accelerated approach.
GPU computing has become increasingly pivotal in CFD due to its ability to significantly accelerate complex simulations.GPUs are being leveraged for their superior parallel processing capabilities, allowing for faster computation of large-scale simulations.Narang et al. [16] exemplify this; in their study, GPUs facilitated the efficient acceleration of heat and mass transfer equations, showcasing the potential for rapid processing in real-time applications.In terms of integration with traditional methods, there is a growing trend of integrating GPU computing with traditional computational methods like the FVM to enhance computational efficiency while maintaining accuracy.This integration is highlighted in studies such as those by Moukalled et al. [22], which discuss the adaptation of FVM for GPU implementation.Furthermore, researchers are focusing on developing and optimizing solvers that are specifically designed for GPU architectures.For instance, Silvestri and Pecnik [17] implemented a fast GPU Monte Carlo algorithm for radiative heat transfer, which showed significant computational speed-ups.This method serves as another validation point for the efficiency of GPU utilization in complex simulations.These studies exemplify the potent combination of GPU acceleration and CHT analysis, demonstrating the significant computational advantages and broad application potential of this approach in engineering and scientific research.With the advancement in GPU technologies, there is an increased focus on conducting real-time simulations and visualizations of complex fluid dynamics scenarios.This application is crucial for industries where real-time data and feedback are essential for operational success and safety.As computational demands grow, there is also a concurrent focus on making GPU computing more sustainable and energy efficient.This involves developing algorithms that not only run faster but also consume less power, aligning with global sustainability goals.
This work investigates the application of GPU technologies to accelerate CHT simulations within squared heated cavities, focusing on increased parallel computing capabilities to reduce the computational time.The goal is to formulate a comprehensive methodology that not only addresses the specifics of GPU computing but also the unique challenges presented by square heated cavity configurations.In summary, this paper aims to bridge the gap between the computational demands of high-fidelity CHT simulations and the capabilities of contemporary computing hardware by leveraging GPU acceleration.Given the importance of accurate and efficient thermal management in various engineering applications, the development of optimized computational strategies is imperative.This work presents a novel approach that combines the parallel processing power of GPUs with advanced numerical methods to significantly enhance the speed of CHT simulations without compromising accuracy.By showcasing the application of this technology in squared heated cavity configurations, the work highlights the potential for widespread adoption and further innovation in the simulation of complex thermal phenomena.
The GPU simulations were conducted using an inhouse code developed in CUDA-C and operated on a MSI Nvidia™ GeForce ® RTX™ 4090 graphic card (made in Hsinchu, Taiwan), equipped with 24 GB of video memory.Additionally, this CUDA ® code was run on a desktop PC powered by an Intel ® Core™ i9 12900KF processor (made in Hillsboro, Oregon, United States of America).Figure 1 presents the general numerical fluid solution developed utilizing both CPU and GPU capabilities.
and optimizing solvers that are specifically designed for GPU architectures.For instance, Silvestri and Pecnik [17] implemented a fast GPU Monte Carlo algorithm for radiative heat transfer, which showed significant computational speed-ups.This method serves as another validation point for the efficiency of GPU utilization in complex simulations.These studies exemplify the potent combination of GPU acceleration and CHT analysis, demonstrating the significant computational advantages and broad application potential of this approach in engineering and scientific research.With the advancement in GPU technologies, there is an increased focus on conducting real-time simulations and visualizations of complex fluid dynamics scenarios.This application is crucial for industries where real-time data and feedback are essential for operational success and safety.As computational demands grow, there is also a concurrent focus on making GPU computing more sustainable and energy efficient.This involves developing algorithms that not only run faster but also consume less power, aligning with global sustainability goals.
This work investigates the application of GPU technologies to accelerate CHT simulations within squared heated cavities, focusing on increased parallel computing capabilities to reduce the computational time.The goal is to formulate a comprehensive methodology that not only addresses the specifics of GPU computing but also the unique challenges presented by square heated cavity configurations.In summary, this paper aims to bridge the gap between the computational demands of high-fidelity CHT simulations and the capabilities of contemporary computing hardware by leveraging GPU acceleration.Given the importance of accurate and efficient thermal management in various engineering applications, the development of optimized computational strategies is imperative.This work presents a novel approach that combines the parallel processing power of GPUs with advanced numerical methods to significantly enhance the speed of CHT simulations without compromising accuracy.By showcasing the application of this technology in squared heated cavity configurations, the work highlights the potential for widespread adoption and further innovation in the simulation of complex thermal phenomena.
The GPU simulations were conducted using an inhouse code developed in CUDA-C and operated on a MSI Nvidia™ GeForce ® RTX™ 4090 graphic card (made in Hsinchu, Taiwan), equipped with 24 GB of video memory.Additionally, this CUDA ® code was run on a desktop PC powered by an Intel ® Core™ i9 12900KF processor (made in Hillsboro, Oregon, United States of America).Figure 1 presents the general numerical fluid solution developed utilizing both CPU and GPU capabilities.

Mathematical Model
Computational Fluid Dynamics (CFD) problems typically involve solving sets of partial differential equations (PDEs) that describe the behavior of fluid flow.The governing

Mathematical Model
Computational Fluid Dynamics (CFD) problems typically involve solving sets of partial differential equations (PDEs) that describe the behavior of fluid flow.The governing equations for CHT, for instance, include the simultaneous solution of the Navier-Stokes equations and the Energy-Transport Balance equation [22].
The conservation of mass, known as the continuity equation, considering that the flow is incompressible, is expressed as where v is the velocity vector.The linear momentum conservation, in turn, considering Equation (1) and the fluid viscosity being constant, can be written as where ρ f is the fluid density, t is the time, p is the fluid pressure, µ is the fluid viscosity coefficient, and f b is the body forces vector.Finally, the energy-transport balance, in terms of temperature, is given by where c is the specific heat, T is the temperature, k is the thermal conductivity, and Q T is the source term.
Considering the Boussinesq approximation, the equation for density becomes where T ∞ is the environment temperature, ρ f ,∞ is the fluid density at T ∞ , and β is the volume expansion coefficient.

Boundary and Initial Conditions 2.2.1. Lid-Driven Cavity
Figure 2 shows the scheme for a classical problem, the Lid-Driven Cavity.It involves a square cavity filled with an ideal gas, with the top lid moving at a constant speed while the other walls remain stationary.This movement creates complex flow patterns within the cavity due to the shear forces at the moving boundary.The primary interest in this problem lies in understanding how the fluid moves, how vortices develop within the cavity, and how these vortices evolve with changes in the speed of the lid and the viscosity of the fluid.This configuration is particularly relevant for applications involving lubrication technologies and microfluidic devices, where fluid shear and boundary layer interactions are critical.The specific conditions were chosen to simulate typical microscale flow environments, enabling the study of shear-driven flow patterns, which is essential for optimizing device performance.For this case, Table 1 presents all boundary and initial conditions used in this simulation case.

Natural Convection in a Square Cavity
Figure 3 presents the scheme for a square cavity subject to natural convection inside In this scenario, a fluid-filled square cavity typically has vertical walls at different temper atures, causing the fluid near the hot wall to heat up, decrease in density, and rise, while the fluid near the cold wall cools down, increases in density, and sinks.This creates a circulating motion of the fluid within the cavity, known as a convection current.The sim plicity of the square cavity model makes it ideal for studying fundamental aspects of con vection without the complexities of external influences.The prescribed boundary and in itial conditions mimic real-world conditions, where convection is crucial for heat dissipa tion, thereby aiding in the design of efficient heat exchangers and cooling systems.Table 2 presents all boundary and initial conditions used in this simulation case.

Boundary/Initial Conditions
Description u = 0 m/s horizontal velocity for all faces v = 0 m/s vertical velocity for all faces ∂T/∂y = 0 K/m insulation at top and bottom faces ∂p/∂x = 0 N/m 3  pressure gradient for vertical faces ∂p/∂y = 0 N/m 3  pressure gradient for horizontal faces TH and TC [K] prescript temperature (TH > TC)

Natural Convection in a Square Cavity
Figure 3 presents the scheme for a square cavity subject to natural convection inside.In this scenario, a fluid-filled square cavity typically has vertical walls at different temperatures, causing the fluid near the hot wall to heat up, decrease in density, and rise, while the fluid near the cold wall cools down, increases in density, and sinks.This creates a circulating motion of the fluid within the cavity, known as a convection current.The simplicity of the square cavity model makes it ideal for studying fundamental aspects of convection without the complexities of external influences.The prescribed boundary and initial conditions mimic real-world conditions, where convection is crucial for heat dissipation, thereby aiding in the design of efficient heat exchangers and cooling systems.Table 2 presents all boundary and initial conditions used in this simulation case.In this case, the authors considered an ideal gas as a fluid inside the cavity.

Conjugate Heat Transfer
Figure 4 presents the scheme for a square cavity with a solid block inside, where U

Boundary/Initial Conditions Description
u = 0 m/s horizontal velocity for all faces v = 0 m/s vertical velocity for all faces ∂T/∂y = 0 K/m insulation at top and bottom faces ∂p/∂x = 0 N/m 3  pressure gradient for vertical faces ∂p/∂y = 0 N/m 3  pressure gradient for horizontal faces In this case, the authors considered an ideal gas as a fluid inside the cavity.

Conjugate Heat Transfer
Figure 4 presents the scheme for a square cavity with a solid block inside, where U and V are the zero horizontal and vertical dimensionless velocities; ∂T dimless /∂y = 0 is the horizontal face insulation; ∂p dimless /∂x = 0 and ∂p dimless /∂y = 0 are the zero dimensionless pressure gradients at the vertical and horizontal faces, respectively; T H,dimless and T C,dimless are prescript dimensionless temperatures; and T dimless (T = 0) = 0.5 is the initial dimensionless temperature.The boundary conditions for the CHT simulations, involving a square cavity with a solid block inside, are crucial for understanding the thermal interactions in industrial applications such as electronic component cooling and reactor safety assessments.The chosen conditions aim to closely represent the thermal behavior observed in systems where solid-structure and fluid interactions determine the overall thermal management efficiency.This is essential for evaluating the effectiveness of various material properties and configurations in real-life engineering applications.

Numerical Model
In this section, the authors detail the numerical methodologies employed to solv governing equations.The Fractional Step Method (FSM) was selected due to its robus in handling the coupling between velocity and pressure fields in fluid dynamics sim tions.FSM effectively splits the Navier-Stokes equations into simpler sub-steps, allo for more stable and accurate solutions by treating the momentum and continuity e tions separately.This method is particularly advantageous in complex flow simulat where precision in pressure-velocity coupling is crucial.
Additionally, the Finite Volume Method (FVM) was applied due to its high fid In this case, the authors considered that the fluid inside of cavity is an ideal gas.Furthermore, the governing equations are made dimensionless according to [23].Then, where x and y are Cartesian coordinates, X and Y are dimensionless Cartesian coordinates, and L is the square cavity dimension.
where u and v are the horizontal and vertical velocities, respectively; U and V are the horizontal and vertical dimensionless velocities, respectively; and α is the thermal diffusivity.
where p dimless is the dimensionless pressure, and g y is the gravity acceleration.
where T dimless is the dimensionless temperature, and T H and T C are the temperatures of the heated and cooled faces, respectively.

Numerical Model
In this section, the authors detail the numerical methodologies employed to solve the governing equations.The Fractional Step Method (FSM) was selected due to its robustness in handling the coupling between velocity and pressure fields in fluid dynamics simulations.FSM effectively splits the Navier-Stokes equations into simpler sub-steps, allowing for more stable and accurate solutions by treating the momentum and continuity equations separately.This method is particularly advantageous in complex flow simulations, where precision in pressure-velocity coupling is crucial.
Additionally, the Finite Volume Method (FVM) was applied due to its high fidelity in discretizing the governing equations on structured meshes.FVM conserves fluxes across control volume boundaries, ensuring conservation of mass, momentum, and energy, which is essential for accurately capturing the nuanced behaviors of thermal and flow fields in natural convection scenarios.By using FVM, it is possible to achieve detailed spatial discretization and gain precise control over the numerical diffusion, which is particularly important in the study of thermal transport in fluid systems.
The displaced mesh approach complements these methods by addressing the staggered arrangement of velocity components, which helps in reducing numerical errors associated with grid alignment.The co-located arrangement of pressure and temperature fields on the same mesh, alongside the staggered velocity fields, enhances the stability and accuracy of our simulation framework, particularly in handling the complex boundary layers and thermal gradients observed in natural convection within square cavities.
Figure 5a illustrates the mesh within a rectangular domain, segmented into volumes known as finite volumes, with each featuring its respective nodes positioned at the domain's center.This mesh configuration specifically accounts for the nodes associated with pressure and temperature.Given the presence of boundary conditions, the introduction of ghost cells is deemed necessary.As depicted in Figure 5b, the mesh representing the vertical component of the velocity field is displaced upwards, while the mesh for the horizontal component is shifted to the right.
The numerical solution consists of use the SIMPLE (Semi-Implicit Method for Pressure Linked Equations) Algorithm in three stages to solve Navier-Stokes equations.If the problem involves a temperature field, it is necessary to solve the Energy-Transport Balance Equation.The process begins by calculating the initial conditions for velocity and pressure to estimate the velocity field.Then, the linear system is solved using the Successive Over-Relaxation (SOR) method, aimed at determining pressure correction values.If a temperature field is present, the Energy-Transport Balance Equation must be solved The numerical solution consists of use the SIMPLE (Semi-Implicit Method for Pr sure Linked Equations) Algorithm in three stages to solve Navier-Stokes equations.If t problem involves a temperature field, it is necessary to solve the Energy-Transport B ance Equation.The process begins by calculating the initial conditions for velocity a pressure to estimate the velocity field.Then, the linear system is solved using the Succ sive Over-Relaxation (SOR) method, aimed at determining pressure correction values a temperature field is present, the Energy-Transport Balance Equation must be solved e plicitly.This iterative process continues until the convergence criterion is met, based the residue value representing mass conservation in the discretized volumes.
Equation ( 9) represents the linear system to solve the estimated velocity field and t pressure correction.Substituting Equation (10) in Equation ( 9), we obtain The value of pressure correction of the central node can be obtained using Equati (11), and its value of the next iteration can be calculated using Equation (12).where the superscript e is the index of the pressure iteration, and WP is the correction.Equation ( 9) represents the linear system to solve the estimated velocity field and the pressure correction.
where the subscript P is the center of the control volume (CV); the subscripts E (east), W (west), N (north), and S (south) are the CV neighbors of P; and S is the source term.
The SOR method, in turn, can be written as Substituting Equation (10) in Equation ( 9), we obtain The value of pressure correction of the central node can be obtained using Equation (11), and its value of the next iteration can be calculated using Equation (12).
where the superscript e is the index of the pressure iteration, and W P is the correction.Hence, the iterative solution of the linear system can be written as a P P ′ e+1 P Computation 2024, 12, 106 9 of 18 Finally, the solution for the pressure correction can be written as With the estimated velocity field and the pressure correction, the update for these can be written as P = P ′ + P 0 ( 16) where P is the pressure correction, and u and v are horizontal and vertical velocity fields updates.
In scenarios involving CHT problems, there are two distinct domains: a solid and a fluid.The numerical solution for each domain is computed independently.However, these domains exchange information at the interface, such as heat flux or temperature, ensuring a cohesive solution across the boundary between the solid and fluid regions.
Figure 6 illustrates the partitioned domain model utilizing the CHT problem approach.Within this model, I 1 and I 2 represent the information exchanged between the solid and fluid domains.I 1 corresponds to the heat flux at the interface between the fluid and solid, derived from the calculations performed in the solid domain.Conversely, I 2 represents the temperature at the same location, determined based on the fluid domain's results and then transferred to the solid domain.This bidirectional exchange-where one domain contributes heat flux and the other temperature-is crucial for achieving stable and convergent numerical model results.
Finally, the solution for the pressure correction can be written as With the estimated velocity field and the pressure correction, the update for can be written as 0 ' where P is the pressure correction, and u and v are horizontal and vertical velocity f updates.
In scenarios involving CHT problems, there are two distinct domains: a solid a fluid.The numerical solution for each domain is computed independently.How these domains exchange information at the interface, such as heat flux or tempera ensuring a cohesive solution across the boundary between the solid and fluid region Figure 6 illustrates the partitioned domain model utilizing the CHT problem proach.Within this model, I1 and I2 represent the information exchanged between solid and fluid domains.I1 corresponds to the heat flux at the interface between the and solid, derived from the calculations performed in the solid domain.Converse represents the temperature at the same location, determined based on the fluid dom results and then transferred to the solid domain.This bidirectional exchange-where domain contributes heat flux and the other temperature-is crucial for achieving s and convergent numerical model results.Equations ( 19) and ( 20) represent the information I1 and I2, respectively.
( ) The parameters βq and βT in Equations ( 19) and ( 20) are sub-relaxation parame and the stability of the numerical model depends of the choice of these.Equations ( 19) and ( 20) represent the information I 1 and I 2 , respectively.
The parameters β q and β T in Equations ( 19) and ( 20) are sub-relaxation parameters, and the stability of the numerical model depends of the choice of these.

Results and Discussion
This section presents findings for three distinct scenarios: the Lid-Driven Cavity, Natural Convection in a Square Cavity, and the CHT Problem.For each case, the validation of the GPU computational approach was conducted against the existing literature, which utilized alternative methodologies.Additionally, this section will highlight the mesh convergence observed in the numerical model when processed via GPU, particularly exemplifying this through the Lid-Driven Cavity case with a Reynolds number of 1000.Finally, an analysis of the CPU's performance in simulating the problem addressed in this paper will be provided.

Lid-Driven Cavity
In this CFD study, the numerical solutions were obtained for four distinct scenarios.In each scenario, the cavity's upper boundary is subjected to a velocity of 1.0 m/s, while the remaining boundaries of the square geometry, measuring 1 m × 1 m, maintain consistent boundary conditions according Figure 2. The variable parameter in this investigation is the Reynolds number, which is set to 100, 400, 1000, and 3200 for each case.Figure 7 presents the streamlines of the velocity field in Lid-Driven Cavity simulations across a range of Reynolds numbers.The mesh configuration for all cases is 150 × 150.

Results and Discussion
This section presents findings for three distinct scenarios: the Lid-Driven Cavity, Natural Convection in a Square Cavity, and the CHT Problem.For each case, the validation of the GPU computational approach was conducted against the existing literature, which utilized alternative methodologies.Additionally, this section will highlight the mesh convergence observed in the numerical model when processed via GPU, particularly exemplifying this through the Lid-Driven Cavity case with a Reynolds number of 1000.Finally, an analysis of the CPU's performance in simulating the problem addressed in this paper will be provided.

Lid-Driven Cavity
In this CFD study, the numerical solutions were obtained for four distinct scenarios.In each scenario, the cavity's upper boundary is subjected to a velocity of 1.0 m/s, while the remaining boundaries of the square geometry, measuring 1 m × 1 m, maintain consistent boundary conditions according Figure 2. The variable parameter in this investigation is the Reynolds number, which is set to 100, 400, 1000, and 3200 for each case.A significant observation from the results of the first three cases is the presence of three recirculation zones within the cavity.There is a large recirculation region dominating the space, accompanied by two smaller recirculation areas in the bottom corners.As the Reynolds number increases, as seen in Figure 7a-c, the intensity of the recirculation in these corner regions becomes more pronounced.Furthermore, Figure 7d reveals an additional recirculation zone emerging in the upper left corner of the cavity, in addition to the similar recirculation regions observed in each case.Figures 8 and 9 present the comparison results among this work and Guia et al. [24].In this comparison, the authors present the vertical velocity at x = 0.5 m and the horizontal velocity at y = 0.5 m across a range of Reynolds numbers, Re = {100, 400, 1000, 3200}.
The results for each scenario, corresponding to their respective Reynolds numbers, closely align with those reported by Ghia [24].This similarity underscores the accuracy of the methodology employed in this study.

Natural Convection in a Square Cavity
The second scenario involves Natural Convection in a Square Cavity, with the dimensions being 1.0 m × 1.0 m.Here, the Rayleigh number serves as the variable parameter across four cases, set to 10 3 , 10 4 , 10 5 , and 10 6 .The gravitational acceleration, consistent in all cases and directed along the y-axis, is 9.81 m/s 2 .Parameters such as dynamic viscosity, specific mass, specific heat, dissipation constant, and the adjustable boundary conditions on the cavity's vertical walls are fine-tuned to yield the Rayleigh number.For the horizontal walls, the boundary conditions remain consistent across all cases, characterized by a heat flux equal to zero, according Figure 3. Furthermore, the authors considered the fluid as an ideal gas, and β = 1/T 0 , where T 0 is the reference temperature.Figure 10 presents the isothermal lines results of Natural Convection in a Square Cavity across a range of Rayleigh numbers.
It is clear that with a low Rayleigh number, the isothermal lines tend to be vertical, with little detour in the horizontal direction.With the increase of this dimensionless parameter, the isothermal lines, when near the vertical walls, show a large inclination in relation to the horizontal direction; when distant from the vertical walls, they clearly show a horizontal direction.This means that with a low Rayleigh number, the heat flux does not have zero relation to the horizontal direction, and with the increase of the Rayleigh number, the heat flux in the horizontal direction can be null, with the exception of near the vertical walls.Table 3 presents the comparison of the present work and other works [21,25,26] for the Nusselt number at the vertical wall across a range of Rayleigh numbers.these corner regions becomes more pronounced.Furthermore, Figure 7d reveals an additional recirculation zone emerging in the upper left corner of the cavity, in addition to the similar recirculation regions observed in each case.Figures 8 and 9 present the comparison results among this work and Guia et al. [24].In this comparison, the authors present the vertical velocity at x = 0.5 m and the horizontal velocity at y = 0.5 m across a range of Reynolds numbers, Re = {100, 400, 1000, 3200}.The results for each scenario, corresponding to their respective Reynolds numbers, closely align with those reported by Ghia [24].This similarity underscores the accuracy of the methodology employed in this study.It is clear that with a low Rayleigh number, the isothermal lines tend to be vertical, with little detour in the horizontal direction.With the increase of this dimensionless parameter, the isothermal lines, when near the vertical walls, show a large inclination in relation to the horizontal direction; when distant from the vertical walls, they clearly show a horizontal direction.This means that with a low Rayleigh number, the heat flux does not have zero relation to the horizontal direction, and with the increase of the Rayleigh number, the heat flux in the horizontal direction can be null, with the exception of near the vertical walls.Table 3 presents the comparison of the present work and other works [21,25,26] for the Nusselt number at the vertical wall across a range of Rayleigh numbers.Reviewing Table 3 in the context of this study compared to others, it is evident that the numerical model achieves a close approximation to previous works, albeit with a slight deviation in the case of a Rayleigh number equal to 10 6 .While the initial hypothesis attributed the discrepancy primarily to the application of a single iteration of the SIMPLE algorithm, several other aspects could also influence the outcome.In terms of numerical stability and convergence, the use of a single iteration of the SIMPLE algorithm might not sufficiently capture the complex interactions between buoyancy forces and thermal gradients at such a high Rayleigh number.This high value intensifies the non-linearities due to increased convective heat transfer, possibly requiring multiple iterations for the algorithm to achieve a stable and accurate solution.On the other hand, the chosen mesh resolution might not be adequate for capturing the finer details of the flow and temperature fields at high Rayleigh numbers.A grid independence study could reveal whether a denser mesh could help in reducing the observed discrepancy by providing a more refined spatial discretization.In terms of physical modeling assumptions such as the Boussinesq approximation, which simplifies the density variations to be solely a function of temperature, these might not hold perfectly at higher temperature differentials.Deviations could also arise from boundary layer approximations or simplifications in the properties of the fluid used in the simulations.Finally, the SIMPLE algorithm itself, while robust for a wide range of applications, has limitations in handling the highly dynamic and complex flow patterns that emerge at very high Rayleigh numbers.Exploring alternative algorithms or more advanced iterations within the SIMPLE framework could provide insights into the sensitivity of the results to the numerical method employed.

Conjugate Heat Transfer
The final analysis discussed in this paper necessitates the nondimensionalization of the equations governing both the fluid and solid domains, with the approach for the solid domain detailed by [26].This study focuses on a square cavity of dimensions L × L, containing a concentric solid body with dimensions 0.5 L × 0.5 L. Within the fluid domain, the assumption is that the fluid behaves as an ideal gas.Two scenarios were simulated: the first with a dimensionless thermal conductivity ratio k* set to 0.2, representing the ratio of the solid's dimensionless thermal conductivity to that of the fluid, and the second with k* = 5.0.The mesh resolution for the entire system, encompassing both the solid and fluid, was set to 201 × 201. Figure 11 presents the isothermal lines of CHT simulations.Furthermore, Table 4 presents a comparison of the present work and other works [26,27] for the Nusselt number at the vertical wall for the CHT problem.
denser mesh could help in reducing the observed discrepancy by providing a more refined spatial discretization.In terms of physical modeling assumptions such as the Boussinesq approximation, which simplifies the density variations to be solely a function of temperature, these might not hold perfectly at higher temperature differentials.Deviations could also arise from boundary layer approximations or simplifications in the properties of the fluid used in the simulations.Finally, the SIMPLE algorithm itself, while robust for a wide range of applications, has limitations in handling the highly dynamic and complex flow patterns that emerge at very high Rayleigh numbers.Exploring alternative algorithms or more advanced iterations within the SIMPLE framework could provide insights into the sensitivity of the results to the numerical method employed.

Conjugate Heat Transfer
The final analysis discussed in this paper necessitates the nondimensionalization of the equations governing both the fluid and solid domains, with the approach for the solid domain detailed by [26].This study focuses on a square cavity of dimensions L × L, containing a concentric solid body with dimensions 0.5 L × 0.5 L. Within the fluid domain, the assumption is that the fluid behaves as an ideal gas.Two scenarios were simulated: the first with a dimensionless thermal conductivity ratio k* set to 0.2, representing the ratio of the solid's dimensionless thermal conductivity to that of the fluid, and the second with k* = 5.0.The mesh resolution for the entire system, encompassing both the solid and fluid, was set to 201 × 201. Figure 11 presents the isothermal lines of CHT simulations.Furthermore, Table 4 presents a comparison of the present work and other works [26,27] for the Nusselt number at the vertical wall for the CHT problem.For k * = 0.2 (Figure 11a), the isotherms bend around the square, indicating that it has a lower thermal conductivity compared to the surrounding medium, acting as a thermal resistance.This results in a noticeable 'shadowing' effect downstream of the square, which creates regions of lower temperature gradients.For k * = 5.0 (Figure 11b), although the square still influences the temperature field, the isotherms are less disturbed by its presence.This could imply that the square's thermal conductivity is closer to that of the surrounding medium or that the overall system has a higher thermal diffusivity, allowing for a more uniform temperature distribution despite the presence of the square.The curvature of the isotherms near the boundaries and the square's edges in both cases suggests a complex interaction between conductive and convective heat transfer mechanisms.The nature of this interaction is likely to depend on the thermal properties of the materials involved as well as the boundary conditions of the system.
The data presented in Table 4 illustrate the high accuracy of the simulation outcomes achieved using GPU processing.This demonstrates that, in addition to its efficiency in terms of processing time, GPU-based simulation also delivers precise results.The findings from GPU-accelerated CHT simulations are significant not only in terms of computational efficiency but also for their potential applications in real-world engineering problems, such as enhanced thermal management in aerospace and automotive industries, energy systems optimization, and electronic cooling systems.

Mesh Convergence
The analysis presents mesh convergence in the Lid-Driven Cavity problem at a Reynolds umber of 1000, employing simulations across mesh resolutions of 25 × 25, 50 × 50, 100 × 100, and 200 × 200.The L 2 norm was calculated to compare the 50 × 50 mesh against the 25 × 25, the 100 × against the 50 × 50, and finally, the 200 × 200 mesh compared to the 100 × 100. Figure 12a,b compares the results from all four mesh configurations with those reported by Ghia [24].On the other hand, Figure 12c shows the L2 norm as a function of the number of cells.

Simulation Performance
To evaluate the computational performance of GPU processing, three simulations were conducted, each incorporating variations of a specific dimensionless parameter.These simulations achieved results closely matching those found in the literature, which were obtained using conventional CPU-based methodologies.Notably, the use of GPU processing in this study resulted in a substantial reduction in simulation time, achieving a 99.7% decrease compared to mono-core CPU processing.
In the case of the Lid-Driven Cavity problem, which involves complex flow structures and high Reynolds numbers (notably at 1000 and 3200), GPU processing provided results that closely approximated those reported by Ghia [24].This underscores the significant time-saving benefits of utilizing GPUs for such simulations.
The study also explored Natural Convection in a Square Cavity, a scenario that introduces an additional equation for heat transport and involves varying Rayleigh numbers.Here too, GPU-accelerated algorithms delivered results that aligned well with findings from other research, demonstrating the GPU's effectiveness in handling complex problems.
Furthermore, the final simulation addressed a more intricate issue of solid-fluid in-

Simulation Performance
To evaluate the computational performance of GPU processing, three simulations were conducted, each incorporating variations of a specific dimensionless parameter.These simulations achieved results closely matching those found in the literature, which were obtained using conventional CPU-based methodologies.Notably, the use of GPU processing in this study resulted in a substantial reduction in simulation time, achieving a 99.7% decrease compared to mono-core CPU processing.
In the case of the Lid-Driven Cavity problem, which involves complex flow structures and high Reynolds numbers (notably at 1000 and 3200), GPU processing provided results that closely approximated those reported by Ghia [24].This underscores the significant time-saving benefits of utilizing GPUs for such simulations.

Figure 1 .
Figure 1.Flowchart of the developed CUDA-C language code.

Figure 1 .
Figure 1.Flowchart of the developed CUDA-C language code.
top lid u = 0 m/s horizontal velocity at bottom, left.and right faces v = 0 m/s vertical velocities for all faces ∂p/∂x = 0 N/m 3 pressure gradient for vertical faces ∂p/∂y = 0 N/m 3 pressure gradient for horizontal faces g y = 9.81 m/s 2 vertical gravitational acceleration p (t = 0) = p 0 [N/m 2 ] initial pressure condition

Figure 2 .
Figure 2. Scheme and boundary conditions for the simulations of Lid-Driven Cavity.

Figure 2 .
Figure 2. Scheme and boundary conditions for the simulations of Lid-Driven Cavity.

Figure 3 .
Figure 3. Square cavity subject to natural convection.

T
is the dimensionless temperature, and TH and TC are the temperature the heated and cooled faces, respectively.

Figure 4 .
Figure 4. Square cavity with solid block inside.

Figure 4 .
Figure 4. Square cavity with solid block inside.

Figure 5 .
Figure 5. Mesh model illustration: (a) square domain mesh for the pressure and the temperatu with the ghost cells in the boundaries; (b) scheme of displaced mesh for the vertical and horizon components of the velocity field, the vertical volume component is highlighted in red, and the h izontal volume component is highlighted in blue.

Figure 5 .
Figure 5. Mesh model illustration: (a) square domain mesh for the pressure and the temperature with the ghost cells in the boundaries; (b) scheme of displaced mesh for the vertical and horizontal components of the velocity field, the vertical volume component is highlighted in red, and the horizontal volume component is highlighted in blue.

Figure 6 .
Figure 6.Partitioned domain model with the sequence adopted in the CHT problem.

Figure 6 .
Figure 6.Partitioned domain model with the sequence adopted in the CHT problem.

Table 1 .
Boundary and initial conditions for Lid-Driven Cavity.

Table 2 .
Boundary and initial conditions for Natural Convection in a Square Cavity.

Table 2 .
Boundary and initial conditions for Natural Convection in a Square Cavity.

Table 3 .
Comparison results for the Nusselt number at the vertical wall across a range of Rayleigh numbers.

Table 3 .
Comparison results for the Nusselt number at the vertical wall across a range of Rayleigh numbers.

Table 4 .
Comparison results for the Nusselt number at the vertical wall for the CHT problem.