Super-Resolution Reconstruction of Turbulence via Spatio-Frequency Distillation and Physics-Guided Learning

Liu, Xiuyan; Yuan, Jingtong; Zhang, Yufei; Song, Dalei; Qi, Qi

doi:10.3390/jmse14111035

Open AccessArticle

Super-Resolution Reconstruction of Turbulence via Spatio-Frequency Distillation and Physics-Guided Learning

by

Xiuyan Liu

^1,*

,

Jingtong Yuan

¹

,

Yufei Zhang

¹,

Dalei Song

^2,3,4

and

Qi Qi

¹

School of Information Management, Qingdao University of Technology, Qingdao 266525, China

²

Engineering College, Ocean University of China, Qingdao 266100, China

³

Key Laboratory of Ocean Observation and Information of Hainan Province, Sanya Oceanographic Institution, Ocean University of China, Sanya 572024, China

⁴

Sanya Oceanographic Laboratory, Sanya 572024, China

^*

Author to whom correspondence should be addressed.

J. Mar. Sci. Eng. 2026, 14(11), 1035; https://doi.org/10.3390/jmse14111035

Submission received: 8 May 2026 / Revised: 26 May 2026 / Accepted: 29 May 2026 / Published: 31 May 2026

(This article belongs to the Section Physical Oceanography)

Download

Browse Figures

Versions Notes

Abstract

High-resolution turbulence data are essential for marine environment monitoring, climate change modeling, and aerospace engineering. However, numerical simulations and experimental measurements are often limited by high computational costs, insufficient sensor resolution, and sparse data acquisition, making it difficult to fully capture the multi scale details of turbulence. To address this challenge, a spatio-frequency fusion distillation network (SFDN) is developed to reconstruct finescale turbulent structures from low resolution inputs. The proposed model is built upon stacked Mamba-frequency distillation blocks, where each block integrates multiple Mamba-frequency fusion blocks (MFFBs) to refine and fuse the deep features of turbulence. By combining a Mamba-based state space model with a frequency attention mechanism, the MFFB is capable of capturing both long-range spatial dependencies and frequency-domain representations. Additionally, a physics-guided loss function is introduced to constrain the solution space and guide the learning process. To evaluate the performance of the proposed SFDN, comprehensive experiments were conducted on datasets of the forced isotropic turbulence and the turbulent channel flow, and comparisons were made with bicubic interpolation and several deep learning super-resolution models. The results demonstrate that SFDN achieves comparable or superior performance to state-of-the-art methods in terms of visual quality, quantitative accuracy, and preservation of physical characteristics, while using only 35.38% of the parameters of leading models. These findings highlight the effectiveness and efficiency of the proposed approach, as well as its strong generalization capability for reconstructing complex turbulent flows.

Keywords:

spatio-frequency fusion; Mamba; feature distillation; turbulence super-resolution reconstruction; physics-guided learning

1. Introduction

Turbulence is an irregular, random, and strongly nonlinear fluid phenomenon that appears in both natural and engineered systems, from ocean and atmospheric flows to aerospace propulsion [1,2,3,4]. Understanding its multi-scale vortex structures and energy transfer across scales can deepen our basic knowledge of fluid dynamics and also support more accurate marine environment monitoring and climate change prediction [5,6,7,8].

Traditional turbulence research mainly depends on high-resolution direct numerical simulations (DNS) and laboratory or in situ measurements. To resolve motions across many scales, DNS requires very fine spatial grids and very small time steps. However, this leads to extremely high computational and storage costs [9]. Because of this, researchers often subsample or compress DNS data, keeping only coarse-grained velocity fields and losing important small-scale information. In the same way, laboratory and in situ measurements are limited by sensor resolution, array density, and budget. These limits make it hard to fully capture fine-scale turbulent structures in both space and time [10]. For this reason, low-cost super-resolution reconstruction methods have become an important topic in fluid mechanics.

Existing reconstruction methods can usually be divided into mathematical interpolation methods and data-driven deep learning methods. Traditional interpolation methods, such as bicubic interpolation and spline interpolation, are widely used because they do not depend on training data. But they generally struggle to recover the physical properties of turbulence [11,12,13]. By contrast, data-driven deep learning models can learn turbulence features directly from training data through neural networks. As a result, they often achieve much higher accuracy and have attracted increasing attention in engineering applications.

For example, convolutional neural networks (CNNs) have been studied widely. Models such as SCNN [14] and DSC/MS [15,16] have achieved high upsampling ratios in turbulent wake simulations. Later CNN-based models have further reduced the cost of PIV experiments [17], improved reconstruction from sparse and noisy data [18], and supported the development of physics-informed neural networks (PINNs) [19]. Even so, CNN-based methods still face basic challenges when dealing with complex physical flow fields. CNNs mainly rely on local convolution kernels and deeper stacked layers to enlarge the receptive field. Although deeper CNNs can in theory capture long-range information, their effective receptive field is still mostly local. This is a serious limitation for turbulence, where coherent vortex structures and cross-scale interactions can extend over long spatial distances. As a result, CNNs often have limited ability to recover large-scale spatial correlations, small-scale local details, and global energy distributions at the same time.

To address the locality of CNNs, Transformer-based models [20,21,22] have recently been introduced into turbulence reconstruction.Models such as SRTT [23] and SWINFlow-3D [24] reduce this limitation by using self-attention to build direct connections between distant spatial positions. In addition, Xu et al. [25] showed that self-supervised Transformer learning can achieve "unequivocally superior performance" in flow reconstruction and prediction. The multi-scale hybrid attention model MHASTR [26] further improves the modeling of nonlocal dependencies and global structural correlations. However, standard self-attention has quadratic computational complexity, which causes very high memory and computational costs, especially for high-resolution inputs. This becomes a major bottleneck and greatly limits the use of Transformer-based models in engineering settings with limited resources.

More importantly, turbulence reconstruction is not simply a visual image restoration task. It also requires the preservation of physically meaningful spectral characteristics. Most existing methods focus mainly on learning spatial features. Although they may achieve good pointwise numerical accuracy, they often perform poorly in recovering energy spectra and small-scale fluctuations at high wavenumbers [27]. This gap motivates the explicit introduction of spatial-frequency information into the reconstruction framework.

To address these issues, this paper proposes a new spatio-frequency fusion distillation network (SFDN) for high-fidelity turbulence super-resolution. SFDN addresses the main challenges through four key designs. First, it uses a Mamba-based state-space model instead of a standard Transformer, which reduces computational complexity to a linear scale while keeping the ability to model long-range dependencies. Second, it adds a frequency branch to recover missing spectral information and improve the reconstruction of high-frequency energy spectra. Third, it uses a feature distillation strategy to improve representation efficiency and keep the model lightweight. Fourth, it introduces a physics-guided loss function to enforce statistical and dynamical physical consistency in the reconstructed flow fields. As shown by the results, SFDN achieves reconstruction accuracy and physical fidelity that are comparable to or better than those of the state-of-the-art MHASTR model [26], while using only 35.38% of its parameters.

The goal of this study is not to recover high-resolution turbulence fields from completely unknown real-world flows in an unconditional or unsupervised way. Instead, the proposed model is designed for supervised super-resolution in cases where high-fidelity reference data, historical numerical simulations, or representative databases are available under the same or similar flow conditions. Typical applications include post-processing DNS/LES data for low-cost storage and later high-fidelity recovery [15,16], improving the resolution of experimental measurements such as PIV using high-fidelity reference simulations [17], and quickly reconstructing flow fields under the same or similar operating conditions. Therefore, the performance of this supervised model is naturally limited by the physical distribution covered by the training data, and it cannot be applied to completely unseen flow regimes without prior high-resolution training data. In addition, zero-shot transfer across fundamentally different turbulent flows is beyond the scope of this work and will be studied in future research.

From a physical point of view, turbulence super-resolution from a single coarse-grained snapshot is inherently an ill-posed problem, because similar large-scale flow patterns may correspond to several possible small-scale realizations. So, the aim of this study is not to claim the exact recovery of a unique instantaneous ground-truth field. Instead, the proposed SFDN is designed to approximately reconstruct plausible fine-scale structures from coarse observations, while preserving statistical and physical consistency with the high-resolution reference data in a supervised learning framework.

The outline of this paper is organized as follows: Section 2 describes the details of our proposed SFDN model and introduces the turbulence datasets and the model configuration details of the experiments. Section 3 comprehensively discusses the results of the model validations on two turbulence datasets. Finally, conclusions and future work have been drawn in Section 4.

2. Materials and Methods

2.1. Datasets and Model Configuration

2.1.1. Forced Isotropic Turbulence

In this section, the proposed model is evaluated in detail using forced isotropic turbulence. The three-dimensional isotropic turbulence data [28,29] come from the Johns Hopkins Turbulence Database (JHTDB). The original direct numerical simulation (DNS) was carried out with a pseudo-spectral method in a cubic domain of size (2π × 2π × 2π) under periodic boundary conditions, with a grid resolution of 1024 × 1024 × 1024. To maintain the turbulence, a deterministic large-scale forcing scheme was applied at low wavenumbers. The flow has a Taylor microscale Reynolds number of Re_λ ≈ 433. The spatial resolution is fine enough to fully resolve the Kolmogorov length scale η, with k_max·η ≈ 1.01, where k_max is the maximum resolved wavenumber and reaches 512. The dataset contains 5028 frames of physical fields, including pressure and the three velocity components. The time interval between adjacent frames is 0.002 s, giving a total duration of 10.056 s. This time span covers several large-eddy turnover times and is much larger than the Kolmogorov time scale. More details on the numerical setup and validation can be found in the JHTDB references [28,29].

All experiments are performed on two-dimensional x-y slices extracted from the 3D turbulence field. For training and testing, 2000 high-resolution DNS velocity-field frames are randomly selected for training, and another 400 frames are used for testing. To generate the low-resolution inputs, the original high-resolution DNS data are first processed with a coarse-graining procedure. A spatial top-hat filter, that is, a local box average, is first applied to the velocity field with a kernel size set by the upscaling factor r. This step removes unresolved high-frequency fluctuations, reduces spectral aliasing, and mimics the spatial averaging effect in coarse-resolution measurements. The filtered data are then resampled to produce the final low-resolution patches, and bicubic interpolation is used in the implementation pipeline. To improve computational efficiency, the low-resolution input size of SFDN is fixed at 3 × 64 × 64. When a low-resolution snapshot is larger than this size, it is split into multiple 3 × 64 × 64 slices, and each slice is reconstructed separately by the model. The final snapshot is then obtained by mean fusion. The detailed settings of the proposed model are listed in Table 1.

For physical context, it is useful to distinguish isotropic turbulence from directionally anisotropic turbulence. In strictly isotropic turbulence, the statistical properties of the flow do not change under rotation or reflection of the coordinate system, so the turbulent fluctuations are statistically the same in all directions. This kind of flow is a standard and idealized benchmark for studying small-scale turbulence physics and energy cascades. By contrast, directionally anisotropic turbulence shows different flow behavior in different spatial directions, which is more typical of real engineering flows influenced by walls, mean shear, or directional forcing. From this view, the two datasets used in this study provide complementary test cases. The forced isotropic turbulence dataset serves as an idealized isotropic benchmark, while the turbulent channel flow dataset in Section 2.1.2 represents a strongly anisotropic wall-bounded flow with clear differences among the streamwise, wall-normal, and spanwise directions. Testing the proposed model on both datasets therefore helps evaluate its robustness and physical consistency across fundamentally different turbulence structures.

2.1.2. Turbulent Channel Flow

Turbulent channel flow is a typical wall-bounded turbulent flow in which fluid moves between two parallel walls. The interaction between the fluid and the walls creates more complex multi-scale structures and stronger anisotropic features near the wall, which makes flow reconstruction more difficult [30]. Because of this, turbulent channel flow is widely used to test a model’s ability to reconstruct complex flows.

The high-resolution turbulent channel flow dataset used in this paper comes from the Johns Hopkins Turbulence Database (JHTDB) [28,29]. The data were generated by direct numerical simulation (DNS) based on a velocity-vorticity formulation, using a pseudo-spectral method in the horizontal directions and B-splines in the wall-normal direction, following the common setup used in wall-turbulence DNS studies and datasets [30,31]. The computational domain has a size of 8π × 2π × 3π and is discretized on a 2048 × 512 × 1536 grid. The simulation corresponds to a friction Reynolds number of Re_λ ≈ 1000. Uniform periodic grids are used in the streamwise (x) and spanwise (z) directions, while a non-uniform grid is used in the wall-normal (y) direction.

The dataset contains 4000 snapshots of physical fields, and each snapshot includes pressure and the three velocity components, u, v, and w. The time interval between two adjacent frames is 0.0065 s, resulting in a total duration of 24.9935 s. Since the grid is non-uniform in the y-direction, the wall-normal position is described by the dimensionless wall distance y⁺, instead of the physical wall distance. Here, y⁺ =

\frac{u_{τ \cdot y}}{v}

, where u_τ =

\sqrt{\frac{τ_{ω}}{ρ}}

is the friction velocity,

τ_{ω}

is the wall shear stress,

ρ

is the fluid density, and ν is the kinematic viscosity. The parameter

δ

denotes the channel half-height. Based on y⁺ and the classical theory of turbulent boundary layers, the channel flow is divided into four regions: the viscous sublayer (y⁺ ≤ 5), the buffer layer (5 < y⁺ ≤ 30), the log-law layer (30 < y⁺ ≤ 0.2

δ

), and the outer layer (0.2

δ

< y⁺ ≤

δ

). More details on the numerical method and baseline validation can be found in the JHTDB references [28,29].

For training and testing, 300 snapshots on the x-z plane are randomly selected for training, and another 60 snapshots are used for testing. In the simulation data, the proportions of the four regions are 14:24:64:151 [32]. To improve the generalization ability of the model, the training and testing samples are selected from the four regions with a ratio of 1:1:2:2. Most model settings are kept the same as those used for isotropic turbulence, including the Mamba-based sequence modeling strategy [33]. The two balance parameters in the total loss function are set to α = 50 and β = 1. All models are implemented in PyTorch 2.0.0 and trained on an NVIDIA RTX3090 GPU.

2.2. The Deep Learning Model for Turbulence Super-Resolution

Current deep-learning-based models for turbulence super-resolution face two primary challenges. First, CNN-based architectures have limited capabilities in capturing long-range spatial dependencies and often suffer from the loss of high-frequency spectral information.Although lightweight convolution designs, such as depthwise separable convolutions, can improve parameter efficiency, their representational capacity still depends mainly on local receptive fields [34] Second, Transformer-based models achieve high reconstruction accuracy but introduce excessive parameter counts and quadratic computational complexity due to their self-attention mechanisms, hindering their deployment in resource-constrained environments. To address these issues, a novel spatio-frequency fusion distillation network (SFDN) is proposed in this paper. The core design philosophy is to utilize the Mamba architecture to reduce computational complexity, introduce a frequency branch to compensate for uncaptured spectral information, and employ a feature distillation strategy to retain critical turbulent features while keeping the model lightweight. The overall architecture is illustrated in Figure 1.

2.2.1. Mamba-Frequency Fusion Block

To comprehensively capture the underlying physical information of turbulence, the Mamba-Frequency Fusion Block (MFFB) was designed, and is detailed in Figure 2. Traditional deep learning methods typically extract features solely in the spatial domain, which often leads to the over-smoothing or complete loss of high-frequency energy cascades, especially at high upsampling ratios. To overcome this, the MFFB adopts a dual-branch architecture: Spatial Branch and Frequency Branch. Spatial Branch employs a Mamba module to replace the traditional Transformer structure. Leveraging its State Space Model (SSM), Mamba can capture long-range spatial dependencies with linear time complexity. This design is also consistent with the recent success of Vision Mamba in efficient visual representation learning [35]. It effectively maintains global perception capabilities for large-scale vortex structures while significantly reducing the parameter burden and computational overhead associated with Transformers. Frequency Branch employs a Frequency-Enhanced Pixel Attention (FPA) mechanism to compensate for the spectral information that spatial convolutions struggle to preserve. By mapping features into the frequency domain via Fast Fourier Transform (FFT), the model directly learns the spectral representations of turbulence. This is crucial for recovering dissipative scales and high-frequency energy spectra. Finally, the spatial and frequency representations are fused, ensuring complementary advantages between spatial structural morphology and spectral energy characteristics within a single module.

In the spatial branch, the input feature

F_{M F F B_i n}

is firstly refined by a blueprint separable convolutions [35] (BSConv) layer. Then, the refined feature is sent to a standard Mamba module [36] to extract the deep features from the spatial domain. At the end of the spatial branch, a BSConv layer and a long skip connection are added to refine and fuse the deep features from different MFFB modules. The spatial feature

F_{s p a s i a l}

is represented as:

F_{s p a t i a l} = H_{B S C o n v} (H_{M a m b a} (H_{B S C o n v} (F_{M F F B_i n}))) + F_{M F F B_i n}

(1)

where

H_{B S C o n v}

is the BSConv layer. Compared with the traditional standard convolution layer, the BSConv layer consists of a 1 × 1 pointwise convolution and a channel-wise convolution. This structure can enhance the generalization ability of the model, alleviate overfitting, and significantly reduce the number of parameters and computational complexity. Meanwhile, the MFFB block uses a novel Mamba module to replace the Transformer architecture. However, constrained by the quadratic computational complexity of its self-attention mechanism, traditional Transformer-based models require large parameter matrices and substantial computational resources [37]. Therefore, as the input resolution increases, deploying Transformer-based models on edge or low-resource devices becomes infeasible. By integrating state-space model formulations with hardware-aware optimizations, the Mamba network can reduce computational complexity to a linear scale and significantly improve its applicability to high-resolution inputs [38].

In the frequency branch, a frequency-enhanced pixel attention (FPA) mechanism [39] is introduced in the MFFB block to extract the frequency feature. The detailed architecture of FPA is shown in Figure 3. Specifically, the

F_{f_i n i t}

is first transformed into the frequency domain by a Fast Fourier Transform (FFT). Then, three convolutional layers are used to extract the global frequency features. The Invert Fast Fourier Transformation (IFFT) operation is applied to transform the frequency features into the spatial domain. The detailed process of this branch can be formulated as:

F_{f_i n i t} = H_{C O N V} (F_{M F F B_i n})

(2)

F_{f r e q u e n c y} = H_{C O N V} (D y T (H_{F P A} (H_{B S C o n v} (F_{f_i n i t}))) + F_{f_i n i t})

(3)

where the

F_{f r e q u e n c y}

is the frequency feature,

H_{C O N V}

and

H_{F P A}

are the convolution layer and the FPA module, respectively. The dynamic hyperbolic tangent function (Dynamic Tanh, DyT) proposed by Zhu et al. [35] is used to replace the normalization layer. Finally, the spatial feature and frequency feature obtained from the spatial and frequency branches are refined to generate the output feature

F_{M F F B_o u t}

:

F_{M F F B_o u t} = H_{C O N V} (C o n c a t (F_{f r e q u e n c y} + F_{s p a t i a l}))

(4)

Furthermore, as illustrated in both the overall architecture (Figure 1) and the internal structure of the MFFB (Figure 2), 1 × 1 convolutions (Conv-1) are repeatedly used throughout the network. Their inclusion serves two important architectural purposes. First, they act as efficient cross-channel feature mixers. By linearly combining feature maps across the channel dimension without changing the spatial resolution, Conv-1 layers facilitate the interaction and aggregation of heterogeneous information, such as the fusion of concatenated spatial and frequency representations after dual-branch feature extraction. Second, they serve as lightweight dimensionality regulators. When inserted before computationally intensive operations, Conv-1 layers create a bottleneck that reduces the parameter count and floating-point computational cost. When placed after such operations or concatenation steps, they help project refined features into the desired channel space. This design helps keep the SFDN lightweight and computationally efficient while preserving its representational capacity.

2.2.2. Mamba-Frequency Distillation Block

Inspired by the large kernel frequency-enhanced block (LKFB) [36], a Mamba-frequency distillation block (MFDB) is designed based on the MFFB block., shown in the dashed box in Figure 1. As the network depth increases, indiscriminately passing all extracted features forward leads to computational redundancy and noise accumulation. The core concept of the MFDB is feature distillation. Through a series of progressive feature splitting and refinement operations, the MFDB distills the most critical high-frequency turbulent features while filtering out redundant information. This progressive purification mechanism significantly enhances the representational efficiency of the network, retaining multi-scale structural details while maintaining a lightweight model footprint. Additionally, a lightweight Local Region Self-Attention (LRSA) module is introduced at the end of the MFDB. This module focuses on enhancing the model’s perception of local, fine-grained vortex details, further improving the fusion of local flow fields with global spatial features without introducing excessive parameters. For the input feature

F_{M F D B_i n}

, the feature distillation work is defined as:

F_{d 1}, F_{r 1} = D_{1} (F_{M F D B_i n}), M F F B_{1} (F_{M P D B_i n})

(5)

F_{d i}, F_{r i} = D_{i} (F_{r (i - 1)}), M F F B_{i} (F_{r (i - 1)}) (i = 2, 3)

(6)

F_{r 4} = H_{B S C o n v} (F_{r 3})

(7)

where the

D_{i}

is the i-th distillation layer and the

M F F B_{i}

is the i-th refinement layer.

F_{d i}

and

F_{r i}

are the i-th distilled feature and the refined feature, respectively. Meanwhile, a BSConv is used after the last MFFB module to refine features from the final MFFB module. Then, all distilled features and the final refined features are concatenated:

F_{c o n c a t} = C o n c a t (F_{d 1}, F_{d 2}, F_{d 3}, F_{r 4})

(8)

Additionally, a lightweight local region self-attention (LRSA) module (Figure 4) [38] is introduced into the MFDB block. The newly-designed LRSA module can effectively enhance the expression of local features by the local self-attention mechanism and improve the model’s understanding of local details in turbulence. In addition, the LRSA module can also enhance the fusion of local information and global information of turbulence and improve the performance of the proposed SFDN model. To reduce the number of parameters, the DyT module is used to replace the original norm operation, and this work can be formulated as:

F_{e n h a n c e d} = D y T (H_{C o n v} (H_{L S R A} (H_{C o n v} (F_{c o n c a t}))))

(9)

where two

H_{C o n v}

are the

1 \times 1

convolution, and

H_{L S R A}

presents the LRSA module. The input feature

F_{M F D B_i n}

is sent to the end of MFDB by a residual connection and mixed with the feature

F_{e n h a n c e d}

to enhance the expressive ability of the module:

F_{M F D B_o u t} = F_{e n h a n c e d} + F_{M F D B_i n}

(10)

2.3. Spatio-Frequency Fusion Distillation Network

In summary, the proposed SFDN pipeline primarily consists of three stages: shallow feature extraction, deep feature extraction, and high-resolution reconstruction. Given a low-resolution input, a 3 × 3 Blueprint Separable Convolution (BSConv) layer is first applied to extract shallow features, mapping the turbulent velocity field into a higher-dimensional feature space. These features are then fed into the deep feature extraction module, which consists of

n

stacked MFDBs. Through this cascading architecture, the network progressively distills and refines the spatio-frequency features of the turbulence. Finally, a residual connection forwards the structural information from the shallow layers directly to the high-resolution reconstruction module. This module utilizes Pixel-Shuffle layers to upscale the refined features, yielding the final high-resolution turbulence field. Given a low-resolution input data

X_{L R}

, the shallow feature extraction module is first applied to get the shallow feature

F_{S F}

:

F_{S F} = H_{B S C o n v} (X_{L R})

(11)

A

3 \times 3

BSConv layer in the shallow feature extraction module is utilized to map the low-resolution turbulent velocity field into a higher-dimensional feature space to enrich the model’s representational capacity. Then, the

F_{S F}

is fed into n stacks of MFDB to extract the deep features, and this work can be formulated as:

F_{i} = H_{M F D B_{i}} (F_{i - 1}), i = 1, \dots, n

(12)

where

F_{i}

and

F_{i - 1}

are the input and output features of the i-th MFDB module, respectively. The

H_{M F D B,}

is the i-th MFDB module. Through the cascading of MFDBs, this model can gradually refine the deep features of the turbulence field. Next, all deep features from MFDBs are fused and refined at the end of the deep feature extraction module:

F_{D F} = H_{B S C o n v} (H_{C o n v} (C o n c n a t (F_{1}, F_{2}, \dots, F_{n})))

(13)

where

F_{D F}

is the final deep feature from the deep feature extraction module. Finally, the shallow features and deep features are sent to the high-resolution turbulence reconstruction module to get the reconstructed high-resolution output data

X_{S R}

:

X_{S R} = H_{U p s a m p l e} (F_{S F} + F_{D F})

(14)

where

H_{U p s a m p l e}

represents the high-resolution turbulence reconstruction module (Figure 1). Specifically, the high-resolution reconstruction module is made up of several reconstruction layers. Each layer contains a 3 × 3 convolution layer and a Pixel-Shuffle layer, so the module can progressively achieve ×2 super-resolution reconstruction of the turbulence field. A residual connection is also introduced to pass shallow features directly to the reconstruction stage. This helps preserve low-level information and reduces the burden on the feature extraction modules, allowing them to focus more on learning deeper representations.

2.4. Physics-Guided Loss Function

In traditional super-resolution tasks, loss functions primarily focus on minimizing point-wise numerical errors, such as L₁ or L₂ loss, while neglecting the fundamental physical nature of the flow fields. For turbulence, relying solely on numerical approximation often results in the distortion of physical characteristics. To improve the physical consistency of the reconstructed results, a physics-guided loss function is introduced, integrating data fidelity with physical constraints

L

where

α

and

β

are parameters to balance the contributions of the two components. The data loss component,

L_{d a t a}

, ensures that the reconstructed velocity vectors closely match the true DNS values point-by-point. More importantly, the physical loss component,

L_{p h y s}

, incorporates the Turbulent Kinetic Energy (TKE) as a constraining term. This physics-guided learning paradigm forces the model to not only “look” numerically similar to the true flow field, but also adhere strictly to real turbulence physics in terms of statistical and dynamic properties during the optimization process. To address this limitation, we introduce a physics-guided loss function that integrates point fidelity with physical constraints.

L = α L_{p h y s} + β L_{d a t a}

(15)

where

α

and

β

are the parameters to balance the contributions of the two components. The data loss is defined as the mean squared error between the true data

u^{D N S}

and the reconstructed data

\hat{u}

:

L_{d a t a} = \frac{1}{N} \sum_{i = 1}^{N} ‖ u_{i}^{D N S} - {\hat{u}}_{i} ‖^{2}

(16)

This data component can ensure that each reconstructed velocity vector closely matches its true values. To reconstruct the physical structure of turbulence, a turbulent kinetic energy (TKE) is used in the loss function, and it is defined as

k = \frac{1}{2} (u^{' 2} + v^{' 2} + w^{' 2})

(17)

where

u^{'}

,

ν^{'}

, and

w^{'}

are the velocity fluctuations in the streamwise, spanwise, and wall-normal directions, respectively. Based on Equation (17), the physics component is defined as [36]:

L_{p h y s} = \frac{1}{N} {\sum_{i = 1}^{N} ‖k_{i}^{D N S} - {\hat{k}}_{i}‖}^{2}

(18)

where

k^{D N S}

and

\hat{k}

are the kinetic energy values computed from the true and reconstructed fields.

The fluctuating velocity component is not assumed a priori. For the forced isotropic turbulence dataset, the mean velocity is approximately zero, so the fluctuation is naturally defined with respect to this near-zero mean. For the turbulent channel flow dataset, the mean velocity profile is computed by spatial averaging along the statistically homogeneous directions for each snapshot during training, and the fluctuating component is then obtained by subtracting this mean profile from the total velocity field.

The coefficient α is introduced to balance the contribution of the fluctuation-related physical term relative to the data loss during optimization. Because the magnitude of the fluctuation term differs substantially between the forced isotropic turbulence and channel flow datasets, α is selected in a dataset-dependent manner to ensure that the physical loss and the data loss remain at comparable orders of magnitude at the beginning of training, which helps maintain stable optimization and physically consistent reconstruction. In practice, α should be chosen by first estimating the scale of the fluctuation energy in the target dataset, and then adjusting the weight so that the physical loss does not dominate or become negligible compared with the data loss. An excessively large α may overemphasize the physical term and slightly reduce pointwise data fidelity, whereas an excessively small α may weaken the physical constraint. Our sensitivity analysis indicates that the selected value provides a good balance between numerical accuracy and physical consistency.

Overall, this physics-guided loss function forces the model not only to reconstruct accurate velocity at each point but also to reproduce the physical information of turbulence.

3. Results

3.1. The Results of Forced Isotropic Turbulence

In this subsection, high-resolution turbulence fields are reconstructed from the forced isotropic turbulence dataset using super-resolution methods. The proposed SFDN is systematically compared with several state-of-the-art methods, including bicubic interpolation, which is a data-agnostic mathematical method, SCNN [14] and DSC/MS [15], which are representative CNN-based models, and SRTT [23] and MHASTR [26], which are advanced Transformer-based models. The comparison is carried out from three aspects: qualitative results, quantitative results, and physics-based evaluation.

The upscaling factor r is defined as the ratio between the low-resolution input and the high-resolution output. In this study, two upscaling factors are considered: r = 4, which corresponds to medium resolution, and r = 8, which corresponds to low resolution. These two settings are used to test the model under different reconstruction difficulties.

Figure 5 and Figure 6 show the reconstruction error fields of all methods with respect to the DNS reference under the two upscaling factors. A fixed color scale is used in both figures to allow a fair visual comparison. To show local structural differences more clearly, representative 64 × 64 patches are also presented.

At r = 4 (Figure 5), most deep learning methods keep the reconstruction errors at a relatively low level, while bicubic interpolation shows larger local deviations. This again shows that traditional interpolation methods have clear limits in turbulence super-resolution. At r = 8 (Figure 6), the error fields of the CNN-based methods become larger and more scattered because more high-frequency details are lost in the low-resolution input. By contrast, MHASTR and the proposed SFDN still maintain lower error levels and more coherent local structures, which suggests better reconstruction fidelity under more challenging conditions.

This trend is generally consistent with the results reported by Fukami et al. [16], who showed that machine-learning-based super-resolution can recover the main turbulent structures from coarse inputs, although the task becomes much harder as the resolution gap increases. Compared with other methods, the proposed SFDN and the Transformer-based models show more stable performance and reproduce turbulent features with higher fidelity under low-resolution conditions.

The

L_{2}

error norm represents the error between the original high-resolution turbulence field and the reconstruction results [14]:

L_{2} = \frac{{‖X_{H R} - X_{S R}‖}_{2}}{{‖X_{H R}‖}_{2}}

where

X_{H R}

and

X_{S R}

are the original high-resolution data and the reconstruction data from the super-resolution method.

‖ \cdot ‖_{2}

is the Euclidean norm. The structural similarity (SSIM) reflects the consistency in structural information between the original high-resolution data and the reconstruction data [26]:

S S I M = \frac{(2 μ_{D N S} μ_{S R} + c_{1}) (σ_{D N S, S R} + c_{2})}{(μ_{D N S}^{2} + μ_{S R}^{2} + c_{1}) (σ_{D N S}^{2} + σ_{S R}^{2} + c_{2})}

where

μ

,

τ

and

σ_{D N S, S R}

denote the mean, standard deviation, and cross-covariance of the DNS data and SR data, respectively.

c_{1}

and

c_{2}

are stabilization constants.

The

L_{2}

error norm and the SSIM are used to quantify the reconstruction accuracy. Table 2 summarizes the results for two upscaling factors and reveals that the bicubic interpolation yields the highest errors, SCNN and DSC/MS achieve moderate accuracy, SRTT outperforms the CNN-based methods, and SFDN and MHASTR achieve the lowest error norms and the highest SSIM.

Figure 7 and Figure 8 show the high-pass filtered velocity field, demonstrating the effectiveness of SFDN in restructuring at different wavenumber bands, corresponding to medium (k > 32), fine (k > 64), and dissipative (k > 128) scales. At the fine scale (k > 64), SFDN accurately captures the complexity and connectivity of vortex filaments, closely matching the DNS reference. In contrast, bicubic interpolation produces overly smooth and distorted small-scale features, especially at r = 8.

For quantitative evaluation, Table 3 lists the root mean square (RMS) of velocity fluctuations of high-pass-filtered fields for two upscaling factors. At r = 4, SFDN achieves near-perfect fidelity: 99.7% (0.355 RMS) at k > 64 and 99.2% (0.253 RMS) at k > 128, comparable to MHASTR. At the more challenging r = 8, SFDN maintains the best performance at the dissipative scale (k > 128) with 90.2% fidelity (0.230 RMS). The performance degradation from r = 4 to r = 8 is only 4.8% (k > 64) and 9.1% (k > 128), demonstrating robust generalization.

Physically, accurate reconstruction of these high-wavenumber fluctuations is important because they represent the small-scale turbulent motions associated with the dissipation range. Preserving these structures helps maintain the correct distribution of turbulent kinetic energy and supports more reliable estimation of small-scale mixing and dissipation processes. This is particularly relevant for environmental and engineering applications where unresolved turbulent fluctuations can strongly affect transport, dispersion, and energy transfer.

In turbulence super-resolution, physical consistency is as important as numerical accuracy. For this reason, we further evaluate all methods using the energy spectra and the probability density functions (PDFs) of the normalized longitudinal and transverse velocity gradients. In Figure 9, all methods deviate from the DNS curve at high wavenumbers, reflecting the challenge of reconstructing missing high-frequency content: the bicubic interpolation diverges first, followed by SCNN, DSC/MS, and SRTT, but MHASTR and SFDN maintain the best consistency. The very small magnitude of E(k) in this region is physically expected, as it corresponds to the dissipative subrange of turbulence, where the energy transferred from larger scales is rapidly dissipated and the spectral density naturally decays to a very low level. In addition, at r = 8, deviations shift to lower wavenumbers, indicating an increased difficulty in reconstructing missing high-frequency content. Accordingly, the present evaluation does not rely solely on pointwise reconstruction errors, but also emphasizes spectral consistency, probability density distributions, statistical quantities, and other physics-related indicators, so as to more comprehensively assess whether the reconstructed fields remain consistent with the DNS reference in physically meaningful aspects.

Figure 10 shows the probability density functions of the normalized longitudinal and transverse velocity gradients, which capture the small-scale non-Gaussian features and intermittency. Compared with other methods, most errors mainly concentrate on the gradient tails, and the PDF of SFDN matches the DNS data most closely.

These findings confirm that the proposed SFDN not only performs excellently in numerical metrics, but also faithfully restores the key physical features of turbulent flows, thereby emphasizing its effectiveness and competitiveness for the turbulent super-resolution reconstruction task.

3.2. The Results of Channel Turbulent Flow

To evaluate the generalization ability of the proposed SFDN in complex turbulence, this subsection examines its performance on turbulent channel flow and compares it with five methods. Figure 11 and Figure 12 show the reconstruction error fields of all methods relative to the DNS reference at two upscaling factors. Both figures use the same color scale to ensure a fair visual comparison. To better show the local structural differences, representative 64 × 64 patches are also displayed.

At r = 4 (Figure 11), bicubic interpolation shows clear local deviations, while the deep learning methods generally keep lower error levels and preserve fine-scale turbulent structures better. At r = 8 (Figure 12), the error fields of both CNN-based and Transformer-based methods become more obvious because more high-frequency information is lost in the low-resolution input. By contrast, the proposed SFDN still keeps lower reconstruction errors and more coherent local patterns, which suggests better reconstruction fidelity under more challenging low-resolution conditions.

Table 4 reports the reconstruction errors of turbulent channel flow at the two upscaling factors. The results are consistent with those for isotropic turbulence. At r = 4, bicubic interpolation gives the largest errors, while the CNN-based models reach moderate accuracy. SFDN and MHASTR perform best among all methods. To make the comparison with previous studies more direct, the baseline models in Table 4, including DSC/MS [15], SRTT [23], and MHASTR [26], were re-implemented and evaluated under the same training and testing protocol on the JHTDB channel-flow dataset. This makes the results directly comparable with recent representative studies on the same benchmark.

Notably, SFDN outperforms MHASTR by a clear margin in the L₂ error norm. At r = 8, SFDN still performs better than MHASTR, achieving the lowest error norms and the highest SSIM. These results show that, for complex low-resolution anisotropic turbulence, SFDN has a clear advantage over the second-best method, MHASTR.

Considering the anisotropic nature and layered structure of turbulent channel flow, the performance of SFDN is further evaluated for reconstructing the three velocity components, u, v, and w, in four flow regions under two upscaling factors. As shown in Figure 13, all methods can recover the main streamwise component u with high accuracy. Their performance drops slightly for the spanwise component w and becomes worst for the wall-normal component v.

The results also show that reconstruction in Regions III and IV is generally better than in Regions I and II. One reason is that Regions III and IV cover larger volumes and provide more data. Another reason is that Regions I and II are closer to the wall, where fluid-wall interactions are stronger and flow features vary more sharply. This makes reconstruction more difficult in the near-wall region.

Among all methods, bicubic interpolation gives the worst overall performance. The CNN-based methods improve on interpolation, but they still perform worse than the Transformer-based methods. The proposed SFDN performs closely to MHASTR and achieves strong reconstruction accuracy across different regions.

3.3. Model Complexity

To examine model complexity, SFDN is compared with several representative super-resolution methods, including SCNN, DSC/MS, SRTT, and MHASTR. According to their network designs, these methods fall into two groups: CNN-based models, represented by SCNN and DSC/MS, and Transformer-based models, represented by SRTT and MHASTR.

Table 5 compares the computational cost of these models for reconstructing full snapshots (3 × 1024 × 1024) from the isotropic turbulence dataset in the ×4 super-resolution setting. The comparison includes parameter count and floating-point operations (FLOPs). As the table shows, CNN-based models are more efficient in both respects. However, this lower cost is usually accompanied by weaker reconstruction performance in turbulence super-resolution.

By contrast, Transformer-based models make use of the attention mechanism and usually achieve higher reconstruction accuracy. But this also leads to much larger parameter sizes and higher computational cost. For example, SRTT has 214 million parameters and requires 1618G FLOPs, which greatly reduces its inference speed.

To evaluate the physical consistency of reconstruction results, the mean velocity profile and the Reynolds stress normalized by friction velocity as the dimensionless distance

y^{+}

is examined. Figure 14 shows that all super-resolution methods capture the overall structure of the mean velocity profile at two upscaling factors, indicating that this fundamental turbulence characteristic is strongly recovered.

Figure 15 and Figure 16 show the Reynolds stresses normalized by the friction velocity as a function of dimensionless wall distance at r = 4 and r = 8, respectively. At r = 4, bicubic interpolation shows clear deviations from the DNS results, while SCNN and DSC/MS produce small errors near the wall. SRTT, MHASTR, and the proposed SFDN all agree closely with the DNS data, and the SFDN curve almost overlaps with the DNS curve.

At r = 8 (Figure 16), the errors of the CNN-based methods increase clearly, which suggests that they have difficulty recovering complex turbulent flows at a high upscaling factor. This result is also consistent with Fukami et al. [40], who pointed out that physically meaningful quantities, such as turbulence statistics, Reynolds stresses, and energy spectra, are important criteria for evaluating reconstructed turbulent flows. The Transformer-based methods also show some performance loss because of the missing high-frequency information. Still, the proposed SFDN keeps the smallest deviation near the peak of the curve, which shows its stronger ability to recover the physical characteristics of complex turbulence under more challenging conditions.

Figure 17 shows the relationship between the L₂ error norm and the number of parameters of each model for the ×4 super-resolution task on the isotropic turbulence dataset. With the Mamba architecture and the frequency-domain feature extraction design, the proposed model reaches reconstruction accuracy comparable to that of MHASTR while using only 35.38% of its parameters. It also slightly outperforms MHASTR on some physical metrics. These results show that the proposed model achieves much better efficiency and remains highly competitive in turbulence super-resolution reconstruction.

4. Conclusions

We propose SFDN, a new turbulence super-resolution model based on a spatio-frequency fusion distillation framework. The model learns flow features in both the spatial and frequency domains, then combines them in a compact end-to-end network.

SFDN is built in three steps. First, the feature extraction block MFFB is redesigned from the standard Mamba module and a frequency attention mechanism. It uses two parallel branches to extract deep turbulence features from spatial and frequency information at the same time. Second, multiple MFFBs are stacked to form the MFDB, which gradually refines these features. An LRSA module is added at the end to improve local detail recovery and global information integration. Finally, multiple MFDBs are stacked to build the full SFDN model.

We evaluate SFDN on two classical turbulence datasets from four aspects: visual quality, quantitative accuracy, physical consistency, and model complexity. The results show that SFDN uses only 35.38% of the parameters of MHASTR, while achieving comparable or better performance on the main metrics. Its advantage becomes clearer when high-frequency information is strongly lost. This shows that the proposed strategy is effective and robust in recovering multi-scale turbulent structures.

The goal of SFDN is not to recover a unique fine-scale truth from coarse data. Instead, within the physical range covered by the training data, it aims to reconstruct fine-scale features that remain close to the DNS reference in statistics, spectra, gradient distributions, and major flow structures.

Overall, SFDN provides an efficient and low-cost framework for reconstructing high-resolution turbulence fields. It is especially useful when data quality is reduced by compression or limited sensor resolution. Although this work focuses on isotropic turbulence and turbulent channel flow, the model may also be extended to more complex flows, such as oceanic, atmospheric, and aeronautical turbulence. In future work, we will test the framework on real data, introduce stronger physical constraints, and explore model compression for real-time applications.

Author Contributions

X.L.: Conceptualization (equal); Investigation (equal); Writing review and editing (equal); Funding acquisition (equal); Supervision (equal). J.Y.: Investigation (equal); Writing original draft (equal); Writing review and editing (equal); Data curation (equal). Y.Z.: Methodology (equal); Software (equal); Writing original draft (equal); Writing review and editing (equal), Data curation (equal). Q.Q.: Writing review (equal); Editing (equal);Visualization (equal); Funding acquisition (equal). D.S.: Formal analysis (equal); Resources (equal); Validation (equal); Supervision (equal). All authors have read and agreed to the published version of the manuscript.

Funding

The work was supported by the Natural Science Foundation of Qingdao Municipality (Grant No.24-4-4-zrjj-128-jch,24-4-4-zrjj-92-jch), the Natural Science Foundation of Shandong Province (Grant No. ZR2025MS1021) and the National Natural Science Foundation of China (62401311).

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

DNS	Direct Numerical Simulation
LES	Large Eddy Simulation
JHTDB	Johns Hopkins Turbulence Database
PIV	Particle Image Velocimetry
SFDN	Spatio-Frequency Fusion Distillation Network
MFFB	Mamba-Frequency Fusion Block
MFDB	Mamba-Frequency Distillation Block
LRSA	Local Region Self-Attention
SSM	State Space Model
FPA	Frequency-Enhanced Pixel Attention
FFT	Fast Fourier Transform
BSConv	Blueprint Separable Convolution
TKE	Turbulent Kinetic Energy
CNN	Convolutional Neural Network
SSIM	Structural Similarity Index Measure
RMS	Root Mean Square
PDF	Probability Density Function
$R e_{λ}$	Taylor Microscale Reynolds Number
$η$	Kolmogorov Length Scale
$k_{m a x}$	Maximum Resolved Wavenumber
$R e_{τ}$	Friction Reynolds Number
$x, y, z$	Streamwise, Wall-Normal, and Spanwise Spatial Coordinates
$u, v, w$	Velocity Components in the Streamwise, Wall-Normal, and Spanwise Directions
$u^{'}, v^{'}, w^{'}$	Velocity Fluctuations
$\bar{u}$	Mean Velocity
$y^{+}$	Dimensionless Wall Distance
$u_{τ}$	Friction Velocity
$Γ_{w}$	Wall Shear Stress
$ρ$	Fluid Density
$v$	Kinematic Viscosity
$δ$	Channel Half-Height
$k$	Turbulent Kinetic Energy
$α, β$	Balance Parameters for the Physics-Guided Loss Function

References

Mashayek, A.; Reynard, N.; Zhai, F.; Srinivasan, K.; Jelley, A.; Garabato, A.N.; Caulfield, C.-C.P. Deep ocean learning of small scale turbulence. Geophys. Res. Lett. 2022, 49, e2022GL098039. [Google Scholar] [CrossRef]
Dangi, N.; Sodja, J.; Ferreira, C.S.; Yu, W. The effect of turbulent coherent structures in atmospheric flow on wind turbine loads. Renew. Energy 2025, 241, 122248. [Google Scholar] [CrossRef]
Saito, M.; Nagao, J.; Yamada, T.; Pillai, A.L.; Kurose, R. Large-eddy simulation of blade-turbulence interaction in a cyclorotor system. Aerosp. Sci. Technol. 2024, 146, 108921. [Google Scholar] [CrossRef]
Zhang, W.; Xia, M.; Kou, J. A Scientometric Investigation of Artificial Intelligence for Fluid Mechanics: Emerging Topics and Active Groups. Prog. Aerosp. Sci. 2025, 157, 101130. [Google Scholar] [CrossRef]
Kim, S.-H.; Kim, J.-H.; Chun, H.-Y.; Sharman, R.D. Global response of upper-level aviation turbulence from various sources to climate change. npj Clim. Atmos. Sci. 2023, 6, 92. [Google Scholar] [CrossRef]
de Wit, X.M.; Fruchart, M.; Khain, T.; Toschi, F.; Vitelli, V. Pattern formation by turbulent cascades. Nature 2024, 627, 515–521. [Google Scholar] [CrossRef] [PubMed]
Park, D.; Lozano-Durán, A. The coherent structure of the energy cascade in isotropic turbulence. Sci. Rep. 2025, 15, 14. [Google Scholar] [CrossRef]
Sutherland, B.R.; DiBenedetto, M.; Kaminski, A.; van den Bremer, T. Fluid dynamics challenges in predicting plastic pollution transport in the ocean: A perspective. Phys. Rev. Fluids 2023, 8, 070701. [Google Scholar] [CrossRef]
Kim, Y.; Ghosh, D.; Constantinescu, E.M.; Balakrishnan, R. GPU-accelerated DNS of compressible turbulent flows. Comput. Fluids 2023, 251, 105744. [Google Scholar] [CrossRef]
Hughes, K.G.; Moum, J.N.; Rudnick, D.L. A turbulence data reduction scheme for autonomous and expendable profiling floats. Ocean Sci. 2023, 19, 193–207. [Google Scholar] [CrossRef]
Fukami, K.; Taira, K. Single-snapshot machine learning for super-resolution of turbulence. J. Fluid Mech. 2024, 1001, A32. [Google Scholar] [CrossRef]
Yang, Z.; Yang, H.; Yin, Z. Super-resolution reconstruction for the three-dimensional turbulence flows with a back-projection network. Phys. Fluids 2023, 35, 055123. [Google Scholar] [CrossRef]
Du, P.; Parikh, M.H.; Fan, X.; Liu, X.-Y.; Wang, J.-X. Conditional neural field latent diffusion model for generating spatiotemporal turbulence. Nat. Commun. 2024, 15, 10416. [Google Scholar] [CrossRef] [PubMed]
Liu, B.; Tang, J.; Huang, H.; Lu, X.-Y. Deep learning methods for super-resolution reconstruction of turbulent flows. Phys. Fluids 2020, 32, 025105. [Google Scholar] [CrossRef]
Chen, M.; Wang, L.; Luo, Z.; Xu, J.; Zhang, B.; Li, Y.; Tan, A.C.C. Super-resolution reconstruction framework of wind turbine wake: Design and application. Ocean Eng. 2023, 288, 116099. [Google Scholar] [CrossRef]
Fukami, K.; Fukagata, K.; Taira, K. Super-resolution reconstruction of turbulent flows with machine learning. J. Fluid Mech. 2019, 870, 106–120. [Google Scholar] [CrossRef]
Zhou, X.; Jin, X.; Laima, S.; Li, H. Large-scale flow field super-resolution via local-global fusion convolutional neural networks. Phys. Fluids 2024, 36, 055130. [Google Scholar] [CrossRef]
Sofos, F.; Drikakis, D.; Kokkinakis, I.W. Deep learning architecture for sparse and noisy turbulent flow data. Phys. Fluids 2024, 36, 035155. [Google Scholar] [CrossRef]
Liu, X.; Li, X.; Zhang, Y.; Guo, T.; Song, D.; Bao, M. A physics-informed convolutional network based on feature fusion for high-resolution flow field reconstruction from sparse and noisy data. Phys. Fluids 2025, 37, 073622. [Google Scholar] [CrossRef]
Hsu, C.-C.; Lee, C.-M.; Chou, Y.-S. Drct: Saving image super-resolution away from information bottleneck. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 16–22 June 2024. [Google Scholar]
Islam, S.; Elmekki, H.; Elsebai, A.; Bentahar, J.; Drawel, N.; Rjoub, G.; Pedrycz, W. A comprehensive survey on applications of Transformers for deep learning tasks. Expert Syst. Appl. 2024, 241, 122666. [Google Scholar] [CrossRef]
Rao, Y.; Zhao, W.; Tang, Y.; Zhou, J.; Lim, S.N.; Lu, J. Hornet: Efficient high-order spatial interactions with recursive gated convolutions. Adv. Neural Inf. Process. Syst. 2022, 35, 10353–10366. [Google Scholar]
Xu, Q.; Zhuang, Z.; Pan, Y.; Wen, B. Super-resolution reconstruction of turbulent flows with a Transformer-based deep learning framework. Phys. Fluids 2023, 35, 055130. [Google Scholar] [CrossRef]
Li, X.; Yang, Z.; Yang, H. Hybrid-attention-based Swin-Transformer super-resolution reconstruction for tomographic particle image velocimetry. Phys. Fluids 2024, 36, 065132. [Google Scholar] [CrossRef]
Xu, B.; Zhou, Y.; Bian, X. Self-supervised learning based on Transformer for flow reconstruction and prediction. Phys. Fluids 2024, 36, 023607. [Google Scholar] [CrossRef]
Liu, X.; Zhang, Y.; Guo, T.; Li, X.; Song, D.; Yang, H. A multi-scale hybrid attention Swin-Transformer-based model for the super-resolution reconstruction of turbulence. Nonlinear Dyn. 2025, 113, 15815–15844. [Google Scholar] [CrossRef]
Cheng, R.; Shamooni, A.; Zirwes, T.; Kronenburg, A. Improved super-resolution reconstruction of turbulent flows with spectral loss function. Phys. Fluids 2025, 37, 035208. [Google Scholar] [CrossRef]
Perlman, E.; Burns, R.; Li, Y.; Meneveau, C. Data exploration of turbulence simulations using a database cluster. In Proceedings of the 2007 ACM/IEEE Conference on Supercomputing, Reno, NV, USA, 10–16 November 2007. [Google Scholar]
Li, Y.; Perlman, E.; Wan, M.; Yang, Y.; Meneveau, C.; Burns, R.; Chen, S.; Szalay, A.; Eyink, G. A public turbulence database cluster and applications to study Lagrangian evolution of velocity increments in turbulence. J. Turbul. 2008, 9, N31. [Google Scholar] [CrossRef]
Cimarelli, A.; Boga, G.; Pavan, A.; Costa, P.; Stalio, E. Spatially evolving cascades in wall turbulence with and without interface. J. Fluid Mech. 2024, 987, A4. [Google Scholar] [CrossRef]
Graham, J.; Kanov, K.; Yang, X.I.A.; Lee, M.; Malaya, N.; Lalescu, C.C.; Burns, R.; Eyink, G.; Szalay, A.; Moser, R.D.; et al. A web services accessible database of turbulent channel flow and its use for testing a new integral wall model for LES. J. Turbul. 2016, 17, 181–215. [Google Scholar] [CrossRef]
Keles, F.D.; Wijewardena, P.M.; Hegde, C. On the computational complexity of self-attention. In Proceedings of the International Conference on Algorithmic Learning Theory, Singapore, 20–23 February 2023. [Google Scholar]
Gu, A.; Dao, T. Mamba: Linear-time sequence modeling with selective state spaces. arXiv 2023, arXiv:2312.00752. [Google Scholar]
Haase, D.; Amthor, M. Rethinking depthwise separable convolutions: How intra-kernel correlations lead to improved mobile nets. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020. [Google Scholar]
Zhu, L.; Liao, B.; Zhang, Q.; Wang, X.; Liu, W.; Wang, X. Vision mamba: Efficient visual representation learning with bidirectional state space model. In Proceedings of the 41st International Conference on Machine Learning, Vienna, Austria, 21–27 July 2024. [Google Scholar]
Chen, J.; Duanmu, C.; Long, H. Large kernel frequency-enhanced network for efficient single image super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 16–22 June 2024. [Google Scholar]
Zhu, J. Transformers without normalization. arXiv 2025, arXiv:2503.10622. [Google Scholar] [CrossRef]
Liu, X.; Liu, J.; Tang, J.; Wu, G. CATANet: Efficient Content-Aware Token Aggregation for Lightweight Image Super-Resolution. In Proceedings of the Computer Vision and Pattern Recognition Conference, Nashville, TN, USA, 11–15 June 2025. [Google Scholar]
Chen, S.; Bao, T.; Givi, P.; Zheng, C.; Jia, X. Reconstructing turbulent flows using spatio-temporal physical dynamics. ACM Trans. Intell. Syst. Technol. 2024, 15, 1–18. [Google Scholar] [CrossRef]
Fukami, K.; Fukagata, K.; Taira, K. Machine-learning-based spatio-temporal super resolution reconstruction of turbulent flows. J. Fluid Mech. 2021, 909, A9. [Google Scholar] [CrossRef]

Figure 1. The architecture of the spatio-frequency fusion distillation network for turbulence super-resolution.

Figure 2. The detailed architecture of the Mamba-frequency fusion block.

Figure 3. The detailed architecture of the frequency-enhanced pixel attenuation module.

Figure 4. The detailed structure of the local region self-attention module.

Figure 5. Reconstruction error fields relative to the DNS reference for six super-resolution methods on representative 64 × 64 patches from the forced isotropic turbulence dataset at r = 4, shown with a fixed color scale for fair comparison.

Figure 6. Reconstruction error fields relative to the DNS reference for six super-resolution methods on representative 64 × 64 patches from the forced isotropic turbulence dataset at r = 8, shown with a fixed color scale for fair comparison.

Figure 7. The comparison of filtered velocity field of six super-resolution methods, the corresponding DNS data, and low-resolution data from the forced isotropic dataset at r = 4.

Figure 8. The comparison of filtered velocity field of six super-resolution methods, the corresponding DNS data and low-resolution data from the forced isotropic dataset at r = 8.

Figure 9. The kinetic energy spectra computed from a representative reconstructed snapshot produced by the six super-resolution methods, together with the corresponding DNS data and low-resolution data from the forced isotropic dataset.

Figure 10. The probability density functions of the normalized longitudinal and transverse velocity gradients computed from a representative reconstructed snapshot produced by the six super-resolution methods, together with the corresponding DNS data and low-resolution data from the forced isotropic dataset. (a) The PDFs of the longitudinal velocity gradients at r = 4; (b) The PDFs of the transverse velocity gradients at r = 8; (c) The PDFs of the longitudinal velocity gradients at r = 4; (d) The PDFs of the transverse velocity gradients at r = 8.

Figure 11. The visualization results of reconstruction patches of six super-resolution methods, the corresponding DNS data and low-resolution data from the channel turbulent flow dataset at r = 4.

Figure 12. The visualization results of reconstruction patches of six super-resolution methods, the corresponding DNS data and low-resolution data from the channel turbulent flow dataset at r = 8.

Figure 13. Average correlation coefficients between reconstruction snapshots by six super-resolution methods and the corresponding DNS data from the channel turbulent flow dataset. (a) The average correlation coefficients of u at r = 4; (b) The average correlation coefficients of v at r = 4; (c) The average correlation coefficients of w at r = 4; (d) The average correlation coefficients of u at r = 8; (e) The average correlation coefficients of v at r = 8; (f) The average correlation coefficients of w at r = 8.

Figure 14. Mean velocity profiles of the reconstructed snapshot by six super-resolution methods, the corresponding DNS data and low-resolution data from the channel turbulent flow dataset.

Figure 15. The Reynolds stresses normalized by the friction velocity as a function of dimensionless wall distance, computed from a representative reconstructed snapshot produced by the six super-resolution methods, together with the corresponding DNS data and low-resolution data from the channel turbulent flow dataset at r = 4.

Figure 16. The Reynolds stresses normalized by the friction velocity as a function of dimensionless wall distance, computed from a representative reconstructed snapshot produced by the six super-resolution methods, together with the corresponding DNS data and low-resolution data from the channel turbulent flow dataset at r = 8.

Figure 17. The relationship between the

L_{2}

error norms and the parameters of five deep learning models for the

\times 4

super-resolution task evaluated on the isotropic turbulence dataset.

Figure 17. The relationship between the

L_{2}

error norms and the parameters of five deep learning models for the

\times 4

super-resolution task evaluated on the isotropic turbulence dataset.

Table 1. The configuration parameters of the SFDN network.

Configuration Parameter	Value
$α$	1
$β$	1
Number of MFDB	8
Learning rate	$1.0 \times 1 0^{- 4}$
Input data size	$3 \times 64 \times 64$
Batch size	16
Epoch	3000
Optimization algorithm	Adam

Table 2. Quantitative comparisons of six super-resolution models for isotropic turbulence.

Ratio	Evaluation Index	Models
Ratio	Evaluation Index	Bicubic	SCNN	DSC/MS	SRTT	MHASTR	SFDN
r = 4	L₂	0.0566	0.0376	0.0348	0.0318	0.0221	0.0231
r = 4	SSIM	0.9320	0.9698	0.9743	0.9784	0.9895	0.9886
r = 8	L₂	0.1105	0.0987	0.0955	0.0871	0.0815	0.0835
r = 8	SSIM	0.7737	0.8106	0.8214	0.8427	0.8617	0.8525

Table 3. The RMS of filtered velocity field for the isotropic turbulence dataset for three cases at r = 4 and r = 8.

Ratio		Models
Ratio		DNS	LR	Bicubic	SCNN	DSC/MS	SRTT	MHASTR	SFDN
r = 4	k > 32	0.466	0.446	0.448	0.463	0.464	0.464	0.465	0.465
	k > 64	0.356	0.332	0.333	0.353	0.354	0.355	0.356	0.355
	k > 128	0.255	0.225	0.225	0.251	0.251	0.253	0.254	0.254
r = 8	k > 32	0.466	0.446	0.437	0.443	0.445	0.448	0.452	0.452
	k > 64	0.356	0.332	0.317	0.327	0.329	0.333	0.338	0.338
	k > 128	0.255	0.223	0.199	0.215	0.218	0.224	0.228	0.229

Table 4. Quantitative comparisons for six super-resolution models for channel turbulent flow.

Ratio	Evaluation Index	Models
Ratio	Evaluation Index	Bicubic	SCNN	DSC/MS	SRTT	MHASTR	SFDN
r = 4	L₂	0.0630	0.0471	0.0426	0.0426	0.0300	0.0339
r = 4	SSIM	0.9552	0.9742	0.9769	0.9761	0.9861	0.9832
r = 8	L₂	0.1140	0.0987	0.0955	0.0871	0.0815	0.8247
r = 8	SSIM	0.8532	0.8797	0.8892	0.8843	0.9040	0.8988

Table 5. Comparison of computational cost of five deep learning models for the

\times 4

super-resolution task on a full snapshot of the isotropic turbulence dataset.

Table 5. Comparison of computational cost of five deep learning models for the

\times 4

super-resolution task on a full snapshot of the isotropic turbulence dataset.

Models	Params (M)	FLOPs (G)
SCNN	0.1842	18.8416
DSC/MS	0.0929	82.1518
SRTT	214.3480	1618.5491
MHASTR	9.0871	963.5684
SFDN	3.2155	438.6078

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liu, X.; Yuan, J.; Zhang, Y.; Song, D.; Qi, Q. Super-Resolution Reconstruction of Turbulence via Spatio-Frequency Distillation and Physics-Guided Learning. J. Mar. Sci. Eng. 2026, 14, 1035. https://doi.org/10.3390/jmse14111035

AMA Style

Liu X, Yuan J, Zhang Y, Song D, Qi Q. Super-Resolution Reconstruction of Turbulence via Spatio-Frequency Distillation and Physics-Guided Learning. Journal of Marine Science and Engineering. 2026; 14(11):1035. https://doi.org/10.3390/jmse14111035

Chicago/Turabian Style

Liu, Xiuyan, Jingtong Yuan, Yufei Zhang, Dalei Song, and Qi Qi. 2026. "Super-Resolution Reconstruction of Turbulence via Spatio-Frequency Distillation and Physics-Guided Learning" Journal of Marine Science and Engineering 14, no. 11: 1035. https://doi.org/10.3390/jmse14111035

APA Style

Liu, X., Yuan, J., Zhang, Y., Song, D., & Qi, Q. (2026). Super-Resolution Reconstruction of Turbulence via Spatio-Frequency Distillation and Physics-Guided Learning. Journal of Marine Science and Engineering, 14(11), 1035. https://doi.org/10.3390/jmse14111035

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Super-Resolution Reconstruction of Turbulence via Spatio-Frequency Distillation and Physics-Guided Learning

Abstract

1. Introduction

2. Materials and Methods

2.1. Datasets and Model Configuration

2.1.1. Forced Isotropic Turbulence

2.1.2. Turbulent Channel Flow

2.2. The Deep Learning Model for Turbulence Super-Resolution

2.2.1. Mamba-Frequency Fusion Block

2.2.2. Mamba-Frequency Distillation Block

2.3. Spatio-Frequency Fusion Distillation Network

2.4. Physics-Guided Loss Function

3. Results

3.1. The Results of Forced Isotropic Turbulence

3.2. The Results of Channel Turbulent Flow

3.3. Model Complexity

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI